Pages

November 7, 2014

CRISP-DM

The Cross-Industry Standard Process for Data Mining is a framework that outlines the main tasks underlying the data-mining process. It is mostly from the business point of view, and it doesn't specify technical details.


Here are the six phases, with the corresponding tasks:

1) Business Understanding:
     - defining business goals
     - assessing the situation
     - defining data-mining goals
     - creating the project plan

2) Data Understanding:
     - gathering data
     - describing data (overall)
     - exploring data (variable by variable)
     - verifying data quality

3) Data Preparation:
     - selecting data
     - cleaning data
     - constructing data (ex: deriving new variables)
     - integrating data (merge data into one dataset)
     - formatting data

4) Modeling:
     - selecting modeling techniques
     - designing tests
     - building models
     - assessing models

5) Evaluation:
     - evaluating results
     - reviewing the process
     - determining the next step

6) Deployment:
     - planning deployment
     - planning monitoring and maintenance
     - reporting final results
     - reviewing final results

No comments:

Post a Comment