The Cross-Industry Standard Process for Data Mining is a framework that outlines the main tasks underlying the data-mining process. It is mostly from the business point of view, and it doesn't specify technical details.
Here are the six phases, with the corresponding tasks:
1) Business Understanding:
- defining business goals
- assessing the situation
- defining data-mining goals
- creating the project plan
2) Data Understanding:
- gathering data
- describing data (overall)
- exploring data (variable by variable)
- verifying data quality
3) Data Preparation:
- selecting data
- cleaning data
- constructing data (ex: deriving new variables)
- integrating data (merge data into one dataset)
- formatting data
4) Modeling:
- selecting modeling techniques
- designing tests
- building models
- assessing models
5) Evaluation:
- evaluating results
- reviewing the process
- determining the next step
6) Deployment:
- planning deployment
- planning monitoring and maintenance
- reporting final results
- reviewing final results

No comments:
Post a Comment