Machine Learning Workflow - ignacio-alorre/Data-Science GitHub Wiki
Workflow to follow in Applied Machine Learning
-
1 - Problem Definition
- Problem Description
- Informal description
- Formal description
- Assumptions
- Provided Data
- Constrains imposed on data
- Attribute definition
- Motivation
- Motivation
- Benefits
- Use
- Manual Solution
- Problem Description
-
2 - Analyze Data
- Summarize Data
- Data Structure
- Data Distribution
- Visualize Data
- Attribute Histograms
- Pairwise scatterplots of attributes
- Summarize Data
-
3 - Prepare Data
- Select Data
- Preprocess Data
- Formatting
- Cleaning
- Sampling
- Transform Data
- Scaling
- Decompositon
- Aggregation
-
4 - Evaluate Algorithms
- Test Harness and Options
- Explire and select algorithms
- Interpret and report results
-
5 - Improve Results
- Algorithm Tuning
- Ensemble methods
- Bagging
- Boosting
- Blending
- Extreme Feature Engineering
-
6 - Present Results
- Present Results
- Context
- Problem
- Solution
- Findings
- Limitations
- Conclusions
- Operationalize Algorithm
- Present Results