M_q - arcturus9/useful-link GitHub Wiki
Data Analysis
- statistics for each stock code ==> compare (mean / stdev) of Y for two stock code..
Data Preprocessing
- (MinMax, Standard) Scaler ==> X_train_s2, s3, s4, ...
- statistics for each stock code ==> Scaler for each stock code ==> X_train_ScaleByCode1
- Generate 2day, 3day set : {[D-0] + [D-1]}, {[D-0] + [D-1] + [D-2]}, ... ==> X_train_2days, X_train_3days
Selecting Baseline Model
- Cross validation (ExtraTreesClassifier)
- Model selection among ExtraTreesClassifier, AdaBoost, Gradient Boosting Method, RandomForests (using GridSearch, using EstimatorHelper)
- Hyper-parameter Optimization for Best Model
Compare Different X_train_xx... for selected Baseline Models
- X_train_s2, X_train_ScaleByCode1, X_train_2days...
Deep learning
Ensemble DNN + (AdaBoost, GBM)