Train Regression Model in Shifu - ShifuML/shifu GitHub Wiki
In most case, Shifu is designed for 0-1 regression, including data binning, data normalization and variable selection. But we can also do Linear Regression using Shifu.
There are two ways to train regression model in Shifu.
Method 1: Make temporary 0-1 Tag
- Create a temporary 0-1 target column by using original target (you can decide how to do do that.)
- Run
shifu stats,shifu norm,shifu varselas normal - After the ColumnConfig.json is generated, and final variables are selected, then change temporary target column to original target column, and remove tags in
posTagsandnegTags - Add
OutputActivationFuncto ModelConfig.json -> train -> params. The value ofOutputActivationFunccould beLinear|ReLU|LeakyReLU|Swish. Depends on what you need. - Rerun
shifu normandshifu trainstep to build model
Method 2: Native
- Keep
posTagsandnegTagsempty in ModelConfig.json. (Attention: "" is not empty, [] is empty.) - Use
EqualTotalto do binning when runshifu stats - Use
ONEHOTorZSCALE_ONEHOTto do data normalization - Since IV/KS are all zeros, you can use
SEto do variable selection. Or you can useshifu varsel -f <variables.names.file>to select variables manually - Add
OutputActivationFuncto ModelConfig.json -> train -> params. The value ofOutputActivationFunccould beLinear|ReLU|LeakyReLU|Swish. Depends on what you need. - Rerun
shifu normandshifu trainstep to build model
GBDT Regression Support
Natively GBDT supports regression if impurity set to variance, please follow the steps above to prepare well before training and then run GBDT 'shifu train' to train a regression model. In 'eval' step, one parameter need to set to avoid sigmoid of final output:
"evals" : [ {
"name" : "Eval1",
"dataSet" : {
...
},
"gbtScoreConvertStrategy" : 'RAW',
...
} ]