Multicollineairty 모델 구축 및 성능 평가 - ISEL-HGU/IntegratedDPModel GitHub Wiki
다중공선성 논문의 실험을 재현할 수 있다. 데이터셋이 다중공선성을 가질 때 결점 예측 모델 성능 저하가 발생하는지 여부를 검사하기 위해 다중공선성 제거 기술을 사용하거나 사용하지 않는 11가지 유형의 모델을 설정한다. None, Default-PCA, NSVIF10, NSVIF5, NSVIF4, NSVIF2.5, SVIF10, SVIF5, SVIF4, SVIF2.5 및 VCRR 모델을 만들어 성능을 평가한다.
git clone https://github.com/ISEL-HGU/MulticollinearityExpTool.git
gradle distzip
unzip build/distributions/MulticollinearityExpTool.zip
-
cd MulticollinearityExpTool/bin/
하기. 그 아래multisearch_None_PCA_VIF_Eval.sh
와multisearch_VCRR_Eval.sh
실행시킨다.- 이 스크립트는 파라미터튜닝을 적용시킨 모델을 만드는 스크립트이다 (-e, -u 그리고 -v 옵션).
usage: MulticollinearityExpTool -c <csv file location> -d <data unbalancing mode> [-e
<MultiSearch Evaluation option>] -f <the number of cross-validation
folds> [-h] -i <number of cross-validation iterations> -m <machine
learning model> -o <path> -p <thread pool size> -s <file> -t
<attribute value> [-u <parameter tuning option>] [-v <flag of
parameter tuning>]
Multicollineaity paper experiment tool
-c,--csv <csv file location> file path of output
to output file.
-d,--dataUnbalancingMode <data unbalancing mode> 1 is noHandling data
unbalance or 2 is
applying spread
subsampling or 3 is
applying smote
-e,--evaluation <MultiSearch Evaluation option> 1 is AUC or 2 is
Fmeasure or 3 is MCC
or 4 is Precision or
5 is Recall
-f,--fold <the number of cross-validation folds> the number of
cross-validation
folds
-h,--help Help
-i,--iter <number of cross-validation iterations> number of
cross-validation
iterations
-m,--model <machine learning model> machine learning
model
-o,--originaldata <path> path to original data
before creating
cross-validation data
-p,--pool <thread pool size> thread pool size
-s,--source <file> source arff file path
to train a prediction
model
-t,--type <attribute value> 1 is a original
dataset or applying
PCA or VIF to remove
multicollinearity or
2 is applying
Correlation-based
feature selection or
3 is applying
Wrapper-based feature
selection or 4 is
applying Variable
clustering and
removing redundant
metrics.
-u,--tuning <parameter tuning option> parameter tuning
option. 1 is
GridSearch or 2 is
CVParameterSelection
or 3 is MultiSearch
-v,--tuningflag <flag of parameter tuning> true or false