CohMetrix Model 3exp Variable Importance - shmercer/writeAlizer GitHub Wiki

Ensemble Weightings and Metric Importance

Coh-Metrix Model 3exp

This model used Coh-Metrix scores from 15 min expository writing samples from 200 students in Grades 2-5 to predict holistic writing quality on the samples (theta scores calculated from paired comparisons).

Highly correlated Coh-Metrix metrics (r > |.90|) were excluded during pre-processing (see section on Scoring Model Development for more details).

Algorithm Weightings in Ensemble

Abbreviations:

  • overall = ensemble model
  • pls = partial least squares regression
  • gbm = stochastic gradient boosted trees
  • mars = bagged multivariate adaptive regression splines
  • cube = cubist regression

The table below presents the linear weightings of each algorithm for the ensemble model.

Intercept pls mars gbm cube
-0.0577 0.1306 0.3136 0.3991 0.1752

Metric Importance in Each Algorithm and Ensemble

Each column sums to 100 (so values can be interpreted as % contribution to the model).

Metric overall mars pls gbm cube
DESWC 26.13 25.33 5.56 47.08 15.82
LSAGN 3.77 0 2.75 5.56 4.35
LDTTRa 3.68 0 4.12 4.55 3.68
DESSLd 3.13 9.32 1.26 2.42 3.51
DESWLlt 2.98 10.88 0.99 1.35 4.35
WRDPRP2 2.25 0 2.09 2.72 3.18
LDVOCD 2.16 2.46 3.82 0.71 2.26
DRPP 2.11 0 2.07 3.28 1.09
WRDPOLc 2.1 12.49 0.93 0.31 0.5
LDTTRc 1.96 0 3.31 1.89 1.17
SMCAUSwn 1.62 0 0.98 2.12 2.85
WRDNOUN 1.61 3.23 1.38 0.53 3.26
WRDPRP1s 1.55 5.37 1.05 0.83 1.26
WRDPRP1p 1.5 5.2 0.08 0.9 2.68
CNCTemp 1.39 7.18 1.01 0.27 0.33
PCNARz 1.37 0 2.21 0.63 2.59
LSASS1d 1.29 4.14 1.67 0.51 0.25
PCREFz 1.27 5.37 1.14 0.18 0.92
PCCONNz 1.25 0 1.16 2.11 0.42
DRNP 1.22 3.67 0.17 1.18 1.34
WRDMEAc 1.2 5.37 0.64 0.47 0.75
WRDFRQa 1.19 0 1.04 1.52 1.59
PCCONNp 1.19 0 2.2 0.62 1.59
SYNMEDpos 1.19 0 1.84 0.24 3.1
WRDHYPn 1.18 0 0.89 1.18 2.59
DESPL 1.03 0 2.71 0.38 0.25
WRDHYPnv 1.02 0 0.83 0.73 2.76
PCCNCz 0.94 0 1.35 0.26 2.43
RDL2 0.93 0 1.89 0.83 0.17
LSASSp 0.9 0 1.68 0.69 0.67
PCVERBz 0.9 0 1.38 0.38 1.92
WRDHYPv 0.87 0 0.92 0.94 1.26
LSASSpd 0.84 0 1.66 0.21 1.42
WRDADJ 0.82 0 1.39 0.89 0.25
CRFCWOa 0.81 0 1.79 0.37 0.67
CRFANPa 0.78 0 1.28 0.52 1.09
LSAGNd 0.77 0 2.06 0.08 0.59
PCREFp 0.77 0 0.97 0 2.76
CRFAOa 0.74 0 1.87 0.02 0.92
DESSL 0.74 0 0.94 0.46 1.59
CRFCWO1d 0.69 0 1.77 0.29 0.17
SYNNP 0.69 0 1.25 0.31 1.09
CRFNOa 0.64 0 1.2 0.47 0.5
PCTEMPp 0.62 0 1.26 0.49 0.25
WRDAOAc 0.61 0 1.21 0.47 0.33
CNCNeg 0.6 0 1.7 0.2 0
DRAP 0.58 0 1.08 0.21 0.92
DRGERUND 0.57 0 0.97 0.69 0
PCDCz 0.55 0 1.33 0.16 0.42
PCDCp 0.53 0 1.35 0.09 0.42
DESWLsy 0.53 0 0.58 0.34 1.26
PCSYNz 0.51 0 0.75 0.12 1.34
DESWLltd 0.5 0 0.25 0.53 1.26
SMCAUSr 0.47 0 1.38 0.11 0
WRDFRQmc 0.47 0 1.21 0.11 0.33
WRDIMGc 0.45 0 0.32 0.27 1.42
LDMTLD 0.45 0 0.51 0.64 0.25
SMCAUSlsa 0.44 0 0.33 0.24 1.42
SYNSTRUTa 0.43 0 1.13 0.22 0
WRDADV 0.42 0 0.56 0.12 1.17
CNCTempx 0.37 0 1.07 0.09 0
WRDFRQc 0.37 0 0.68 0.16 0.59
CNCLogic 0.36 0 0.95 0.17 0
SMINTEr 0.36 0 1.12 0.03 0
DRINF 0.36 0 0.89 0.21 0
DESWLsyd 0.36 0 0.28 0.55 0.33
SMINTEp 0.35 0 0.99 0.07 0.08
SMCAUSvp 0.33 0 0.99 0.06 0
CNCPos 0.33 0 0.17 0.61 0.25
PCVERBp 0.31 0 0.49 0.01 0.92
WRDPRP3s 0.29 0 0.42 0.26 0.33
PCCNCp 0.29 0 0.91 0.03 0
WRDPRO 0.29 0 0.64 0.14 0.25
WRDFAMc 0.26 0 0.55 0.24 0
DRVP 0.25 0 0.57 0.18 0
SMCAUSv 0.23 0 0.69 0.04 0
SYNLE 0.2 0 0.03 0.33 0.33
PCSYNp 0.2 0 0.53 0.02 0.17
WRDPRP3p 0.19 0 0.25 0.29 0
CNCCaus 0.18 0 0.47 0.09 0
WRDVERB 0.17 0 0.1 0.34 0
DRNEG 0.03 0 0 0.08 0