CohMetrix Model 2b Variable Importance - shmercer/writeAlizer GitHub Wiki

Ensemble Weightings and Metric Importance

Coh-Metrix Model 2b

This model used Coh-Metrix scores from 7 min narrative writing samples ("I once had a magic pencil and ...") from 131 students in the winter of Grades 2-5 (Mercer et al., 2019) to predict holistic writing quality on the samples (elo ratings calculated from paired comparisons).

Highly correlated Coh-Metrix metrics (r > |.90|) were excluded during pre-processing (see section on Scoring Model Development for more details).

Mercer, S. H., Keller-Margulis, M. A., Faith, E. L., Reid, E. K., & Ochs, S. (2019). The potential for automated text evaluation to improve the technical adequacy of written expression curriculum-based measurement. Learning Disability Quarterly, 42, 117-128. https://doi.org/10.1177/0731948718803296

Algorithm Weightings in Ensemble

Abbreviations:

  • all = ensemble model
  • mars = bagged multivariate adaptive regression splines
  • gbm = stochastic gradient boosted trees
  • svm = support vector machines
  • cube = cubist regression

The table below presents the linear weightings of each algorithm for the ensemble model.

Intercept mars gbm svm cube
-7.2585 0.2289 0.5300 0.1527 0.1150

Metric Importance in Each Algorithm and Ensemble

Each column sums to 100 (so values can be interpreted as % contribution to the model).

Detailed information on Coh-Metrix abbreviations and indices is available here.

Metric overall mars gbm svm cube
DESWC 30.39 45.46 34.5 4.37 16.04
LSAGN 7.18 0 9.31 2.85 17.43
DESWLlt 6.73 19 2.42 1.37 9.31
LDMTLD 5.59 0 8.91 2.47 5.54
SYNLE 5.43 13.58 3.65 1.79 2.18
WRDIMGc 3.6 9.36 1.84 1.22 3.37
WRDNOUN 3.19 6.64 1.38 1.77 6.53
CNCAdd 1.87 5.96 0.45 1.02 1.39
WRDVERB 1.42 0 2.02 1.67 1.19
SMCAUSwn 1.25 0 1.83 2.04 0
DESWLltd 1.16 0 1.21 1.53 2.77
CRFCWO1d 1.09 0 1.18 2.17 1.39
WRDHYPnv 1.02 0 0.98 1.36 2.77
CRFNOa 0.99 0 1.37 1.9 0
RDFRE 0.99 0 1.54 0.25 1.39
LSAGNd 0.85 0 0.53 1.76 2.77
DESWLsy 0.77 0 1.11 1.28 0
SYNMEDpos 0.76 0 0.29 2 2.77
WRDPRP3s 0.74 0 0.91 1.53 0.4
DESPL 0.73 0 0.44 2.34 1.39
CNCAll 0.72 0 0.44 0.96 3.17
RDL2 0.71 0 0.6 1.65 1.39
PCCNCz 0.71 0 0.62 1.6 1.39
PCVERBz 0.69 0 0.82 1.78 0
WRDPRO 0.69 0 0.69 1.18 1.39
CNCTemp 0.65 0 0.71 1.87 0
WRDFRQc 0.65 0 0.98 0.99 0
WRDFRQmc 0.63 0 0.56 1.25 1.39
DRVP 0.62 0 0.93 0.92 0
LSASS1d 0.61 0 0.65 1.87 0
SMCAUSlsa 0.6 0 0.94 0.77 0
CRFCWOad 0.6 0 0.61 1.92 0
WRDMEAc 0.59 0 0.53 1.4 0.99
PCTEMPp 0.56 0 0.78 1.02 0
SMCAUSv 0.56 0 0.85 0.83 0
PCCNCp 0.56 0 0.02 1.62 2.77
WRDFRQa 0.53 0 0.73 1.01 0
LSASSp 0.53 0 0.54 1.71 0
LDTTRc 0.52 0 0.64 1.3 0
DRPP 0.51 0 0.7 1.02 0
PCREFp 0.5 0 0 0.7 3.56
CRFCWO1 0.47 0 0.4 1.75 0
SMCAUSvp 0.46 0 0.47 1.42 0
PCNARz 0.45 0 0.26 1.64 0.59
SYNNP 0.45 0 0.3 0.91 1.39
PCSYNz 0.45 0 0.32 0.87 1.39
SMINTEp 0.45 0 0.4 1.66 0
LDTTRa 0.43 0 0.23 1.06 1.39
DESWLsyd 0.42 0 0.35 1.62 0
CRFANPa 0.41 0 0.34 1.59 0
SMINTEr 0.41 0 0.6 0.69 0
CNCLogic 0.4 0 0.51 0.89 0
WRDAOAc 0.4 0 0.58 0.69 0
WRDHYPv 0.4 0 0.36 1.45 0
CNCNeg 0.4 0 0.32 1.17 0.59
CNCCaus 0.38 0 0.47 0.92 0
WRDFAMc 0.37 0 0.46 0.91 0
SYNSTRUTa 0.36 0 0.36 1.2 0
CRFAOa 0.35 0 0.19 1.68 0
WRDADV 0.34 0 0.32 1.19 0
SMCAUSr 0.33 0 0.61 0.1 0
DESSLd 0.32 0 0.18 1.5 0
PCCONNp 0.29 0 0.46 0.37 0
WRDPOLc 0.28 0 0.28 0.93 0
WRDADJ 0.28 0 0.4 0.51 0
DRAP 0.26 0 0.24 0.87 0
DRNP 0.26 0 0.33 0.57 0
WRDHYPn 0.26 0 0.27 0.84 0
CNCTempx 0.25 0 0.23 0.86 0
DRNEG 0.23 0 0.13 1.08 0
PCREFz 0.23 0 0.24 0.7 0
PCVERBp 0.22 0 0 1.48 0
PCNARp 0.2 0 0.01 1.35 0
PCDCp 0.19 0 0.11 0.9 0
PCSYNp 0.09 0 0.02 0.56 0
WRDPRP3p 0 0 0 0 0