ReaderBench Model 2c Variable Importance - shmercer/writeAlizer GitHub Wiki
Ensemble Weightings and Metric Importance
ReaderBench Model 2c
This model used Coh-Metrix scores from 7 min narrative writing samples (I once had a magic pencil and ...) from 124 students in the spring of Grades 2-5 (Mercer et al., 2019) to predict holistic writing quality on the samples (elo ratings calculated from paired comparisons).
Highly correlated ReaderBench metrics (r > |.90|) were excluded during pre-processing (see section on Scoring Model Development for more details).
Mercer, S. H., Keller-Margulis, M. A., Faith, E. L., Reid, E. K., & Ochs, S. (2019). The potential for automated text evaluation to improve the technical adequacy of written expression curriculum-based measurement. Learning Disability Quarterly, 42, 117-128. https://doi.org/10.1177/0731948718803296
Algorithm Weightings in Ensemble
Abbreviations:
- all = ensemble model
- pls = partial least squares regression
- rf = random forest regression
- mars = bagged multivariate adaptive regression splines
- gbm = stochastic gradient boosted trees
- svm = support vector machines
- cube = cubist regression
The table below presents the linear weightings of each algorithm for the ensemble model.
Intercept | pls | rf | mars | gbm | svm | cube |
---|---|---|---|---|---|---|
-7.3027 | 0.2354 | 0.1868 | 0.1595 | 0.1816 | 0.2191 | 0.0704 |
Metric Importance in Each Algorithm and Ensemble
Each column sums to 100 (so values can be interpreted as % contribution to the model).
Metric | overall | pls | rf | mars | gbm | svm | cube |
---|---|---|---|---|---|---|---|
Content.words | 11.99 | 4.55 | 5.81 | 30.16 | 21.71 | 4.24 | 11.11 |
WdEnt | 7.28 | 4.3 | 5.74 | 0 | 21.09 | 4.12 | 12.09 |
AvgDepsSen_compound | 3.97 | 2.07 | 1.98 | 13.22 | 2.22 | 1.52 | 6.82 |
AvgWdLen | 3.87 | 2.64 | 2.65 | 7.11 | 4.85 | 2.04 | 7.02 |
LxcDiv | 3.77 | 4.06 | 4.13 | 0 | 7.72 | 3.59 | 0.78 |
AvgChainSpan | 3.36 | 3.09 | 2.64 | 5.1 | 4.15 | 2.66 | 2.34 |
TCorefChainBigSpan | 2.64 | 2.23 | 1.48 | 10.59 | 0.43 | 0.93 | 0 |
Sentences | 2.37 | 3.33 | 2.15 | 0 | 2.63 | 2.24 | 4.87 |
AvgDepsSen_mark | 2.21 | 0.38 | 1.17 | 10.59 | 0.08 | 1.45 | 0 |
AvgDepsSen_dobj | 2 | 0.81 | 0.96 | 8.72 | 0.14 | 1.13 | 0.97 |
AvgSenAdjCoh_LSA | 1.95 | 2.68 | 1.87 | 0 | 3.17 | 2.26 | 0 |
AvgCorefChain | 1.94 | 2.2 | 1 | 5.1 | 0.28 | 1.28 | 2.73 |
WdDiffWdStem | 1.92 | 2.4 | 1.86 | 0 | 2.95 | 2.09 | 1.56 |
LexChainMaxSp | 1.82 | 3.13 | 2.35 | 0 | 1.28 | 2.01 | 0.97 |
WdLettStdDev | 1.79 | 3 | 1.66 | 0 | 1.64 | 2.28 | 0.97 |
TCorefChainDoc | 1.62 | 3.23 | 1.85 | 0 | 0.17 | 1.92 | 2.14 |
CharEnt | 1.59 | 2.56 | 0.9 | 0 | 0.29 | 2.1 | 5.46 |
WdSylCnt | 1.53 | 2.45 | 1.7 | 0 | 1.52 | 1.55 | 1.36 |
FrqRhythmId | 1.47 | 2.67 | 1.7 | 0 | 1.03 | 1.59 | 0.97 |
AvgDepsSen_punct | 1.36 | 1.82 | 1.57 | 0 | 0.72 | 1.83 | 2.53 |
AvgAOEDoc_InverseLinearRegressionSlope | 1.32 | 1.31 | 0.73 | 4.26 | 0.24 | 0.89 | 0.39 |
RdbltyDaleChall | 1.25 | 1.81 | 1.27 | 0 | 1.04 | 1.02 | 3.51 |
AvgAOADoc_Shock | 1.2 | 2.2 | 1.24 | 0 | 0.69 | 1.8 | 0 |
LangRhythmCoeff | 1.06 | 1.61 | 1.41 | 0 | 1.03 | 1.24 | 0.19 |
LexChainAvgSpan | 1.05 | 1.94 | 1.36 | 0 | 0.16 | 1.66 | 0 |
SenAsson | 1.05 | 1.63 | 0.58 | 3.07 | 0.02 | 0.56 | 0 |
AvgVoice | 1 | 2.62 | 0.58 | 0 | 0 | 1.36 | 0.39 |
AvgNounSen | 0.97 | 1.09 | 1.59 | 0 | 0.47 | 1.06 | 2.14 |
WdDiffLemmaStem | 0.94 | 1.65 | 1.01 | 0 | 0.36 | 1.32 | 0.78 |
TActCorefChainWd | 0.94 | 0.93 | 0.74 | 0 | 1.05 | 0.71 | 4.09 |
AvgAOADoc_Cortese | 0.93 | 1.16 | 0.9 | 0 | 0.56 | 1.89 | 0.39 |
AvgAOASen_Bristol | 0.92 | 0.35 | 1.28 | 2.08 | 1.15 | 0.34 | 0.39 |
SenStdDevWd | 0.92 | 1.6 | 0.97 | 0 | 0.09 | 1.78 | 0 |
AvgDepsSen_xcomp | 0.83 | 0.41 | 1.61 | 0 | 1.42 | 0.99 | 0 |
AvgAdjectiveSen | 0.83 | 1.24 | 1 | 0 | 0.18 | 1.21 | 1.36 |
AvgDepsSen_nmod | 0.81 | 0.16 | 1.23 | 0 | 0.38 | 1.25 | 3.51 |
AvgAOADoc_Kuperman | 0.8 | 0.7 | 1.18 | 0 | 0.65 | 1.44 | 0.39 |
AvgDepsSen_amod | 0.79 | 1.11 | 0.91 | 0 | 0.1 | 1.33 | 1.36 |
AvgDepsSen_ccomp | 0.78 | 1.06 | 1.41 | 0 | 0.27 | 1.18 | 0 |
AvgAOASen_Kuperman | 0.78 | 0.78 | 0.83 | 0 | 0.41 | 0.99 | 2.73 |
AvgNmdEntSen | 0.78 | 0.93 | 1 | 0 | 1.05 | 1.05 | 0 |
AvgAOESen_IndexPolynomialFitAboveThreshold.0.3. | 0.76 | 0.58 | 1.12 | 0 | 0.4 | 0.84 | 2.73 |
AvgConnSen_simple_subordinators | 0.74 | 0.25 | 0.86 | 0 | 1.12 | 1.52 | 0.39 |
AvgPronounSen | 0.72 | 1.09 | 1.13 | 0 | 0.02 | 0.99 | 0.97 |
AvgAOASen_Shock | 0.69 | 0.48 | 1.51 | 0 | 0.21 | 1.32 | 0 |
AvgConnSen_reason_and_purpose | 0.68 | 0.16 | 1.31 | 0 | 0.82 | 1.29 | 0 |
AvgAOASen_Cortese | 0.66 | 1.25 | 0.45 | 0 | 0.24 | 1.22 | 0 |
AvgAOESen_InverseLinearRegressionSlope | 0.66 | 0.99 | 1.02 | 0 | 0.31 | 0.68 | 0.97 |
AvgAOEDoc_InflectionPointPolynomial | 0.65 | 0.64 | 0.36 | 0 | 0.37 | 0.61 | 3.7 |
AvgConnSen_addition | 0.65 | 0.88 | 0.88 | 0 | 0.12 | 1.21 | 0.39 |
AvgConnSen_order | 0.64 | 0.44 | 0.48 | 0 | 0.92 | 1.41 | 0 |
AvgInferenceDistChain | 0.64 | 0.8 | 0.91 | 0 | 0.7 | 0.83 | 0 |
WdPolysemyCnt | 0.62 | 0.27 | 0.93 | 0 | 0.37 | 1.61 | 0 |
AvgAOEDoc_IndexPolynomialFitAboveThreshold.0.3. | 0.61 | 0.83 | 0.71 | 0 | 0.07 | 1.07 | 0.97 |
AvgRhythmUnits | 0.61 | 0.3 | 1.24 | 0 | 0.32 | 1.3 | 0 |
AvgDepsSen_aux | 0.57 | 0.03 | 1.26 | 0 | 0.38 | 1.32 | 0 |
SynSoph | 0.57 | 0.59 | 0.85 | 0 | 0.08 | 1.02 | 0.97 |
AvgDepsSen_cop | 0.55 | 0.87 | 0.48 | 0 | 0.05 | 1.25 | 0 |
AvgRhythmUnitStreesSyll | 0.52 | 0.76 | 1.17 | 0 | 0.14 | 0.56 | 0 |
AvgDepsSen_advmod | 0.48 | 0.33 | 0.6 | 0 | 0.2 | 1.26 | 0 |
AvgDepsSen_det | 0.48 | 0.22 | 1.04 | 0 | 0.45 | 0.68 | 0.39 |
AggPronSen_third_person | 0.47 | 0.86 | 0.8 | 0 | 0.08 | 0.58 | 0 |
AvgAOADoc_Bristol | 0.45 | 0.36 | 0.71 | 0 | 0.12 | 1.02 | 0.19 |
AvgDepsSen_acl | 0.44 | 1.28 | 0.29 | 0 | 0.16 | 0.36 | 0 |
AvgAOADoc_Bird | 0.44 | 0.38 | 0.84 | 0 | 0.13 | 0.89 | 0 |
WdAvgDpthHypernymTree | 0.43 | 0.79 | 0.71 | 0 | 0.06 | 0.54 | 0 |
RdbltyFlesch | 0.43 | 0.42 | 1.35 | 0 | 0.03 | 0.44 | 0 |
AvgDepsSen_dep | 0.42 | 0.68 | 0.6 | 0 | 0.02 | 0.75 | 0 |
AggPronSen_indefinite | 0.41 | 0.34 | 0.51 | 0 | 0.14 | 1.05 | 0 |
AvgConnSen_semi_coordinators | 0.39 | 0 | 1.01 | 0 | 0.13 | 0.92 | 0 |
AvgDepsSen_mwe | 0.38 | 0.6 | 1.17 | 0 | 0.1 | 0.07 | 0 |
AvgDepsSen_advcl | 0.38 | 0.06 | 0.44 | 0 | 0.01 | 1.4 | 0 |
AvgDepsSen_neg | 0.37 | 0.51 | 0.97 | 0 | 0.4 | 0.05 | 0 |
WdPathCntHypernymTree | 0.36 | 0.89 | 0.33 | 0 | 0.19 | 0.35 | 0 |
AvgAOESen_IndexAboveThreshold.0.3. | 0.35 | 0.27 | 0 | 0 | 0.41 | 1.05 | 0 |
AvgAOASen_Bird | 0.33 | 0.04 | 0.75 | 0 | 0.59 | 0.42 | 0 |
LxcSoph | 0.31 | 0.02 | 0.61 | 0 | 0.16 | 0.18 | 1.95 |
AvgConnSen_oppositions | 0.26 | 0.11 | 0.98 | 0 | 0.36 | 0 | 0 |
LangRhythmDiameter | 0.24 | 0.3 | 0.84 | 0 | 0.12 | 0.03 | 0 |
AvgConnSen_temporal_connectors | 0.17 | 0.23 | 0.58 | 0 | 0.09 | 0.01 | 0 |
LangRhythmId | 0.09 | 0.22 | 0.23 | 0 | 0.02 | 0.01 | 0 |