Why is the score for a core factor not the mean of its lower‐order factors - theliberators/columinity.docs GitHub Wiki

Observant users have wondered why it is that the scores of the various core factors are different from the means of their lower-order factors. The screenshot provides a nice example of this. The scores for the lower-order factors of Team Effectiveness are 72, 72, and 62 respectively. The mean of this is 68.6, which is indeed different from the score of 72 that is reported. This isn't a bug but a conscious decision on our part.

A challenge of data aggregation is that you're trying to summarize detailed, low-level data with a single number. The quality of the resulting number depends on the quality and spread of underlying data points. For example, most of us know that averages are sensitive to extreme scores. A mean average can take a nosedive when even a single very low score exists in a larger set of high scores. One way to dampen this kind of bias is to maximize the number of data points used in analyses, which is what we always strive for.

In the aforementioned simple calculation of Team Effectiveness, we would calculate its score for each participant by averaging the participants' score for Team morale, Stakeholder Happiness, and Stakeholders: Team Value. But this average would be based on only 3 data points and prone to bias. Instead, for each participant we take all 8 responses to the lower-order factors and calculate a mean average from that. This gives us 8 data points per participant and less bias.

Statistical example

The histogram below shows the scores for three participants on five sub-factors of "Stakeholder Concern" (15 data points). It is based on a real team.

unnamed

The histogram clearly shows substantial disagreement among participants or at least very different perspectives. More of the scores either fall on the left or the right than in the middle. With this distribution, what is the most accurate single number to summarize the "Stakeholder Concern" score?

  • Our current algorithm offers 3.15
  • The median average of all scores is 3.67
  • The mean average of all scores is 2.71
  • If we calculate median averages for each sub-factor first and then take the median average of those 5 values, we get 4.1

Each strategy gives a slightly different number. Which is the most accurate? The most straightforward approach is the fourth option. You could take a calculator and average the values from the five sub-factors. However, this approach takes averages of averages and loses a lot of information about the underlying distribution in the process. The resulting 4.1 also gives more weight to the six higher scores than the nine lower scores and is thus biased. Similarly, the mean average (third approach) is too biased toward the other end. The 3.15 of our current approach is most reflective of the true "middle" of the scores, although it is less intuitive to calculate manually.