Interpreting clusters with dark areas off of the diagonal - jonathanbrecher/sharedclustering GitHub Wiki

Dark areas off of the diagonal represent a special type of overlap. This overlap could be very important and should not be overlooked.

Overlap off of the diagonal

In this example, There are is a series of clusters on the diagonal as expected, marked in blue. Additionally, there are two dark areas off of the diagonal, marked in green.

Overlap off of the diagonal, labeled

If you follow the dark areas off of the diagonal horizontally and vertically, you can see that they are aligned with two of the normal clusters on the diagonal. You can see that the two green areas here align with clusters 3 and 6.

Overlap off of the diagonal, aligned

Analysis

There is overlap between clusters 2 and 3, and also between clusters 3 and 4. In this portion of the cluster diagram, the closest match is in cluster 3, sharing 206 cM with the test taker. All of that information can be combined:

Cluster 3

  • Closest match: 206 cM
  • Overlaps with clusters 2, 4, 6

Cluster 2

  • Closest match: 148 cM
  • Overlaps with cluster 3
  • Does not overlap with clusters 4, 6

Cluster 4

  • Closest match: 147 cM
  • Overlaps with cluster 3
  • Does not overlap with clusters 2, 6

Cluster 6

  • Closest match: 138 cM
  • Overlaps with cluster 3
  • Does not overlap with clusters 2, 4

The key feature is that there is one cluster that overlaps three others, while those three do not overlap with each other.

One possible hypothesis is that the 206 cM match is a second cousin to the test taker, suggesting that cluster 3 contains other matches related to the test taker through one pair of great-grandparents.

Because clusters 2, 4, and 6 all overlap with cluster 3, it is very likely that they contain matches related to the test taker through the same pair of great-grandparents... somehow. But since those clusters do not overlap with each other, they probably are related to the test taker in different ways. those clusters could be related to the test taker through three of the four great-great-grandparents who were parents to those grandparents.

Or, maybe not. There are many other possibilities. The clusters give hints about how the matches might be related to the test taker. The next step is research, to see if the hints lead anywhere that can be proved through other sources.

Theory: Why dark areas off of the diagonal?

All of that leads to a reasonable question: Why were there dark areas off of the diagonal in the first place, when almost everything else happens on the diagonal?

The important detail comes down to the same thing: Cluster 3 overlaps three other clusters. In a normal two-dimensional cluster diagram, each cluster on the diagram has only two direct neighbors, the clusters immediately before and after it. Biology doesn't work that neatly. When one cluster overlaps three (or more!) clusters, the extra clusters can't be a direct neighbor, and they still have to go somewhere.

Dark areas off of the diagonal are a special type of overlap when one cluster overlaps three or more other clusters. The dark areas point to the other overlapping clusters that couldn't be direct neighbors because there wasn't room for them to be.