Tool: cath cluster - UCLOrengoGroup/cath-tools GitHub Wiki
Possible Future Features
Consider implementing the algorithm for the S40 NR reps procedure (wiki:CathReleaseProtocol#Generatesingle-linkageS40repsabout6hours) that attempts to generate as many clusters as possible with no redundancy between reps.
Consider implementing a MMSeqs / cd-hit type clustering. Think about how to handle thresholds with reference to the way the cd-hit's -s ("length difference cutoff") parameter works, ensuring the threshold isn't exceeded across any pair in the cluster.
Consider trying a clustering that uses complete-linkage criteria but single-linkage search