Disease Network - serratus-bio/open-virome GitHub Wiki

Human Disease Ontology

The Human Disease Ontology (DO) forms a monopartite network containing Disease nodes and HAS_PARENT edges from a human curated ontology. This network forms a Directed Acyclic Graph (DAG). The nodes and edges are extracted from DO OWL data

A bipartite network can be formed with SRA Run nodes and their associated Disease nodes via HAS_DISEASE_METADATA edges. These edges are mined from BioSample metadata associated to the run which is then mapped to a matching term in the DO.

Summary stats

Total number of Disease nodes: 14,172

Total number of Disease HAS_PARENT relationships: 11,724

Total number of HAS_DISEASE_METADATA relationships: 525,032

Communities

In the monopartite disease network, 14,172 disease nodes form a connected component with no isolated nodes.

Visualizing the entire network with a force-directed layout shows naturally forming communities of closely related disease terms. We can use hierarchical community detection algorithms to reduce the number of labels during a feature engineering step.

[image placeholder]