Analyzing Network Metrics - labbces/sugarcane_RNAome GitHub Wiki

Node degrees (genes) for each network were calculated using this script. This measurement is essential for understanding the connectivity within the network, as the degree of a node represents the number of connections it has to other nodes. Highly connected nodes, or hubs, can significantly influence network dynamics and robustness.

The degree distribution was analyzed using this script, resulting in the following degree distribution. The degree distribution provides insights into the network's topology and helps identify whether the network follows common patterns such as a normal distribution, power-law distribution, or another type.

Note: Before plotting the degree distribution, it is necessary to sort the genes by their degree. e.g:

sort -k2,2nr Perlo2022_counts_filters_VST_CNC_CV_above0.6_mcl_degree.tsv > Perlo2022_counts_filters_VST_CNC_CV_above0.6_mcl_degree_sorted.tsv

Degree distribution - Network Perlo CV > 0.6

To further investigate the nature of the network, a scale-free network analysis was performed using this script. Scale-free networks are characterized by a power-law degree distribution, where a few nodes (hubs) have many connections, while most nodes have few connections. This property has been observed in many biological networks and is believed to contribute to their robustness and resilience against random failures.

Degree distribution by panRNAome category - Network Perlo CV > 0.6

Degree distribution (log-log) - Network Perlo CV > 0.6

The degree distribution of a scale-free graph can be described by a power-law function, particularly for large values of 𝑘.

P(K = k) = C * k^(-γ)

When this function is plotted on a double logarithmic scale, it appears as a straight line with a slope of -γ. This property has been used as the standard criterion for determining whether a graph is scale-free and for estimating the value of γ. As shown in the figure above, this Probability Distribution Function (PDF) is very noisy for large k, resulting in unreliable estimates of γ.

CDF of degree distribution (log-log) - Network Perlo CV > 0.6

A recommended approach for analyzing the degree distribution is through the Cumulative Distribution Function (CDF), which indicates the probability that a degree is greater than k. The CDF of a power-law function also follows a power-law form. The following CDF were calculated using this script

The CDF represents the cumulative probability of the degree distribution, starting from 0 and increasing to 1. When plotted on a logarithmic scale, the values are transformed. Since these values range between 0 and 1, on a logarithmic scale they begin at 10^(-1) (which is 0.1) and increase to 10^(0) (which is 1).

The reference article shows the complementary CDF P(X>k), which represents the probability that a node's degree is greater than k. This is a decreasing function because as k increases, fewer nodes have degrees greater than k.