3. Estimating a Network Model - GenomicNetworkAnalysis/GNA GitHub Wiki

This page contains a guide on how to estimate a genetic network model for a set of traits using GNA. The example below is taken from the introductory GNA manuscript (doi), in which we estimated a network for Type 2 diabetes and 5 related cardio-metabolic traits in individuals of East Asian ancestry.

To estimate a network model, the only thing you need is the genetic covariance structure for your set of traits (such as that obtained from multivariable LDSC; see Estimating a genetic covariance structure).

The genetic covariance structure for the example below is included in the GNA package, which can be extracted by:

# Load the GNA package
require(GNA)

# Extract the example data (will create directory 'example_data' in your current working directory)
refData("example")

# Load the genetic covariance structure into R
LDSC_MET <- readRDS("example_data/LDSC_MET.RDS")

Estimate the genomic network

The traitNET function in GNA is used to estimate a network model from a genetic covariance structure.

The function takes 9 arguments:

covstruc: The genetic covariance structure obtained from multivariable LDSC.
fix_omega: Specifies which elements of the edge weight matrix (omega) are to be estimated. Set to "full" to estimate every element freely (i.e. estimate every possible edge in the network). Alternatively, can be a matrix of the dimensions node x node with 0 encoding a fixed to zero element, and a nonzero value encoding a freely estimated element. Typically this argument should be set to "full", unless you are testing a specific hypothesis about the network structure.
prune: TRUE/FALSE argument indicating whether to prune the network by removing non-significant edges.
p.adjust: P-value adjustment method to use to prune edges. Default is "fdr" (false discovery rate), which we find to provide a sensible multiple testing correction threshold in simulations described in the first GNA paper. Can take any of the options available through the p.adjust function from the stats R package.
alpha: Significance level to use for pruning when estimating the genomic network (default = 0.05).
reestimate: TRUE/FALSE to indicate whether network parameters should be reestimated after pruning.
recursive: TRUE/FALSE to indicate whether the network model should be pruned and reestimated recursively.
graph_layout: Specifies the layout for the network graph. Common layouts include "circle" and "mds" (multi-dimensional scaling).
toler: Tolerance used for matrix inversion of the S or V matrices at different steps of model estimation. If function returns error that begins with "system is computationally singular..." then toler can be used to set a lower tolerance threshold. If this error does arise, this can indicate that a subset of the trait(s) is underpowered.

Example code:

#load GNA package
require(GNA)

# Covariance structure from LDSC
covstruc <- LDSC_MET

# Specify the model to use for the genetic covariance matrix
fix_omega <- "full"

# Prune the network
prune <- TRUE

# Method to adjust p-values
p.adjust <- "fdr"

# Significance level for edge inclusion
alpha <- 0.05

# Re-estimate edges after pruning
reestimate <- TRUE

# Use recursive estimation
recursive <- TRUE

# Layout of the graph
graph_layout <- "circle"

# Tolerance used for matrix inversion 
toler <- NULL

# Run GeneNet with specified parameters
METnetwork <- traitNET(covstruc=covstruc, fix_omega=fix_omega,
                      prune=prune, p.adjust=p.adjust, alpha=alpha, reestimate=reestimate,
                      recursive=recursive, graph_layout=graph_layout, toler=toler)

The output stored in METnetwork reflects a list object with two primary items:

model_results reflects the model fit and estimated edges (partial genetic correlations) for the saturated network and the sparse network that is pruned for significnce (if requested)
network includes the various graph theory metrics that can be used to characterize nodes using centrality and clustering coefficients and to evaluate clusters of nodes using global metrics. These different metrics are described in more detail in the primary paper.

We focus here on the different data stored in model results. As we requested a sparse network that prunes edges based on significance, we are also going to focus further on only these sparse results. We can view the sparse results for this example, and specific pieces of output within this part of the results, using the code below:

Example code:


#the full set of sparse results
METnetwork$model_results$sparse

#the estimated network edges
METnetwork$model_results$sparse$parameters

#the fit of the sparse model
METnetwork$model_results$sparse$modelfit

The METnetwork$model_results$sparse$parameters reflects the primary output from this function and should print as below:

These results should be plotted by the package in the requested circular layout as below: