volesti vs cobra toolbox - GeomScale/volesti GitHub Wiki

In this thread we usevolesti to sample feasible flux solutions in a metabolic network and we compare its efficiency with Matlab toolbox cobra. Given the stoichiometric matrix of a metabolic network the set of feasible flux distributions (steady states) is defined by a convex polytope. Sampling uniformly distributed steady states is a powerful tool to study metabolism under changing conditions. In particular, one could estimate the probability density mass of the marginal distribution of a specific reductase flux.

We present this topic to state that volesti could be a good alternative to cobra for studying metabolic networks. More interestingly, using volesti could lead in more efficient scaling with respect to the dimension of polytopes and thus possible to study currently intractable models. To this end, the three following steps should be implemented:

  • Implement the rounding algorithms in here and here.
  • Implement preprocess methods to compute a high dimensional polytope from the data set.
  • Implement an interface to illustrate sampling outcomes.

We uniformly sample steady states from the metabolic network of Escherichia coli given here to illustrate the marginal distribution of ATP synthase reductase flux. In this example e-coli results to a 24-dimensional polytope that we have to sample uniformly distributed points from it.

The following Matlab script samples uniformly distributed points from the convex polytope induced by the metabolic network of Escherichia coli and creates the histogram of the values of coordinates that correspond to ATP synthase reductase flux (12th coordinate). It uses the CHRR algorithm.

load('e_coli_core.mat')
model = e_coli_core;
ATP = 'ATPS4r'; 
ibm = find(ismember(model.rxns, ATP)); % column index of the ATP demand reaction
model.c = zeros(length(model.c),1);
model.csense=model.csense';
[~, roundedPolytope, ~, ~] = chrrSampler(model, 1, 1,1);
nbins = 20;
sampling_times = [];
counter = 1;

for numSteps = [1 5:5:40]
  tic
  samples = genSamples( roundedPolytope,numSteps, 2000);
  sampling_times = [sampling_times toc];
  [yUn, xUn] = hist(samples(ibm, :), nbins);
  subplot(3, 3, counter);
  plot(xUn, yUn);
  xlabel('Flux (mmol/gDW/h)')
  ylabel('# samples')
  title(strcat('number of Steps = ', num2str(numSteps)));
  counter = counter + 1;
end

We get the following plots

e_coli_plots2

volesti does not support low dimensional sampling neither the rounding method implemented in cobra. The following Matlab script saves in a cell the matrices we need to construct the polytope in R and then map the sampled points to the low dimensional polytope in up dimension.

  poly=cell(4,1);
  poly{1} = roundedPolytope.A;
  poly{2} = roundedPolytope.b;
  poly{5} = roundedPolytope.N;
  poly{6} = roundedPolytope.p_shift;
  save('roundedPolytope.mat','poly')
end

Now we use R package R.matlab to load the matrices in R and to construct the polytope.

library(volesti)
library(R.matlab)
modelmat = readMat('roundedPolytope.mat')
roundedPoly = modelmat$poly

A = roundedPoly[1]
A = A[1](/GeomScale/volesti/wiki/1)
A = A[1](/GeomScale/volesti/wiki/1)

b = roundedPoly[2]
b = b[1](/GeomScale/volesti/wiki/1)
b = b[1](/GeomScale/volesti/wiki/1)
           
G = roundedPoly[5]
G = G[1](/GeomScale/volesti/wiki/1)
G = G[1](/GeomScale/volesti/wiki/1)
            
shift = roundedPoly[6]
shift = shift[1](/GeomScale/volesti/wiki/1)
shift = shift[1](/GeomScale/volesti/wiki/1)
             
P = Hpolytope$new(A=A, b=b)

Now we sample with both Coordinate Hit and Run (CDHR), which is the same with CHRR as we sample in the rounded polytope, and Billiard walk.

N=2000

sampling_time_cdhr = c()
for (walk_length in c(1, seq(from=5,to=40,by=5))) {
  tim = system.time({ points =  sample_points(P, n=N, random_walk = list("walk" = "CDHR", 
           "walk_length"=walk_length))
  points = G%*%points + matrix(rep(shift,N),ncol=N) })
  sampling_time_cdhr = c(sampling_time_cdhr, as.numeric(tim)[3])
}

sampling_time_biw  = c()
for (walk_length in 1:9) {
     
  tim = system.time({ points =  sample_points(P, n=N, random_walk = list("walk" = "aBiW", 
          "walk_length"=walk_length))
  points = G%*%points + matrix(rep(shift,N),ncol=N) })
  sampling_time_biw  = c(sampling_time_biw, as.numeric(tim)[3])
}

For CDHR we get the following histograms, which behave almost the same with those of cobra.
cdhr_figs

For Billiard walk we get the following histograms.
biw_figs

Notice that the histogram implies the correct distribution even the walk length equals to one. The latter implies that Billiard walk mixes faster than CDHR (and CHRR).

--- volesti volesti COBRA
Walk length Billiard walk time CDHR time CHRR time
1/1/1 0.099 0.007 3.5299
2/5/5 0.179 0.011 3.5976
3/10/10 0.268 0.015 3.7210
4/15/15 0.347 0.019 3.8905
5/20/20 0.443 0.023 4.0119
6/25/25 0.522 0.028 4.0606
7/30/30 0.607 0.032 4.2878
8/35/35 0.685 0.036 4.5809
9/40/40 0.775 0.040 5.0799

The implementation of CDHR in volesti is 100 times faster than CDRR. However, we expect that billiard walk will be much more efficient than both CDHR and CHRR for high dimensional metabolic networks due to its superior mixing rate.