Runtime of Higashi - ma-compbio/Higashi GitHub Wiki

Runtime of the latest Higashi

The batch size is not automatically chosen as int(256 * max((1000000 / hic_resolution), 1) * max(cell_number / 6000, 1)). We are still working on the benchmarking of Higashi runtime (2022/01/26).

Runtime of Higashi reported in manuscript

The runtime analysis was carried out on a Linux machine with 8 NVIDIA RTX 2080 Ti GPU cards, a 16-core Intel Xeon Silver 4110 CPU, and 252GB memory. The batch size is set as 192. Since the number of hyperedges varies across different datasets, we use the operation time per 1000 batches as the unit for measuring the runtime. For simplicity, we refer to 1000 batches as one epoch, which is different from the conventional definition where one iteration over the whole training dataset is one epoch. For the reported runtime, the Higashi program is set to use all available CPU cores and one GPU card during training. It is also set to not use parallel imputation although for smaller datasets one GPU card could fit multiple Higashi models. The runtime of the core operations of Higashi is reported in the table here.

Operation Average runtime
Training without cd-GNN 61.3s / epoch
Training with cd-GNN ($k=0$) 95.2s / epoch
Training with cd-GNN ($k=4$, fast mode) 109.6s / epoch
Training with cd-GNN ($k=4$, memory saving mode) 221.4s / epoch
Imputation (1Mb resolution, hg38, autosomal chromosomes) 0.2s / cell
Imputation (500Kb resolution, hg38, autosomal chromosomes) 0.8s / cell
Imputation (100Kb resolution, hg38, autosomal chromosomes) 21.3s / cell
Imputation (50Kb resolution, hg38, autosomal chromosomes) 76.8s / cell