Runtime of Higashi - ma-compbio/Higashi GitHub Wiki
Runtime of the latest Higashi
The batch size is not automatically chosen as int(256 * max((1000000 / hic_resolution), 1) * max(cell_number / 6000, 1))
.
We are still working on the benchmarking of Higashi runtime (2022/01/26).
Runtime of Higashi reported in manuscript
The runtime analysis was carried out on a Linux machine with 8 NVIDIA RTX 2080 Ti GPU cards, a 16-core Intel Xeon Silver 4110 CPU, and 252GB memory. The batch size is set as 192. Since the number of hyperedges varies across different datasets, we use the operation time per 1000 batches as the unit for measuring the runtime. For simplicity, we refer to 1000 batches as one epoch, which is different from the conventional definition where one iteration over the whole training dataset is one epoch. For the reported runtime, the Higashi program is set to use all available CPU cores and one GPU card during training. It is also set to not use parallel imputation although for smaller datasets one GPU card could fit multiple Higashi models. The runtime of the core operations of Higashi is reported in the table here.
Operation | Average runtime |
---|---|
Training without cd-GNN | 61.3s / epoch |
Training with cd-GNN ($k=0$) | 95.2s / epoch |
Training with cd-GNN ($k=4$, fast mode) | 109.6s / epoch |
Training with cd-GNN ($k=4$, memory saving mode) | 221.4s / epoch |
Imputation (1Mb resolution, hg38, autosomal chromosomes) | 0.2s / cell |
Imputation (500Kb resolution, hg38, autosomal chromosomes) | 0.8s / cell |
Imputation (100Kb resolution, hg38, autosomal chromosomes) | 21.3s / cell |
Imputation (50Kb resolution, hg38, autosomal chromosomes) | 76.8s / cell |