Output of Higashi analysis - ma-compbio/Higashi GitHub Wiki

The output path, naming convention, etc., are defined in the configuration file (See details in "configuration of parameters"). For instance, all output of Higashi-analysis is stored at the temp_dir of the configuration file. See tutorials for examples on how to use these output files.

Single cell TAD calling & calibration results

The single cell TAD calling and calibration results are stored with the name scTAD.hdf5 (single cell insulation scores, single cell boundaries, calibrated single cell boundaries).

This hdf5 file has a similar structure as the Cooler format.

.
├── insulation
│   ├── bin (information about entries in the insulation file)
│   │   ├── chrom (vector size of k)
│   │   ├── start (vector size of k)
│   │   └── end (vector size of k)
│   ├── bulk (vector size of k, continuous value of insulation score of pooled imputed scHi-C)
│   ├── cell 0 (vector size of k, continuous value of insulation score of cell 0)
│   ├── ...
│   └── cell N
├── tads (same structure as insulation above, without the "bulk", tads information is stored with a vector size of k, binary value indicating this bin is a tad or not )
└── calib_tads (same structure as tads above)

Single cell A/B compartment calling results

The single cell TAD calling and calibration results are stored with the name scCompartment.hdf5.

This hdf5 file has a similar structure as the Cooler format.

Note: The +/- signs do not necessarily correspond to A/B compartments, nor are they consistent across different chromosomes or even different arms of the same chromosome. We do provide a method to calibrate the signs using sequence features (GC content or CpG frequencies). See Single cell A B compartment calling for details.

.
├── compartment
│   ├── bin (information about entries in the signal file)
│   │   ├── chrom (vector size of k)
│   │   ├── start (vector size of k)
│   │   └── end (vector size of k)
│   ├── bulk (vector size of k, compartment values called on pooled raw scHi-C contact maps)
│   ├── real_bulk (vector size of k, compartment values called on true bulk Hi-C (Optional))
│   ├── cell 0 (vector size of k, continuous single cell A/B compartment value for cell 0)
│   ├── ...
│   └── cell N
├── compartment_raw (same structure as "compartment", stored the single cell compartment scores without normalization )
└── compartment_zscore (same structure as "compartment", stored the single cell compartment scores with z-score normalization )