training - MiraldiLab/maxATAC GitHub Wiki
Training
For this walkthrough, we will use data from the GATA3 TF model.
For the GATA3 model, we have 4 total cell types available:
Jurkat
MCF-7
A549
SK-N-SH
Using the GATA3 model, we will be able to train on 3 of the cell types and evaluate the model performance in an independently held out cell type. The data for the GATA3 model comes from multiple sources, like most maxATAC models, the data is derived from both the GEO and ENCODE database.
The following table has the metadata for experiments used:
ENCODE_ACCESSION | CELL_TYPE | ENCODE_BIO_REP | TECH_REP | BIO_REP | SOURCE |
---|---|---|---|---|---|
NaN | Jurkat | 1 | SRR8820028 | SRX5608483 | GEO |
NaN | Jurkat | 1 | SRR8820029 | SRX5608483 | GEO |
NaN | Jurkat | 1 | SRR8820030 | SRX5608483 | GEO |
NaN | Jurkat | 1 | SRR13199795 | SRX9633376 | GEO |
NaN | Jurkat | 1 | SRR13199794 | SRX9633376 | GEO |
NaN | Jurkat | 1 | SRR13199792 | SRX9633375 | GEO |
NaN | Jurkat | 1 | SRR13199793 | SRX9633375 | GEO |
ENCSR422SUG | MCF-7 | 1 | SRR14103347 | SRX10475183 | ENCODE |
ENCSR422SUG | MCF-7 | 2 | SRR14103348 | SRX10475184 | ENCODE |
NaN | MCF-7 | 1 | SRR13199806 | SRX9633382 | GEO |
NaN | MCF-7 | 1 | SRR13199807 | SRX9633382 | GEO |
NaN | MCF-7 | 1 | SRR13199804 | SRX9633381 | GEO |
NaN | MCF-7 | 1 | SRR13199805 | SRX9633381 | GEO |
ENCSR032RGS | A549 | 1 | SRR14103424 | SRX10475232 | ENCODE |
ENCSR032RGS | A549 | 1 | SRR14103425 | SRX10475232 | ENCODE |
ENCSR032RGS | A549 | 1 | SRR14103426 | SRX10475232 | ENCODE |
ENCSR032RGS | A549 | 1 | SRR14103427 | SRX10475232 | ENCODE |
ENCSR032RGS | A549 | 2 | SRR14103420 | SRX10475231 | ENCODE |
ENCSR032RGS | A549 | 2 | SRR14103421 | SRX10475231 | ENCODE |
ENCSR032RGS | A549 | 2 | SRR14103422 | SRX10475231 | ENCODE |
ENCSR032RGS | A549 | 2 | SRR14103423 | SRX10475231 | ENCODE |
ENCSR032RGS | A549 | 3 | SRR14103428 | SRX10475233 | ENCODE |
ENCSR032RGS | A549 | 3 | SRR14103429 | SRX10475233 | ENCODE |
ENCSR032RGS | A549 | 3 | SRR14103430 | SRX10475233 | ENCODE |
ENCSR032RGS | A549 | 3 | SRR14103431 | SRX10475233 | ENCODE |
ENCSR587TRP | SK-N-SH | 1 | SRR14305486 | SRX10661032 | ENCODE |
ENCSR587TRP | SK-N-SH | 1 | SRR14305487 | SRX10661032 | ENCODE |
ENCSR587TRP | SK-N-SH | 1 | SRR14305488 | SRX10661032 | ENCODE |
ENCSR587TRP | SK-N-SH | 1 | SRR14305489 | SRX10661032 | ENCODE |
ENCSR587TRP | SK-N-SH | 2 | SRR14305490 | SRX10661033 | ENCODE |
ENCSR587TRP | SK-N-SH | 2 | SRR14305491 | SRX10661033 | ENCODE |
ENCSR587TRP | SK-N-SH | 2 | SRR14305492 | SRX10661033 | ENCODE |
ENCSR587TRP | SK-N-SH | 2 | SRR14305493 | SRX10661033 | ENCODE |
ENCSR587TRP | SK-N-SH | 3 | SRR14305494 | SRX10661034 | ENCODE |
ENCSR587TRP | SK-N-SH | 3 | SRR14305495 | SRX10661034 | ENCODE |
ENCSR587TRP | SK-N-SH | 3 | SRR14305496 | SRX10661034 | ENCODE |
ENCSR587TRP | SK-N-SH | 3 | SRR14305497 | SRX10661034 | ENCODE |