Entry 7: Process of Finding a Dataset - bcb420-2025/Izumi_Ando GitHub Wiki
Criteria
- Cancer related dataset from Japan, or labs in universities I might want to work in in the future
- 6+ samples, 3 per comparable conditions, all from homo sapiens
- bulk RNAseq
- published in or after 2022
- ideally associated with a publication in a notable journal
- Japan
- publication in Gynecologic Oncology which has a slightly low but not too low IF
- Concern : The main analysis done in this dataset is differential expression analysis. Is this okay? - (update) Yes according to Prof Isserlin
- Stanford
- publication in Integrative Biology
- uses enteroids (type of organoid) to test the anti-tumor effect of FLASH radiation therapy
- Issue: data file broken
- ETH Zurich
- publication in Nature Portfolio
- Issue: this dataset is scRNAseq, the publication also has other bulk RNA seq data as well but it only has 2 samples
- ETH Zurich
- data file is very organized, and paper is readable
- dataset of Panc1 cells either treated with control siRNA or siRNA targeting SF3B1, 3 of each
- publication in Cell Press