Benchmarks: Data - simonrharris/SKA GitHub Wiki

Benchmarking Data

The raw sequence data used in the benchmarks in this wiki are from two sources.

  1. 65 Staphylococcus aureus samples from Harris, SR et al. 2012. Whole-genome sequencing for analysis of an outbreak of meticillin-resistant Staphylococcus aureus: a descriptive study. The Lancet Infectious Diseases, Volume 13, Issue 2, pages 130 - 136. The 65 samples in this dataset include some diverse samples. To illustrate the improvements in speed, memory usage and file size when applying SKA to outbreak data, analyses of 45 of the samples, which were deemed part of an outbreak, are also presented separately.
  2. 22 Campylobacter jejuni, 9 Escherichia coli, 31 Listeria monocytogenes and 23 Salmonella enterica from Timme RE et al. 2017. Benchmark datasets for phylogenomic pipeline validation, applications for foodborne pathogen surveillance. PeerJ, Volume 5, pages e3893. Data available on github

Accession Numbers

Staphylococcus aureus

All samples

ERR070033  ERR070043  ERR072250  ERR124434  ERR128712  ERR131801  ERR131811
ERR070034  ERR070044  ERR072251  ERR124435  ERR128713  ERR131802  ERR131812
ERR070035  ERR070045  ERR072252  ERR124436  ERR128714  ERR131803  ERR131813
ERR070036  ERR070046  ERR072253  ERR128705  ERR128715  ERR131804  ERR131814
ERR070037  ERR070047  ERR108054  ERR128706  ERR128716  ERR131805  ERR131815
ERR070038  ERR070048  ERR124429  ERR128707  ERR128717  ERR131806
ERR070039  ERR072246  ERR124430  ERR128708  ERR128718  ERR131807
ERR070040  ERR072247  ERR124431  ERR128709  ERR128719  ERR131808
ERR070041  ERR072248  ERR124432  ERR128710  ERR128720  ERR131809
ERR070042  ERR072249  ERR124433  ERR128711  ERR131800  ERR131810

Outbreak samples

ERR070033 ERR070043 ERR072247 ERR124434 ERR128712 ERR128719 ERR131813
ERR070034 ERR070044 ERR108054 ERR124435 ERR128713 ERR128720 ERR131814
ERR070036 ERR070045 ERR124429 ERR128707 ERR128714 ERR131808 ERR131815
ERR070038 ERR070046 ERR124430 ERR128708 ERR128715 ERR131809
ERR070039 ERR070047 ERR124431 ERR128709 ERR128716 ERR131810
ERR070040 ERR070048 ERR124432 ERR128710 ERR128717 ERR131811
ERR070042 ERR072246 ERR124433 ERR128711 ERR128718 ERR131812

Campylobacter jejuni

SRR1993270  SRR1999661  SRR3214715  SRR3215123  SRR3215210  SRR3216186
SRR1993271  SRR2984947  SRR3215024  SRR3215124  SRR3215211  SRR3216366
SRR1993272  SRR2985018  SRR3215107  SRR3215135  SRR3216118
SRR1999649  SRR2985019  SRR3215108  SRR3215209  SRR3216133

Escherichia coli

SRR1609861  SRR1609871  SRR1610029  SRR1610032  SRR1610034
SRR1609862  SRR1610028  SRR1610031  SRR1610033

Listeria monocytogenes

SRR1206159  SRR1553774  SRR1553821  SRR1553882  SRR1556293  SRR1597487
SRR1393979  SRR1553788  SRR1553826  SRR1553907  SRR1556294
SRR1534987  SRR1553791  SRR1553827  SRR1556288  SRR1556295
SRR1553739  SRR1553792  SRR1553851  SRR1556289  SRR1556296
SRR1553756  SRR1553804  SRR1553856  SRR1556290  SRR1556297
SRR1553773  SRR1553816  SRR1553867  SRR1556291  SRR1562157

Salmonella enterica

SRR1258439  SRR498276  SRR498399  SRR498422  SRR498433  SRR498444
SRR1258440  SRR498369  SRR498402  SRR498423  SRR498434  SRR500493
SRR1258442  SRR498373  SRR498403  SRR498425  SRR498436  SRR500494
SRR1258443  SRR498397  SRR498404  SRR498431  SRR498442