Home - rrwick/Perfect-bacterial-genome-tutorial GitHub Wiki

Logo

This tutorial is the companion to our paper: Wick RR, Judd LM, Holt KE. Assembling the perfect bacterial genome using Oxford Nanopore and Illumina sequencing. PLOS Computational Biology. 2023. doi:10.1371/journal.pcbi.1010905.

It provides sample data and detailed instructions so you can try the paper's assembly method for yourself. This tutorial is not a general purpose guide to bacterial genome assembly. Rather, it assumes that you want the very best possible assembly (ideally zero errors) and are willing to put in extra time/effort for this level of accuracy.

For an example genome where we followed this method (and the source of the sample data), see: Wick RR, Judd LM, Monk IR, Seemann T, Stinear TP. Improved Genome Sequence of Australian Methicillin-Resistant Staphylococcus aureus Strain JKD6159. Microbiology Resource Announcements. 2023. doi: 10.1128/mra.01129-22.

Before you begin, check out the Requirements and Sample data pages to make sure that you have everything you need.

You can then proceed to the tutorial, choosing your difficulty level:

  • EASY. This will take you step-by-step through the entire process of assembling the sample S. aureus data. Exact commands are provided (so you shouldn't have to read the documentation for any of the software) and expected results are described after each step. Not too much thinking or problem-solving is required.
  • MEDIUM. This will give you moderately-detailed instructions. Exact commands are not given (so you will need to consult software documentation), but there are tips and guidelines along the way to help you stay on track.
  • HARD. This will only give brief and high-level instructions. You'll have to figure things out on your own!

Stepping through the easy tutorial won't be very educational, so I'd encourage you to try at least medium. You can always consult the easy version if you get stuck. Also note that the easy and medium tutorials assume you're assembling the S. aureus sample data, while the hard tutorial is sufficiently high-level to work with any decent ONT+Illumina dataset.

⚠️ **GitHub.com Fallback** ⚠️