Quick Usage Guide - saitomics/metatryp GitHub Wiki
1: Digest and ingest data
- Run the script bin/digest_and_ingest.sh with FASTA proteome files you wish to digest and ingest. e.g.:
bin/digest_and_ingest.sh file1.fasta file2.fasta ...
This script reads the FASTA files, and runs digestions on their sequences. You should see a fair amount of output as these files are processed.
2: Generate redundancy tables
- See available taxon ids by querying DB: e.g.
bin/list_taxon_ids.sh
- Generate redundancy tables for groups of taxons e.g.
bin/generate_redundancy_tables.sh --taxon-ids syn8102 syn7502 syn7503 --output-dir exampleRedundancyTables
Note that you can also specify a file that contains a list of taxon IDs, e.g
bin/generate_redundancy_tables.sh --taxon-id-file taxon_id_list.txt --output-dir exampleRedundancyTables
- View resulting files in exampleRedundancyTables
- redundancy.db.sqlite is generated with the redundancy information
- counts.csv contains counts of redundant peptides
- percents.csv contains the values in counts.csv, divided by the number of unique peptides in the union of digestions of a taxa pair.
3: Remove Taxa from the Database
If you wish to delete data for a given set of taxa in the database, run a command like this:
bin/clear_taxon_data.sh --taxon-ids taxa_name