SB Delete records - mendessoares/BuddySuite GitHub Wiki
--delete_records, -dr
Description
Delete all sequences with IDs matching regular expression patterns. The remaining sequences are returned along with a list of IDs for the deleted sequences (the IDs are sent to stderr).
Arguments
Pattern [pattern ...] ( regex )
One or more strings or regular expressions. To avoid issues with special characters, make a habit of adding 'single quotes' around the search term.
File path ( path ) (Available in V 1.3)
Optional. If searching for many different records, it can be easier to put the search terms in a separate file. Put each term on it's own line, but remember that SeqBuddy is searching for regular expressions! If you are looking for exact ID matches, it is good practice to include the '^' and '$' operator on each term (e.g., ^id_1234$).
Columns ( int )
Optional. The list of deleted IDs sent to stderr will be output as a single column by default. An integer passed in as the FINAL argument will change the number of output columns. If you need to search with something that could be interpreted as an integer, make it an explicit regex (e.g., sb foo.fa -dr "foo|bar" "(4563)").
Modifier
--quiet/-q
Suppress the stderr list of deleted IDs.
Examples
Input file: C-terms.fa
>Dme-Panxδ1
YKLLGSLKSYLKWQIQTDNAVFRLHNSFTTVLLLTCSLIITATQYVGQPI
>Dme-Panxδ2
MDVFGSVKGLLKIDQVDNNVFRMHYKATVIILIAFSLLVTSRQYIGDPID
>Dme-Panxδ3
GFIKIDNMVFRCHYRITAILFTCCIIVTANNLIGDPISCIIPMHVINTFC
>Dme-Panxδ4
MAAVKPLSKYLQFKVHIYDAIFTLHSKVTVALLLACTFLLSSKQYFGDPI
>Mle-Panxα1 cDNA - ML078817.
MYWIFEICQEIKRAQSCRKFAIDGPFDWTNRIIMPTLMVICCFLQTFTFM
>Mle-Panxα5 cDNA - ML223536a.
MIYWVWAVFKRMAPFKVVTLDDRWDQMNRSFMMPLTMSFAYLIDYGIIAG
>Mle-Panxα6 cDNA - ML25993a.
MLLEILANFKGATPFKEIVLDDKWDQINRCYMFLLCVIFGTVVTFRQYTG
>Mle-Panxα9 cDNA - ML47742a.
MLDILSKFKGVTPFKGITIDDGWDQLNRSFMFVLLVVMGTTVTVRQYTGS
Usage example 1
$: sb C-terms.fa -dr 'Dme'
Output
# ####################### Deleted records ######################## #
Dme-Panxδ1
Dme-Panxδ2
Dme-Panxδ3
Dme-Panxδ4
# ################################################################ #
>Mle-Panxα1 cDNA - ML078817.
MYWIFEICQEIKRAQSCRKFAIDGPFDWTNRIIMPTLMVICCFLQTFTFM
>Mle-Panxα5 cDNA - ML223536a.
MIYWVWAVFKRMAPFKVVTLDDRWDQMNRSFMMPLTMSFAYLIDYGIIAG
>Mle-Panxα6 cDNA - ML25993a.
MLLEILANFKGATPFKEIVLDDKWDQINRCYMFLLCVIFGTVVTFRQYTG
>Mle-Panxα9 cDNA - ML47742a.
MLDILSKFKGVTPFKGITIDDGWDQLNRSFMFVLLVVMGTTVTVRQYTGS
Usage example 2
$: sb C-terms.fa -dr '.*Panx[αδ][1-2]' 3
Output
# ####################### Deleted records ######################## #
Dme-Panxδ1 Dme-Panxδ2 Mle-Panxα1
# ################################################################ #
>Dme-Panxδ3
GFIKIDNMVFRCHYRITAILFTCCIIVTANNLIGDPISCIIPMHVINTFC
>Dme-Panxδ4
MAAVKPLSKYLQFKVHIYDAIFTLHSKVTVALLLACTFLLSSKQYFGDPI
>Mle-Panxα5 cDNA - ML223536a.
MIYWVWAVFKRMAPFKVVTLDDRWDQMNRSFMMPLTMSFAYLIDYGIIAG
>Mle-Panxα6 cDNA - ML25993a.
MLLEILANFKGATPFKEIVLDDKWDQINRCYMFLLCVIFGTVVTFRQYTG
>Mle-Panxα9 cDNA - ML47742a.
MLDILSKFKGVTPFKGITIDDGWDQLNRSFMFVLLVVMGTTVTVRQYTGS
Usage example 3
$: sb C-terms.fa -dr 'Mle' -q
Output
>Dme-Panxδ1
YKLLGSLKSYLKWQIQTDNAVFRLHNSFTTVLLLTCSLIITATQYVGQPI
>Dme-Panxδ2
MDVFGSVKGLLKIDQVDNNVFRMHYKATVIILIAFSLLVTSRQYIGDPID
>Dme-Panxδ3
GFIKIDNMVFRCHYRITAILFTCCIIVTANNLIGDPISCIIPMHVINTFC
>Dme-Panxδ4
MAAVKPLSKYLQFKVHIYDAIFTLHSKVTVALLLACTFLLSSKQYFGDPI
Usage example 4
Read from a file of search terms
Search terms file: names.txt
^Dme-Panxδ1$
Dme-Panxδ[34]
$: sb Panx-ends.fa -pr names.txt
Output
# ####################### Deleted records ######################## #
Dme-Panxδ1
Dme-Panxδ3
Dme-Panxδ4
# ################################################################ #
>Dme-Panxδ2
MDVFGSVKGLLKIDQVDNNVFRMHYKATVIILIAFSLLVTSRQYIGDPID