SB Pull records - mendessoares/BuddySuite GitHub Wiki

--pull_records, -pr

Description

Return all sequences with IDs containing a regular expression pattern match. The search will also look in the 'description' fields if you specify the 'full' keyword.

Argument

One or more search strings ( regex )

As many simple strings or regular expressions as you want. To avoid issues with special characters, make a habit of adding 'single quotes' around the search terms.

'full' ( exact string )

Optional. By default, only the record IDs are searched. If the records have a description field, then you can pass in the word 'full' to expand the search to this metadata. In the rare case that you must search for the exact word 'full' in your IDs, turn it into an explicit regular expression by enclosing it in parentheses --> '(full)'

File path ( path ) (Available in V 1.3)

Optional. If searching for many different records, it can be easier to put the search terms in a separate file. Put each term on it's own line, but remember that SeqBuddy is searching for regular expressions! If you are looking for exact ID matches, it is good practice to include the '^' and '$' operator on each term (e.g., ^id_1234$).

Examples

Input file: Panx-ends.fa

>Dme-Panxδ1
YKLLGSLKSYLKWQIQTDNAVFRLHNSFTTVLLLTCSLIITATQYVGQPI
>Dme-Panxδ2
MDVFGSVKGLLKIDQVDNNVFRMHYKATVIILIAFSLLVTSRQYIGDPID
>Dme-Panxδ3
GFIKIDNMVFRCHYRITAILFTCCIIVTANNLIGDPISCIIPMHVINTFC
>Dme-Panxδ4
MAAVKPLSKYLQFKVHIYDAIFTLHSKVTVALLLACTFLLSSKQYFGDPI
>Mle-Panxα1 cDNA - ML078817.
MYWIFEICQEIKRAQSCRKFAIDGPFDWTNRIIMPTLMVICCFLQTFTFM
>Mle-Panxα5 cDNA - ML223536a.
MIYWVWAVFKRMAPFKVVTLDDRWDQMNRSFMMPLTMSFAYLIDYGIIAG
>Mle-Panxα6 cDNA - ML25993a.
MLLEILANFKGATPFKEIVLDDKWDQINRCYMFLLCVIFGTVVTFRQYTG
>Mle-Panxα9 cDNA - ML47742a.
MLDILSKFKGVTPFKGITIDDGWDQLNRSFMFVLLVVMGTTVTVRQYTGS

Usage example 1

$: sb Panx-ends.fa -pr 'Dme'

Output

>Dme-Panxδ1
YKLLGSLKSYLKWQIQTDNAVFRLHNSFTTVLLLTCSLIITATQYVGQPI
>Dme-Panxδ2
MDVFGSVKGLLKIDQVDNNVFRMHYKATVIILIAFSLLVTSRQYIGDPID
>Dme-Panxδ3
GFIKIDNMVFRCHYRITAILFTCCIIVTANNLIGDPISCIIPMHVINTFC
>Dme-Panxδ4
MAAVKPLSKYLQFKVHIYDAIFTLHSKVTVALLLACTFLLSSKQYFGDPI

Usage example 2

$: sb Panx-ends.fa -pr '.*Panx[αδ][1-2]'

Output

>Dme-Panxδ1
YKLLGSLKSYLKWQIQTDNAVFRLHNSFTTVLLLTCSLIITATQYVGQPI
>Dme-Panxδ2
MDVFGSVKGLLKIDQVDNNVFRMHYKATVIILIAFSLLVTSRQYIGDPID
>Mle-Panxα1 cDNA - ML078817.
MYWIFEICQEIKRAQSCRKFAIDGPFDWTNRIIMPTLMVICCFLQTFTFM

Usage example 3

$: sb Panx-ends.fa -pr 'δ1' 'α5'

Output

>Dme-Panxδ1
YKLLGSLKSYLKWQIQTDNAVFRLHNSFTTVLLLTCSLIITATQYVGQPI
>Mle-Panxα5 cDNA - ML223536a.
MIYWVWAVFKRMAPFKVVTLDDRWDQMNRSFMMPLTMSFAYLIDYGIIAG

Usage example 4

Include the description metadata in the search with the 'full' keyword

$: sb Panx-ends.fa -pr 'δ1' 'ML[0-9]*a' 'full'

Output

>Dme-Panxδ1
YKLLGSLKSYLKWQIQTDNAVFRLHNSFTTVLLLTCSLIITATQYVGQPI
>Mle-Panxα5 cDNA - ML223536a.
MIYWVWAVFKRMAPFKVVTLDDRWDQMNRSFMMPLTMSFAYLIDYGIIAG
>Mle-Panxα6 cDNA - ML25993a.
MLLEILANFKGATPFKEIVLDDKWDQINRCYMFLLCVIFGTVVTFRQYTG
>Mle-Panxα9 cDNA - ML47742a.
MLDILSKFKGVTPFKGITIDDGWDQLNRSFMFVLLVVMGTTVTVRQYTGS

Usage example 5

Read from a file of search terms

Search terms file: names.txt
^Dme-Panxδ1$
Dme-Panxδ[59]

$: sb Panx-ends.fa -pr names.txt

Output

>Dme-Panxδ1
YKLLGSLKSYLKWQIQTDNAVFRLHNSFTTVLLLTCSLIITATQYVGQPI
>Mle-Panxα5 cDNA - ML223536a.
MIYWVWAVFKRMAPFKVVTLDDRWDQMNRSFMMPLTMSFAYLIDYGIIAG
>Mle-Panxα9 cDNA - ML47742a.
MLDILSKFKGVTPFKGITIDDGWDQLNRSFMFVLLVVMGTTVTVRQYTGS