AB Delete Records - mendessoares/BuddySuite GitHub Wiki

--delete_records, -dr

Description

Strip out entire rows from an alignment. The tool takes one or more strings as input, which are searched for in the record IDs; matches are deleted. If removing a row results in columns with 100% gaps, then those columns are also deleted.

Arguments

Search pattern ( regex )

At least one simple string or regular expression can be used. If the search pattern exists in a record ID, that row will be deleted.

Columns ( int )

Optional. Lists of deleted records are printed to stderr so you know what has been removed. By default these lists are ordered in a single column, but you can pass in an integer to flatten the list out into more columns.

Examples

Input file: Panx_C-terms.physr

 3 100
Mle-Panxα9  ---atgttagacatactttcaaagtttaaaggagttactccttttaaaggtataacgatagatgacgggtgggatcaactcaatcggagttttatgttcg
Mle-Panxα4  atggttattgagctgctagctggatacaaaggtctgtccccgtttaaagacgcgactgttgacgactcatgggaccaaataaaccgatgttacgtgttca
Mle-Panxα6  atgttattggagatattagcgaacttcaaaggagcgacacctttcaaagaaatagttctagatgacaagtgggaccagattaaccgatgttacatgttcc

 3 100
Ael_PanxβA  ---------------------------------------------------------------------------------------ATGGTAGTCATTC
Ael_PanxβB  ---------------------------------------------------------------------------------------ATGGTTGTCATAC
Ael_PanxβC  ATGCCCAACAACATATACCCAAACAGACTATTCGTGAAGACTAATGATATCCCGGAAAAATTAAACACTCCGTGGTCATACGAAAAAATGGTTGTAGTGC

Usage example 1

$: alb Panx_C-terms.physr -dr "PanxβC"

Output

# ####################### Deleted records ######################## #
# Alignment 2
Ael_PanxβC
# ################################################################ #
 3 100
Mle-Panxα9  ---atgttagacatactttcaaagtttaaaggagttactccttttaaaggtataacgatagatgacgggtgggatcaactcaatcggagttttatgttcg
Mle-Panxα4  atggttattgagctgctagctggatacaaaggtctgtccccgtttaaagacgcgactgttgacgactcatgggaccaaataaaccgatgttacgtgttca
Mle-Panxα6  atgttattggagatattagcgaacttcaaaggagcgacacctttcaaagaaatagttctagatgacaagtgggaccagattaaccgatgttacatgttcc

 2 13
Ael_PanxβA  ATGGTAGTCATTC
Ael_PanxβB  ATGGTTGTCATAC

Usage example 2

Regular expressions are understood and multiple search terms can be included

$: alb temp.del -dr α[46] βA

Output

# ####################### Deleted records ######################## #
# Alignment 1
Mle-Panxα4
Mle-Panxα6

# Alignment 2
Ael_PanxβA
# ################################################################ #
 1 97
Mle-Panxα9  atgttagacatactttcaaagtttaaaggagttactccttttaaaggtataacgatagatgacgggtgggatcaactcaatcggagttttatgttcg

 2 100
Ael_PanxβB  ---------------------------------------------------------------------------------------ATGGTTGTCATAC
Ael_PanxβC  ATGCCCAACAACATATACCCAAACAGACTATTCGTGAAGACTAATGATATCCCGGAAAAATTAAACACTCCGTGGTCATACGAAAAAATGGTTGTAGTGC

Usage example 3

Include an integer to specify how many columns deleted record IDs should be organized in

$: alb Panx_C-terms.physr -dr α 2

Output

# ####################### Deleted records ######################## #
# Alignment 1
Mle-Panxα9	Mle-Panxα4
Mle-Panxα6
# ################################################################ #
 3 100
Ael_PanxβA  ---------------------------------------------------------------------------------------ATGGTAGTCATTC
Ael_PanxβB  ---------------------------------------------------------------------------------------ATGGTTGTCATAC
Ael_PanxβC  ATGCCCAACAACATATACCCAAACAGACTATTCGTGAAGACTAATGATATCCCGGAAAAATTAAACACTCCGTGGTCATACGAAAAAATGGTTGTAGTGC