Command line examples - edanssandes/MASA-Core GitHub Wiki
In this page, the masa extension binary will be called ./masa-extension, but in real scenarios, the binary is called ./masa-cudalign, ./masa-openmp and so on. All MASA extensions share common parameters. To see the command line arguments, use the --help parameter.
Retrieving only the optimal local score.
./masa-extension --stage-1 seq1.fasta seq2.fasta
The best score can be found in the ./work.tmp/statistics_01.00 filename
Retrieving the optimal local alignment.
./masa-extension --ram-size=100M --disk-size=20M seq1.fasta seq2.fasta
In this example, the amount of memory used to store temporary data is restricted to 100MB of RAM memory and 20MB of disk space. The alignment file will be saved as ./work.tmp/alignment.00.txt.
Retrieving the optimal global alignment.
./masa-extension --ram-size=50M --alignment-edges=++ seq1.fasta seq2.fasta
In this example, the alignment will contains all the character of sequences 1 and 2. Only 50M of RAM memory will be used to store temporary data.
Retrieving the optimal semi-global alignment.
./masa-extension --ram-size=50M --alignment-edges=33 seq1.fasta seq2.fasta
The --alignment-edges parameter defines where the alignment can start or end. Option '33' allows the alignment to start and end in any of the sequences edges. See below 25 allowed combinations.
--alignment-edges=[*|1|2|3|+][*|1|2|3|+] (start,end)
- *: any location.
- 1: start/end of sequence 1.
- 2: start/end of sequence 2.
- 3: start/end of sequences 1 or 2.
- +: start/end of sequences 1 and 2.
Disabling Block-Pruning optimization.
./masa-extension --ram-size=50M --no-block-pruning seq1.fasta seq2.fasta
Block pruning optimization is able to reduce execution time in more than 50% in some situations. This optimization is enabled by default. If you want to check the effect of disabling it, use the --no-block-pruning parameter.
Forking two processes for parallel execution in the same machine.
./masa-extension --fork=2 seq1.fasta seq2.fasta
You can specify to fork any number of processes, but the actual behavior is platform dependent. Some platforms may limit the number of processes. The alignment result will be placed in one of the FORK.xx subfolder inside the work directory (e.g. work.tmp/FORK.00/alignment.01.txt)
Forking parallel processes in two separated machines.
machine1:~$ ./masa-extension --split=2 --part=1 --flush-column=socket://localhost:9001 --shared-dir=shared_folder seq1.fasta seq2.fasta
machine2:~$ ./masa-extension --split=2 --part=2 --load-column=socket://machine1:9001 --shared-dir=shared_folder seq1.fasta seq2.fasta
The 'machine1' hostname can be replaced by its IP address. You can use any available port instead of 9001. The 'shared_folder' must be a shared folder accessible from both machines.
Forking parallel processes in three separated machines.
machine1:~$ ./masa-extension --split=3 --part=1 --flush-column=socket://localhost:9001 --shared-dir=shared_folder seq1.fasta seq2.fasta
machine2:~$ ./masa-extension --split=3 --part=2 --flush-column=socket://localhost:9002 --load-column=socket://machine1:9001 --shared-dir=shared_folder seq1.fasta seq2.fasta
machine3:~$ ./masa-extension --split=3 --part=3 --load-column=socket://machine2:9002 --shared-dir=shared_folder seq1.fasta seq2.fasta
You can extend this idea to any number of machines. In each machine, you can also use the --fork parameter in order to launch more than one processes in each machine (e.g. machine1 may fork 2 processes, machine 2 and 3 may fork 3 processes each, producing a total of 8 processes in a chain).
Forking parallel processes with different loads.
machine1:~$ ./masa-extension --split=20,20,10,50 --part=1 ... seq1.fasta seq2.fasta
machine2:~$ ./masa-extension --split=20,20,10,50 --part=2 ... seq1.fasta seq2.fasta
machine3:~$ ./masa-extension --split=20,20,10,50 --part=3 ... seq1.fasta seq2.fasta
machine4:~$ ./masa-extension --split=20,20,10,50 --part=4 ... seq1.fasta seq2.fasta
Use the --split parameter to specify the proportional weights separated by commas. In the above example, the processes will receive a load balance of respectively 20%, 20%, 10% and 50%.