Assembling with WENGAN on EC2 - Green-Biome-Institute/AWS GitHub Wiki
early documentation on WENGAN use, still updating this page
following the wengan instructions:
installation:
- $ wget https://github.com/adigenova/wengan/releases/download/v0.2/wengan-v0.2-bin-Linux.tar.gz
- $ tar zxvf wengan-v0.2-bin-Linux.tar.gz
- $ rm wengan-v0.2-bin-Linux.tar.gz
- $ export WG=$PWD/wengan-v0.2-bin-Linux/wengan.pl
This last step seemed to un-do itself when I rebooted/restarted the EC2 instance, so it might be one thing for troubleshooting if using the perl ${WG} command doesn't work. To check if the correct PATH is associated with the variable WG you would do:
- $ echo $WG
if /home/ubuntu/wengan-v0.2-bin-Linux/wengan.pl
doesn't show up, then you will need to go back to the home directory:
- $ cd
then redo the export command
- $ export WG=$PWD/wengan-v0.2-bin-Linux/wengan.pl
Now onto actually doing an assembly:
example command for arabidopsis thaliana assembly using
short reads: SRR1946554_1.fastq.gz, SRR1946554_2.fastq.gz
long reads: SRR11968809.fastq.gz
-p =prefix
-t = threads
-x = mode (nanopore, pacbio, etc)
-a = short-read assembly (DiscovarDenovo 'D', ABySS 'A', Minia3 'M')
-s = short read input data
-l = long read input data
-g = [estimated] genome length
- $ perl ${WG} -x pacraw -a A -s ../data/short-read/SRR1946554_1.fastq.gz,../data/short-read/SRR1946554_2.fastq.gz -l ../data/long-read/SRR11968809.fastq.gz -p athal-assem1 -t 4 -g 135
- $
Resources:
https://github.com/adigenova/wengan
https://github.com/adigenova/wengan_demo#running-the-ecoli-demo