Assembling with WENGAN on EC2 - Green-Biome-Institute/AWS GitHub Wiki

Go back to GBI AWS Wiki

early documentation on WENGAN use, still updating this page

following the wengan instructions:

installation:
    - $ wget https://github.com/adigenova/wengan/releases/download/v0.2/wengan-v0.2-bin-Linux.tar.gz
    - $ tar zxvf wengan-v0.2-bin-Linux.tar.gz
    - $ rm wengan-v0.2-bin-Linux.tar.gz 
    - $ export WG=$PWD/wengan-v0.2-bin-Linux/wengan.pl

This last step seemed to un-do itself when I rebooted/restarted the EC2 instance, so it might be one thing for troubleshooting if using the perl ${WG} command doesn't work. To check if the correct PATH is associated with the variable WG you would do:

    - $ echo $WG

if /home/ubuntu/wengan-v0.2-bin-Linux/wengan.pl doesn't show up, then you will need to go back to the home directory:

    - $ cd

then redo the export command

    - $ export WG=$PWD/wengan-v0.2-bin-Linux/wengan.pl

Now onto actually doing an assembly:

example command for arabidopsis thaliana assembly using 
short reads: SRR1946554_1.fastq.gz, SRR1946554_2.fastq.gz
long reads: SRR11968809.fastq.gz
-p =prefix
-t = threads
-x = mode (nanopore, pacbio, etc)
-a = short-read assembly (DiscovarDenovo 'D', ABySS 'A', Minia3 'M')
-s = short read input data
-l = long read input data
-g = [estimated] genome length

    - $ perl ${WG} -x pacraw -a A -s ../data/short-read/SRR1946554_1.fastq.gz,../data/short-read/SRR1946554_2.fastq.gz -l ../data/long-read/SRR11968809.fastq.gz -p athal-assem1 -t 4 -g 135
    - $ 

Resources:

https://github.com/adigenova/wengan

https://github.com/adigenova/wengan_demo#running-the-ecoli-demo

Go back to GBI AWS Wiki