Supported Bioinformatic Analysis Softwares - Genetalks/gtz GitHub Wiki

Index


1、BWA for gtz

  • How to Install?

    For installation you can (recommended)

    sudo curl -sSL https://gtz.io/bwagtz_latest.run -o /tmp/bwagtz.run && sudo sh /tmp/bwagtz.run

    or

    download installation files:-GTX.Zip bwa-gtz-
    Run commands in the installation file directory

    sudo sh bwagtz_lastest.run

    complete installation according to prompt.

  • How to Use?

    GTX.Zip's support package for BWA includes bwa-gtz and bwa-opt-gtz, both of which are based on version 0.7.17 of bwa. Among them: the two versions have added the ability to read GTZ files directly, and the functions are completely consistent with the main code functions of bwa. bwa-opt-gtz also optimizes the structure of BWA lookup table, which can save more than one third of the time without changing the results of comparison. Due to some changes in the data structure of the lookup table, bwa-opt-gtz is incompatible with the index file data generated by the original bwa. According to the standard steps of bwa, first regenerate the index file, and then compare it with bwa-opt-gtz.

    The difference between bwa-gtz and bwa-opt-gtz is as follows:

    (1) bwa-gtz can directly use index produced by official website BWA, and its performance is consistent with official website BWA.

    (2) bwa-opt-gtz can not directly use the index produced by BWA on official website. index needs to be reproduced by bwa-opt-gtz, but its performance will be improved by 1/3 than that of BWA on official website.

  • Use examples

    bwa-gtz

    export GTZ_RBIN_PATH=/path/rbin

    bwa-gtz mem ref.fa read1.fq.gtz read2.fq.gtz -o aln-pe.sam

    * In this example, the path of the RBIN file is specified by the environment variable GTZ_RBIN_PATH, where "export GTZ_RBIN_PATH=/path/rbin" is not necessary, but if you know the path of rbin, you are advised to specify it, which can speed up the processing of bwa-gtz. Because when bwa-gtz needs RBIN file and cannot find the RBIN file under the default path ~/.config/gtz, it will be downloaded through the network, and the download process will consume time.

    bwa-opt-gtz

    Step one: Remake index (must)

    bwa-opt-gtz index ref.fa

    Step two: execution comparison

    export GTZ_RBIN_PATH=/path/rbin

    bwa-opt-gtz mem ref.fa read1.fq.gtz read2.fq.gtz -t 4 -o aln-pe.sam

  • Performance

    In the case of sufficient server resources, the performance of bwa-opt-gtz is 1/3 better than that of official bwa. The following is a set of test data in the same environment (the number of specified threads is 4):

    Test command

    bwa mem ref.fa read1.fq.gz read2.fq.gz -t 4 -o aln-pe.sam

    bwa-gtz mem ref.fa read1.fq.gtz read2.fq.gtz -t 4 -o aln-pe.sam

    bwa-opt-gtz mem ref.fa read1.fq.gtz read2.fq.gtz -t 4 -o aln-pe.sam

    Test environment

    Server configuration: 16 core CPU, 64G memory; file size: read1.fq.gz(1.8G), read2.fq.gz(1.8G), read1.fq.gtz(0.3G), read2.fq.gtz(0.3G)

    performance data
    Software bwa bwa-gtz bwa-opt-gtz
    Time consumption 50m14.06s 51m37.67s 39m18.86s
    Memory consumption 5.888G 10.56G 19.84G
⚠️ **GitHub.com Fallback** ⚠️