STAR Fusion FAQ - STAR-Fusion/STAR-Fusion GitHub Wiki

Which version of STAR is compatible with which version of STAR-Fusion?

It can be confusing as to which versions are compatible with the different software. Because everything's been fairly dynamic, we try to support only the latest version of STAR-Fusion, which should ideally be compatible with the latest version of STAR.

We do, however, have Docker images for each of the major releases, which come bundled with the targeted version of the STAR aligner. You can grab these from here: https://hub.docker.com/r/trinityctat/ctatfusion/tags/

and the Docker file itself is under version control: https://github.com/STAR-Fusion/STAR-Fusion/blob/master/Docker/Dockerfile

The CTAT genome libs that correspond to the different release fall into 2 categories right now: < STAR-Fusion v1.3 and >= STAR-Fusion v1.3: https://data.broadinstitute.org/Trinity/CTAT_RESOURCE_LIB/

Versions of STAR-Fusion starting at v1.4.0 check for the version of the STAR being used and will indicate if the version is out of date, failing to be compatible.

How can I run many rna-seq samples through STAR-Fusion?

The best way to run a lot of RNA-seq samples is using a cloud computing infrastructure such as Terra: https://app.terra.bio/#workspaces/ctat-firecloud/STAR-Fusion

It ends up costing around $1 per sample, but you can get $300 worth of free credits to explore the system.

Otherwise, if you want to run it locally on a bunch of samples, there's a script that's included with STAR-Fusion to help:

STAR-Fusion/util/STAR-Fusion.run_many_samples.pl samples.txt num_parallel [starF options passthru] [--do_remove]

which will load the genome into memory once and then run STAR-Fusion (with STAR) in shared memory mode, processing 'num_parallel' number of samples in parallel.

The samples.txt file has the simple tab delimited formatting: sample_name (tab) /path/to/left.fq (tab) /path/to/right.fq ...

listing all the samples to be processed.