Walkthrough - MatthewHiggins2017/bioconda-PrimedRPA GitHub Wiki

PrimedRPA Walkthrough

Estimated Time: 10 minutes

The following walkthrough examples demonstrate the flexibility of the PrimedRPA software. In this tutorial, we shall attempt to identify primers to target the Human papillomavirus (HPV).

Example One

Overview

Utilise parameters file
Single sequence input file
Generate sets of viable primers

Step 1

Prepare the necessary work environment as follows:

mkdir Walk_Through_RPA_Primers
cd ./Walk_Through_RPA_Primers

# Download HPV-126 Virus Genome From NCBI
wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/896/435/GCF_000896435.1_ViralProj76727/GCF_000896435.1_ViralProj76727_genomic.fna.gz

gunzip GCF_000896435.1_ViralProj76727_genomic.fna.gz

# Download Parameters File
wget https://raw.githubusercontent.com/MatthewHiggins2017/bioconda-PrimedRPA/master/PrimedRPA_Parameters.txt

Edit the PrimedRPA_Parameters.txt parameters file to represent the following:

This parameters file will guide the PrimedRPA-based primer and probe design process. Please follow the instructions outlined below:

----####----
Important Note - Do not remove any of the “>” and write your input directly after this symbol.
----####----


Please define the reference name for this PrimedRPA run:
>HPV_Run_1

Please indicate if you would like to use a previously generated Alignment File: [NO or File path]
>NO

Please indicate if you would like to use the previously generated Binding Sites: [NO or File path]
>NO

Please enter the path, from your current working directory, to the input fasta file:
>GCF_000896435.1_ViralProj76727_genomic.fna

Please classify the contents of the input fasta file as one of the following options: [SS, MS, AMS]. Whereby:
 SS = Single sequence
 MS = Multiple unaligned sequences
 MAS = Multiple aligned sequences

>SS

If multiple sequences are present in the input fasta file (Classification of MS or MAS), please indicate below the
percentage identity required for the primers and probes target binding sites:
>99

Please indicate if a primer identity anchor is required. [NO or length of anchor]
>NO

Desired primer length (This can be a range: 28-32 or fixed value: 32):
>32

Please state if you require a probe to be designed and if so what type [NO,EXO,NFO]
>NO

Desired probe length (This can be a range: 45-50 or fixed value: 50):
>50

Below please define your max amplicon length.
>300

Below please state the repeat nucleotide cut-off in bp (e.g. 5bp will exclude sequences containing GGGGG).

>5

Below please insert the minimum percentage GC content for primer/probe:
>30

Below please insert the maximum percentage GC content for primer/probe:
>70

Below please indicate the percentage match tolerance for primer-probe dimerisation and secondary structure formation:
>80

Please enter [No or Path to Background file] below to identify if you want to perform a background DNA binding check:
>NO

Below please insert the percentage background cross reactivity threshold:
>65

Below please indicate if you would like to implement a Background Hard Fail Filter [NO,YES]:
>NO

Please define the maximum number of sets you would like to identify:
>5

Please define the number of threads available:
>2

Blastn Cross Reactivity Search Settings [Basic or Advanced or Fast]
>Fast


Blastn Evalue
>1000

Step 2

Now the parameters file has been adjusted, we can begin our analysis via the following command:

PrimedRPA PrimedRPA_Parameters.txt

First, an alignment summary will be generated however, as we are only using a single sequence in this first example, we can ignore it for now.

HPV_Run_1_Alignment_Summary.csv

Next a file will be generated containing all of the potential oligo binding sites:

HPV_Run_1_PrimedRPA_Oligo_Binding_Sites.csv

Finally, the output file will be generated:

HPV_Run_1_Output_Sets.csv

On inspection of the output file the Max Dimerisation Score appears rather high for most sets.

Step 3

To obtain better candidates we can increase filter stringency and rerun the analysis. In addition, to save computational efficiency, we can load in the previously generated binding sites.

To achieve this, edit the parameters file as follows:

This parameters file will guide the PrimedRPA-based primer and probe design process. Please follow the instructions outlined below:

----####----
Important Note - Do not remove any of the “>” and write your input directly after this symbol.
----####----


Please define the reference name for this PrimedRPA run:
>HPV_Run_2

Please indicate if you would like to use a previously generated Alignment File: [NO or File path]
>NO

Please indicate if you would like to use the previously generated Binding Sites: [NO or File path]
>NO

Please enter the path, from your current working directory, to the input fasta file:
>GCF_000896435.1_ViralProj76727_genomic.fna

Please classify the contents of the input fasta file as one of the following options: [SS, MS, AMS]. Whereby:
 SS = Single sequence
 MS = Multiple unaligned sequences
 MAS = Multiple aligned sequences

>SS

If multiple sequences are present in the input fasta file (Classification of MS or MAS), please indicate below the
percentage identity required for the primers and probes target binding sites:
>99

Please indicate if a primer identity anchor is required. [NO or length of anchor]
>NO

Desired primer length (This can be a range: 28-32 or fixed value: 32):
>32

Please state if you require a probe to be designed and if so what type [NO,EXO,NFO]
>NO

Desired probe length (This can be a range: 45-50 or fixed value: 50):
>50

Below please define your max amplicon length.
>300

Below please state the repeat nucleotide cut-off in bp (e.g. 5bp will exclude sequences containing GGGGG).

>5

Below please insert the minimum percentage GC content for primer/probe:
>30

Below please insert the maximum percentage GC content for primer/probe:
>70

Below please indicate the percentage match tolerance for primer-probe dimerisation and secondary structure formation:
>80

Please enter [No or Path to Background file] below to identify if you want to perform a background DNA binding check:
>NO

Below please insert the percentage background cross reactivity threshold:
>35

Below please indicate if you would like to implement a Background Hard Fail Filter [NO,YES]:
>NO

Please define the maximum number of sets you would like to identify:
>5

Please define the number of threads available:
>2

Blastn Cross Reactivity Search Settings [Basic or Advanced or Fast]
>Fast


Blastn Evalue
>1000

Then re-run the analysis:

PrimedRPA PrimedRPA_Parameters.txt

This time only a single output file will be generated (as shown below) as we have utilised the binding sites from the previous run. In addition, all candidates look more suitable to carry forward to be tested in the lab due to their lower dimerisation scores.

HPV_Run_2_Output_Sets.csv

Example Two

Overview

Utilise command line parameters
Generate primers sets
Run a cross reactivity check

Step 1

Due to an unforeseen labelling error we have clinical samples that could either contain HIV or HPV. To be able to distinguish the HPV containing samples, we want to design primers which have no HIV cross-reactivity potential.

# Download HIV-2 Genome From NCBI
wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/003/098/135/GCA_003098135.1_ASM309813v1/GCA_003098135.1_ASM309813v1_genomic.fna.gz

gunzip GCA_003098135.1_ASM309813v1_genomic.fna.gz

Step 2

This time we are going to perform analysis using only the command line options. In addition, as we are now adding in a background binding check we have to regenerate the binding sites (please see Parameters Options for more information). However, as the input file has not changed, we can use the alignment file generated previously (HPV_Run_1).


PrimedRPA --RunID HPV_Run_3 --PriorAlign HPV_Run_1_Alignment_Summary.csv --PrimerLength 32 --AmpliconSizeLimit 500 --NucleotideRepeatLimit 5 --MinGC 30 --MaxGC 70 --DimerisationThresh 40 --BackgroundCheck GCA_003098135.1_ASM309813v1_genomic.fna --CrossReactivityThresh 40

Please look through HPV_Run_3_Output_Sets.csv to inspect the potential candidates generated. The addition column, Max Background Cross Reactivity Score, is the highest max cross reactivity score out of all relative oligos within a given candidate set.

Also, as we have included a cross-reactivity check, the blastn output files for all oligo binding sites, which fell below the threshold, are stored in the following location:

./GCA_003098135_Blastn_DB_PrimedRPA/HIV_Run_3/<Oligo_Binding_Site_Sequence>_Blastn_Output.csv

Example Three

Overview

Utilise parameters file
Generate primers & Exo probes

Step 1

Now we want to quantify the concentration of HPV DNA in our samples; to do this, we will need fluorescent Exo probes. Therefore, we need to re-run the analysis as follows:

Please alter the parameters file to include the Exo probe preference as follows:

This parameters file will guide the PrimedRPA-based primer and probe design process. Please follow the instructions outlined below:

----####----
Important Note - Do not remove any of the “>” and write your input directly after this symbol.
----####----


Please define the reference name for this PrimedRPA run:
>HPV_Run_4

Please indicate if you would like to use a previously generated Alignment File: [NO or File path]
>HPV_Run_1_Alignment_Summary.csv

Please indicate if you would like to use the previously generated Binding Sites: [NO or File path]
>NO

Please enter the path, from your current working directory, to the input fasta file:
>GCF_000896435.1_ViralProj76727_genomic.fna

Please classify the contents of the input fasta file as one of the following options: [SS, MS, AMS]. Whereby:
 SS = Single sequence
 MS = Multiple unaligned sequences
 MAS = Multiple aligned sequences

>SS

If multiple sequences are present in the input fasta file (Classification of MS or MAS), please indicate below the
percentage identity required for the primers and probes target binding sites:
>99

Please indicate if a primer identity anchor is required. [NO or length of anchor]
>NO

Desired primer length (This can be a range: 28-32 or fixed value: 32):
>32

Please state if you require a probe to be designed and if so what type [NO,EXO,NFO]
>EXO

Desired probe length (This can be a range: 45-50 or fixed value: 50):
>50

Below please define your max amplicon length.
>300

Below please state the repeat nucleotide cut-off in bp (e.g. 5bp will exclude sequences containing GGGGG).

>5

Below please insert the minimum percentage GC content for primer/probe:
>30

Below please insert the maximum percentage GC content for primer/probe:
>70

Below please indicate the percentage match tolerance for primer-probe dimerisation and secondary structure formation:
>80

Please enter [No or Path to Background file] below to identify if you want to perform a background DNA binding check:
>NO

Below please insert the percentage background cross reactivity threshold:
>65

Below please indicate if you would like to implement a Background Hard Fail Filter [NO,YES]:
>NO

Please define the maximum number of sets you would like to identify:
>5

Please define the number of threads available:
>2

Blastn Cross Reactivity Search Settings [Basic or Advanced or Fast]
>Fast

Blastn Evalue
>1000

Again, trigger analysis with the following command:

PrimedRPA PrimedRPA_Parameters

Inspect the output file (below) and you will see potential candidate sets. For any of the probe sequences generated, two thymine residues will be situated approximately 2/3rds into the probe which can be exchange for the fluorescent marker and quencher respectively.

HPV_Run_4_Output_Sets.csv

Example Four

Overview

Utilise parameters file
Input multiple unalinged fasta sequence.
Generate primers & Exo probes

Walkthrough - MatthewHiggins2017/bioconda-PrimedRPA GitHub Wiki

PrimedRPA Walkthrough

Example One

Overview

Step 1

Step 2

Step 3

Example Two

Overview

Step 1

Step 2

Example Three

Overview

Step 1

Example Four

Overview

Coming Soon!