in silico assembly of pTA1_TDH3_ATF1_PGI1 - MetabolicEngineeringGroupCBMA/MetabolicEngineeringGroupCBMA.github.io GitHub Wiki
This document will show you how to assemble an expression vector called pTA1_TDH3_ScATF1_PGI1
using the Yeast Pathway Kit.
This vector is a combination of:
- A strong TDH3/YBR192C promoter from the S. cerevisiae glyceraldehyde-3-phosphate dehydrogenase (GAPDH),
- The ATF1 gene from the pYPKa_A_ATF1 plasmid
- A terminator consisting of the phosphoglucose isomerase (PGI1/YBR196C) upstream intergenic sequence.
- A pTA1 vector. The pTA1 vector provides selection markers and origin of replication.
The first step is to collect all sequences needed for the assembly. For a Yeast Pathway Kit single gene expression TU vector, this means:
- A linear plasmid sequence
- A promoter PCR product
- A gene PCR product
- A terminator PCR product
The pTA1 vector is available here. It should be linearized using the ZraI restriction enzyme. Use the Enzymes>Enzyme selector to find the cut location of this enzyme. The linear plasmid sequence can be obtained by using the ApE Edit>"Linearize @ insert site" function.
The PCR products can be obtained using WebPCR and the PCR primers indicated in the table below.
Target | Template | Forward primer | Reverse primer |
---|---|---|---|
Promoter | pYPKa_Z_TDH3 | 577 | 567 |
Gene | pYPKa_A_ATF1 | 468 | 467 |
Terminator | pYPKa_E_PGI1 | 568 | 578 |
All primer sequences are available here. |
Collect the linear vector sequence and the three PCR product sequences in FASTA format in a text editor such as Notepad like so:
Paste the four sequences into the Assembly simulator tool:
Select circular assembly and click "submit". The result should yield a figure and a sequence for the assembly and for the reverse complement. The reverse complement sequence is a by-product of the algorithm used. The assembled sequence is marked by an orange line in the figure below.
The resulting sequence should be around 9646 bp and have a short seguid checksum cdseguid=y6oBCE
Compare the size and complete seguid checksum with that of your colleagues.
The assembly can also be done using pydna directly. For this exercise, we will use pydna and google colab which you can use if you have a free google account. Colab is a hosted Jupyter Notebook service that requires no setup. A Jupyter notebook is a python program file that can also show comments and images as well as intermediate results. Colab allows you to write and execute Python in your browser without installing any software.
Go to Google colab in you web browser. Create a new notebook by clicking on the "New notebook button", see the image below:
Copy the code below into the first cell. This code will tell the python package manager pip to install the pydna package which has the functionality we need.
!pip install pydna
Run the first cell by clicking the arrow button and wait for the execution to finish (see below)).
You can ignore the output from this cell.
Click on the button to get a new code cell (see below):
Copy the code below into the new cell and execute.
from pydna import logo
from pydna.parsers import parse
from pydna.assembly import Assembly
logo()
You should now have a printout of the pydna logo:
Create a new code cell and paste the code below. Replace the sequences with your own and execute. Make sure that you adhere to the FASTA sequence format.
sequences = """\
>pTA1 linear
replace-this-text-with-your-own-DNA-sequence
>promoter
replace-this-text-with-your-own-DNA-sequence
>gene
replace-this-text-with-your-own-DNA-sequence
>terminator
replace-this-text-with-your-own-DNA-sequence
"""
There should be no output from the code cell above.
Create a new code cell and paste the code below and execute.
linear_vector, promoter, gene, terminator = parse(sequences)
linear_vector, promoter, gene, terminator
For this example, you should have an output like the one below.
asm = Assembly((linear_vector, promoter, gene, terminator))
candidates = asm.assemble_circular()
candidate, *rest = candidates
candidate.figure()
You should have a figure like this one as result:
Create a new code cell, paste the code below and execute.
result = candidate.synced("gttctgatcctcgagcatcttaagaattc")
result.name = "pTA1_TDH3_ScATF1_PGI1" # Change this name as needed
print(result.format())
This should give you a sequence of the plasmid in Genbank format:
Create a new code cell and paste the code below and execute.
result.seguid()
Compare the sequence length and seguid code (short cdseguid=y6oBCE
) with the ones of your colleagues.