in silico assembly of a pYPKa vector - MetabolicEngineeringGroupCBMA/MetabolicEngineeringGroupCBMA.github.io GitHub Wiki

It is possible to assemble Yeast Pathway Kit vectors by hand using ApE or some other DNA editor. However it is slow, tedious and error prone to manually copy paste all sequences.

If you have a google account, you can use google colab and pydna to do this automatically from a collection of linear sequences.

Collect all sequences for the assembly. For a Yeast Pathway Kit single gene expression TU vector, this means:

  1. A linear plasmid sequence
  2. A promoter PCR product
  3. A gene PCR product
  4. A terminator PCR product

The linear plasmid sequence can be obtained by transforming the sequence using the ApE Edit>"Linearize @ insert site" function. If you have problems look here ApE.

The PCR products can be obtained using WebPCR and the primers below.

Template Forward primer Reverse primer
pYPKa_Z_Promoter 577 567
pYPKa_A_Gene 468 467
pYPKa_E_Terminator 568 578

Collect the four sequences in FASTA format in a text editor such as Notepad:

Go to google colab in you web browser. Google colab is a free service offered by google to run Jupyter notebooks. A []Jupyter notebook](https://nbviewer.org) is a python program that can also show comments and images as well as intermediate results.

Create a new notebook by clicking on the "New notebook button", see the image below:

Copy the code below into the first cell. This code will tell the pip python package manager to install the pydna package which has the functionality we need.

!pip install pydna

Run the first cell by clicking the arrow button and wait for the execution to finish (see below)). You can ignore this output.

Then click on the button to get a new code cell (see below):

Copy the code below into the new cell and execute.

from pydna import logo
from pydna.parsers import parse
from pydna.assembly import Assembly
logo()

You should now have a printout of the pydna logo:

Create a new code cell and paste the code below. Replace the sequences with your own and execute.

sequences = """\

>pTA1 linear
gtcatgcgcatgatatcttcacaggcggtt...

>promoter
GTTCTGATCCTCGAGCATCTTAAGAATTCG...

>gene
GTCGAGGAACGCCAGGTTGCCCACTTTCTC...

>terminator
GTGCCATCTGTGCAGACAAACGCATCAGGA...

"""

There should be no output from the code cell above.

Create a new code cell and paste the code below and execute.

linear_vector, promoter, gene, terminator = parse(seqs)

asm = Assembly((linear_vector, promoter, gene, terminator))

candidates = asm.assemble_circular()

candidate, *rest = candidates

candidate.figure()

You should have a figure like this one as result:

Create a new code cell, paste the code below and execute.

result = candidate.synced("gttctgatcctcgagcatcttaagaattc")

result.name = "pTA1_TDH3_ScATF1_PGI1"

print(result.format())

This should give you a sequence of the plasmid in Genbank format.

Create a new code cell and paste the code below and execute. Compare the cSEGUID code with the ones of your colleagues:

result.cseguid()
⚠️ **GitHub.com Fallback** ⚠️