Canu - weaponsforge/fastractor GitHub Wiki
canu notes
Notes on how to install and use canu(https://canu.readthedocs.io/en/latest/quick-start.html) on Centos 8.
Content
- Requirements
- Install the canu Dependencies
- Install canu
- Download a Dataset
- canu Sample Usage
- References
Requirements
The following requirements and dependencies were used for this project. Other system and software configurations are open for testing.
- Virtual Box 6.14 (for Windows OS)
- Windows 10 Pro (host OS)
- Version 1909 (OS Build 18363.1082)
- Processor: Intel(R) Core(TM) i7-6700HQ
- CPU @2.60GHz 2.60 GHz
- GPU: NVIDIA GeForce GTX 1060, 6 GB Dedicated GPU Memory
- Memory: 16 GB
- System type: 64-bit OS, x64-based processor
- CentOS Linux release 8.1.1911 (Core) - VM (guest OS) running on VMWare Player
- Memory: 8 GB
- Processors: 4
- Hard Disk: 40 GB
- kernel 4.18.0-147.5.1.el8_2.x86_64
- canu v.2.1 (binary release) dependencies
- Gcc (GCC) 8.3.1 20190507 (Red Hat 8.3.1-4)
- Perl v5.26.3
- Java jdk 8
- Gnuplot 5.2
Install the canu Dependencies
System Update
(Optional) The system may need to be updated to ensure the latest security and binary packages, if its not yet updated. The Centos OS used for this demo was already updated, and skipped these steps.
- Check if there are available updates.
sudo yum update
- Update the OS kernel package.
sudo yum update -y kernel
- Update all packages.
sudo yum update
- Reboot.
sudo shutdown -r now
canu Dependencies Installation
The following dependencies must first be installed and configured before proceeding to use canu.
- Install gcc and perl.
- No need to install these dependencies because Centos 8.0 already has
gcc 8.3.1
andperl 5.26.3
pre-installed.
- No need to install these dependencies because Centos 8.0 already has
- Install Java JDK 8
-
Install the openjdk version
sudo yum install -y java-1.8.0-openjdk sudo yum install -y java-1.8.0-openjdk-devel java -version
-
Set the java environment variables
cat <<EOF | sudo tee /etc/profile.d/java8.sh export JAVA_HOME=/usr/lib/jvm/jre-openjdk export PATH=\$PATH:\$JAVA_HOME/bin export CLASSPATH=.:\$JAVA_HOME/jre/lib:\$JAVA_HOME/lib:\$JAVA_HOME/lib/tools.jar EOF
-
Activate the Java environment.
source /etc/profile.d/java8.sh
-
Verify the installed Java version:
java -version
// The above command should output something similar: openjdk version "1.8.0_265" OpenJDK Runtime Environment (build 1.8.0_265-b01) OpenJDK 64-bit Server VM (build 25.265-b01, mixed mode)
-
- Install gnuplot. Version 5.2 patch 4 was used for this demo.
sudo yum install gnuplot
Install canu
- Download the the canu binary release for Linux. canu 2.1 was used for this demo.
wget https://github.com/marbl/canu/releases/download/v2.1/canu-2.1.Linux-amd64.tar.xz
- Install from the binary distribution. The following command will install canu on /home/adminuser/canu-2.1/bin:
tar -xJf canu-2.1.Linux-amd64.tar.xz
- Verify that the canu installation file
canu-2.1/bin/canu
is present. If there is nocanu-2.1/bin/
directory or thecanu-2.1/bin/canu
file is missing, retrace the previous installation steps first for errors before proceeding to #4. - Permanently add canu's bin directory to the
PATH
environment variable to make canu available globally from the command line.- Create a
canu.sh
file.
sudo nano /etc/profile.d/canu.sh
- Encode your canu's installation path and save.
INFO: Take note to use your canu's full installation directory for the PATH variable. The sample code uses canu installed in /home/adminuser/canu-2.1/bin.
export PATH=$PATH:/home/adminuser/canu-2.1/bin
- Source out the canu.sh file.
source /etc/profile.d/canu.sh
- Create a
Download a Dataset
- Navigate back to /home/adminuser.
cd ~/
- Download a sample dataset to test with canu.
curl -L -o pacbio.fastq http://gembox.cbcb.umd.edu/mhap/raw/ecoli_p6_25x.filtered.fastq
(233 MB) - Verify that a file
pacbio.fastq
is downloaded on /home/adminuser/pacbio.fastq
canu Sample Usage
-
Process the dataset.
pacbio.fastq
's directory path may need to be specified, if canu is called from another directory other than the directory where pacbio.fastq was downloaded.canu \ -p ecoli -d ecoli-pacbio \ genomeSize=4.8m \ -pacbio-raw pacbio.fastq
INFO: if canu is not yet globally available, use its full installation path instead:
/home/adminuser/canu-2.1/bin/canu \ -p ecoli -d ecoli-pacbio \ genomeSize=4.8m \ -pacbio-raw pacbio.fastq
-
Wait for the process to finish.
References
1(https://github.com/marbl/canu/releases) - canu binary releases
2(https://canu.readthedocs.io/en/latest/) - canu documentation
@weaponsforge
20201015