Microbiome Helper 2 Useful code - LangilleLab/microbiome_helper GitHub Wiki
Authors: Robyn Wright Modifications by: NA
Please note: We think that everything here should work, but we are still testing/developing this so use with caution :)
Introduction
All of the code shown here is random snippets that have been used at various points by me - they have lived in a "Useful_code" document of my own for a long time, so I thought that they may as well migrate here.
Random useful code
htop
Check processes:
htop
Exit:
F10 or fn+F10 (Mac)
Change file/folder permissions
sudo chmod -R ugo+rw folder_path #the -R flag will do this recursively to everything inside this folder
chmod -R ugo-rw
Change file/folder owner:
sudo chown -R USER folder_path
Count lines in a file
less file_name.txt | wc -l
Or files in a directory:
ls directory | wc -l
See most recent files added to current directory
ls -Artlh | tail -n 10
Show certain number of files in directory (default is 10)
ls | head -20
List size of files
du -h | sort -h
su -sh
Show free space on server
df -h
Zipping files
Unzip files:
gunzip raw_data/*gz
tar -xf filename
Zip files:
gzip file_to_zip
tar -czvf name-of-archive.tar.gz /path/to/directory-or-file
rsync
Using rsync to copy files:
rsync --partial --progress W0.tar.bz2 [email protected]
Combine files:
cat folder/*.fasta > combined.fasta
Convert fastq to fasta
sed -n '1~4s/^@/>/p;2~4p' cat_reads/cDNA-N1-neg.fastq > cat_reads/cDNA-N1-neg.fasta
Convert bam to fastq
samtools bam2fq SAMPLE.bam > SAMPLE.fastq
Split a file
split -b 200G hash.k2d hash_split #by size
split -l 1000 hash.k2d hash_split #by line number
#last argument here is the prefix to give the new files (no suffix given)
Make md5 sums
md5sum opts.k2d taxo.k2d unmapped.txt > kraken2_RefSeqV205_Complete_500GB_2.md5
Edit text document with vi
vi $file_name enter text editor
i enter insert mode - make any changes
esc exit insert mode
:x save changes and exit document
:q exit document (no changes made)
:q! exit document without saving changes\
BLAST
Make database:
makeblastdb -in TARA_004_DCM_0.22-1.6.16SrRNA.miTAG.fna -dbtype nucl
BLAST:
blastn -db TARA/test_blast/TARA_004_DCM_0.22-1.6.16SrRNA.miTAG.fna -query Bacillus_16S.fna -out bacillus_test.txt -perc_identity 90 -outfmt 6
Barrnap
ssu-align dereplicated_marref_assembly_16S.fasta marref_align_DNA --dna #align
ssu-mask marref_align_DNA --pf 0.001 --pt 0 #mask
ssu-mask -a --stk2afa marref_align_DNA #stockholm > fasta
hmmbuild marref.hmm marref_align_DNA/marref_align_DNA.bacteria.mask.stk #build HMM with bacterial 16S
#looking at identifying 16S
cd tools/
git clone https://github.com/tseemann/barrnap.git
cd barrnap/bin
./barrnap --help
RAxML
conda install -c genomedk raxml-ng
raxmlHPC -s sequence_file -n new_folder_name -m GTRGAMMA
raxmlHPC -s marref_align_DNA.bacteria.mask.afa -n marref_tree_2 -m GTRGAMMA
raxml-ng --evaluate --msa $REF_MSA --tree $TREE --prefix info --model GTR+G —threads 2
Build and run HMM
#first align sequence file using https://www.ebi.ac.uk/Tools/msa/clustalo/ (choose stockholm alignment)
hmmbuild output_file.hmm input_aligned_sequences.sto
hmmsearch hmm_file.hmm fasta_input.fa > output_file.out