Page Index - soedinglab/MMseqs2 GitHub Wiki

162 page(s) in this GitHub Wiki:

Home
MMseqs2 User Guide
Table of Contents
Summary
System requirements
Check system requirements under Linux
Check system requirements under macOS
Check system requirements under Windows
Installation
Install MMseqs2 for Linux
Install with Homebrew
Install static Linux version
Compile from source under Linux
Compile from source for Linux with GPU support
Install MMseqs2 for macOS
Install with Homebrew
Install static macOS version
Compile from source under macOS
Compiling under Clang
Compiling using GCC
Install MMseqs2 for Windows (preview)
Install static Windows version
Compile from source under Windows
Use the Docker container
Building the Docker container
Set up the Bash/Zsh command completion
Customizing compilation through CMake
Getting started
Usage of MMseqs2 modules
Easy Workflows in MMseqs2
Easy-Search
Easy-Cluster
Easy-Linclust
Using MMseqs2 Workflows and Modules
Downloading databases
Searching
convertalis columns
GPU-accelerated search
Multi-GPU usage
Database larger than GPU memory
Clustering
Linclust
Updating a clustered database
Overview of folders in MMseqs2
Overview of important MMseqs2 Modules
Description of workflows
Batch sequence searching using mmseqs search
Expanded cluster searches
Downloading precomputed expandable profile databases
Searching against expandable profile databases
Using a precomputed index with expandable profile databases
Translated sequence searching
Mapping very similar sequences using mmseqs map
Clustering databases using mmseqs cluster or mmseqs linclust
Clustering criteria
Cascaded clustering
Clustering modes
Linear time clustering using mmseqs linclust
Run Linclust
Updating a clustering database using mmseqs clusterupdate
Taxonomy assignment
Terminology
Creating a seqTaxDB
Filtering a seqTaxDB
The concept of LCA
Using seqTaxDB for taxonomy assignment
Taxonomy output and TSV
Taxonomic ranks
Taxonomy report in Kraken or Krona style
Taxonomy top hit report
Filtering taxonomy output
Taxonomy annotation of search/cluster results
Create a seqTaxDB from an existing BLAST database
Create a seqTaxDB for SILVA
Create a seqTaxDB for GTDB
Create a seqTaxDB by manual annotation of a sequence database
Reciprocal best hit using mmseqs rbh
Behind the scenes
Description of core modules
Computation of prefiltering scores using mmseqs prefilter
Set sensitivity -s parameter
Local alignment of prefiltered sequence pairs using mmseqs align
Clustering sequence database using mmseqs clust
File formats
MMseqs2 database format
Manipulating databases
Sequence database format
Prefiltering format
Alignment format
Internal alignment format
Custom alignment format with convertalis
Clustering format
Internal cluster format
Cluster TSV format
Cluster FASTA-like format
Extract representative sequence
Taxonomy format
Internal taxonomy format
Taxonomy report in Kraken or Krona style
LCA TSV
Profile format
Parameters that affect profile construction
Convert a result database into a profile
Convert an external MSA into a profile
Extract consensus or sequence information from a profile
Convert HHsuite HMMs into a profile
Identifier parsing
Optimizing sensitivity and consumption of resources
Prefiltering module
Memory consumption
Database splitting runtime slowdown
Runtime
Disk space
Important options for tuning the memory, runtime and disk space usage
Alignment module
Memory consumption
Runtime
Disk space
Clustering module
Memory consumption
Runtime
Disk space
Workflows
How to run MMseqs2 on multiple servers using MPI
Write temporary files to local disk when running with MPI
How to run MMseqs2 on multiple servers using batch systems
Frequently Asked Questions
How to set the right alignment coverage to cluster
Bidirectional coverage
Target coverage
Query coverage
How do parameters of CD-HIT relate to MMseqs2
How does MMseqs2 compute the sequence identity
How to restart a search or clustering workflow
How to control the speed of the search
How to find the best hit the fastest way
How does MMseqs2 handle low complexity
How to redundancy filter sequences with identical length and 100% length overlap
How to add sequence identities and other alignment information to a clustering result
How to run external tools for each database entry
How to compute a multiple alignment for each cluster
How to manually cascade cluster
How to cluster using profiles
How to create a HHblits database
How to create a target profile database (from PFAM)
How to cluster a graph given as tsv or m8 file
How to search small query sets fast
What is the difference between the map and search workflow
How to build your own MMseqs2 compatible substitution matrices
How to create a fake prefiltering for all-vs-all alignments
How to compute the lowest common ancestor (LCA) of a given set of sequences
Workflow control parameters
Search workflow
Clustering workflow
Updating workflow
Environment variables used by MMseqs2
External libraries used in MMseqs2
License terms
MMseqs2 Developer Guide
Please reload this page
Tutorials
Please reload this page