FAQ - BGIGPD/BestPractices4Pathogenomics GitHub Wiki

Frequently Asked Questions

What does Virtual Environment mean?

A virtual environment is a tool that creates an isolated environment for each project. This means that you can have multiple projects on one machine, each with its own dependencies, without any conflicts between them.

What does local or remote refer to?

Local refers to the machine you are currently using, while remote refers to a server that is hosted on the internet. When you run a command on a remote server, it will be executed on that server, not on your local machine.

Why do data manipulation and visualization on local, while run bioinformatics pipeline on remote?

Local environments (e.g., Laptops and PCs) allow for more interactive exploration of data. Users can quickly iterate and adjust their analysis without waiting for remote server responses.
Besides, high-resolution visualizations and complex data manipulations can be computationally intensive. Running these tasks locally can be faster than sending data back and forth over a network.

While remote servers (e.g., HPC, ECS, AWS, andth other Cloud servers) can provide access to more computational resources than a local machine, which is essential for large-scale bioinformatics analysis that requires significant processing power or storage. Remote pipelines can be accessed by multiple users simultaneously, which is beneficial for collaborative projects where data and results need to be shared. Most importantly, large datasets that are too big to store locally can be kept on remote servers, which might also offer better backup and disaster recovery options.

In practice, the choice between local and remote processing often depends on the specific requirements of the project, the available resources, and the preferences of the researchers involved. Sometimes, a hybrid approach is used, where preliminary analysis is done locally and more intensive computations are handled by remote servers.

What is the Bastion host?

A bastion host is a server that acts as a gateway between a private network and the internet. It is used to allow secure access to the private network from the internet.

Can not access UOMC with error like add 'HostkeyAlgorithms +ssh-dss'to ~/.ssh/config

open and edit ~/.ssh/config file:

Host uomc
HostName uomc-worker01.genomics.cn
HostkeyAlgorithms +ssh-dss

Then try again