Methodology - BGIGPD/BestPractices4Pathogenomics GitHub Wiki
-
A wiki guidance includes:
- Content / Schedule of this course
- A rapid introduction to describe core concepts
- A step by step guide to guide you practicing analysis
-
A git repository where:
- We will prepare nessisary scripts in advace for you
- Act as a demo project practicing bioinformatics analysis
- Install or load environments with necessary packages get ready
- Organize and store dataset
- Edit and execute scripts
- commit changes to code
- push and pull changes to keep work updated
- Sync and share your work with your team
Shells are command-line interpreters that provide a way for users to interact with the operating system. They are used to execute commands, run programs, and manipulate files and directories.
Here are the most popular shells:
- Bash (Bourne Again SHell): The most commonly used shell in Linux and Unix-like operating systems.
- Zsh (Z Shell): A more advanced shell with features like syntax highlighting, command completion, and themes.
- Fish (Friendly Interactive Shell): A user-friendly shell with features like syntax highlighting, command completion, and auto-suggestions.
In order to standardize teaching, ensure the consistency of code execution, and the operability of exercises, I recommend everyone using the Bash shell on our bastion host. Here is how we login it:
From local terminal:
ssh uomc-worker01.genomics.cn
By using xshell:
More details please check our guidance about How to use bastion host
Following is some extra guidline without using our bastion server.
For Linux & Mac OS users, Bash is already installed in your system, simple open Terminal
and here you are.
For Windows OS users, it's a little bit tough to get bash, but you still have some options:
- Install a Linux distribution on your Windows machine, such as Ubuntu or Fedora.
- Use a terminal emulator that supports Bash, such as Git Bash or Cygwin.
- Use a terminal emulator that supports Bash, such as Git Bash or Cygwin.
- [recommanded]Windows Subsystem for Linux (WSL): A feature of Windows 10 that allows you to run a Linux distribution directly on your Windows machine.
More details to learn shell:
Conda is a popular package and environment management system used primarily for installing software packages, managing dependencies, and creating isolated environments.
If you were suffered by install packages manually, you better try conda
The easy way:
bash /home/fangchao/Miniconda.sh
The hard way:
- search conda
- find the installation guidence
- Download and install
mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm ~/miniconda3/miniconda.sh
If successfully installed, your terminal should display message like below:
[fangchao@localhost ~]$ bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
PREFIX=/home/fangchao/miniconda3
Unpacking payload ...
Installing base environment...
Preparing transaction: ...working... done
Executing transaction: ...working... done
installation finished.
Git is a widely used distributed version control system for tracking changes in source code during software development. It is designed to handle everything from small to very large projects with speed and efficiency.
After install conda and activate the base env, git
should be already installed.
But for the first time using git, you may need configure your author information so we can distinguish your works from others.
git config --global --add user.name YourName # Replace `YourName` with your own name;
git config --global --add user.email YourEmail # Replace `YourEmail` with your own email;
Go to a folder you want to store the course repository. Assume the default home folder /home/<yourname>
(also known as $HOME or
~`) once you login to the bastion server.
Then run the following command:
git clone https://github.com/BGIGPD/BestPractices4Pathogenomics.git
You should see the following output if nothing goes wrong:
Cloning into 'BestPractices4Pathogenomics'...
remote: Enumerating objects: 6, done.
remote: Counting objects: 100% (6/6), done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 6 (delta 0), reused 0 (delta 0), pack-reused 0 (from 0)
Receiving objects: 100% (6/6), done.
After cloning the repository, you should see a new folder named BestPractices4Pathogenomics
in your current directory.
Go into it:
cd BestPractices4Pathogenomics
Create a new branch for you own work.
IMPORTANT: Replace YourName
with your own name or any other name you want to use.
git checkout -b YourName
You should see the following output if nothing goes wrong:
Switched to a new branch 'YourName'
Create a new folder and a README file in it.
mkdir YourName
Then enter this folder:
cd YourName
Edit it by using vi
( it's short for vim
, learn more basic usage here):
vi README.md
In vi editor, you should enter 'i' first to enter a --INSERT--
model.
Add a markdown
syntax h1-title of "About this project". Write something about your work then save it. Here is an example:
# About this project
This is my work.
To save your content, type 'ESC' to leave --INSERT--
mode, then type:
:wq
to save and quit vi editor. Here w
means write
(save), q
means quit
.
git status
git add README.md
git status
git commit -m 'My work begin' README.md
git push origin YourName
Well done! You have now been aware of a git-styled version control workflow.
R(The R Project for Statistical Computing) is a programming language and environment commonly used for statistical computing, data analysis, and visualization. It is widely used by statisticians, data miners, and researchers for a variety of tasks involving data manipulation and analysis.
Since display images on remote server is often complex and inconvinient, we recommand install it on your local system (laptop).
Find the download page, select a mirror close to your location, and download the proper version of R for your operating system.
Go Rstudio website (RStudio) and download the proper version of RStudio for your operating system.
rstudio
Method1: Temporarily use another git code host
git remote add gitea https://gitea.biochao.cc/fangchao/BestPractices4Pathogenomics.git
git push gitea YourName
Replace the branch parameter with your defined name.
Method2: In this demo case, I figure out an easy way for us to do so (also temporarily).
Copy the rsa key from Bastion host to your local machine, under ~/.ssh/
.
Edit ~/.ssh/config
:
Host github.com
HostName github.com
IdentityFile ~/.ssh/bot4demo_id_rsa
Plese remove above content in
~/.ssh/config
after this course or after you regist your own github account. Otherwise it will affect your future work.
git remote set-url origin [email protected]:BGIGPD/BestPractices4Pathogenomics.git
Go back to section 3.5 and try again.
Read following sections from Git Book and try them out!
- Getting Started
1.2 A Short History of Git
1.3 What is Git?
1.4 The Command Line
1.5 Installing Git
1.6 First-Time Git Setup
1.7 Getting Help
1.8 Summary - Git Basics
2.1 Getting a Git Repository
2.2 Recording Changes to the Repository
2.3 Viewing the Commit History
2.4 Undoing Things
2.5 Working with Remotes