103. AWS EC2 03 - qyjohn/AWS_Tutorials GitHub Wiki
This section covers the various shared storage options such as NFS, EFS, distributed file system.
(1) NFS
Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystems in 1984, allowing a user on a client computer to access files over a computer network much like local storage is accessed. It is a highly matured technology. Most of the system administrators with a Unix/Linux background are quite familiar with it.
In this exercise, we will learn how to setup NFS on Ubuntu 16.04. The experiment setup includes an EC2 instance running the NFS server, and two EC2 instances as NFS clients. To make things simple, we use the same security group on all these EC2 instances, allowing all network traffic among EC2 instances using the same security group.
- STEP 1 - Launch an EC2 instance with Ubuntu 16.04, with an additional EBS volume as the NFS server.
Install the necessary software to run the NFS server.
$ sudo apt-get update;
$ sudo apt-get install nfs-kernel-server
Format the second EBS volume with Ext4 and mount it as /nfs_share.
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvda 202:0 0 8G 0 disk
└─xvda1 202:1 0 8G 0 part /
xvdb 202:16 0 50G 0 disk
$ sudo mkfs.ext4 /dev/xvdb
$ sudo mkdir /nfs_share
$ sudo mount /dev/xvdb /nfs_share
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvda 202:0 0 8G 0 disk
└─xvda1 202:1 0 8G 0 part /
xvdb 202:16 0 50G 0 disk /nfs_share
$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 489M 0 489M 0% /dev
tmpfs 100M 3.1M 96M 4% /run
/dev/xvda1 7.8G 978M 6.4G 14% /
tmpfs 496M 0 496M 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 496M 0 496M 0% /sys/fs/cgroup
tmpfs 100M 0 100M 0% /run/user/1000
/dev/xvdb 50G 52M 47G 1% /nfs_share
$ cd /nfs_share
$ mkdir test
$ sudo chown -R ubuntu:ubuntu test
$ ls
lost+found test
Edit /etc/exports (with sudo) using your favorite editor and add the following line to the file. It should be noted that 172.31.0.0.0/16 is the CIDR range of your VPC. If the CIDR range of your VPC is different, please use the actual value. Here we are exporting the /nfs_share folder so that any EC2 instances in your VPC can use it.
/nfs_share 172.31.0.0/16(rw,fsid=0,insecure,no_subtree_check,async)
Restart the NFS server with the following command:
$ sudo service nfs-kernel-server restart
You can use the showmount command to view the NFS export:
$ showmount -e localhost
Export list for localhost:
/nfs_share 172.31.0.0/16
Now we will try to mount the NFS shared folder to another mounting point on the same EC2 instance. Please replace 172.31.13.51 with the real private IP address of your NFS server. As you can see, under /nfs_mount you see the same content as in /nfs_share. This proves that the NFS server is working.
$ sudo mkdir /nfs_mount
$ sudo mount 172.31.13.15:/nfs_share /nfs_mount
$ cd /nfs_mount
$ ls
lost+found test
$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 489M 0 489M 0% /dev
tmpfs 100M 3.1M 96M 4% /run
/dev/xvda1 7.8G 978M 6.4G 14% /
tmpfs 496M 0 496M 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 496M 0 496M 0% /sys/fs/cgroup
tmpfs 100M 0 100M 0% /run/user/1000
/dev/xvdb 50G 52M 47G 1% /nfs_share
172.31.13.51:/nfs_share 50G 52M 47G 1% /nfs_mount
- STEP 2 - Launch two EC2 instance with Ubuntu 16.04 as the NFS client
Now we launch two more EC2 instances acting as the NFS client. SSH into these EC2 instances and run the following commands to mount the NFS shared folder. As you can see, under /nfs_mount you see the same content as in /nfs_share. This proves that the NFS server is working.
$ sudo apt-get update
$ sudo apt-get install nfs-common
$ sudo mkdir /nfs_mount
$ sudo mount 172.31.13.51:/nfs_share /nfs_mount
$ cd /nfs_mount/
$ ls
lost+found test
At this point, you should try to use the dd command to read from a big file, and write to a big file, over the NFS connection, to see how the performance is using iostat. Compare the test results with the performance of the reading/writing from/to the same local folder (/nfs_share) on the NFS server. Also, change the instance types for the NFS server and the NFS client to understand the performance impacts of different instance types.
- STEP 3 - Other Considerations
Instead of allowing all traffic between your EC2 instances, you might want to allow only NFS traffic between them. This is going to be tricky because the NFS protocol depends on a collection of ports to work. On your NFS server, run the following command to see what ports are being used. As you can see, you will need to open port 111 for both TCP and UDP (portmapper), port 2049 for both TCP and UDP (nfs), and there are some dynamic ports for both TCP and UDP (mountd, nlockmanager, and status). This makes it difficult to maintain an NFS server with stringent firewall requirements.
$ rpcinfo -p
program vers proto port service
100000 4 tcp 111 portmapper
100000 3 tcp 111 portmapper
100000 2 tcp 111 portmapper
100000 4 udp 111 portmapper
100000 3 udp 111 portmapper
100000 2 udp 111 portmapper
100005 1 udp 59115 mountd
100005 1 tcp 42101 mountd
100005 2 udp 43226 mountd
100005 2 tcp 57476 mountd
100005 3 udp 48633 mountd
100005 3 tcp 40737 mountd
100003 2 tcp 2049 nfs
100003 3 tcp 2049 nfs
100003 4 tcp 2049 nfs
100227 2 tcp 2049
100227 3 tcp 2049
100003 2 udp 2049 nfs
100003 3 udp 2049 nfs
100003 4 udp 2049 nfs
100227 2 udp 2049
100227 3 udp 2049
100021 1 udp 46383 nlockmgr
100021 3 udp 46383 nlockmgr
100021 4 udp 46383 nlockmgr
100021 1 tcp 35596 nlockmgr
100021 3 tcp 35596 nlockmgr
100021 4 tcp 35596 nlockmgr
100024 1 udp 35448 status
100024 1 tcp 33858 status
Running an NFS server also creates a single point of failure in the system. According to AWS's best practice, the #1 design principle is "everything fails, all the time". The EC2 instance running the NFS server, as well as the EBS volume to host the shared data, can fail at any time. As such, hosting your own NFS server is not a highly available solution.
(2) EFS
AWS offers an NFS-compatible solution, which is EFS. The EFS service, at the time we prepare this tutorial, is only available in the eu-west-1 (Ireland), us-east-1 (N. Virginia), us-east-2 (Ohio) and us-west-2 (Oregon) regions.
In your AWS console, switch to the us-west-2 (Oregon) region, follow this tutorial to create an EFS file system. On the EFS file system, use a new security group with only one inbound rule, allowing TCP access to port 2049 from the CIDR range of your VPC. Then, launch an EC2 instance running Ubuntu 16.04 and mount the EFS file system to /efs. As you can see, using only one port to provide access simplifies the management of your security group.
From your EC2 instance, use dd and iostat to observe the read and write performance of your EFS file system. Compare the results with your self-managed NFS server.
Assuming that you have multiple EC2 instances using the same EFS. The EC2 instances might be rebooted from time to time for various reasons. You would like to automatically mount the EFS to /mnt when the EC2 instance boots up. This can be achieve by adding an entry to your /etc/fstab. You should Learn about the basics of how /etc/fstab works, and add the entry necessary to automatically mount your EFS.
(3) Distributed File System
In High Performance Computing (HPC) use cases, there is a need to provide a shared storage with both large capacity and high throughput. This is usually achieved by setting up a distributed file system using the local storage volumes available on all computing nodes. Some of the commonly used distributed file systems in the HPC community include GlusterFS, MooseFS, and XtreemFS (relatively old).
- MooseFS
- Distributed File System on Amazon Linux — MooseFS
- GlusterFS
- Distributed File System on Amazon Linux — GlusterFS
- XtreemFS
- Distributed File System on Amazon Linux — XtreemFS
You should pick one of the above-mentioned distributed file systems and go through the setup and configuration process yourself. Use the latest version of operating system and the latest version of the software, produce an installation and configuration document when you are done.
Trouble Shooting Exercises
-
You are running an application on a Linux machine, and you have a feeling that the application is running a little bit slow. Treating the application as a black box, how would you determine why the application is slow, using the various commands available on Linux?
-
You delete a huge file using the rm command, with the hope to release some disk space. Before doing the rm you use df to check the available disk space, and you do the same after the file is deleted. You notice that although the file has been deleted, but you are not getting the disk space back. What might be happening?
-
Your application is writing a large number of small files into a directory. It is working fine at the beginning, but gradually failed to write to disk. What might be happening?
-
You maintain a service (it can be a web server powered by Apache or Nginx, or a database server powered by MySQL or PostgreSQL), but the service daemon keeps on failing. What might be happening?