GlusterFS - shawfdong/hyades GitHub Wiki

GlusterFS is a scale-out network-attached storage file system. It aggregates various storage servers over Ethernet or Infiniband RDMA interconnect into one large parallel network file system. By default, files are stored whole, but striping of files across multiple remote volumes is also supported. The final volume may then be mounted by the client host using its own native protocol via the FUSE mechanism, using NFS v3 protocol using a built-in server translator, or accessed via gfapi client library. Native-protocol mounts may then be re-exported e.g. via the kernel NFSv4 server, SAMBA, or the object-based OpenStack Storage (Swift) protocol using the "UFO" (Unified File and Object) translator.

We'll test GlusterFS, by creating a virtual cluster using VirtualBox on my HP xw8600 Workstation.

Table of Contents Networking GlusterFS repos GlusterFS servers Adding Servers to Trusted Storage Pool Verifying the peer status Creating a Replicated Volume GlusterFS clients Native GlusterFS client NFS client References

Networking

We use a VirtualBox Host-Only Network for the cluster^[1]. The subnet is 192.168.56.0/24, and the DHCP Server is configure as the following:

Server Address: 192.168.56.100
Server Mask: 255.255.255.0
Lower Address Bound: 192.168.56.101
Upper Address Bound: 192.168.56.254

Note VirtualBox NAT Network won't work, because all VMs will be assigned the same IP address (e.g., 10.0.2.15)!

GlusterFS repos

We run CentOS 6 on both the server and client VMs.

Install the EPEL repo^[2]^[3]::

# yum install epel-release

But EPEL no longer provides glusterfs-server. Let's install the glusterfs-epel repo:

# wget -P /etc/yum.repos.d http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/glusterfs-epel.repo

GlusterFS servers

There are 4 servers in the virtaul cluster.

Install the glusterfs-server package on one VM:

# yum install glusterfs-server
# chkconfig glusterfsd on

Shutdown the VM:

# halt -p

Make 3 clones of the VM. For each clone, select the Full Clone type, and check "Reinitialize the MAC address of all network cards".

Modify the following files on each VM:

/etc/udev/rules.d/70-persistent-net.rules
/etc/sysconfig/network
/etc/sysconfig/network-scripts/ifcfg-eth0
/etc/sysconfig/network-scripts/ifcfg-eth1

And add the following lines to /etc/hosts:

192.168.56.11 c6-1.local c6-1
192.168.56.12 c6-2.local c6-2
192.168.56.13 c6-3.local c6-3
192.168.56.14 c6-4.local c6-4

Add a 8GB second drive to each server:

# fdisk /dev/sdb

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-1044, default 1):
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-1044, default 1044):
Using default value 1044

Command (m for help): p

Disk /dev/sdb: 8589 MB, 8589934592 bytes
255 heads, 63 sectors/track, 1044 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x91f128b2

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1        1044     8385898+  83  Linux


Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

Create an XFS file system on the second drive:

# mkfs.xfs -i size=512 /dev/sdb1
meta-data=/dev/sdb1              isize=512    agcount=4, agsize=524119 blks
         =                       sectsz=512   attr=2, projid32bit=0
data     =                       bsize=4096   blocks=2096474, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

# xfs_admin -L brick /dev/sdb1
writing all SBs
new label = "brick"

# mkdir /brick

Add the following line to /etc/fstab:

/dev/sdb1               /brick                  xfs     defaults        1 2

Let's disable iptables:

# chkconfig iptables off

Reboot all the servers.

Adding Servers to Trusted Storage Pool

[root@c6-1 ~]# gluster peer probe c6-2
peer probe: success.
[root@c6-1 ~]# gluster peer probe c6-3
peer probe: success.
[root@c6-1 ~]# gluster peer probe c6-4
peer probe: success.

Verifying the peer status

On c6-1:

[root@c6-1 ~]# gluster peer status
Number of Peers: 3

Hostname: c6-2
Uuid: e02a3b8f-5d6d-4cfe-a162-9c44c0967b5e
State: Peer in Cluster (Connected)

Hostname: c6-3
Uuid: 7ce91fee-ccb5-4f4e-a5b6-928aad1c49e7
State: Peer in Cluster (Connected)

Hostname: c6-4
Uuid: 4fe2f9df-36d3-4bcb-870c-372c3dc8aaf7
State: Peer in Cluster (Connected)

On c6-2:

[root@c6-3 ~]# gluster peer status
Number of Peers: 3

Hostname: c6-1.local
Uuid: e4253080-7d4c-47e1-ba55-715fb311aa3c
State: Peer in Cluster (Connected)

Hostname: c6-2
Uuid: e02a3b8f-5d6d-4cfe-a162-9c44c0967b5e
State: Peer in Cluster (Connected)

Hostname: c6-4
Uuid: 4fe2f9df-36d3-4bcb-870c-372c3dc8aaf7
State: Peer in Cluster (Connected)

Creating a Replicated Volume

# gluster volume create gfs replica 2 c6-1:/brick c6-2:/brick c6-3:/brick c6-4:/brick
volume create: gfs: failed: The brick c6-1:/brick is a mount point. Please create a sub-directory under the mount point and use that as the brick directory. Or use 'force' at the end of the command if you want to override this behavior.

Let's create a sub-directory under the mount point on each server:

# mkdir /brick/gfs

Try to create a replicated volume again:

[root@c6-1 ~]# gluster volume create gfs replica 2 c6-1:/brick/gfs c6-2:/brick/gfs c6-3:/brick/gfs c6-4:/brick/gfs
volume create: gfs: success: please start the volume to access data

[root@c6-1 ~]# gluster volume start gfs
volume start: gfs: success

[root@c6-1 ~]# gluster volume info

Volume Name: gfs
Type: Distributed-Replicate
Volume ID: d3e62a2a-dc8f-4649-84a9-9d7407a3243a
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: c6-1:/brick/gfs
Brick2: c6-2:/brick/gfs
Brick3: c6-3:/brick/gfs
Brick4: c6-4:/brick/gfs
Options Reconfigured:
performance.readdir-ahead: on

GlusterFS clients

Native GlusterFS client

# yum install epel-release

# wget -P /etc/yum.repos.d http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/glusterfs-epel.repo

# yum install glusterfs-client

# mkdir /gfs

# mount -t glusterfs c6-1:/gfs /gfs

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_c60-lv_root
                       14G  826M   13G   7% /
tmpfs                 1.9G     0  1.9G   0% /dev/shm
/dev/sda1             477M   57M  395M  13% /boot
c6-1:/gfs              16G   65M   16G   1% /gfs

NFS client

Install nfs-utils:

[root@c6-0 ~]# yum install nfs-utils

Make sure nfslock is started; then mount the GlusterFS using NFS v3 protocol:

[root@c6-0 ~]# /etc/init.d/nfslock status
rpc.statd (pid  1220) is running...
[root@c6-0 ~]# mount -t nfs -o tcp c6-2:/gfs /nfs

You can even simultaneously mount the same GlusterFS using the native protocol:

[root@c6-0 ~]# mount -t glusterfs c6-4:/gfs /gfs