EMR 017 Rescue HDFS Data - qyjohn/AWS_Tutorials GitHub Wiki

Assuming that you mis-configure your EMR cluster to the point that you are no longer able to start the HDFS namenode process on them master node, you can launch an identical EMR cluster to rescue your data on HDFS. With the assumption that there is one data EBS volume on each node, and data in the namenode and datanode folders were still there, you can do the following:

  • Launch a new EMR cluster with identical configuration (in the same availability zone). Virtually create a 1:1 mapping of the nodes. For example, master node in each cluster is called Node-0, the first core node in each cluster is called Node-1, the second core node in each cluster is called Node-1...

  • For the original cluster, on each node create a snapshot of the EBS volume (the data volume), then create a new EBS volume from the snapshot (in the same availability zone). Attach the new EBS volume to the corresponding node in the new EMR cluster as /dev/sdf. For example:

Node-0 -- Volume-0 -> Snapshot-0 -> Volume-0 -> Node-0 (/dev/sdf)
  • SSH into the master node on the new EMR cluster, do the following:
sudo initctl stop hadoop-hdfs-namenode
sudo mkdir /mnt2
sudo mount /dev/nvme2n1p2 /mnt2
sudo rm /mnt2/namenode/in_use.lock

Edit /etc/hadoop/conf/hdfs-site.xml, with the following changes:

  <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:///mnt2/namenode</value>
  </property>

  <property>
    <name>dfs.name.dir</name>
    <value>/mnt2/namenode</value>
  </property>
  • SSH into both core nodes on the new EMR cluster, do the following:
sudo initctl stop hadoop-hdfs-datanode
sudo mkdir /mnt2
sudo mount /dev/nvme2n1p2 /mnt2
sudo rm /mnt2/hdfs/in_use.lock

Edit /etc/hadoop/conf/hdfs-site.xml, with the following changes:

  <property>
    <name>dfs.datanode.data.dir</name>
    <value>file:///mnt2/hdfs</value>
  </property>

  <property>
    <name>dfs.data.dir</name>
    <value>/mnt2/hdfs</value>
  </property>
  • On the master node, start HDFS namenode
sudo initctl start hadoop-hdfs-namenode
  • On the core nodes, start HDFS datanode
sudo initctl start hadoop-hdfs-datanode

At this point you should have the old data on the new EMR cluster.

⚠️ **GitHub.com Fallback** ⚠️