ICP1 - Hiresh12/Big-Data-Programming GitHub Wiki
ICP 1:
Topic : Installing Cloudera and visualize Hadoop data with Hue
Task:
- Install Cloudera
- Load datasets into HDFS
- Append both files
- Visualize the result file with Hue
- Display first and last 5 lines of the result file
- Load new file and append data of all the 3 datasets
Features:
- Cloudera
- Hadoop
- Hue
Questions:
- Creating new directory BDP:

- Copy the files to hadoop hdfs

appendToFile:

appendToFile – copies files from local file system to a destination file system
Appending the files using Cat command and storing moving the output to HDFS using put command,

****View the first 5 lines of merged dataset using appropriate hdfs commands

Output

****View the first 5 lines of merged dataset using appropriate hdfs commands

Output:

****Create a new text file and load it into hdfs and try to append all three datasets.


****Visualize file with Hue
