ICP 1 - bhargavi1411/BigDataProgramming GitHub Wiki

Name: Bhargavi Saipoojitha Chennupati

Class ID: 12

Topic: Cloudera(Configuration/ Hue)

Task : Cloudera installations and Visualization of Hadoop File with Hue

Datasets: Shakespeare.txt, word_line.txt

To perform this task we have used Cloudera.

To perform this Icp task we have used the given datasets shakespeare.txt and word_list.txt Next we have created a directory named hadoopfiles using the following command

hdfs dfs -mkdir /hadoopfiles/

To load the files shakespeare.txt and word_list.txt we have used the following command

hdfs dfs copyFromLocal Desktop/ICP1/shakespeare.txt /hadoopfiles/

hdfs dfs copyFromLocal Desktop/ICP1/word_list.txt /hadoopfiles/

Now we have appended the two files and copied the output into a new file using the command

hdfs dfs -cat /hadoopfiles/shakespeare.txt /hadoopfiles/word_list.txt|hdfs dfs -put - /hadoopfiles/output.txt

Now to read the first 10 lines from the output file we got from appending the two files,we use the following command

hdfs dfs -cat /hadoopfiles/output.txt| head -10

similarly to read the last 10 lines we use the following command

hdfs dfs -cat /hadoopfiles/output.txt| tail -10

Now we have create the third file and append it to the first two files For that I have created a file sample.txt and placed a text inside it. We have loaded sample.txt into hadoop files using the following command

hdfs dfs -copyFromLocal Desktop/ICP1/sample.txt /hadoopfiles

Now we have to append this sample1.txt with other two files and save it into a new file using the following command

hdfs dfs -cat /hadoopfiles/shakespeare.txt /hadoopfiles/word_list.txt /hadoopfiles/sample1.txt| hdfs dfs -put - /hadoopfiles/output2.txt

We can display the first 5 and last 5 lines from output2.txt to check the three files have successfully appended using the commands:

hdfs dfs -cat /hadoopfiles/output2.txt| tail -5

hdfs dfs -cat /hadoopfiles/output2.txt| head -5

We can perform the Visualization using Hue

Here we can see all the files in our folder.