ICP 1 - Murarishetti-Shiva-Kumar/Big-Data-Programming GitHub Wiki
Lesson 1: Cloudera (configuration/Hue)
shakespeare.txt,word_list.txt)
Use the given datasets(Load it in hadoop hdfs
Creating directory:
Copying files from local to Hadoop hdfs:
Use the second file and append it to the first file
Visualize file with Hue
First lines of the file:
Last lines of the file:
View the first and last lines (approximately 5) of merged dataset using appropriate hdfs commands
First lines of the file:
Last lines of the file:
Renaming the appended file:
Copying the original first file to Hadoop hdfs:
Create a new text file and load it into hdfs and try to append all three datasets
Merging the 3files into a single file:
getmerge - Takes a source directory and a destination file as input and concatenates files in src into the destination local file.
Destination local file:
As the concatenated destination file got generated in the local file system. We have to move the file from local to Hadoop hdfs
Final concatenated file in Hue