ICP 1 - PavankumarManchala/BigDataProgrammingICPs GitHub Wiki
Submitted by:
Pavankumar Manchala Class Id: 17
Installations:
Oracle VM Virtual Box, Cloudera, Intellij
Datasets:
-
Shakespeare.txt https://umkc.box.com/s/208ehts7vn8ls5yhsea0x0ht6rgkrnnp
-
Word_list.txt https://umkc.box.com/s/bcurc4qjbpx5hpb7pni8950os78enf0e
Task:
Use the Datasets and load it in hadoop hdfs, append the second file to first file and display it.
Here is the Shakespeare.txt in Hue visualization after appending the data of word_list.txt
This is the word_list.txt dataset end part in Hue.
ICP vidoe explanation: https://drive.google.com/open?id=19zCcfC0hrZxVMLsL7a3o5hvfs-NiQ20w
All ICPs videos link: https://drive.google.com/open?id=1racqWkfI10T-CpLYEDYCvJRSRhhLGsWL