Module 1: ICP #1 - VidyullathaKaza/BigData_Programming_Spring2020 GitHub Wiki
ICP-1
Cloudera
Description:
As part of the exercise we installed the following tools for the Big data programming:
- Cloudera setup
- IntelliJ Community edition
Downloaded in Cloudera
Downloaded in LocalRepository
Exercise
The following steps were performed to complete the exercise
-
Downloaded the dataset provided.
-
Loaded one of the datasets into HDFS using the below commands
-
The other dataset is appended to the first file using below command
-
Finally using Hue we displayed the data.
Learning Outcomes
-
Understood the importance of cloudera.
-
Hue is one of the new technologies that we heard of in this exercise.
-
Executed HDFS commands