ICP 1 - gracesyl/big-data-hadoop GitHub Wiki
Welcome to the big-data-hadoop wiki!
ICP-1
Name: Grace Stalin-ID-30
Introduction:
Cloudera software platform for data engineering,datawarehousing,machine learning and analytics that runs in the cloud or on premises. Here we are going to load the dataset in the cloudera using the command prompt and append it in cloudera and Vizualize it in HUE(Hadoop user experience).
Here we have first installed the cloudera as in the following:
Giving the codes in cloudera command prompt:
Coding:
hdfs dfs -put shakespeare.txt /user/cloudera Hdfs dfs -put word_list .txt /user/cloudera Hdfs dfs -appendToFile shakespeare.txt word_list.txt /user/cloudera
By giving this code we can upload the datasets into cloudera and append the both datasets and visualize as follows:
This is the first ICP of Big data programming learned on june 6th 2019.
Datasource: