section 1 - heda491001/udemy-hadoop-course GitHub Wiki

just memo of the udemy hadoop course

5. installing Hadoop

  • using a virtual machine (virtual box)

  • image download from cloudera (hdp)

  • open the web browser and go to localhost:8888

  • login in the dashboard , user: maria_dev, password: maria_dev

  • download the data , in this course we will use the movie rating data from IMDb

  • for sample demo

    • upload the data to hive
    • create a table in hive
    • run a query in hive

6. Hortonworks and Cloudera

  • just some information about the two companies
  • Hortonworks is a company that is based on the open source hadoop, merged with cloudera in 2019
  • this course use the sandbox from hodtonworks, called HDP (hortonworks data platform)
  • after the company merged, Cloudrea Data Platform (CDP) is the new platform
    • at the time of this course filming , the HDP is still available ,seems support for HDP will end in 2022
    • but this memo(which i take this course) is at 2024 , expired already
    • course teacher provide a image of HDP