1 Introduction - ignacio-alorre/Hive GitHub Wiki

What is Hive?

Hive is a data warehouse infrastructure tool to process structured and semi-structured data in Hadoop. It resides on top of Hadoop and it is mostly used for querying and analyzing large datasets. Initially, you have to write complex Map-Reduce jobs, but now you just need to submit merely SQL-like queries. Hive use language called HiveQL (HQL), which is similar to SQL. HiveQL automatically translates SQL-like queries into MapReduce jobs.

What is not?

A relational database
A design for OnLine Transaction Processing (OLTP)
A language for real-time queries and row-level updates

Features of Hive

It stores schema in a database and processed data into HDFS
It is designed for OLAP
It provides SQL type language for querying called HiveQL or HQL
It is familiar, fast, scalable and extensible

Sources

https://www.tutorialspoint.com/hive/hive_introduction.htm
https://data-flair.training/blogs/apache-hive-tutorial/