Transactions in Hive [ACID] - ignacio-alorre/Hive GitHub Wiki
- From Hive 0.13, it enables SQL atomicity of operations at the row level rather than at the table/partition level
- This allows a Hive client to read from a partition at the same time that another Hive client is adding rows to the same partition
- It also provides a mechanism for streaming clients to rapidly update Hive tables and partitions
- Each Hive Transaction has an identifier. Multiple transactions are grouped into a single transaction batch
- Client requests a set of transaction IDs after connecting to Hive and subsequently uses these transaction IDs, one at a time
- Clients write one or more records for each transaction and either commit or abort a transaction before moving to the next transaction
ACID is an acronym for four required traits of database transactions: atomicity, consistency, isolation and durability
- Atomicity: An operation either succeeds completely or fails. It does not leave partial data
- Consistency: Once an application performs an operation, the results of that operation are visible to the application in every subsequent operation
- Isolation: Operations by one user do not cause unexpected side effects for other users
- Durability: Once an operation is complete. it is preserved in case of machine or system failure
Source