QA HBase MOB - cto-bdt-qa/bdt-qa GitHub Wiki
This is an optimized solution for storing records with MOB(Medium Object) files into HBase
MOB: Medium object. It usually refers to BMOB(binary medium object) and CMOB (character medium object). It can be a PDF document, Word document, image, multimedia object, etc. Unlike typical records, MOB can typically be several hundred KB to several MB in size. (100KB ~ 5MB).
MOB data is typically stored together with its metadata. For example, in ITS(Intelligent Transportation System) field, the metadata (car speed, color, direction, plate number, etc.) of the picture taken by lane camera is extracted before storing MOB (pictures). And the metadata are stored together with MOB.
Write characteristics:
- Write performance is quite critical. MOBs and their metadata need to be stored efficiently.
- The total data size is quite big.
- MOBs are seldom updated.
Read characteristics:
- The MOBs are accessed much less than their corresponding metadata. Typically their metadata is accessed for analysis; MOBs are used as archives, and accessed only when user explicitly request them.
- MOBs are typically accessed in a random way. User typically doesn't scan all MOBs at one batch.
- High performance: (1) Write: Stable low latency and high throughput. Minimize the MOB impact to the HBase when split, compaction etc. (2) Read: Low read latency and good concurrent read performance. Fast scan on the metadata, and fast random read on the MOB data.
- Consistency for MOB data read/write.
- Transparency during reading and writing against the MOB data.
| Document | Description |
|---|---|
| HBASE-11339 |
JIRA entry for MOB feature in Apache HBase project
|
| Design Doc |
Design Document for MOB, which is summitted to JIRA entry HBASE-11339 |
| Current Dev Branch |
Dev Branch on GitHub (intel-hadoop account): https://github.com/intel-hadoop/HBase-LOB
|
| QA Start Date | QA Plan Ready | Functionality Test Case Ready | Performance Test Case Ready | QA End Date |
|---|---|---|---|---|
| |
|
|
|
|
| Test Category | Test Scenario | Description | Planed Test Case No. | Test Case Dev Status | Test Result |
|---|---|---|---|---|---|
| Functionality Test |
HTable APITest (CRUD, Batch, Scan etc.) |
Test MOB functionality via HTable API:
, to validate whether there is any conformance issue when MOB is used. |
14 | - | - |
| |
HBase Administration Operation Test |
Test MOB functionality via HBase Administration operations:
, to validate whether there is any conformance issue when MOB is used. |
10 | - | - |
| |
MOB Housekeeping Test |
Validate the correctness of the following 2 housekeeping ways which MOB supports:
|
8 | - | - |
| Performance Test | MOB data write performance |
Collect the performance data of MOB table data write, for different MOB data sizes (e.g. 100K, 3M, 5M) Compare it with non-MOB table data write. |
4 | - | - |
| |
MOB data random read performance |
Collect the performance data of MOB table data random read, for different MOB data sizes (e.g. 100K, 3M, 5M) Compare it with non-MOB table data random read. |
4 | - | - |
| |
Metadata data scan |
Collect the performance data of metadata data scan, for different corresponding MOB data sizes (e.g. 100K, 3M, 5M) Compare it with non-MOB table data scan. |
4 | - | - |
| Failure Test | HBase Region Server Down |
Validate the correctness of all the functionality test cases, when there are 1~2 HBase region servers down. (Suppose data replication is 3) |
32 | - | - |
| |
HDFS Datanode Down |
Validate the correctness of all the functionality test cases, when there are 1~2 HDFS datanode servers down. (Suppose data replication is 3) |
32 | - | - |
| # | Test Suite | Test Case |
Description |
Covered HTable Operation | Test Result |
|---|---|---|---|---|---|
| 1 |
CRUD Test |
testPutDataBasic |
Validate the basic functionality for MOB Put/Get | Put, Get | - |
| 2 | |
testPutDataTwice |
Validate the functionality for MOB Put/Get, by put on one row twice | Put, Get | - |
| 3 | |
testCheckAndPut |
Validate the functionality for MOB CheckAndPut: Put one row first, use checkAndPut to update it, and then check the updated result. |
CheckAntPut, Get | - |
| 4 | |
testCheckAndDelete |
Validate the functionality for MOB CheckAndDelete: Put one row first, use checkAndDelete to delete it, and then check the result. | CheckAndDelete, Get | - |
| 5 | |
testAppend |
Validate the functionality for MOB Append: Put one row first, use Append to update it, and then check the updated result. | Append, Get | - |
| 6 | |
testMutateRowPut |
Validate the functionality for MOB MutateRow(Put): | MutateRowAdd, Get | - |
| 7 | |
testMutateRowDelete |
Validate the functionality for MOB MutateRow(Delete): | MutateRowDelete, Get | - |
| 8 | |
testIncrement |
This is just a compatibility test, to see whether Increment will be impacted when some column family enabled MOB |
Increment, Get | - |
| 9 | |
testIncrementColumnValueWithLog |
This is just a compatibility test, to see whether IncrementColumnValue will be impacted when some column family enabled MOB |
incrementColumnValue, Get | - |
| 10 | |
testIncrementColumnValueOverflow |
This is just a compatibility test, to see whether IncrementColumnValue will be impacted when some column family enabled MOB |
incrementColumnValue, Get | - |
| 11 | Batch Test |
testBatch |
Validate the functionality for MOB Put/Get/Append/Increment etc. by calling HTable batch api | Batch (Put, Append, Increment, Get etc.) | - |
| 12 | Scan Test |
testScanByQualifier |
Validate the functionality for MOB Scan, by Column Qualifier | Put, Scan | - |
| 13 | |
testScanWithFilter |
Validate the functionality for MOB Scan, by setting a specific filter for scan | Put, Scan | - |
| 14 | |
testPutAndScanForMultipleMOBs |
Perform multiple put operations to insert multiple MOBs to different rows, and then use scan to query the result. | Put, Scan | - |
| # | Test Suite | Test Case | Test Result |
|---|---|---|---|
| 1 | Table Create/Drop/Enable/Disable | Create Table Normally | - |
| 2 | |
Drop Table Normally | - |
| 3 | |
Create Table Twice | - |
| 4 | |
Drop Table Twice | - |
| 5 | |
Re-create Table (Create -> Drop -> Create) | - |
| 6 | |
Enable/Disable Table Normally | - |
| 7 | Alter Table Schema |
Column Family Modification |
- |
| 8 | Compaction |
Major Compaction |
- |
| 9 | |
Minor Compaction |
- |
| 10 | Split |
Table Split |
- |
| # | Test Suite | Test Case | Test Result |
|---|---|---|---|
| 1 |
Delete expired MOB files according to the TTL |
MOB files are expired |
- |
| 2 |
MOB files are actually NOT expired |
- | |
| 3 | Use sweeper tool to clean the unused and expired MOB data |
Start sweeper tool to do housekeeping (MOB files are expired) |
- |
| 4 |
Start sweeper tool to do housekeeping (MOB files are actually NOT expired) |
- | |
| 5 |
Perform update operation during sweeper tool running |
- | |
| 6 |
Perform delete operation during sweeper tool running |
- | |
| 7 |
Perform delete & major compaction during sweeper tool running |
- | |
| 8 |
Perform major compaction during sweeper tool running |
- | |
| 9 |
Start sweeper tool during major compaction |
- | |
| 10 |
There is no MOB file unused or expired |
- |
Methodology: Develop test plugins and add them into SOAK reliability/performance test framework.
The following test scenarios will be covered:
| # | Test Case | Throughput |
|---|---|---|
| 1 |
MOB (Data Size: 100K) |
- |
| 2 |
MOB (Data Size: 3M) |
- |
| 3 |
MOB (Data Size: 5M) |
- |
| 4 |
non-MOB |
- |
| # | Test Case | Latency |
|---|---|---|
| 1 |
MOB (Data Size: 100K) |
- |
| 2 |
MOB (Data Size: 3M) |
- |
| 3 |
MOB (Data Size: 5M) |
- |
| 4 |
non-MOB |
- |
| # | Test Case | Latency |
|---|---|---|
| 1 |
MOB (Data Size: 100K) |
- |
| 2 |
MOB (Data Size: 3M) |
- |
| 3 |
MOB (Data Size: 5M) |
- |
| 4 |
non-MOB |
- |
Failure test will validate the correctness of all the MOB functionality test cases, when some non-critical service is down on cluster node. (e.g. )
| # | Test Scenario | Functionality Test Cases to Validate | Test Result |
|---|---|---|---|
| 1 | HBase Region Server Down |
HTable CRUD Test |
- |
| 2 |
HBase Administration Operation Test |
- | |
| 3 |
MOB Housekeeping Test |
- | |
| 4 | HDFS Datanode Server Down |
HTable CRUD Test |
- |
| 5 |
HBase Administration Operation Test |
- | |
| 6 |
MOB Housekeeping Test |
- |