GCP BigTable - dennisholee/notes GitHub Wiki
- Originally used by Google for web search index.
Src: https://cloud.google.com/bigtable/docs/schema-design
- Each table has only one index, the row key.
- Rows are sorted lexicographically by row key
- Columns are grouped by column family and sorted in lexicographic order within the column family.
- All operations are atomic at the row level.
- Ideally, both reads and writes should be distributed evenly
- Related entities should be stored in adjacent rows
- Cloud Bigtable tables are sparse
- It's better to have a few large tables than many small tables.
cbt createinstance instance "instance" instance-c1 asia-east2-a 1 HDD
Define cbt
default settings
# ~/.cbtrc
project = playground-s-11-d40072
instance = instance
Create table cbt createtable <table_name>
Create family cbt createfamily <table_name> <column_family>
Add row cbt set <table_name> <row_identifier> <column_family>:<column_name>=<value>
List tables cbt ls
List table's family cbt ls tbl
https://cloud.google.com/bigtable/docs/monitoring-instance#disk
CPU Metrics
Metric | Description |
---|---|
Average CPU utilization | If a cluster exceeds the recommended maximum value for your configuration for more than a few minutes, add nodes to the cluster. |
CPU utilization of hottest node | 1. Use Key Visualizer tool to identify hotspots. 2. Check schema design |
Disk Usage Metrics
Metric | Description |
---|---|
Storage utilization (bytes) | This value affects your cost |
Storage utilization (% max) | Based on the number of nodes in your cluster. Add nodes to the cluster if over 70% |
Disk load (HDD Clusters only) | Experience increased latency then add nodes to the cluster to reduce the disk load percentage. |
Monitor storage utilization for your clusters to make sure they have enough nodes to support the amount of data in the cluster, based on the following limits:
- SSD clusters: 2.5 TB per node
- HDD clusters: 8 TB per node
Src: https://cloud.google.com/bigtable/quotas#storage-per-node