HBase Shell Cheat Sheet - tenji/ks GitHub Wiki

HBase Shell常用命令

分组

general

status, table_help, version, whoami

ddl

alter, alter_async, alter_status, create, describe, disable, disable_all, drop, drop_all,enable , enable_all, exists, get_table, is_disabled, is_enabled, list, locate_region, show_filters

namespace

alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace,list_namespace_tables

dml

append, count, delete, deleteall, get, get_counter, get_splits, incr, put, scan, truncate, truncate_preserve

tools

assign, balance_switch, balancer, balancer_enabled, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, close_region, compact, compact_mob, compact_rs, flush, major_compact, major_compact_mob, merge_region, move, normalizer_enabled, normalize, normalizer_switch, split, trace, unassign, wal_roll, zk_dump

replication

add_peer, append_peer_tableCFs, disable_peer, disable_table_replication, enable_peer, enable_table_replication, get_peer_config, list_peer_configs, list_peers, list_replicated_tables, remove_peer, remove_peer_tableCFs, set_peer_tableCFs, show_peer_tableCFs, update_peer_config

snapshots

clone_snapshot, delete_all_snapshot, delete_snapshot, list_snapshots, restore_snapshot, snapshot

configuration

update_all_config, update_config

quotas

list_quotas, set_quota

security

grant, list_security_capabilities, revoke, user_permission

procedures

abort_procedure, list_procedures

visibility labels

add_labels, clear_auths, get_auths, list_labels, set_auths, set_visibility

rsgroup

add_rsgroup, balance_rsgroup, get_rsgroup, get_server_rsgroup, get_table_rsgroup, list_rsgroups, move_servers_rsgroup, move_tables_rsgroup, remove_rsgroup

一、表管理

  • 查看有哪些表

    > list
    
  • 查询表行数(比较耗时)

    # 语法:count <table>, { INTERVAL => <INTERVAL> };其中,INTERVAL 为统计的行数间隔,默认为 1000
    > count 'stream', { INTERVAL => 10000 }
    
  • 创建表

    # 语法:create <table>, {NAME => <family>, VERSIONS => <VERSIONS>}
    # 例如:创建表 t1,有两个 family name:f1, f2,且版本数均为 2
    > create 't1', {NAME => 'f1', VERSIONS => 2}, {NAME => 'f2', VERSIONS => 2}
    
  • 删除表

    # 分两步:首先 disable,然后 drop
    > disable 't1'
    > drop 't1'
    
  • 修改 TTL

    > alter 't1', NAME => 'cf', TTL => '500'
    

二、表数据管理

2.1 添加数据

$ hbase(main)> put 'stream','rowkey001','f1:col1','value01'

2.2 查询数据

1. 查询某行数据

2. 扫描表

以stream表为例,stream表的表结构如下:

1). STARTROW
2). TIMERANGE
3). FILTER
  • ValueFilter

谁的值=sku188

$ hbase(main)> scan 'test1', {FILTER=>"ValueFilter(=,'binary:sku188')"}

谁的值包含88

$ hbase(main)> scan 'test1', {FILTER=>"ValueFilter(=,'substring:88')"}

  • ColumnPrefixFilter

  • PrefixFilter

RowKey中以6352bba7aaab443aa1d9943efc586a68为前缀的数据

$ hbase(main)> scan 'stream', {FILTER => "PrefixFilter('6352bba7aaab443aa1d9943efc586a68')"}

  • FirstKeyOnlyFilter
  • PageFilter

$ hbase(main)> scan 'stream',{FILTER => "PageFilter(10)"}

  • RowFilter

使用RowFilter必须先import相关的包:

import org.apache.hadoop.hbase.filter.RegexStringComparator
import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.SubstringComparator
import org.apache.hadoop.hbase.filter.RowFilter

查询RowKey满足正则表达式".*\\.[2]\\..*"的记录(RegexStringComparator

$ hbase(main)> import org.apache.hadoop.hbase.filter.RegexStringComparator
$ hbase(main)> import org.apache.hadoop.hbase.filter.CompareFilter
$ hbase(main)> import org.apache.hadoop.hbase.filter.SubstringComparator
$ hbase(main)> import org.apache.hadoop.hbase.filter.RowFilter
$ hbase(main)> scan 'stream', {FILTER => RowFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'), RegexStringComparator.new('.*\\.[2]\\..*'))}

查询RowKey中包含字符串.2.的记录(SubstringComparator

$ hbase(main)> import org.apache.hadoop.hbase.filter.CompareFilter
$ hbase(main)> import org.apache.hadoop.hbase.filter.SubstringComparator
$ hbase(main)> import org.apache.hadoop.hbase.filter.RowFilter
$ hbase(main)> scan 'stream', {FILTER => RowFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'), SubstringComparator.new('.2.'))}

3.查询表中的数据行数

2.3 删除数据

  • 删除行中某列

    # 语法:delete <table>, <rowkey>,  <family:column>, <timestamp>,必须指定列名
    > delete 'stream','rowkey001','f1:col1'
    
  • 删除行

    # 语法:deleteall <table>, <rowkey>,  <family:column>, <timestamp>,可以不指定列名,删除整行数据
    > deleteall 'stream','rowkey001'
    
  • 删除表中所有数据

    # 具体过程:disable table -> drop table -> create table
    > truncate 'stream'
    

2.4 更新数据

  • 修改行中某列
    # 语法:put 'table name', 'row', 'column family:column name', 'new value'
    > put 'stream', 'rowkey001', 'f1:col1', 'newValue'
    

三、权限管理

四、配置管理

参考链接

⚠️ **GitHub.com Fallback** ⚠️