
1.hbase Shell概述
Apache HBase Shell 是(J)Ruby的 IRB,其中添加了一些 HBase 特定命令,您可以在 IRB 中執行的任何操作,您都應該可以在 HBase Shell 中執行,
0. 首先創建hbase集群的操作用戶 hbase_test
1.首先root用戶在本地客戶端添加hbase_test用戶
[root@10-90-50-77-jhdxyjd ~]# useradd hbase_test
2.切換hbase集群的超級用戶,創建/user/hbase_test目錄,修改授權
[hdfs@10-90-50-77-jhdxyjd ~]# hdfs dfs -mkdir /user/hbase_test
[hdfs@10-90-50-77-jhdxyjd ~]# hdfs dfs -chown hbase_test:hbase_test /user/hbase_test
3.注意這里hbase沒有開啟授權管理,后面詳細講解,
1.開啟hbase shell很簡單,在裝了Hbase的節點直接執行./hbase shell即可進入
Hbase官網Shell命令查看:Apache HBase ? Reference Guide
2.Hbase shell中所有命令分類匯總
如上,hbase shell中help可以查看所有hbase命令,分類匯總展示,標紅的要重點關注,如果不會用help 一下命令,
| 類別 | 命令名 | 介紹描述 | 語法 |
| 1.通用型別命令 (主要用來查看基本的hbase操作和集群基本資訊) | status | 回傳hbase集群的狀態資訊 | hbase(main):053:0> status 1 active master, 1 backup masters, 7 servers, 0 dead, 2.1429 average load Took 0.0082 seconds |
| processlist | 查看regionser上的task串列,可以查看多種明細 | hbase(main):054:0> processlist hbase> processlist | |
| table_help | 查看如何操作表 | table_help 會告訴你操作表的crud命令語法和演示 比如: Or, if you have already created the table, you can get a reference to it: hbase> t = get_table 't' You can do things like call 'put' on the table: | |
| version | 回傳hbase版本資訊 | hbase(main):068:0> version 2.1.0-cdh6.1.0, rUnknown, Thu Dec 6 16:59:59 PST 2018 Took 0.0003 seconds | |
| whoami | 查看當前hbase操作用戶 | hbase(main):069:0> whoami hbase_test (auth:SIMPLE) groups: hbase_test Took 0.0137 seconds | |
| 2.namespace的所有操作命令 | create_namespace | 創建namespace,類似資料庫 | create_namespace 'myns_test' |
| describe_namespace | 查看namespace的資訊 | hbase(main):070:0> describe_namespace 'myns_test' DESCRIPTION {NAME => 'myns_test', PROPERTY_NAME => 'PROPERTY_VALUE'} => 1 | |
| drop_namespace | 洗掉namespace,前提必須先洗掉其中的表,否則例外 | drop_namespace 'myns_test' | |
| alter_namespace | 修改namespace其中屬性 | alter_namespace 'myns_test',{METHOD => 'set', 'PROPERTY_NAME' => 'PROPERTY_VALUE'} | |
| list_namespace | 查看hbase中有個多少個namespace | hbase(main):071:0> list_namespace NAMESPACE default hbase myns_test | |
| list_namespace _tables | 查看某個namespace中的所有表 | hbase(main):073:0> list_namespace_tables 'myns_test' TABLE t1 tb2 2 row(s) Took 0.0043 seconds => ["t1", "tb2"] | |
| 3.表的ddl陳述句 重點掌握表的操作 | alter, | alter修改表模式,表列族和表所有的屬性,類似alter table | alter 't1', METHOD => 'table_conf_unset', NAME => 'hbase.hregion.majorcompaction' |
| create drop drop_all, | 創建表 洗掉表 洗掉所有符合規則的表,正則匹配 | 具體參考下面詳細介紹 create 't1', 'f1', 'f2', 'f3' drop 't1' hbase> drop_all 't.*' | |
| list | list列舉出hbase中所有的表;也支持模糊匹配檢索列舉 | hbase> list hbase> list 'abc.*' hbase> list 'ns:abc.*' hbase> list 'ns:.*' | |
| exists | 查看表是否存在,回傳布林值 | hbase(main):076:0> exists 't1' Table t1 does exist Took 0.0279 seconds => true hbase(main):077:0> exists 'ns1:t3' Table ns1:t3 does not exist => false | |
| describe/desc | 都是查看表結構詳細資訊 | desc 'myns:t1' | |
| disable, disable_all, enable, enable_all, | 設定表可用和不可用狀態,同樣disable_all是按正則匹配批量設定表的disable狀態 | hbase> disable 't1' hbase> disable_all 't.*' | |
| is_disabled, is_enabled, | 查看表是否可用,或者不可用,回傳布林值 | hbase> is_disabled 't1' hbase> is_disabled 'ns1:t1' | |
| get_table, | 獲取表,將其作為object物件回傳,然后基于物件操作 |
hbase>t1.help | |
| list_regions 一般查一個表的所有region,可以通過web 界面查看 | 以陣列的形式列出特定表的所有region, 該命令顯示服務器名稱、rs名稱、起始鍵、結束鍵、區域大小(以MB為單位)、請求數 和位置, | hbase> list_regions 'table_name' hbase> list_regions 'table_name', 'server_name' hbase> list_regions 'table_name', {SERVER_NAME => 'server_name', LOCALITY_THRESHOLD => 0.8} hbase> list_regions 'table_name', {SERVER_NAME => 'server_name', LOCALITY_THRESHOLD => 0.8}, ['SERVER_NAME'] ![]() | |
| locate_region, | 定位一個表的key所在region | hbase> locate_region 'tableName', 'key0' | |
| show_filters | 查看hbase集群中所有的過濾器,過濾器用于get和scan命令中作為篩選資料的條件,型別關系型資料庫中的where的作用,后面詳解 | hbase(main):096:0> show_filtersDependentColumnFilter ............ | |
| alter_status | 獲取alter命令的狀態 | hbase> alter_status 't1' hbase> alter_status 'ns1:t1' hbase(main):098:0> alter_status 't1' 1/1 regions updated. Done. Took 1.0209 seconds | |
| alter_async, clone_table_schema | 克隆表模式類似like | hbase> clone_table_schema 'table_name', 'new_table_name' hbase> clone_table_schema 'table_name', 'new_table_name', false 注意,帶false引數表示不保存表的拆分split模式, | |
| 4.表的dml陳述句 | count, | 統計表中行數,類似hive、資料庫里count語法,支持各種復雜聚合統計, | hbase> count 'ns1:t1' hbase> count 't1' hbase> count 't1', INTERVAL => 100000 hbase> count 't1', CACHE => 1000 hbase> count 't1', INTERVAL => 10, CACHE => 1000 hbase> count 't1', FILTER => " (QualifierFilter (>=, 'binary:xyz')) AND (TimestampsFilter ( 123, 456))" hbase> count 't1', COLUMNS => ['c1', 'c2'], STARTROW => 'abc', STOPROW => 'xyz' |
| put, get, append, | put往指定的表,行,列族插入一個cells值, get是獲取指定指定行的值 append是給指定表,行,列族上cells的value追加值. | hbase(main):119:0> put 'tt1', 'rowkey1','f1:name','tom' hbase(main):120:0> get 'tt1','rowkey1' COLUMN CELL f1:name timestamp=1631876854951, value=tom 1 row(s) hbase(main):121:0> append 'tt1', 'rowkey1','f1:name','tom2' CURRENT VALUE = tomtom2 hbase(main):122:0> get 'tt1','rowkey1' COLUMN CELL f1:name timestamp=1631876898153, value=tomtom2 | |
| delete, deleteall, | delete洗掉一個指定表,指定列族,指定列的一個cell值, 而deleteall指定洗掉表,列族或者時間戳的所有celles值, |
| |
|
scan,重點,后詳 | 掃描一張表,通過各種屬性方式設定掃描規則,后面詳解 | hbase> scan 'hbase:meta' hbase> scan 'hbase:meta', {COLUMNS => 'info:regioninfo'} hbase> scan 'ns1:t1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'} hbase> scan 't1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, STARTROW => 'xyz'} hbase> scan 't1', {COLUMNS => 'c1', TIMERANGE => [1303668804000, 1303668904000]} hbase> scan 't1', {REVERSED => true} hbase> scan 't1', {ALL_METRICS => true} | |
| truncate, truncate_preserve | 清空表,保留表結構,實際本質是先disabled表,洗掉表,再重建表, | ||
| get_counter, | get_counter獲取計數器 | # 點擊量:日、周、月 create 'counters', 'daily', 'weekly', 'monthly' incr 'counters', '20110101', 'daily:hits', 1 incr 'counters', '20110101', 'daily:hits', 1 get_counter 'counters', '20110101', 'daily:hits' | |
| incr, | 注意:incr 可以對不存的行鍵操作,如果行鍵已經存在會報錯,如果使用put修改了incr的值再使用incr也會報錯 | incr# 語法 incr '表名', '行鍵', '列族:列名', 步長值 hbase(main):171:0> incr 'ns1:t1','r3','cf1:name2',10
| |
| get_splits, | 獲取表的分隔符 | hbase(main):176:0> create 'ns1:t3', 'f1', SPLITS => ['10', '20', '30', '40'] Created table ns1:t3 => Hbase::Table - ns1:t3 hbase(main):177:0> get_splits 'ns1:t3' Total number of splits = 5 10 20 30 40 |
Hbase shell其他的命令,后續講解,多是運維命令,一般開發使用的少,
| hbase集群工具類命令tools重要,運維常用,后面詳細展開講述 | assign,balance_switch,balancer, balancer_enabled, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, cleaner_chore_enabled, cleaner_chore_run, cleaner_chore_switch, clear_block_cache, clear_compaction_queues, clear_deadservers, close_region, compact,compact_rs,compaction_state, flush, is_in_maintenance_mode, list_deadservers,major_compact,merge_region, move, normalize, normalizer_enabled, normalizer_switch, split, splitormerge_enabled, splitormerge_switch,stop_master,stop_regionserver,trace, unassign,wal_roll, zk_dump | ||
| 安全權限管理類命令 | grant,list_security_capabilities,revoke,user_permission security權限管理,運維會用, | ||
| 程式類procedures | list_locks, list_procedures | ||
| visibility labels | add_labels,clear_auths,get_auths,list_labels, set_auths,set_visibility | ||
| rsgroup相關操作命令 | add_rsgroup,balance_rsgroup,get_rsgroup,get_server_rsgroup, get_table_rsgroup, list_rsgroups, move_namespaces_rsgroup, move_servers_namespaces_rsgroup, move_servers_rsgroup, move_servers_tables_rsgroup, move_tables_rsgroup, remove_rsgroup, remove_servers_rsgroup | ||
| 空間配額類命令quotas | 一般生產對租戶對做配額管理,防止單個用戶占用大量的資源,運維命令list_quota_snapshots, list_quota_table_sizes, list_quotas, list_snapshot_sizes, set_quota | ||
| 組態檔更新命令 | update_all_config,update_config 運維命令 | ||
| snapshots | clone_snapshot, delete_all_snapshot, delete_snapshot, delete_table_snapshots, list_snapshots, list_table_snapshots, restore_snapshot, snapshot | ||
| replication | add_peer, append_peer_namespaces, append_peer_tableCFs, disable_peer, disable_table_replication, enable_peer, enable_table_replication, get_peer_config, list_peer_configs, list_peers, list_replicated_tables, remove_peer, remove_peer_namespaces, remove_peer_tableCFs, set_peer_bandwidth, set_peer_exclude_namespaces, set_peer_exclude_tableCFs, set_peer_namespaces, set_peer_replicate_all, set_peer_serial, set_peer_tableCFs, show_peer_tableCFs,update_peer_config | ||
3.Hbase Shell命令操作與實體演示
3.0 hbase 神器 help
3.01.直接列舉出出所有hbase shell的命令,分類歸總
hbase(main):019:0> help
HBase Shell, version 2.1.0-cdh6.1.0, rUnknown, Thu Dec 6 16:59:59 PST 2018
Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.
Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.
這里是hbase所有命令,并且按類分組了,
COMMAND GROUPS:
Group name: general //通用命令
Commands: processlist, status, table_help, version, whoami
.............................
3.0.2查看某個命令的詳細使用
hbase(main):020:0> help 'create'
Creates a table. Pass a table name, and a set of column family
specifications (at least one), and, optionally, table configuration.
Column specification can be a simple string (name), or a dictionary
(dictionaries are described below in main help output), necessarily
including NAME attribute.
Examples:
Create a table with namespace=ns1 and table qualifier=t1
hbase> create 'ns1:t1', {NAME => 'f1', VERSIONS => 5}
Create a table with namespace=default and table qualifier=t1
hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}
hbase> # The above in shorthand would be the following:
hbase> create 't1', 'f1', 'f2', 'f3'
hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}
hbase> create 't1', {NAME => 'f1', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}}
hbase> create 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 1000000, MOB_COMPACT_PARTITION_POLICY => 'weekly'}
Table configuration options can be put at the end.
Examples:
hbase> create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40']
hbase> create 't1', 'f1', SPLITS => ['10', '20', '30', '40']
hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe'
hbase> create 't1', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 'myvalue' }
hbase> # Optionally pre-split the table into NUMREGIONS, using
hbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname)
hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}
hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit', REGION_REPLICATION => 2, CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'}}
hbase> create 't1', {NAME => 'f1', DFS_REPLICATION => 1}
You can also keep around a reference to the created table:
hbase> t1 = create 't1', 'f1'
Which gives you a reference to the table named 't1', on which you can then
call methods.
3.1 namespace概述與創建更新洗掉
namespace命名空間是表的邏輯分組,類似于關系資料庫系統中的資料庫,這種抽象為即將到來的多租戶相關功能奠定了基礎,簡單理解就是hbase中的資料庫,隔離用戶,做如下配額,安全管理等,
-
配額管理 ( HBASE-8410 ) - 限制命名空間可以消耗的資源量(即區域、表),
-
命名空間安全管理 ( HBASE-9206 ) - 為租戶提供另一個級別的安全管理,
-
區域服務器組 ( HBASE-6721 ) - 命名空間/表可以固定到 RegionServers 的子集上,從而保證粗略的隔離級別,
注意:hbase集群在創建時,默認預定義了兩個特殊的命名空間
-
hbase - 系統命名空間,用于包含 HBase 內部表
-
default - 沒有明確指定命名空間的表將自動落入這個命名空間
尖叫總結:實際生產中很少通過hbase shell去操作hbase,更多的是學習測驗,問題排查等等才會使用到hbase shell ,hbase總的來說就是寫資料,然后查詢, 前者是通過API bulkload等形式寫資料,后者通過api呼叫查詢,
3.1.1 namespace的操作
1.創建namespace
hbase(main):008:0> create_namespace 'myns_test'
2.在指定的namespace中創建一個表
hbase(main):009:0> create 'myns_test:t1','cl1'
Created table myns_test:t1
=> Hbase::Table - myns_test:t1
3.洗掉一個namespace,前提必須要先把其中的表刪完,不然報錯,跟資料庫一樣
hbase(main):010:0> drop_namespace 'myns_test'
ERROR: org.apache.hadoop.hbase.constraint.ConstraintException: Only empty namespaces can be removed. Namespace myns_test has 1 tables
at org.apache.hadoop.hbase.master.procedure.DeleteNamespaceProcedure.prepareDelete(DeleteNamespaceProcedure.java:217)
...............
at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:2140)
For usage try 'help "drop_namespace"'
Took 0.7591 seconds
5.修改namespace的屬性
hbase(main):011:0> alter_namespace 'myns_test',{METHOD => 'set', 'PROPERTY_NAME' => 'PROPERTY_VALUE'}
Took 0.5489 seconds
7.查看所有的namespace,看系統預定義與默認的defalut/hbase namespace,如果創建表不指定namespace,則在默認default里,
hbase(main):014:0> create_namespace 'myns_test1'
Took 0.2504 seconds
hbase(main):015:0> list_namespace
NAMESPACE
default
hbase
myns_test
myns_test1
4 row(s)
hbase(main):016:0> drop_namespace 'myns_test1' #洗掉一個空的namespace
Took 0.2248 seconds
hbase(main):017:0> list_namespace
NAMESPACE
default
hbase
myns_test
3 row(s)
Took 0.0113 seconds
8.查看namespace的詳細資訊
hbase(main):018:0> describe_namespace 'myns_test'
DESCRIPTION
{NAME => 'myns_test', PROPERTY_NAME => 'PROPERTY_VALUE'}
Took 0.0058 seconds
=> 1
9.列出某個namespace下所有的表
hbase(main):023:0> list_namespace_tables 'myns_test'
TABLE
t1
1 row(s)
Took 0.0237 seconds
=> ["t1"]
尖叫總結:namespace了解即可,實際生產中用的很少,一般也都是運維同學給創建好,開發更多的是表級別的操作,
3.2hbase 表的CRUD
創建表,必須傳遞兩個值,一個是表名,一個是列族名,其他可選的表的配置可加可不加,其他都是對表(實際列族)的約束,根據實際生產要求添加,比如壓縮,時間戳,版本等等,且屬性可以單獨指定,不指定的屬性就是默認值,
3.2.1列舉表,查看表結構等
1.列出某個namespace下所有的表
hbase(main):023:0> list_namespace_tables 'myns_test'
TABLE
t1
1 row(s)
Took 0.0237 seconds
=> ["t1"]
2.列出所有表,所有namespace下所有表,
hbase(main):024:0> list
TABLE
myns_test:t1
test
2 row(s)
Took 0.0041 seconds
=> ["myns_test:t1", "test"]
3.查看表結構,describe會顯示表的結構,默認值,引數等,類似show create table
hbase(main):027:0> describe 't1'
Table t1 is ENABLED
t1
COLUMN FAMILIES DESCRIPTION
{NAME => 'f1', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS
=> 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', RE
PLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_
WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '6553
6'}
{NAME => 'f2', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS
=> 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', RE
PLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_
WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '6553
6'}
{NAME => 'f3', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS
=> 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', RE
PLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_
WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '6553
6'}
3 row(s)
Took 0.1012 seconds
3.2.2各種形式的創建表,常用
1.基礎建表陳述句,要有表名+列族名,如下在默認namespace中創建表tb1,列族名cf,同時創建一個表tb2,3個列族,注意觀察兩個表的表結構有啥不一樣,
1.創建一個列族的表tb1
hbase(main):028:0> create 'tb1','cf'
Created table tb1
Took 0.7221 seconds
=> Hbase::Table - tb1
hbase(main):029:0> describe 'tb1'
Table tb1 is ENABLED
tb1
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS
=> 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', RE
PLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_
WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '6553
6'}
2.創建3個列族的表,tb2
hbase(main):036:0> create 'tb2','cf1','cf2','cf3'
Created table tb2
hbase(main):038:0> describe 'tb2'
Table tb2 is ENABLED
tb2
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELL
S => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', R
EPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON
_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '655
36'}
{NAME => 'cf2', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELL
S => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', R
EPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON
_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '655
36'}
{NAME => 'cf3', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELL
S => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', R
EPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON
_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '655
36'}
如上,我們簡單建表后,describe以后我們發現,表tb1有如下默認配置屬性,
注意,Hbase表/列族屬性介紹:
- NAME => 'cf', 列族名
- VERSIONS => '1', 版本數,默認資料存放一個版本,多余洗掉,實際生產常用引數,可以設定更多,
- EVICT_BLOCKS_ON_CLOSE => 'false',是否在關閉時從blockcache中取出快取塊,
- NEW_VERSION_BEHAVIOR => 'false',可選新版本行為,hbase2特性,與洗掉有關,后續講解
- KEEP_DELETED_CELLS => 'FALSE',列族是否可以選擇保留已洗掉的單元格,如果true情況下,仍然可以檢索已洗掉的單元格,默認一般洗掉了就不保留了,false,具體可以參考:Apache HBase ? Reference Guide
- CACHE_DATA_ON_WRITE => 'false',寫入快取資料
- DATA_BLOCK_ENCODING => 'NONE', 資料塊block的編碼方式設定,HBase目前提供了四種常用的編碼方式: Prefix_Tree、 Diff 、 Fast_Diff 、Prefix,
- TTL => 'FOREVER', 全稱time to live,列族可以以設定一個以秒為單位的 TTL 長度,一旦到了過期時間,HBase 會自動洗掉行,這適用于一行的所有版本——甚至是當前版本,在 HBase 中為行編碼的 TTL 時間以 UTC 指定,非常常用,生產一般設定TTL,相當于數倉里的生命周期,比如一個月等,不然資料一直膨脹,具體可以參考:Apache HBase ? Reference Guide
- MIN_VERSIONS => '0',如果 HBase 中的表設定了 TTL 的時候,MIN_VERSIONS 才會起作用,
- REPLICATION_SCOPE => '0',REPLICATION_SCOPE 是列族級別屬性,其值可以是 0 或 1,值 0 表示禁用復制,而 1 表示啟用復制,這個一般默認值0,關于hbase的復制可以參考這兩篇文章后續詳細介紹:https://clouderatemp.wpengine.com/blog/2012/07/hbase-replication-overview-2/ Apache HBase Replication: Operational Overview - Cloudera Blog
- BLOOMFILTER => 'ROW',布隆過濾器級別,默認行級別
- CACHE_INDEX_ON_WRITE => 'false',寫入快取索引
- IN_MEMORY => 'false',是否將列族存盤在記憶體中,HBase 可以選擇一個列族賦予更高的優先級快取,激進快取(表示優先級更高),IN_MEMORY 默認是false,如果設定為true,HBase 會嘗試將整個列族保存在記憶體中,只有在需要保存是才會持久化寫入磁盤,但是在運行時 HBase 會嘗試將整張表加載到記憶體里,
- CACHE_BLOOMS_ON_WRITE => 'false',寫入時快取爆發
- PREFETCH_BLOCKS_ON_OPEN => 'false',在打開狀態下預取塊,默認false
- COMPRESSION => 'NONE', 配置資料是否壓縮,以及壓縮演算法,如snappy等,針對列族進行配置,一張表多個列族可以不同列族不同壓縮演算法,
- BLOCKCACHE => 'true', 塊快取是否開啟,默認開啟,后續介紹
- BLOCKSIZE => '65536'} 設定HFile資料塊大小(默認64kb)一般不改,即使修改也是集群層面的統一設定,很少設定單個表,單個列族的屬性,
尖叫總結:注意通過上面我們創建的3個列族的tb2表我們可以看出,如上表的屬性都是針對列族的,所有的操作屬性都是列族級別的,我們可以針對列族設定,也可以使用默認值,
2.指定列族屬性創建表,
1.創建表使用NAME屬性值指定列族名
hbase(main):040:0> create 't4', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}
hbase(main):040:0> create 't4', 'f1','f2','f3'
注意這兩種創建的表結構都是一樣的
2.其他指定列族屬性創建表
hbase> create 't1', 'f1', 'f2', 'f3'
hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}
hbase> create 't1', {NAME => 'f1', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}}
hbase> create 't1', {NAME => 'f1', IS_MOB => true, MOB_THRESHOLD => 1000000, MOB_COMPACT_PARTITION_POLICY => 'weekly'}
其實列族的屬性有很多,上面是默認的,可以通過創建表時指定很多屬性,比如預磁區,具體參考hbase官網Apache HBase ? Reference Guide
hbase> create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40']
hbase(main):047:0> describe 'tb12'
Table tb12 is ENABLED
tb12
COLUMN FAMILIES DESCRIPTION
{NAME => 'f1', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS
=> 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', RE
PLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_
WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '6553
6'}
1 row(s)
Took 0.0152 seconds
hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe'
hbase> create 't1', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 'myvalue' }
hbase> # Optionally pre-split the table into NUMREGIONS, using
hbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname)
hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}
hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit', REGION_REPLICATION => 2, CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'}}
hbase> create 't1', {NAME => 'f1', DFS_REPLICATION => 1}

轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/301242.html
標籤:其他

