Hbase
Hbase 数据连接器
描述
将数据输出到hbase
主要特性
选项
名称 | 类型 | 是否必须 | 默认值 |
---|---|---|---|
zookeeper_quorum | string | yes | - |
table | string | yes | - |
rowkey_column | list | yes | - |
family_name | config | yes | - |
rowkey_delimiter | string | no | "" |
version_column | string | no | - |
null_mode | string | no | skip |
wal_write | boolean | yes | false |
write_buffer_size | string | no | 8 1024 1024 |
encoding | string | no | utf8 |
hbase_extra_config | string | no | - |
common-options | no | - | |
ttl | long | no | - |
zookeeper_quorum [string]
hbase的zookeeper集群主机, 示例: "hadoop001:2181,hadoop002:2181,hadoop003:2181"
table [string]
要写入的表名, 例如: "seatunnel"
rowkey_column [list]
行键的列名列表, 例如: ["id", "uuid"]
family_name [config]
字段的列簇名称映射。例如,上游的行如下所示:
id | name | age |
---|---|---|
1 | tyrantlucifer | 27 |
id作为行键和其他写入不同列簇的字段,可以分配
family_name { name = "info1" age = "info2" }
这主要是name写入列簇info1,age写入将写给列簇 info2
如果要将其他字段写入同一列簇,可以分配
family_name { all_columns = "info" }
这意味着所有字段都将写入该列簇 info
rowkey_delimiter [string]
连接多行键的分隔符,默认 ""
version_column [string]
版本列名称,您可以使用它来分配 hbase 记录的时间戳
null_mode [double]
写入 null 值的模式,支持 [ skip , empty], 默认 skip
- skip: 当字段为 null ,连接器不会将此字段写入 hbase
- empty: 当字段为null时,连接器将写入并为此字段生成空值
wal_write [boolean]
wal log 写入标志,默认值 false
write_buffer_size [int]
hbase 客户端的写入缓冲区大小,默认 8 1024 1024
encoding [string]
字符串字段的编码,支持[ utf8 , gbk],默认 utf8
hbase_extra_config [config]
hbase扩展配置
ttl [long]
hbase 写入数据 TTL 时间,默认以表设置的TTL为准,单位毫秒
常见选项
Sink 插件常用参数,详见 Sink 常用选项 Sink Common Options
案例
Hbase {
zookeeper_quorum = "hadoop001:2181,hadoop002:2181,hadoop003:2181"
table = "seatunnel_test"
rowkey_column = ["name"]
family_name {
all_columns = seatunnel
}
}
写入多表
env {
# You can set engine configuration here
execution.parallelism = 1
job.mode = "BATCH"
}
source {
FakeSource {
tables_configs = [
{
schema = {
table = "hbase_sink_1"
fields {
name = STRING
c_string = STRING
c_double = DOUBLE
c_bigint = BIGINT
c_float = FLOAT
c_int = INT
c_smallint = SMALLINT
c_boolean = BOOLEAN
time = BIGINT
}
}
rows = [
{
kind = INSERT
fields = ["label_1", "sink_1", 4.3, 200, 2.5, 2, 5, true, 1627529632356]
}
]
},
{
schema = {
table = "hbase_sink_2"
fields {
name = STRING
c_string = STRING
c_double = DOUBLE
c_bigint = BIGINT
c_float = FLOAT
c_int = INT
c_smallint = SMALLINT
c_boolean = BOOLEAN
time = BIGINT
}
}
rows = [
{
kind = INSERT
fields = ["label_2", "sink_2", 4.3, 200, 2.5, 2, 5, true, 1627529632357]
}
]
}
]
}
}
sink {
Hbase {
zookeeper_quorum = "hadoop001:2181,hadoop002:2181,hadoop003:2181"
table = "${table_name}"
rowkey_column = ["name"]
family_name {
all_columns = info
}
}
}
写入指定列族
Hbase {
zookeeper_quorum = "hbase_e2e:2181"
table = "assign_cf_table"
rowkey_column = ["id"]
family_name {
c_double = "cf1"
c_bigint = "cf2"
}
}
变更日志
Change Log
Change | Commit | Version |
---|---|---|
[Improve] hbase options (#8923) | https://github.com/apache/seatunnel/commit/b6a702b58 | 2.3.10 |
[Improve] restruct connector common options (#8634) | https://github.com/apache/seatunnel/commit/f3499a6ee | 2.3.10 |
[Improve][dist]add shade check rule (#8136) | https://github.com/apache/seatunnel/commit/51ef80001 | 2.3.9 |
[Feature][Restapi] Allow metrics information to be associated to logical plan nodes (#7786) | https://github.com/apache/seatunnel/commit/6b7c53d03 | 2.3.9 |
[Fix][Connector-V2] Fix known directory create and delete ignore issues (#7700) | https://github.com/apache/seatunnel/commit/e2fb67957 | 2.3.8 |
[Feature][Connector-V2][Hbase] implement hbase catalog (#7516) | https://github.com/apache/seatunnel/commit/b978792cb | 2.3.8 |
[Feature][Connector-V2] Support multi-table sink feature for HBase (#7169) | https://github.com/apache/seatunnel/commit/025fa3bb8 | 2.3.8 |
[hotfix][connector-v2-hbase]fix and optimize hbase source problem (#7148) | https://github.com/apache/seatunnel/commit/34a6b8e9f | 2.3.7 |
[Improve][hbase] The specified column is written to the specified column family (#5234) | https://github.com/apache/seatunnel/commit/49d397c61 | 2.3.6 |
[feature][connector-v2-hbase-sink] Support Connector v2 HBase sink TTL data writing (#7116) | https://github.com/apache/seatunnel/commit/adafd8025 | 2.3.6 |
[E2E][HBase]Refactor hbase e2e (#6859) | https://github.com/apache/seatunnel/commit/1da9bd6ce | 2.3.6 |
[Connector]Add hbase source connector (#6348) | https://github.com/apache/seatunnel/commit/f108a5e65 | 2.3.6 |
[Feature][HbaseSink]support array data. (#6100) | https://github.com/apache/seatunnel/commit/b59201476 | 2.3.4 |
[Improve][Common] Introduce new error define rule (#5793) | https://github.com/apache/seatunnel/commit/9d1b2582b | 2.3.4 |
[Improve] Remove use SeaTunnelSink::getConsumedType method and mark it as deprecated (#5755) | https://github.com/apache/seatunnel/commit/8de740810 | 2.3.4 |
[Hotfix][Connector-v2][HbaseSink]Fix default timestamp (#4958) | https://github.com/apache/seatunnel/commit/3d8f3bf90 | 2.3.3 |
[Improve][build] Give the maven module a human readable name (#4114) | https://github.com/apache/seatunnel/commit/d7cd60105 | 2.3.1 |
[Improve][Project] Code format with spotless plugin. (#4101) | https://github.com/apache/seatunnel/commit/a2ab16656 | 2.3.1 |
[Feature][Connector-V2][Hbase] Introduce hbase sink connector (#4049) | https://github.com/apache/seatunnel/commit/68bda94a4 | 2.3.1 |