Hive
Hive source connector
Description
Read data from Hive.
In order to use this connector, You must ensure your spark/flink cluster already integrated hive. The tested hive version is 2.3.9.
Tips: Hive Sink Connector can not add partition field to the output data now
Key features
Read all the data in a split in a pollNext call. What splits are read will be saved in snapshot.
- schema projection
- parallelism
- support user-defined split
- file format
- text
- csv
- parquet
- orc
- json
Options
name | type | required | default value |
---|---|---|---|
table_name | string | yes | - |
metastore_uri | string | yes | - |
hdfs_site_path | string | no | - |
schema | config | No | - |
common-options | no | - |
table_name [string]
Target Hive table name eg: db1.table1
metastore_uri [string]
Hive metastore uri
hdfs_site_path [string]
The path of hdfs-site.xml
, used to load ha configuration of namenodes
schema [Config]
fields [Config]
the schema fields of upstream data
common options
Source plugin common parameters, please refer to Source Common Options for details
Example
Hive {
table_name = "default.seatunnel_orc"
metastore_uri = "thrift://namenode001:9083"
}
Changelog
2.2.0-beta 2022-09-26
- Add Hive Source Connector