IoTDB
IoTDB sink connector
Description
Used to write data to IoTDB.
There is a conflict of thrift version between IoTDB and Spark.Therefore, you need to execute rm -f $SPARK_HOME/jars/libthrift*
and cp $IOTDB_HOME/lib/libthrift* $SPARK_HOME/jars/
to resolve it.
Key features
IoTDB supports the exactly-once
feature through idempotent writing. If two pieces of data have
the same key
and timestamp
, the new data will overwrite the old one.
Options
name | type | required | default value |
---|---|---|---|
node_urls | list | yes | - |
username | string | yes | - |
password | string | yes | - |
key_device | string | yes | - |
key_timestamp | string | no | processing time |
key_measurement_fields | array | no | exclude device & timestamp |
storage_group | string | no | - |
batch_size | int | no | 1024 |
batch_interval_ms | int | no | - |
max_retries | int | no | - |
retry_backoff_multiplier_ms | int | no | - |
max_retry_backoff_ms | int | no | - |
default_thrift_buffer_size | int | no | - |
max_thrift_frame_size | int | no | - |
zone_id | string | no | - |
enable_rpc_compression | boolean | no | - |
connection_timeout_in_ms | int | no | - |
common-options | no | - |
node_urls [list]
IoTDB
cluster address, the format is ["host:port", ...]
username [string]
IoTDB
user username
password [string]
IoTDB
user password
key_device [string]
Specify field name of the IoTDB
deviceId in SeaTunnelRow
key_timestamp [string]
Specify field-name of the IoTDB
timestamp in SeaTunnelRow. If not specified, use processing-time as timestamp
key_measurement_fields [array]
Specify field-name of the IoTDB
measurement list in SeaTunnelRow. If not specified, include all fields but exclude device
& timestamp
storage_group [string]
Specify device storage group(path prefix)
example: deviceId = ${storage_group} + "." + ${key_device}
batch_size [int]
For batch writing, when the number of buffers reaches the number of batch_size
or the time reaches batch_interval_ms
, the data will be flushed into the IoTDB
batch_interval_ms [int]
For batch writing, when the number of buffers reaches the number of batch_size
or the time reaches batch_interval_ms
, the data will be flushed into the IoTDB
max_retries [int]
The number of retries to flush failed
retry_backoff_multiplier_ms [int]
Using as a multiplier for generating the next delay for backoff
max_retry_backoff_ms [int]
The amount of time to wait before attempting to retry a request to IoTDB
default_thrift_buffer_size [int]
Thrift init buffer size in IoTDB
client
max_thrift_frame_size [int]
Thrift max frame size in IoTDB
client
zone_id [string]
java.time.ZoneId in IoTDB
client
enable_rpc_compression [boolean]
Enable rpc compression in IoTDB
client
connection_timeout_in_ms [int]
The maximum time (in ms) to wait when connecting to IoTDB
common options
Sink plugin common parameters, please refer to Sink Common Options for details
Examples
Case1
Common options:
sink {
IoTDB {
node_urls = ["localhost:6667"]
username = "root"
password = "root"
batch_size = 1024
batch_interval_ms = 1000
}
}
When you assign key_device
is device_name
, for example:
sink {
IoTDB {
...
key_device = "device_name"
}
}
Upstream SeaTunnelRow data format is the following:
device_name | field_1 | field_2 |
---|---|---|
root.test_group.device_a | 1001 | 1002 |
root.test_group.device_b | 2001 | 2002 |
root.test_group.device_c | 3001 | 3002 |
Output to IoTDB
data format is the following:
IoTDB> SELECT * FROM root.test_group.* align by device;
+------------------------+------------------------+-----------+----------+
| Time| Device| field_1| field_2|
+------------------------+------------------------+----------+-----------+
|2022-09-26T17:50:01.201Z|root.test_group.device_a| 1001| 1002|
|2022-09-26T17:50:01.202Z|root.test_group.device_b| 2001| 2002|
|2022-09-26T17:50:01.203Z|root.test_group.device_c| 3001| 3002|
+------------------------+------------------------+----------+-----------+
Case2
When you assign key_device
、key_timestamp
、key_measurement_fields
, for example:
sink {
IoTDB {
...
key_device = "device_name"
key_timestamp = "ts"
key_measurement_fields = ["temperature", "moisture"]
}
}
Upstream SeaTunnelRow data format is the following:
ts | device_name | field_1 | field_2 | temperature | moisture |
---|---|---|---|---|---|
1664035200001 | root.test_group.device_a | 1001 | 1002 | 36.1 | 100 |
1664035200001 | root.test_group.device_b | 2001 | 2002 | 36.2 | 101 |
1664035200001 | root.test_group.device_c | 3001 | 3002 | 36.3 | 102 |
Output to IoTDB
data format is the following:
IoTDB> SELECT * FROM root.test_group.* align by device;
+------------------------+------------------------+--------------+-----------+
| Time| Device| temperature| moisture|
+------------------------+------------------------+--------------+-----------+
|2022-09-25T00:00:00.001Z|root.test_group.device_a| 36.1| 100|
|2022-09-25T00:00:00.001Z|root.test_group.device_b| 36.2| 101|
|2022-09-25T00:00:00.001Z|root.test_group.device_c| 36.3| 102|
+------------------------+------------------------+--------------+-----------+
Changelog
2.2.0-beta 2022-09-26
- Add IoTDB Sink Connector