版本：Next

Clickhouse

Clickhouse source 连接器

支持引擎

Spark
Flink
SeaTunnel Zeta

核心特性

支持查询SQL，可以实现投影效果。

描述

用于从Clickhouse读取数据。

支持的数据源信息

为了使用 Clickhouse 连接器，需要以下依赖项。它们可以通过 install-plugin.sh 或从 Maven 中央存储库下载。

数据源	支持的版本	依赖
Clickhouse	universal	Download

数据类型映射

Clickhouse 数据类型	SeaTunnel 数据类型
String / Int128 / UInt128 / Int256 / UInt256 / Point / Ring / Polygon MultiPolygon	STRING
Int8 / UInt8 / Int16 / UInt16 / Int32	INT
UInt64 / Int64 / IntervalYear / IntervalQuarter / IntervalMonth / IntervalWeek / IntervalDay / IntervalHour / IntervalMinute / IntervalSecond	BIGINT
Float64	DOUBLE
Decimal	DECIMAL
Float32	FLOAT
Date	DATE
DateTime	TIME
Array	ARRAY
Map	MAP

Source 选项

名称	类型	是否必须	默认值	描述
host	String	是	-	`ClickHouse` 集群地址, 格式是`host:port` , 允许多个`hosts`配置. 例如 `"host1:8123,host2:8123"` .
username	String	是	-	`ClickHouse` user 用户账号.
password	String	是	-	`ClickHouse` user 用户密码.
table_list	Array	NO	-	要读取的数据表列表，支持配置多表.
clickhouse.config	Map	否	-	除了上述必须由 `clickhouse-jdbc` 指定的必填参数外，用户还可以指定多个可选参数，这些参数涵盖了 `clickhouse-jdbc` 提供的所有参数.
server_time_zone	String	否	ZoneId.systemDefault()	数据库服务中的会话时区。如果未设置，则使用ZoneId.systemDefault（）设置服务时区.
common-options		否	-	源插件常用参数，详见源通用选项.

多表配置：

名称	类型	是否必须	默认值	描述
table_path	String	否	-	数据表的完整路径, 例如: `default.table`.
sql	String	否	-	用于通过Clickhouse服务搜索数据的查询sql.
filter_query	String	否	-	数据过滤条件. 格式为: "field = value", 例如 : filter_query = "id > 2 and type = 1"
partition_list	Array	否	-	指定分区列表过滤数据. 如果是分区表，该字段可以配置为过滤指定分区的数据。. 例如: partition_list = ["20250615", "20250616"]
batch_size	int	否	1024	从Clickhouse读取一次可以获得的最大数据行数。

注意: 当此配置对应于单个表时，您可以将table_list中的配置项展平到外层。

并行读取

Clickhouse源连接器支持并行读取数据。

当仅指定table_path参数时，连接器根据从system.parts系统表中获取的数据表的part文件实现并行读取。

当仅指定sql参数时，连接器在集群的每个分片上基于本地表执行查询来实现并发读取。如果sql参数指定了一个分布式表，则会根据分布式表引擎的集群名获取分片列表执行并发读取。如果sql指定了一个本地表，那么host参数配置的节点列表将被视作集群分片列表执行并发读取。

如果同时设置了table_path和sql参数，则将在sql模式下执行。推荐在指定sql参数时同时配置table_path参数以更好地识别表的元数据。

Tips

当指定table_path参数时，如果不想读取整个表，可以指定partition_list或filter_query参数过滤指定条件或分区的数据。

partition_list: 过滤指定分区的数据
filter_query: 根据指定条件对数据进行过滤

batch_size参数可用于控制每次查询读取的数据量，以避免在读取大量数据时出现OOM异常。适当增加这个值将有助于提高读取过程的性能。

当读取单个表的数据时，建议使用table_path参数替代sql参数。

如何创建Clickhouse数据同步作业

单表配置

下面的示例演示了如何创建一个数据同步作业，该作业从Clickhouse读取数据并在本地客户端上打印数据

案例1：基于part文件读取策略的并行读取

env {
  job.mode = "BATCH"
  parallelism = 5
}

source {
  Clickhouse {
    host = "localhost:8123"
    username = "xxx"
    password = "xxx"
    table_path = "default.table"
    server_time_zone = "UTC"
    partition_list = ["20250615", "20250616"]
    filter_query = "id > 2 and type = 1"
    batch_size = 1024
    clickhouse.config = {
      "socket_timeout": "300000"
    }
  }
}

# Console printing of the read Clickhouse data
sink {
  Console {
    parallelism = 1
  }
}

案例2：基于SQL读取策略的并行读取

注意：SQL模式下的并行读取方式目前仅支持单表和where条件查询
env {
  job.mode = "BATCH"
  parallelism = 5
}

source { Clickhouse { host = "localhost:8123" username = "xxx" password = "xxx" table_path = "default.table" server_time_zone = "UTC" sql = "select * from default.table where id > 2 and type = 1" batch_size = 1024 clickhouse.config = { "socket_timeout": "300000" } } }

Console printing of the read Clickhouse data

sink { Console { parallelism = 1 } }

**案例3：针对复杂SQL场景的单并发读取**

当执行复杂SQL查询场景（例如带有join、group by、子查询等的查询）时，连接器将自动切换到单并发执行方式，即使配置了更高的并行度值。

```hocon
env {
  job.mode = "BATCH"
  parallelism = 1
}

source {
  Clickhouse {
    host = "localhost:8123"
    username = "xxx"
    password = "xxx"
    server_time_zone = "UTC"
    sql = "select t1.id, t2.category from default.table1 t1 global join default.table2 t2 on t1.id = t2.id where t1.age > 18"
    batch_size = 1024
    clickhouse.config = {
      "socket_timeout": "300000"
    }
  }
}

# Console printing of the read Clickhouse data
sink {
  Console {
    parallelism = 1
  }
}

多表配置

env {
  job.mode = "BATCH"
  parallelism = 5
}

source {
  Clickhouse {
    host = "localhost:8123"
    username = "xxx"
    password = "xxx"
    table_list = [
      {
        table_path = "default.table1"
        sql = "select * from default.table1 where id > 2 and type = 1"
      },
      {
        table_path = "default.table2"
        sql = "select * from default.table2 where age > 18"
      }
    ]
    server_time_zone = "UTC"
    clickhouse.config = {
      "socket_timeout": "300000"
    }
  }
}

# Console printing of the read Clickhouse data
sink {
  Console {
    parallelism = 1
  }
}

变更日志

Change Log

Change	Commit	Version
[Improve][Connector-Clickhouse] improve ck batch parallel read by using last batch row sorting value approach, instead of limit offset. (#9801)	https://github.com/apache/seatunnel/commit/5e9990afd5	dev
[Feature][Connector-Clickhouse] Support Clickhouse multi table source read (#9704)	https://github.com/apache/seatunnel/commit/6e323743ea	2.3.12
[Improve][API] Optimize the enumerator API semantics and reduce lock calls at the connector level (#9671)	https://github.com/apache/seatunnel/commit/9212a77140	2.3.12
[Fix][Connector-clickhouse] Fix SeaTunnelRow tableId set error (#9585)	https://github.com/apache/seatunnel/commit/01f1caa6fb	2.3.12
[Improve][connector-clickhouse] Clickhouse support parallelism reading schema (#9446)	https://github.com/apache/seatunnel/commit/3ee0fab3a8	2.3.12
[Feature][Connector-V2] Support multi-table sink feature for ClickHouse (#9301)	https://github.com/apache/seatunnel/commit/3524895136	2.3.11
[Fix][Connector-V2] Fix the problem that missing options configuration when building ClickHouse Nodes (#9277)	https://github.com/apache/seatunnel/commit/051d19c3a9	2.3.11
[Feature][Transform] Support define sink column type (#9114)	https://github.com/apache/seatunnel/commit/ab7119e507	2.3.11
[Feature][Checkpoint] Add check script for source/sink state class serialVersionUID missing (#9118)	https://github.com/apache/seatunnel/commit/4f5adeb1c7	2.3.11
[Fix][API] Fixed not invoke the `SinkAggregatedCommitter`'s init method (#9070)	https://github.com/apache/seatunnel/commit/df0d11d632	2.3.11
[Fix][Clickhouse] Parallelism makes data duplicate (#8916)	https://github.com/apache/seatunnel/commit/45345f2738	2.3.10
[Fix][Connector-V2]Fix Descriptions for CUSTOM_SQL in Connector (#8778)	https://github.com/apache/seatunnel/commit/96b610eb7e	2.3.10
[improve] update clickhouse connector config option (#8755)	https://github.com/apache/seatunnel/commit/b964189b75	2.3.10
[Fix][Connector-V2] fix starRocks automatically creates tables with comment (#8568)	https://github.com/apache/seatunnel/commit/c4cb1fc4a3	2.3.10
[Fix][Connector-V2] Fixed adding table comments (#8514)	https://github.com/apache/seatunnel/commit/edca75b0d6	2.3.10
[hotfix] fix exceptions caused by operator priority in connector-clickhouse when using sharding_key (#8162)	https://github.com/apache/seatunnel/commit/5560e3dab2	2.3.9
[Imporve][ClickhouseFile] Directly connect to each shard node to obtain the corresponding path (#8449)	https://github.com/apache/seatunnel/commit/757641bada	2.3.9
[Feature][ClickhouseFile] Support add publicKey to identity (#8351)	https://github.com/apache/seatunnel/commit/287b8c8219	2.3.9
[Improve][ClickhouseFile] Improve rsync log output (#8332)	https://github.com/apache/seatunnel/commit/179223e3c2	2.3.9
[Improve][ClickhouseFile] Added attach sql log for better debugging (#8315)	https://github.com/apache/seatunnel/commit/ade428c5fa	2.3.9
[Chore] delete chinese desc in code (#8306)	https://github.com/apache/seatunnel/commit/a50a8b925f	2.3.9
[Improve][ClickhouseFile Connector] Unified specifying clickhouse file generation path (#8302)	https://github.com/apache/seatunnel/commit/455f1ed760	2.3.9
[Improve][ClickhouseFile] Clickhouse supports option configuration when connecting to shard nodes (#8297)	https://github.com/apache/seatunnel/commit/1ded1b6206	2.3.9
[Imporve][ClickhouseFile] Improve clickhousefile generation parameter configuration (#8293)	https://github.com/apache/seatunnel/commit/753e058fee	2.3.9
[Improve][ClickhouseFile] ClickhouseFile Connector's rsync transmission supports specifying users (#8236)	https://github.com/apache/seatunnel/commit/e012bd0a4f	2.3.9
[Feature][Clickhouse] Support sink savemode (#8086)	https://github.com/apache/seatunnel/commit/e6f92fd79b	2.3.9
[Improve][dist]add shade check rule (#8136)	https://github.com/apache/seatunnel/commit/51ef800016	2.3.9
[Fix][Connecotr-V2] Fix clickhouse sink does not support composite primary key (#8021)	https://github.com/apache/seatunnel/commit/24d0542595	2.3.9
[Improve] update clickhouse connector, use factory to create source/sink (#7946)	https://github.com/apache/seatunnel/commit/b69fceceee	2.3.9
[Fix][Connector-V2] Fixed clickhouse connectors cannot stop under multiple parallelism (#7921)	https://github.com/apache/seatunnel/commit/8d9c6a3714	2.3.9
Bump commons-io:commons-io from 2.11.0 to 2.14.0 in /seatunnel-connectors-v2/connector-clickhouse (#7784)	https://github.com/apache/seatunnel/commit/f4393a02bf	2.3.9
[Feature][Restapi] Allow metrics information to be associated to logical plan nodes (#7786)	https://github.com/apache/seatunnel/commit/6b7c53d03c	2.3.9
[Improve] Improve some connectors prepare check error message (#7465)	https://github.com/apache/seatunnel/commit/6930a25edd	2.3.8
[Improve][Connector-V2] Close all ResultSet after used (#7389)	https://github.com/apache/seatunnel/commit/853e973212	2.3.8
[Feature][Connector-V2][Clickhouse] Add clickhouse.config to the source connector (#7143)	https://github.com/apache/seatunnel/commit/f7994d9ae9	2.3.6
[Improve] Make ClickhouseFileSinker support tables containing materialized columns (#6956)	https://github.com/apache/seatunnel/commit/87c6adcc2e	2.3.6
[Improve][Clickhouse] Remove check when set allow_experimental_lightweight_delete false(#6727) (#6728)	https://github.com/apache/seatunnel/commit/b25e1b1ae5	2.3.6
[Improve][Common] Adapt `FILE_OPERATION_FAILED` to `CommonError` (#5928)	https://github.com/apache/seatunnel/commit/b3dc0bbc21	2.3.4
[Improve][Connector-V2] Replace CommonErrorCodeDeprecated.JSON_OPERATION_FAILED (#5978)	https://github.com/apache/seatunnel/commit/456cd17714	2.3.4
[Feature][Core] Upgrade flink source translation (#5100)	https://github.com/apache/seatunnel/commit/5aabb14a94	2.3.4
[Improve] Speed up ClickhouseFile Local generate a mmap object (#5822)	https://github.com/apache/seatunnel/commit/cf39e29dad	2.3.4
[Improve][Common] Introduce new error define rule (#5793)	https://github.com/apache/seatunnel/commit/9d1b2582b2	2.3.4
[Improve] Remove use `SeaTunnelSink::getConsumedType` method and mark it as deprecated (#5755)	https://github.com/apache/seatunnel/commit/8de7408100	2.3.4
[Hotfix][connector-v2][clickhouse] Fixed an out-of-order BUG with output data fields of clickhouse-sink (#5346)	https://github.com/apache/seatunnel/commit/fce9ddaa2b	2.3.4
[Bugfix][Clickhouse] Fix clickhouse sink flush bug (#5448)	https://github.com/apache/seatunnel/commit/cef03f6673	2.3.4
[Hotfix][Clickhouse] Fix clickhouse old version compatibility (#5326)	https://github.com/apache/seatunnel/commit/1da49f5a2b	2.3.4
[Improve][CheckStyle] Remove useless 'SuppressWarnings' annotation of checkstyle. (#5260)	https://github.com/apache/seatunnel/commit/51c0d709ba	2.3.4
[Hotfix] Fix com.google.common.base.Preconditions to seatunnel shade one (#5284)	https://github.com/apache/seatunnel/commit/ed5eadcf73	2.3.3
[Feature][Connector-V2][Clickhouse] Add clickhouse connector time zone key,default system time zone (#5078)	https://github.com/apache/seatunnel/commit/309b58d12d	2.3.3
[Bugfix]fix clickhouse source connector read Nullable() type is not null,example:Nullable(Float64) while value is null the result is 0.0 (#5080)	https://github.com/apache/seatunnel/commit/cf3d0bba2e	2.3.3
[Feature][Connector-V2][Clickhouse] clickhouse writes with checkpoints (#4999)	https://github.com/apache/seatunnel/commit/f8fefa1e57	2.3.3
[Hotfix][Connector-V2][ClickhouseFile] Fix ClickhouseFile write file failed when field value is null (#4937)	https://github.com/apache/seatunnel/commit/06671474ca	2.3.3
[Hotfix][connector-clickhouse] fix get clickhouse local table name with closing bracket from distributed table engineFull (#4710)	https://github.com/apache/seatunnel/commit/e5e0cba26d	2.3.2
[Bug][Connector-V2] Clickhouse File Connector failed to sink to table with settings like storage_policy (#4172)	https://github.com/apache/seatunnel/commit/e120dc44bc	2.3.1
[Improve][build] Give the maven module a human readable name (#4114)	https://github.com/apache/seatunnel/commit/d7cd601051	2.3.1
[Improve][Project] Code format with spotless plugin. (#4101)	https://github.com/apache/seatunnel/commit/a2ab166561	2.3.1
[Bug][Connector-V2] Clickhouse File Connector not support split mode for write data to all shards of distributed table (#4035)	https://github.com/apache/seatunnel/commit/3f1dcfc915	2.3.1
[Hotfix][Connector-V2] Fix connector source snapshot state NPE (#4027)	https://github.com/apache/seatunnel/commit/e39c4988cc	2.3.1
[Hotfix][Connector-v2][Clickhouse] Fix clickhouse write cdc changelog update event (#3951)	https://github.com/apache/seatunnel/commit/67e6027970	2.3.1
[Feature][shade][Jackson] Add seatunnel-jackson module (#3947)	https://github.com/apache/seatunnel/commit/5d8862ec9c	2.3.1
[Improve][Connector-V2][Clickhouse] Improve performance (#3910)	https://github.com/apache/seatunnel/commit/aeceb855f6	2.3.1
[Improve][Connector-V2] Remove Clickhouse Fields Config (#3826)	https://github.com/apache/seatunnel/commit/74704c362a	2.3.1
[Improve][Connector-V2][clickhouse] Special characters in column names are supported (#3881)	https://github.com/apache/seatunnel/commit/9069609c17	2.3.1
[Feature][Connector] add get source method to all source connector (#3846)	https://github.com/apache/seatunnel/commit/417178fb84	2.3.1
[Improve][Connector-V2] Change Connector Custom Config Prefix To Map (#3719)	https://github.com/apache/seatunnel/commit/ef1b8b1bb5	2.3.1
[Feature][API & Connector & Doc] add parallelism and column projection interface (#3829)	https://github.com/apache/seatunnel/commit/b9164b8ba1	2.3.1
[Bug][Connector-V2] Fix ClickhouseFile Committer Serializable Problems (#3803)	https://github.com/apache/seatunnel/commit/1b26192cb3	2.3.1
[feature][connector-v2][clickhouse] Support write cdc changelog event in clickhouse sink (#3653)	https://github.com/apache/seatunnel/commit/6093c213bf	2.3.0
[Connector-V2][Clickhouse] Improve Clickhouse File Connector (#3416)	https://github.com/apache/seatunnel/commit/e07e9a7cc2	2.3.0
[Hotfix][OptionRule] Fix option rule about all connectors (#3592)	https://github.com/apache/seatunnel/commit/226dc6a119	2.3.0
[Improve][Connector-V2][Clickhouse] Unified exception for Clickhouse source & sink connector (#3563)	https://github.com/apache/seatunnel/commit/04e1743d9e	2.3.0
options in conditional need add to required or optional options (#3501)	https://github.com/apache/seatunnel/commit/51d5bcba10	2.3.0
[Feature][Connector-V2][Clickhouse]Optimize clickhouse connector data type inject (#3471)	https://github.com/apache/seatunnel/commit/9bd0fc8ee2	2.3.0
[improve][connector-v2][clickhouse] Fix DoubleInjectFunction (#3441)	https://github.com/apache/seatunnel/commit/9781a6a385	2.3.0
[feature][api] add option validation for the ReadonlyConfig (#3417)	https://github.com/apache/seatunnel/commit/4f824fea36	2.3.0
[improve][connector] The Factory#factoryIdentifier must be consistent with PluginIdentifierInterface#getPluginName (#3328)	https://github.com/apache/seatunnel/commit/d9519d696a	2.3.0
[Improve][Connector-V2] Add Clickhouse and Assert Source/Sink Factory (#3306)	https://github.com/apache/seatunnel/commit/9e4a128381	2.3.0
[Improve][Clickhouse-V2] Clickhouse Support Geo type (#3141)	https://github.com/apache/seatunnel/commit/01cdc4e336	2.3.0
[Improve][Connector-V2][Clickhouse] Support nest type and array (#3047)	https://github.com/apache/seatunnel/commit/97b5727ec6	2.3.0
[Feature][Connector-V2-Clickhouse] Clickhouse Source random use host when config multi-host (#3108)	https://github.com/apache/seatunnel/commit/c9583b7f63	2.3.0-beta
[Improve][Clickhouse-V2] Clickhouse Support Int128,Int256 Type (#3067)	https://github.com/apache/seatunnel/commit/e118ccea0a	2.3.0-beta
[Improve][all] change Log to @Slf4j (#3001)	https://github.com/apache/seatunnel/commit/6016100f12	2.3.0-beta
[Connector-V2][Clickhouse] Fix Clickhouse Type Mapping and Spark Map reconvert Bug (#2767)	https://github.com/apache/seatunnel/commit/f0a1f5013a	2.2.0-beta
[DEV][Api] Replace SeaTunnelContext with JobContext and remove singleton pattern (#2706)	https://github.com/apache/seatunnel/commit/cbf82f755c	2.2.0-beta
[#2606]Dependency management split (#2630)	https://github.com/apache/seatunnel/commit/fc047be69b	2.2.0-beta
[Feature][Connector-V1 & V2] Support unauthorized ClickHouse (#2393)	https://github.com/apache/seatunnel/commit/0e4e2b1230	2.2.0-beta
[Feature][connector] clickhousefile sink connector support non-root username for fileTransfer (#2263)	https://github.com/apache/seatunnel/commit/704661f1fd	2.2.0-beta
StateT of SeaTunnelSource should extend `Serializable` (#2214)	https://github.com/apache/seatunnel/commit/8c426ef850	2.2.0-beta
[Bug][connector-v2] When outputting data to clickhouse, a ClassCastException was encountered (#2160)	https://github.com/apache/seatunnel/commit/a3a2b5d189	2.2.0-beta
[API-DRAFT][MERGE] fix merge error	https://github.com/apache/seatunnel/commit/736ac01c89	2.2.0-beta
merge dev to api-draft	https://github.com/apache/seatunnel/commit/d265597c64	2.2.0-beta
[api-draft][connector] support Rsync to transfer clickhouse data file (#2080)	https://github.com/apache/seatunnel/commit/02a41902a8	2.2.0-beta
[api-draft][Optimize] Optimize module name (#2062)	https://github.com/apache/seatunnel/commit/f79e3112b1	2.2.0-beta

Clickhouse

支持引擎​

核心特性​

描述​

支持的数据源信息​

数据类型映射​

Source 选项​

并行读取​

Tips​

如何创建Clickhouse数据同步作业​

单表配置​