Skip to main content
Version: 2.3.10

StarRocks

StarRocks source connector

Description

Read external data source data through StarRocks. The internal implementation of StarRocks source connector is obtains the query plan from the frontend (FE), delivers the query plan as a parameter to BE nodes, and then obtains data results from BE nodes.

Key features

Options

nametyperequireddefault value
nodeUrlslistyes-
usernamestringyes-
passwordstringyes-
databasestringyes-
tablestringno-
scan_filterstringno-
schemaconfigno-
table_listarrayno-
request_tablet_sizeintnoInteger.MAX_VALUE
scan_connect_timeout_msintno30000
scan_query_timeout_secintno3600
scan_keep_alive_minintno10
scan_batch_rowsintno1024
scan_mem_limitlongno2147483648
max_retriesintno3
scan.params.*stringno-

nodeUrls [list]

StarRocks cluster address, the format is ["fe_ip:fe_http_port", ...]

username [string]

StarRocks user username

password [string]

StarRocks user password

database [string]

The name of StarRocks database

table [string]

The name of StarRocks table

scan_filter [string]

Filter expression of the query, which is transparently transmitted to StarRocks. StarRocks uses this expression to complete source-side data filtering.

e.g.

"tinyint_1 = 100"

schema [config]

fields [Config]

The schema of the starRocks that you want to generate

e.g.

schema {
fields {
name = string
age = int
}
}

table_list [array]

The list of tables to be read, you can use this configuration instead of table

request_tablet_size [int]

The number of StarRocks Tablets corresponding to an Partition. The smaller this value is set, the more partitions will be generated. This will increase the parallelism on the engine side, but at the same time will cause greater pressure on StarRocks.

The following is an example to explain how to use request_tablet_size to controls the generation of partitions

the tablet distribution of StarRocks table in cluster as follower

be_node_1 tablet[1, 2, 3, 4, 5]
be_node_2 tablet[6, 7, 8, 9, 10]
be_node_3 tablet[11, 12, 13, 14, 15]

1.If not set request_tablet_size, there will no limit on the number of tablets in a single partition. The partitions will be generated as follows

partition[0] read data of tablet[1, 2, 3, 4, 5] from be_node_1
partition[1] read data of tablet[6, 7, 8, 9, 10] from be_node_2
partition[2] read data of tablet[11, 12, 13, 14, 15] from be_node_3

2.if set request_tablet_size=3, the limit on the number of tablets in a single partition is 3. The partitions will be generated as follows

partition[0] read data of tablet[1, 2, 3] from be_node_1
partition[1] read data of tablet[4, 5] from be_node_1
partition[2] read data of tablet[6, 7, 8] from be_node_2
partition[3] read data of tablet[9, 10] from be_node_2
partition[4] read data of tablet[11, 12, 13] from be_node_3
partition[5] read data of tablet[14, 15] from be_node_3

scan_connect_timeout_ms [int]

requests connection timeout sent to StarRocks

scan_query_timeout_sec [int]

Query the timeout time of StarRocks, the default value is 1 hour, -1 means no timeout limit

scan_keep_alive_min [int]

The keep-alive duration of the query task, in minutes. The default value is 10. we recommend that you set this parameter to a value greater than or equal to 5.

scan_batch_rows [int]

The maximum number of data rows to read from BE at a time. Increasing this value reduces the number of connections established between engine and StarRocks and therefore mitigates overhead caused by network latency.

scan_mem_limit [long]

The maximum memory space allowed for a single query in the BE node, in bytes. The default value is 2147483648 (2 GB).

max_retries [int]

number of retry requests sent to StarRocks

scan.params. [string]

The parameter of the scan data from be

Example

source {
StarRocks {
nodeUrls = ["starrocks_e2e:8030"]
username = root
password = ""
database = "test"
table = "e2e_table_source"
scan_batch_rows = 10
max_retries = 3
schema {
fields {
BIGINT_COL = BIGINT
LARGEINT_COL = STRING
SMALLINT_COL = SMALLINT
TINYINT_COL = TINYINT
BOOLEAN_COL = BOOLEAN
DECIMAL_COL = "DECIMAL(20, 1)"
DOUBLE_COL = DOUBLE
FLOAT_COL = FLOAT
INT_COL = INT
CHAR_COL = STRING
VARCHAR_11_COL = STRING
STRING_COL = STRING
DATETIME_COL = TIMESTAMP
DATE_COL = DATE
}
}
scan.params.scanner_thread_pool_thread_num = "3"

}
}

Example 2: Multiple tables

source {
StarRocks {
nodeUrls = ["starrocks_e2e:8030"]
username = root
password = ""
database = "test"
table_list = [
{
table = "e2e_table_source"
schema = {
fields {
BIGINT_COL = BIGINT
LARGEINT_COL = STRING
SMALLINT_COL = SMALLINT
TINYINT_COL = TINYINT
BOOLEAN_COL = BOOLEAN
DECIMAL_COL = "DECIMAL(20, 1)"
DOUBLE_COL = DOUBLE
FLOAT_COL = FLOAT
INT_COL = INT
CHAR_COL = STRING
VARCHAR_11_COL = STRING
STRING_COL = STRING
DATETIME_COL = TIMESTAMP
DATE_COL = DATE
}
}
},
{
table = "e2e_table_source_2"
schema = {
fields {
BIGINT_COL_2 = BIGINT
LARGEINT_COL_2 = STRING
SMALLINT_COL_2 = SMALLINT
TINYINT_COL_2 = TINYINT
BOOLEAN_COL_2 = BOOLEAN
DECIMAL_COL_2 = "DECIMAL(20, 1)"
DOUBLE_COL_2 = DOUBLE
FLOAT_COL_2 = FLOAT
INT_COL_2 = INT
CHAR_COL_2 = STRING
VARCHAR_11_COL_2 = STRING
STRING_COL_2 = STRING
DATETIME_COL_2 = TIMESTAMP
DATE_COL_2 = DATE
}
}
}]
scan_batch_rows = 10
max_retries = 3
scan.params.scanner_thread_pool_thread_num = "3"

}
}

Changelog

Change Log
ChangeCommitVersion
[Fix][Connector-V2] Fix StarRocksCatalogTest#testCatalog() NPE (#8987)https://github.com/apache/seatunnel/commit/53f0a9eb52.3.10
[Improve][Connector-V2] Random pick the starrocks fe address which can be connected (#8898)https://github.com/apache/seatunnel/commit/bef76078f2.3.10
[Feature][Connector-v2] Support multi starrocks source (#8789)https://github.com/apache/seatunnel/commit/26b5529aa2.3.10
[Fix][Connector-V2] Fix possible data loss in scenarios of request_tablet_size is less than the number of BUCKETS (#8768)https://github.com/apache/seatunnel/commit/3c6f216132.3.10
[Fix][Connector-V2]Fix Descriptions for CUSTOM_SQL in Connector (#8778)https://github.com/apache/seatunnel/commit/96b610eb72.3.10
[Improve] restruct connector common options (#8634)https://github.com/apache/seatunnel/commit/f3499a6ee2.3.10
[improve] add StarRocks options (#8639)https://github.com/apache/seatunnel/commit/da8d9cbd32.3.10
[Fix][Connector-V2] fix starRocks automatically creates tables with comment (#8568)https://github.com/apache/seatunnel/commit/c4cb1fc4a2.3.10
[Fix][Connector-V2] Fixed adding table comments (#8514)https://github.com/apache/seatunnel/commit/edca75b0d2.3.10
[Feature][Connector-V2] Starrocks implements multi table sink (#8467)https://github.com/apache/seatunnel/commit/55eebfa8a2.3.9
[Improve][Connector-V2] Add pre-check starrocks version before exeucte alter table field name (#8237)https://github.com/apache/seatunnel/commit/c24e3b12b2.3.9
[Fix][Connector-starrocks] Fix drop column bug for starrocks (#8216)https://github.com/apache/seatunnel/commit/082814da12.3.9
[Feature][Core] Support read arrow data (#8137)https://github.com/apache/seatunnel/commit/4710ea0f82.3.9
[Feature][Clickhouse] Support sink savemode (#8086)https://github.com/apache/seatunnel/commit/e6f92fd792.3.9
[Feature][Connector-V2] StarRocks-sink support schema evolution (#8082)https://github.com/apache/seatunnel/commit/d33b0da8a2.3.9
[Improve][dist]add shade check rule (#8136)https://github.com/apache/seatunnel/commit/51ef800012.3.9
[Improve][Connector-V2] Add doris/starrocks create table with comment (#7847)https://github.com/apache/seatunnel/commit/207b8c16f2.3.9
[Feature][Restapi] Allow metrics information to be associated to logical plan nodes (#7786)https://github.com/apache/seatunnel/commit/6b7c53d032.3.9
[Improve][API] Move catalog open to SaveModeHandler (#7439)https://github.com/apache/seatunnel/commit/8c2c5c79a2.3.8
[Improve][Connector-V2] Reuse connection in StarRocksCatalog (#7342)https://github.com/apache/seatunnel/commit/8ee129d202.3.8
[Improve][Connector-V2] Remove system table limit (#7391)https://github.com/apache/seatunnel/commit/adf888e002.3.8
[Improve][Connector-V2] Close all ResultSet after used (#7389)https://github.com/apache/seatunnel/commit/853e973212.3.8
[Feature][Core] Support using upstream table placeholders in sink options and auto replacement (#7131)https://github.com/apache/seatunnel/commit/c4ca741222.3.6
[Fix][Connector-V2] Fix starrocks Content-Length header already present error (#7034)https://github.com/apache/seatunnel/commit/a485a74ef2.3.6
[Feature][Connector-V2]Support StarRocks Fe Node HAhttps://github.com/apache/seatunnel/commit/9c36c45812.3.6
[Fix][Connector-v2] Fix the sql statement error of create table for doris and starrocks (#6679)https://github.com/apache/seatunnel/commit/88263cd692.3.6
[Fix][StarRocks] Fix NPE when upstream catalogtable table path only have table name part (#6540)https://github.com/apache/seatunnel/commit/5795b265c2.3.5
[Fix][Connector-V2] Fixed doris/starrocks create table sql parse error (#6580)https://github.com/apache/seatunnel/commit/f2ed1fbde2.3.5
[Fix][Connector-V2] Fix connector support SPI but without no args constructor (#6551)https://github.com/apache/seatunnel/commit/5f3c9c36a2.3.5
[Improve] Add SaveMode log of process detail (#6375)https://github.com/apache/seatunnel/commit/b0d70ce222.3.5
[Improve][Connector-V2] Support TableSourceFactory on StarRocks (#6498)https://github.com/apache/seatunnel/commit/aded562992.3.5
[Improve] StarRocksSourceReader use the existing client (#6480)https://github.com/apache/seatunnel/commit/1a02c571a2.3.5
[Improve][API] Unify type system api(data & type) (#5872)https://github.com/apache/seatunnel/commit/b38c7edcc2.3.5
[Feature][Connector] add starrocks save_mode (#6029)https://github.com/apache/seatunnel/commit/66b0f1e1d2.3.4
[Feature] Add unsupported datatype check for all catalog (#5890)https://github.com/apache/seatunnel/commit/b9791285a2.3.4
[Improve] StarRocks support create table template with unique key (#5905)https://github.com/apache/seatunnel/commit/25b01125e2.3.4
[Improve][StarRocksSink] add http socket timeout. (#5918)https://github.com/apache/seatunnel/commit/febdb262b2.3.4
[Improve] Support create varchar field type in StarRocks (#5911)https://github.com/apache/seatunnel/commit/6025895162.3.4
[Improve]Change System.out.println to log output. (#5912)https://github.com/apache/seatunnel/commit/bbedb07a92.3.4
[Improve][Common] Introduce new error define rule (#5793)https://github.com/apache/seatunnel/commit/9d1b2582b2.3.4
[Improve] Remove use SeaTunnelSink::getConsumedType method and mark it as deprecated (#5755)https://github.com/apache/seatunnel/commit/8de7408102.3.4
[Improve][Connector] Add field name to DataTypeConvertor to improve error message (#5782)https://github.com/apache/seatunnel/commit/ab60790f02.3.4
[feature][connector-jdbc]Add Save Mode function and Connector-JDBC (MySQL) connector has been realized (#5663)https://github.com/apache/seatunnel/commit/eff17ccbe2.3.4
[Improve] Add default implement for SeaTunnelSink::setTypeInfo (#5682)https://github.com/apache/seatunnel/commit/86cba87452.3.4
Support config column/primaryKey/constraintKey in schema (#5564)https://github.com/apache/seatunnel/commit/eac76b4e52.3.4
[Improve] Refactor CatalogTable and add SeaTunnelSource::getProducedCatalogTables (#5562)https://github.com/apache/seatunnel/commit/41173357f2.3.4
[Hotfix][Connector-V2][StarRocks] fix starrocks template sql parser #5071 (#5332)https://github.com/apache/seatunnel/commit/23d79b0d12.3.4
[Improve][Connector-V2] Remove scheduler in StarRocks sink (#5269)https://github.com/apache/seatunnel/commit/cb7b794912.3.4
[Improve][CheckStyle] Remove useless 'SuppressWarnings' annotation of checkstyle. (#5260)https://github.com/apache/seatunnel/commit/51c0d709b2.3.4
[Hotfix] Fix com.google.common.base.Preconditions to seatunnel shade one (#5284)https://github.com/apache/seatunnel/commit/ed5eadcf72.3.3
Fix StarRocksJsonSerializer will transform array/map/row to string (#5281)https://github.com/apache/seatunnel/commit/f941953772.3.3
[Improve] Improve savemode api (#4767)https://github.com/apache/seatunnel/commit/4acd370d42.3.3
[Improve][Connector-V2] Improve StarRocks Auto Create Table To Support Use Primary Key Template In Field (#4487)https://github.com/apache/seatunnel/commit/e601cd4c32.3.2
Revert "[Improve][Catalog] refactor catalog (#4540)" (#4628)https://github.com/apache/seatunnel/commit/2d19331952.3.2
[hotfix][starrocks] fix error on get starrocks source typeInfo (#4619)https://github.com/apache/seatunnel/commit/f7b094f9e2.3.2
[Improve][Catalog] refactor catalog (#4540)https://github.com/apache/seatunnel/commit/b0a701cb82.3.2
[Improve][Connector-V2] Throw StarRocks Serialize Error To Client (#4484)https://github.com/apache/seatunnel/commit/e2c1073232.3.2
[Improve][Connector-V2] Improve StarRocks Serialize Error Message (#4458)https://github.com/apache/seatunnel/commit/465e75cbf2.3.2
[Hotfix][Zeta] Adapt StarRocks With Multi-Table And Single-Table Mode (#4324)https://github.com/apache/seatunnel/commit/c11c171d32.3.1
[improve][zeta] fix zeta bugshttps://github.com/apache/seatunnel/commit/3a82e8b392.3.1
[Improve][Zeta] Improve Client Job Info Messagehttps://github.com/apache/seatunnel/commit/56febf0112.3.1
[Fix][Connector-V2] Fix StarRocksSink Without Format Field In Headerhttps://github.com/apache/seatunnel/commit/463ae64372.3.1
[Improve] Support StarRocksCatalog Use JDBC URL With Custom Suffixhttps://github.com/apache/seatunnel/commit/d00ced6ec2.3.1
[Improve] Support MySqlCatalog Use JDBC URL With Custom Suffixhttps://github.com/apache/seatunnel/commit/210d0ff1f2.3.1
[Improve] Change StarRocks Sink Default Format To Jsonhttps://github.com/apache/seatunnel/commit/8703357832.3.1
[Fix] Fix StarRocks Default Url Can't Usehttps://github.com/apache/seatunnel/commit/67c45d3532.3.1
[hotfix] fixed schema options import errorhttps://github.com/apache/seatunnel/commit/656805f2d2.3.1
[chore] Code format with spotless plugin.https://github.com/apache/seatunnel/commit/291214ad62.3.1
Merge branch 'dev' into merge/cdchttps://github.com/apache/seatunnel/commit/4324ee1912.3.1
[Improve][Project] Code format with spotless plugin.https://github.com/apache/seatunnel/commit/423b583032.3.1
[Fix] Fix StarRocks Default Url Can't Use (#4229)https://github.com/apache/seatunnel/commit/ed74d11092.3.1
[Bug] Remove StarRocks Auto Creat Table Default Value (#4220)https://github.com/apache/seatunnel/commit/80b5cd40a2.3.1
[Feature] Add SaveMode For StarRocks (#4217)https://github.com/apache/seatunnel/commit/0674f10a52.3.1
[Improve] Improve StarRocks Catalog Base Url (#4215)https://github.com/apache/seatunnel/commit/6632a40472.3.1
[Improve] Improve StarRocks Sink Config (#4212)https://github.com/apache/seatunnel/commit/8d5712c1d2.3.1
[Hotfix][Zeta] keep deleteCheckpoint method synchronized (#4209)https://github.com/apache/seatunnel/commit/061f9b5872.3.1
[Improve] Improve StarRocks Auto Create Table (#4208)https://github.com/apache/seatunnel/commit/bc9cd6bf62.3.1
[hotfix][zeta] fix zeta multi-table parser error (#4193)https://github.com/apache/seatunnel/commit/98f2ad0c12.3.1
[feature][starrocks] add StarRocks factories (#4191)https://github.com/apache/seatunnel/commit/c485d887e2.3.1
[Feature] Change StarRocks CreatTable Template (#4184)https://github.com/apache/seatunnel/commit/4cf07f3be2.3.1
[Feature][Connector-V2] StarRocks source connector (#3679)https://github.com/apache/seatunnel/commit/9681173b12.3.1
[Improve][Connector-V2] [StarRocks] Starrocks Support Auto Create Table (#4177)https://github.com/apache/seatunnel/commit/7e0008e6f2.3.1
[Improve][build] Give the maven module a human readable name (#4114)https://github.com/apache/seatunnel/commit/d7cd601052.3.1
[Improve][Project] Code format with spotless plugin. (#4101)https://github.com/apache/seatunnel/commit/a2ab166562.3.1
[Feature][Connector-v2][StarRocks] Support write cdc changelog event(INSERT/UPDATE/DELETE) (#3865)https://github.com/apache/seatunnel/commit/8e3d158c02.3.1
[Improve][Connector-V2] Change Connector Custom Config Prefix To Map (#3719)https://github.com/apache/seatunnel/commit/ef1b8b1bb2.3.1
[Improve][Connector-V2][StarRocks] Unified exception for StarRocks source and sink (#3593)https://github.com/apache/seatunnel/commit/612d0297a2.3.0
[Improve][Connector-V2][StarRocks] Delete the Mapper may not be used (#3579)https://github.com/apache/seatunnel/commit/1e868ecf22.3.0
[Hotfix][OptionRule] Fix option rule about all connectors (#3592)https://github.com/apache/seatunnel/commit/226dc6a112.3.0
[Improve][Connector-V2][StarRocks]Add StarRocks connector option rules (#3402)https://github.com/apache/seatunnel/commit/5d187f69b2.3.0
[Bugfix][Connector-V2][StarRocks]Fix StarRocks StreamLoad retry bug and fix doc (#3406)https://github.com/apache/seatunnel/commit/071f9aa052.3.0
[Feature][Connector-V2] Starrocks sink connector (#3164)https://github.com/apache/seatunnel/commit/3e6caf7052.3.0