Version: Next

Multi-Table Transform Capability Boundary

Overview

SeaTunnel's multi-table transform feature allows a single transform node to process multiple tables flowing from an upstream source (typically a CDC connector) in one pipeline. This page documents precisely what is supported, what is not, and what alternatives to use when you hit a capability boundary.

1. What Is a Multi-Table Transform?

In a standard single-table pipeline, one Source feeds one Transform chain feeds one Sink. In a multi-table pipeline, a single Source (e.g., MySQL-CDC) emits records from many tables simultaneously, and each downstream Transform or Sink must declare which table(s) it applies to.

2. Capability Boundary Table

Capability	Supported	Notes
Per-table field rename / map	✅ Yes	Use `FieldMapper` with `plugin_input` and `table_match_regex`
Per-table column filtering	✅ Yes	Use `Filter` with `plugin_input` and `table_match_regex`
Per-table type casting	✅ Yes	Use `FieldMapper` with `define_sink_type` option
Per-table SQL transform (single table)	✅ Yes	Use `SQL` transform with `plugin_input`; scope it with `table_match_regex` when needed
`TableMerge` — merge multiple tables into one	✅ Yes	Tables must share compatible schema
`TableRename` — rename tables in the stream	✅ Yes	Works well for routing to Sink by table name
Row-level filtering (filter by `rowkind`)	✅ Yes	Use `FilterRowKind` transform
Cross-table SQL JOIN	❌ Not supported	See Section 4 for alternatives
Aggregation across multiple CDC tables	❌ Not supported	Aggregate downstream in OLAP engine
Generating new tables from a JOIN result	❌ Not supported	Use a dedicated SQL engine
Applying one transform to ALL tables wildcard	⚠️ Partial	`TableMerge` then single transform; schema must be compatible
Changing schema mid-stream (DDL events)	⚠️ Limited	Depends on sink; some sinks handle schema evolution; transforms do not
Nested JSON field extraction per table	✅ Yes	Use `JsonPath` transform with `plugin_input` and `table_match_regex`

3. TableMerge vs SQL Join

These two are the most commonly confused features.

3.1 `TableMerge` Transform

TableMerge merges the row streams of multiple tables into a single result table. All input tables must have the same (or compatible) schema. Use it to route all tables to one Sink.

{
  "plugin_name": "TableMerge",
  "plugin_input": ["orders_2023", "orders_2024"],
  "plugin_output": "all_orders",
  "merge_by_field": true
}

When to use: Fan-in from multiple source tables that have the same structure (e.g., sharded tables, multi-year partitions, or multi-database tables with identical schema).

When NOT to use: When tables have different schemas that you need to correlate or enrich from each other — that requires a JOIN.

3.2 SQL JOIN (not natively supported in multi-table pipelines)

A SQL JOIN correlates rows from two different tables based on a key. SeaTunnel's SQL transform does NOT support cross-table JOIN inside a multi-table streaming pipeline.

Attempting to JOIN records from two upstream tables inside a single SQL transform is not supported and will result in a configuration error.

Recommended alternatives:

Write both tables to a shared data lake or warehouse (e.g., Hudi, Iceberg, ClickHouse), then run the JOIN there
Use Apache Flink with SeaTunnel's Flink connector for stateful JOIN operations
Materialise the "dimension" table into a lookup cache (e.g., Redis, RocksDB) and use a custom transform for enrichment

4. Cross-Source JOIN Limitation

SeaTunnel does not support streaming JOINs where the two input sides come from different sources (e.g., joining MySQL-CDC with a PostgreSQL-CDC stream).

Scenario	Supported
Single-source multi-table pass-through	✅
Single-source TableMerge (same schema)	✅
Cross-source JOIN (MySQL-CDC + PG-CDC)	❌
Cross-source JOIN (CDC + JDBC batch)	❌
Same-source JOIN on two different tables	❌

Workaround: Write both sources to a common sink (Kafka, Iceberg, etc.) and perform the JOIN downstream in a dedicated SQL engine (Flink, Spark, ClickHouse, etc.).

5. Per-Table Transform Configuration Pattern

When you need different transforms for different tables from the same source, declare separate transform blocks that share the same plugin_input and use different table_match_regex rules:

{
  "env": {
    "job.name": "cdc-multi-table",
    "job.mode": "STREAMING"
  },
  "source": [
    {
      "plugin_name": "MySQL-CDC",
      "plugin_output": "cdc_stream",
      "base-url": "jdbc:mysql://localhost:3306/mydb",
      "username": "cdc_user",
      "password": "password",
      "database-names": ["mydb"],
      "table-names": ["mydb.orders", "mydb.users", "mydb.products"]
    }
  ],
  "transform": [
    {
      "plugin_name": "FieldMapper",
      "plugin_input": ["cdc_stream"],
      "plugin_output": "orders_mapped",
      "field_mapper": { "order_id": "id", "order_amount": "amount" },
      "table_match_regex": "mydb\\.orders"
    },
    {
      "plugin_name": "FieldMapper",
      "plugin_input": ["cdc_stream"],
      "plugin_output": "users_mapped",
      "field_mapper": { "user_id": "id", "user_email": "email" },
      "table_match_regex": "mydb\\.users"
    }
  ],
  "sink": [
    {
      "plugin_name": "Kafka",
      "plugin_input": ["orders_mapped"],
      "topic": "orders"
    },
    {
      "plugin_name": "Kafka",
      "plugin_input": ["users_mapped"],
      "topic": "users"
    },
    {
      "plugin_name": "Kafka",
      "plugin_input": ["cdc_stream"],
      "topic": "products",
      "table_match_regex": "mydb\\.products"
    }
  ]
}

6. Common Fields Example (Shared Schema)

If multiple tables share a common set of fields, you can use TableMerge to combine them and apply a single transform:

"transform": [
  {
    "plugin_name": "TableMerge",
    "plugin_input": ["cdc_stream"],
    "plugin_output": "all_events",
    "table_match_regex": "mydb\\.(orders|payments|refunds)"
  },
  {
    "plugin_name": "FieldMapper",
    "plugin_input": ["all_events"],
    "plugin_output": "all_events_mapped",
    "field_mapper": { "created_at": "event_time", "event_type": "type" }
  }
]

This works only when all three tables (orders, payments, refunds) share created_at and event_type fields. If schemas differ, TableMerge will fail at runtime.

7. EtLT Patterns with Multi-Table Transform

EtLT (Extract, light-transform, Load, then Transform in the warehouse) is the recommended pattern when SeaTunnel's transform layer cannot fulfil the full transformation requirement:

Use SeaTunnel's transform layer for:

Field rename / filtering
Type normalisation
Row-level filtering
Schema routing (different tables → different topics/tables)

Offload to downstream:

Cross-table JOINs
Aggregations
Pivot / unpivot
Enrichment from dimension tables

8. FAQ

Q: Can I apply one transform to ALL tables without specifying each one?

Not directly. Use TableMerge to combine tables with compatible schemas first, then apply a single transform to the merged result. If schemas differ, you must use separate transform blocks, typically with different table_match_regex rules.

Q: Does the SQL transform support JOIN?

No. The SQL transform only supports single-table queries (SELECT, WHERE, expressions). For JOINs, use an external SQL engine after loading the data into a sink.

Q: What happens to tables that are not matched by any transform?

Unmatched tables continue flowing through the pipeline and can be captured by a Sink that uses the right plugin_input and table_match_regex.

Q: Can I add a new table to an existing CDC pipeline without downtime?

This depends on the connector. MySQL-CDC supports dynamic table discovery in some configurations, but adding a transform for a new table requires a pipeline restart. Use stop-with-savepoint to minimise data loss (see REST API v2 Reference).

Multi-Table Transform Capability Boundary

Overview​

1. What Is a Multi-Table Transform?​

2. Capability Boundary Table​

3. TableMerge vs SQL Join​

3.1 TableMerge Transform​

3.2 SQL JOIN (not natively supported in multi-table pipelines)​

4. Cross-Source JOIN Limitation​

5. Per-Table Transform Configuration Pattern​

6. Common Fields Example (Shared Schema)​

7. EtLT Patterns with Multi-Table Transform​

8. FAQ​

See Also​