Version: Next

Job Env Config

This document describes env configuration information. The common parameters can be used in all engines. In order to better distinguish between engine parameters, the additional parameters of other engine need to carry a prefix. In flink engine, we use flink. as the prefix. In the spark engine, we do not use any prefixes to modify parameters, because the official spark parameters themselves start with spark.

Common Parameter

The following configuration parameters are common to all engines.

job.name

This parameter configures the task name.

jars

Third-party packages can be loaded via jars, like jars="file://local/jar1.jar;file://local/jar2.jar".

job.mode

You can configure whether the task is in batch or stream mode through job.mode, like job.mode = "BATCH" or job.mode = "STREAMING"

checkpoint.interval

Gets the interval (milliseconds) in which checkpoints are periodically scheduled.

In STREAMING mode, checkpoints is required, if you do not set it, it will be obtained from the application configuration file seatunnel.yaml. In BATCH mode, you can disable checkpoints by not setting this parameter. In Zeta STREAMING mode, the default value is 30000 milliseconds.

checkpoint.timeout

The timeout (in milliseconds) for a checkpoint. If the checkpoint is not completed before the timeout, the job will fail. In Zeta, the default value is 30000 milliseconds.

parallelism

This parameter configures the parallelism of source and sink.

shade.identifier

Specify the method of encryption, if you didn't have the requirement for encrypting or decrypting config files, this option can be ignored.

For more details, you can refer to the documentation Config Encryption Decryption

Zeta Engine Parameter

job.retry.times

Used to control the default retry times when a job fails. The default value is 3, and it only works in the Zeta engine.

job.retry.interval.seconds

Used to control the default retry interval when a job fails. The default value is 3 seconds, and it only works in the Zeta engine.

savemode.execute.location

This parameter is used to specify the location of the savemode when the job is executed in the Zeta engine. The default value is CLUSTER, which means that the savemode is executed on the cluster. If you want to execute the savemode on the client, you can set it to CLIENT. Please use CLUSTER mode as much as possible, because when there are no problems with CLUSTER mode, we will remove CLIENT mode.

Flink Engine Parameter

Here are some SeaTunnel parameter names corresponding to the names in Flink, not all of them. Please refer to the official Flink Documentation.

Flink Configuration Name	SeaTunnel Configuration Name
pipeline.max-parallelism	flink.pipeline.max-parallelism
execution.checkpointing.mode	flink.execution.checkpointing.mode
execution.checkpointing.timeout	flink.execution.checkpointing.timeout
...	...

Spark Engine Parameter

Because Spark configuration items have not been modified, they are not listed here, please refer to the official Spark Documentation.

Job Env Config

Common Parameter​

job.name​

jars​

job.mode​

checkpoint.interval​

checkpoint.timeout​

parallelism​

shade.identifier​

Zeta Engine Parameter​

job.retry.times​

job.retry.interval.seconds​

savemode.execute.location​

Flink Engine Parameter​

Spark Engine Parameter​