Spark配置项中,有哪些的conf和config选项?
摘要:1、structured-streaming的state 配置项总结 -> 关注清哥聊技术公众号,了解更多技术文章 Config Name Description Default Value
1、structured-streaming的state 配置项总结->关注清哥聊技术公众号,了解更多技术文章
Config Name
Description
Default Value
spark.sql.streaming.stateStore.rocksdb.compactOnCommit
Whether we perform a range compaction of RocksDB instance for commit operation
False
spark.sql.streaming.stateStore.rocksdb.blockSizeKB
Approximate size in KB of user data packed per block for a RocksDB BlockBasedTable, which is a RocksDB's default SST file format.
4
spark.sql.streaming.stateStore.rocksdb.blockCacheSizeMB
The size capacity in MB for a cache of blocks.
8
spark.sql.streaming.stateStore.rocksdb.lockAcquireTimeoutMs
The waiting time in millisecond for acquiring lock in the load operation for RocksDB instance.
60000
spark.sql.streaming.stateStore.rocksdb.resetStatsOnLoad
Whether we resets all ticker and histogram stats for RocksDB on load.
True
spark.sql.streaming.stateStore.providerClass
The class used to manage state data in stateful streaming queries. This class must
be a subclass of StateStoreProvider, and must have a zero-arg constructor.
Note: For structured streaming, this configuration cannot be changed between query
restarts from the same checkpoint location.
org.apache.spark.sql.execution.streaming.
state.HDFSBackedStateStoreProvider
spark.sql.streaming.stateStore.stateSchemaCheck
When true, Spark will validate the state schema against schema on existing state and
fail query if it's incompatible.
true
spark.sql.streaming.stateStore.minDeltasForSnapshot
Minimum number of state store delta files that needs to be generated before they consolidated into snapshots.
10
spark.sql.streaming.stateStore.formatValidation.enabled
When true, check if the data from state store is valid or not when running streaming
queries. This can happen if the state store format has been changed. Note, the feature
is only effective in the build-in
