Default Settings
The SMA-Checkpoints feature includes several settings, each with its corresponding default value.
Default Values
On/Off the whole feature: Enabled.
Collect user-defined methods returning DataFrame type: False.
List of relevant PySpark functions to collect: (See table below).
Sample: 100%.
Mode: Schema.
Enabled: Always True.
Default PySpark functions to collect
Creation
pyspark.sql.session.SparkSession.createDataFrame
pyspark.sql.readwriter.DataFrameReader.csv
pyspark.sql.readwriter.DataFrameReader.jdbc
pyspark.sql.readwriter.DataFrameReader.json
pyspark.sql.readwriter.DataFrameReader.load
pyspark.sql.readwriter.DataFrameReader.orc
pyspark.sql.readwriter.DataFrameReader.parquet
pyspark.sql.readwriter.DataFrameReader.table
pyspark.sql.readwriter.DataFrameReader.text
pyspark.rdd.RDD.toDF
Transformation
pyspark.sql.dataframe.DataFrame.union
pyspark.sql.dataframe.DataFrame.intersect
pyspark.sql.dataframe.DataFrame.join
pyspark.sql.group.GroupedData.pivot
Last updated