SPRKPY1054

pyspark.sql.readwriter.DataFrameReader.format

Message: pyspark.sql.readwriter.DataFrameReader.format is not supported.

Category: Warning.

Description

This issue appears when the pyspark.sql.readwriter.DataFrameReader.format has an argument that is not supported by Snowpark.

Scenarios

There are some scenarios depending on the type of format you are trying to load. It can be a supported , or non-supported format.

Scenario 1

Input

The tool analyzes the type of format that is trying to load, the supported formats are:

  • Csv

  • JSON

  • Parquet

  • Orc

The below example shows how the tool transforms the format method when passing a Csv value.

from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()

df1 = spark.read.format('csv').load('/path/to/file')

Output

The tool transforms the format method into a Csv method call.

Recommended fix

In this case, the tool does not show the EWI, meaning there is no fix necessary.

Scenario 2

Input

The below example shows how the tool transforms the format method when passing a Jdbc value.

Output

The tool shows the EWI SPRKPY1054 indicating that the value "jdbc" is not supported.

Recommended fix

For the not supported scenarios, there is no specific fix since it depends on the files that are trying to be read.

Scenario 3

Input

The below example shows how the tool transforms the format method when passing a CSV, but using a variable instead.

Output

Since the tool can not determine the value of the variable in runtime, shows the EWI SPRKPY1054 indicating that the value "" is not supported.

Recommended fix

As a workaround, you can check the value of the variable and add it as a string to the format call.

Additional recommendations

Last updated