SPRKPY1027
pyspark.sql.readwriter.DataFrameReader.json
This issue code has been deprecated since Spark Conversion Core 4.5.2
Message: pyspark.sql.readwriter.DataFrameReader.json has a workaround, see documentation for more info
Category: Warning
Description
This issue appears when the SMA detects a use of the pyspark.sql.readwriter.DataFrameReader.json function, which has a workaround.
Scenario
Input
Below is an example of a use of the pyspark.sql.readwriter.DataFrameReader.json
function that generates this EWI. In this example, the json
function is used to read multiple .json
files with a given schema and uses some extra options such as primitiveAsString and dateFormat to fine-tune the behavior of reading the files.
Output
The SMA adds the EWI SPRKPY1027
to the output code to let you know that this function is not directly supported by Snowpark, but it has a workaround.
Recommended fix
In this section, we explain how to configure the path
parameter, the schema
parameter and some options
to make them work in Snowpark.
1. path parameter
Snowpark requires the path parameter to be a stage location so, as a workaround, you can create a temporary stage and add each .json
file to that stage using the prefix file://
.
2. schema parameter
Snowpark does not allow defining the schema as a parameter of the json
function. As a workaround, you can use the snowflake.snowpark.DataFrameReader.schema function.
3. options parameters
Snowpark does not allow defining the extra options as parameters of the json
function. As a workaround, for many of them you can use the snowflake.snowpark.DataFrameReader.option function to specify those parameters as options of the DataFrameReader.
The following options are not supported by Snowpark:
allowBackslashEscapingAnyCharacter
allowComments
allowNonNumericNumbers
allowNumericLeadingZero
allowSingleQuotes
allowUnquotedControlChars
allowUnquotedFieldNames
columnNameOfCorruptRecord
dropFiledIfAllNull
encoding
ignoreNullFields
lineSep
locale
mode
multiline
prefersDecimal
primitiveAsString
samplingRatio
timestampNTZFormat
timeZone
Below is the full example of how the input code should look like after applying the suggestions mentioned above to make it work in Snowpark:
Additional recommendations
For more support, you can email us at sma-support@snowflake.com or post an issue in the SMA.
Last updated