The tool adds the EWI SPRKPY1089 indicating that it is required validation.
df = spark.createDataFrame([(1, "myVal")], [2, "myVal2"], [None, "myVal3" ])#EWI: SPRKPY1089 => The pyspark.sql.readwriter.DataFrameWriter.options values in Snowpark may be different, so required validation might be needed.df.write.options(nullValue="myVal", sep=",").csv("some_path")
Recommended fix
The Snowpark API supports these parameters, so the only action can be to check the behavior after the migration. Please refer to the Equivalences table to see the supported parameters.
Scenario 2
Input
Here the scenario shows the usage of options, but adds a header option, which is not supported.
Output
The tool adds the EWI SPRKPY1089 indicating that it is required validation is needed.
Recommended fix
For this scenario it is recommended to evaluate the Snowpark format type options to see if it is possible to change it according to your needs. Also, check the behavior after the change.
Scenario 3
Input
This scenario adds a sep option, which is supported and uses the JSON method.
Output
The tool adds the EWI SPRKPY1089 indicating that it is required validation is needed.
Note: this scenario also applies for PARQUET.
Recommended fix
The file format JSON does not support the parameter sep, so it is recommended to evaluate the snowpark format type options to see if it is possible to change it according to your needs. Also, check the behavior after the change.
Additional recommendations
Since there are some not supported parameters, it is recommended to check the table of equivalences and check the behavior after the transformation.
Equivalences table:
Snowpark can support a list of equivalences for some parameters:
PySpark Option
SnowFlake Option
Supported File Formats
Description
SEP
FIELD_DELIMITER
CSV
One or more single byte or multibyte characters that separate fields in an input file.
LINESEP
RECORD_DELIMITER
CSV
One or more characters that separate records in an input file.
QUOTE
FIELD_OPTIONALLY_ENCLOSED_BY
CSV
Character used to enclose strings.
NULLVALUE
NULL_IF
CSV
String used to convert to and from SQL NULL.
DATEFORMAT
DATE_FORMAT
CSV
String that defines the format of date values in the data files to be loaded.
TIMESTAMPFORMAT
TIMESTAMP_FORMAT
CSV
String that defines the format of timestamp values in the data files to be loaded.
If the parameter used is not in the list, the API throws an error.
df = spark.createDataFrame([(1, "myVal")], [2, "myVal2"], [None, "myVal3" ])
#EWI: SPRKPY1089 => The pyspark.sql.readwriter.DataFrameWriter.options values in Snowpark may be different, so required validation might be needed.
df.write.options(nullValue="myVal", sep=",").csv("some_path")
df = spark.createDataFrame([(1, "myVal")], [2, "myVal2"], [None, "myVal3" ])
#EWI: SPRKPY1089 => The pyspark.sql.readwriter.DataFrameWriter.options values in Snowpark may be different, so required validation might be needed.
df.write.options(header=True, sep=",").csv("some_path")
df = spark.createDataFrame([(1, "myVal")], [2, "myVal2"], [None, "myVal3" ])
#EWI: SPRKPY1089 => The pyspark.sql.readwriter.DataFrameWriter.options values in Snowpark may be different, so required validation might be needed.
df.write.csv("some_path")
df = spark.createDataFrame([(1, "myVal")], [2, "myVal2"], [None, "myVal3" ])
#EWI: SPRKPY1089 => The pyspark.sql.readwriter.DataFrameWriter.options values in Snowpark may be different, so required validation might be needed.
df.write.options(nullValue="myVal", sep=",").json("some_path")
df = spark.createDataFrame([(1, "myVal")], [2, "myVal2"], [None, "myVal3" ])
#EWI: SPRKPY1089 => The pyspark.sql.readwriter.DataFrameWriter.options values in Snowpark may be different, so required validation might be needed.
df.write.json("some_path")