SPRKPY1029

pyspark.sql.readwriter.DataFrameReader.parquet has a workaround

Description

This issue appears when the tool detects the usage of pyspark.sql.readwriter.DataFrameReader.parquet which has a workaround.

Input code:

accounts = sparkSession.read.parquet(path, mergeSchema="True")

Output code:

#EWI: SPRKPY1029 => pyspark.sql.readwriter.DataFrameReader.parquet has a workaround, see documentation for more info
accounts = sparkSession.read.parquet(path, mergeSchema="True")

Scenario:

parquet(

#Path path: str,

#Options mode: Optional[str], partitionBy: Optional[Union[str, List[str]]], compression: Optional[str] ) A couple of workarounds are possible in this scenario. Path: The first parameter "path" must be a stage to make an equivalence with Snowpark, so is recommended to implement a temporary stage and add each ".parquet" path to the stage, using the prefix "file://", as follows. Source:

stringmap = sparkSession.read.parquet(["./data/file1.parquet", "./data/file2.parquet"])

Expected:

stage = f'{sparkSession.get_fully_qualified_current_schema()}.{_generate_prefix("TEMP_STAGE")}'
sparkSession.sql(f'CREATE TEMPORARY STAGE IF NOT EXISTS {stage}').show()
sparkSession.file.put(f"file://./data/file1.parquet", f"@{stage}")
sparkSession.file.put(f"file://./data/file2.parquet", f"@{stage}")
stringmap = sparkSession.read.parquet(stage)

Options: The additional parameters are also not supported by Snowpark as parameters, but for many of them you can use the "option" function to specify those .parquet parameter as options, as follows: Source:

stringmap = sparkSession.read.parquet(path, mergeSchema="True")

Expected:

stringmap = sparkSession.read.parquet(path)

The following options are not supported for Snowpark: compression, datetimeRebaseMode, int96RebaseMode, mergeSchema.

Recommendation

For more support, you can email us at [email protected]. If you have a contract for support with Snowflake, reach out to your sales engineer and they can direct your support needs.

PreviousSPRKPY1028 NextSPRKPY1030

hashtagCategory

hashtagDescription

hashtagScenario:

hashtagRecommendation

Category

Description

Scenario:

Recommendation