Format/Load

The transformation consists of replacing the load call with the parameter used in the format function if this parameter is not supported, the transformation won't be performed and will generate an EWI.

Input Code:

from pyspark.sql import SparkSession

df  = SparkSession.read.format('json').load(path)
df2 = SparkSession.read.format('parquet').load(path)
df3 = SparkSession.read.format('orc').load(path)
df4 = SparkSession.read.format('csv').load(path)
df5 = SparkSession.read.format('jdbc') \ 
                       .option("driver","com.mysql.cj.jdbc.Driver") \
                       .option("url", "jdbc:mysql://localhost:3306/emp") \
                       .load()

Output Code:

import snowpark_extensions
from snowflake.snowpark import Session

df  = Session.read.json(path)
df2 = Session.read.parquet(path)
df3 = Session.read.orc(path)
df4 = Session.read.csv(path
#EWI: SPRKPY1054 => pyspark.sql.readwriter.DataFrameReader.format with argument(s) value(s) "jdbc" is not supported
#EWI: SPRKPY1002 => pyspark.sql.readwriter.DataFrameReader.load is not supported
df5 = Session.read.format('jdbc') \
                       .option("driver","com.mysql.cj.jdbc.Driver") \
                       .option("url", "jdbc:mysql://localhost:3306/emp") \
                       .load()

Last updated