SPRKPY1009

pyspark.sql.dataframe.DataFrame.approxQuantile

Message: pyspark.sql.dataframe.DataFrame.approxQuantile has a workaround

Category: Warning.

Description

This issue appears when the tool detects the usage of pyspark.sql.dataframe.DataFrame.approxQuantilearrow-up-right which has a workaround.

Scenario

Input

It's important understand that Pyspark uses two different approxQuantile functions, here we use the DataFrame approxQuantilearrow-up-right version

from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
data = [['Sun', 10],
        ['Mon', 64],
        ['Thr', 12],
        ['Wen', 15],
        ['Thu', 68],
        ['Fri', 14],
        ['Sat', 13]]

columns = ['Day', 'Ammount']
df = spark.createDataFrame(data, columns)
df.approxQuantile('Ammount', [0.25, 0.5, 0.75], 0)

Output

SMA returns the EWI SPRKPY1009 over the line where approxQuantile is used, so you can use to identify where to fix.

Recommended fix

Use Snowpark approxQuantilearrow-up-right method. Some parameters don't match so they require some manual adjustments. for the output code's example a recommended fix could be:

pyspark.sql.dataframe.DataFrame.approxQuantile's relativeError parameter does't exist in SnowPark.

Additional recommendations

Last updated