SPRKPY1011

pyspark.sql.dataframe.DataFrameStatFunctions.approxQuantile

Message: pyspark.sql.dataframe.DataFrameStatFunctions.approxQuantile has a workaround

Category: Warning.

Description

This issue appears when the tool detects the usage of pyspark.sql.dataframe.DataFrameStatFunctions.approxQuantilearrow-up-right which has a workaround.

Scenario

Input

It's important understand that Pyspark uses two different approxQuantile functions, here we use the DataFrameStatFunctions approxQuantilearrow-up-right version.

import tempfile
from pyspark.sql import SparkSession, DataFrameStatFunctions
spark = SparkSession.builder.getOrCreate()
data = [['Q1', 300000],
        ['Q2', 60000],
        ['Q3', 500002],
        ['Q4', 130000]]

columns = ['Quarter', 'Gain']
df = spark.createDataFrame(data, columns)
aprox_quantille = DataFrameStatFunctions(df).approxQuantile('Gain', [0.25, 0.5, 0.75], 0)
print(aprox_quantille)

Output

SMA returns the EWI SPRKPY1011 over the line where approxQuantile is used, so you can use to identify where to fix.

Recommended fix

You can use Snowpark approxQuantilearrow-up-right method. Some parameters don't match so they require some manual adjustments. for the output code's example a recommended fix could be:

pyspark.sql.dataframe.DataFrame.approxQuantile's relativeError parameter does't exist in SnowPark.

Additional recommendations

Last updated