SPRKPY1016
pyspark.sql.functions.collect_set has a workaround
This issue code has been deprecated since Spark Conversion Core Version 0.11.7
Message: pyspark.sql.functions.collect_set has a workaround
Category: Warning.
Description
This issue appears when the tool detects the usage of pyspark.sql.functions.collect_set which has a workaround.
Scenario
Input
Using collect_set to get the elements of colname without duplicates:
Output
SMA returns the EWI SPRKPY1016 over the line where collect_set is used, so you can use to identify where to fix.
Recommended fix
Use function array_agg, and add a second argument with the value True.
Additional recommendation
For more support, you can email us at sma-support@snowflake.com or post an issue in the SMA.
Last updated