SPRKSCL1155

org.apache.spark.sql.functions.countDistinct has a workaround.

Description

This issue appears when the tool detects the usage of org.apache.spark.sql.functions.countDistinct which has a workaround.

Input code

val result = countDistinct("columnName1", "columnName2")

Output code

/*EWI: SPRKSCL1155 => org.apache.spark.sql.functions.countDistinct has a workaround, see documentation for more info*/
val result = countDistinct("columnName1", "columnName2")

Scenarios

This function has two overloads.

1. countDistinct(expr: Column, exprs: Column*)

Action: rename function to count_distinct.

2. countDistinct(columnName: String, columnNames: String*)

Action: rename function to count_distinct and convert the param to Column type using the com.snowflake.snowpark.functions.col function. For example:

val result = count_distinct(col("columnName1"), col("columnName2"))

Recommendation

  • For more support, you can email us at snowconvert-info@snowflake.com. If you have a contract for support with Snowflake, reach out to your sales engineer and they can direct your support needs.

#332: [SIT-1562] SQL Readiness

Change request updated