SPRKSCL1101
org.apache.spark.sql.functions.broadcast, org.apache.spark.broadcast
This issue code has been deprecated since Spark Conversion Core 2.3.22
Message: Broadcast is not supported
Category: Warning
Description
This issue appears when the SMA detects a use of the org.apache.spark.sql.functions.broadcast function, which is not supported by Snowpark. This function is not supported because Snowflake does not support broadcast variables.
Scenario
Input
Below is an example of the org.apache.spark.sql.functions.broadcast
function used to create a broadcast object to use on each Spark cluster:
Output
The SMA adds the EWI SPRKSCL1101
to the output code to let you know that this function is not supported by Snowpark.
Recommended fix
Since Snowflake manages the storage and the workload on the clusters making broadcast objects inapplicable. This means that the use of broadcast could not be required at all, but each case should require further analysis.
The recommended approach is replace a Spark dataframe broadcast by a Snowpark regular dataframe or by using a dataframe method as Join.
For the proposed input the fix is to adapt the join to use directly the dataframe collegeDF
without the use of broadcast for the dataframe.
Additional recommendations
The Snowflake's architecture guide provides insight about Snowflake storage management.
Snowpark Dataframe reference could be useful in how to adapt a particular broadcast scenario.
For more support, you can email us at sma-support@snowflake.com or post an issue in the SMA.
Last updated