SPRKPY1065

pyspark.context.SparkContext.broadcast

Message: The pyspark.context.SparkContext.broadcast does not apply since snowflake use data-clustering mechanism to compute the data.

Category: Warning

Description

This issue appears when the tool detects the usage of element pyspark.context.SparkContext.broadcast, which is not necessary due to the use of data-clustering of Snowflake.

Input code

In this example a broadcast variable is created, these variables allows data to be share more efficiently through all nodes.

sc = SparkContext(conf=conf_spark)

mapping = {1: 10001, 2: 10002}

bc = sc.broadcast(mapping)

Output code

The SMA adds an EWI message indicating that the broadcast it's not required.

sc = conf_spark

mapping = {1: 10001, 2: 10002}
#EWI: SPRKPY1065 => The element does not apply since snowflake use data-clustering mechanism to compute the data.

bc = sc.broadcast(mapping)

Recommended fix

Remove any usages of pyspark.context.SparkContext.broadcast.

sc = conf_spark

mapping = {1: 10001, 2: 10002}

Additional recommendations

Last updated