SPRKPY1042
pyspark.sql.functions.posexplode has a workaround
This issue appears when the tool detects the usage of pyspark.sql.functions.posexplode which has a workaround.
Input code:
df.select(posexplode(colList))
df.select(posexplode(colDict))
Output code:
#EWI: SPRKPY1042 => pyspark.sql.functions.posexplode has a workaround, see documentation for more info
df.select(posexplode(colList))
#EWI: SPRKPY1042 => pyspark.sql.functions.posexplode has a workaround, see documentation for more info
df.select(posexplode(colDict))
posexplode(col: ColumnOrName) -> pyspark.sql.column.Column
When column contains a list of values
Action: you can use functions.row_number to get the position and Session.flatten with the name of the field to get the value for lists, or the key/value for dictionaries. Example:
df.select(row_number().as_("pos"), flatten(colList)["value"].as_("col"))
When column contains a map/dictionary (keys/values)
Action: you can use snowflake.snowpark.Session.flatten with the name of the field to get the keys/values for dictionaries. Example:
df.select(row_number().as_("pos"), flatten(colDict)["key"], Session.flatten(col)["value"])
# or
flattened = flatten(colDict)
df.select(row_number().as_("pos"), flattened["key"], flattened["value"])
Note: using row_number is not full equivalent, because it starts with 1 (not zero as spark method)
- For more support, you can email us at [email protected]. If you have a contract for support with Snowflake, reach out to your sales engineer and they can direct your support needs.
Last modified 21d ago