SQL Embedded code

Transform the SQL Code embedded in your Scala / Python Code

circle-info

The current use case supported by SMA is the pyspark.sqlarrow-up-right function.

There are instances where the Python or Scala code has embedded SQL code that requires transformation. SMA parses embedded SQL code in the following extensions:

  • Python files (.py)

  • Scala files (.scala)

  • Jupyter Notebook (.ipynb)

  • Databricks (.python, .scala)

  • Databricks Notebooks (.dbc)

Embedded SQL Code transformation Samples

Supported Case

# Original in Spark
spark.sql("""MERGE INTO people_target pt
USING people_source ps
ON (pt.person_id1 = ps.person_id2)
WHEN NOT MATCHED BY SOURCE THEN DELETE""")

Unsupported Cases

When the SMA identifies an unsupported case, it generates an EWIarrow-up-right in the output code.

Some unsupported scenarios are:

  • Using string variables with SQL Code:

  • Using basic concatenation to create SQL code:

  • Using interpolations to create SQL code:

  • Using functions that create SQL queries:

Unsupported Cases and EWI messages

  • In Scala code the error code for unsupported embedded SQL is SPRKSCL1173

  • For Python code the error code for unsupported embedded SQL is SPRKPY1077

Last updated