SQL Embedded code

Transform the SQL Code embedded in your Scala / Python Code

The current use case supported by SMA is the pyspark.sql function.

There are instances where the Python or Scala code has embedded SQL code that requires transformation. SMA parses embedded SQL code in the following extensions:

  • Python files (.py)

  • Scala files (.scala)

  • Jupyter Notebook (.ipynb)

  • Databricks (.python, .scala)

  • Databricks Notebooks (.dbc)

Embedded SQL Code transformation Samples

Supported Case

# Original in Spark
spark.sql("""MERGE INTO people_target pt
USING people_source ps
ON (pt.person_id1 = ps.person_id2)
WHEN NOT MATCHED BY SOURCE THEN DELETE""")

Unsupported Cases

When the SMA identifies an unsupported case, it generates an EWI in the output code.

Some unsupported scenarios are:

  • Using string variables with SQL Code:

  • Using basic concatenation to create SQL code:

  • Using interpolations to create SQL code:

  • Using functions that create SQL queries:

Unsupported Cases and EWI messages

  • In Scala code the error code for unsupported embedded SQL is SPRKSCL1173

  • For Python code the error code for unsupported embedded SQL is SPRKPY1077

Last updated