Output code setup
Before running migrated spark source code, there are a couple of things to consider
Snowpark and snowpark extensions libraries must be referenced from migrated project.
Snowpark Extensions is a support library that extends the standard Snowpark library by adding different functionalities that are present in Apache Spark but are not currently supported by Snowpark. The goal of this library is to facilitate the conversion process of projects from Apache Spark to Snowpark.
Here are the steps to reference snowpark and snowpark extensions libraries from the migrated code.
The tool will try to add these dependencies to the project configuration file. Once the references has been added to the project configuration file, the build tool will take care of resolving the references.
Based on the extension of the project configuration file, the tool adds the references as follows:
dependencies {
implementation 'com.snowflake:snowpark:1.6.2'
implementation 'net.mobilize.snowpark-extensions:snowparkextensions:0.0.9'
...
}
...
libraryDependencies += "com.snowflake" % "snowpark" % "1.6.2"
libraryDependencies += "net.mobilize.snowpark-extensions" % "snowparkextensions" % "0.0.9"
...
<dependencies>
<dependency>
<groupId>com.snowflake</groupId>
<artifactId>snowpark</artifactId>
<version>1.6.2</version>
</dependency>
<dependency>
<groupId>net.mobilize.snowpark-extensions</groupId>
<artifactId>snowparkextensions</artifactId>
<version>0.0.9</version>
</dependency>
...
</dependencies>
The tool includes these two import statements in all output .scala files.
import com.snowflake.snowpark_extensions.Extensions._
import com.snowflake.snowpark_extensions.Extensions.functions._
In the following code, hex and isin are supported by Spark, but these are not supported by Snowpark. The code will work because hex and isin are functions included as extensions.
package com.mobilize.spark
import org.apache.spark.sql._
object Main {
def main(args: Array[String]) : Unit = {
var languageArray = Array("Java");
var languageHex = hex(col("language"));
col("language").isin(languageArray:_*);
}
}
package com.mobilize.spark
import com.snowflake.snowpark._
import com.snowflake.snowpark_extensions.Extensions._
import com.snowflake.snowpark_extensions.Extensions.functions._
object Main {
def main(args: Array[String]) : Unit = {
var languageArray = Array("Java");
// hex does not exist on Snowpark. It is a extension.
var languageHex = hex(col("language"));
// isin does not exist on Snowpark. It is a extension.
col("language").isin(languageArray :_*)
}
}