SMA CLI Walkthrough

A walkthrough on using the SMA CLI

The Snowpark Migration Accelerator (SMA) can take code written in Python or Scala for Spark, report on the compatibility with Snowpark, and convert some references from the Spark API to the Snowpark API. Where conversion is not possible, the tool identifies what cannot be converted and reports on it. Beyond the Spark API, the SMA can also inventory any third party library import in script files or notebooks, and will produce an editable report that can be used to share the compatibility between a codebase(s) Spark and Snowpark code.

Recently, Snowflake published the CLI for the SMA. Let’s walk through how to use it by itself and in a script.

Using the CLI

You can download the CLI from the Download and Access section of this documentation. Choose the operating system available on the machine or container where you will be using the CLI. When you download the CLI, you can place it in any location or container where you have the ability to access it.

NOTE: This walkthrough was done using a Mac. The screenshots shown here are from a Mac computer, but Windows and Linux users can expect a similar experience.

Unzip the downloaded package file (.zip or .tar depending on your OS). When you do, you’ll see a series of folders, but ultimately the CLI itself is available in the orchestrator folder:

If you launch a terminal or command prompt window in this folder, you can run the CLI. To test this, check the version number of the CLI by using ./sma –version

You should see a result similar to this:

The SMA CLI (like the SMA application) is a local application. To scan code files with the SMA, they need to be locally accessible to the CLI (meaning, you need to be able to access them from the CLI). The same type of files that the regular SMA can process are also available to process from the CLI. You can find the supported filetypes in the SMA documentation.

NOTE: if you need a sample codebase to run the CLI through, you can reference the sample codebase mentioned in the Assessment or Conversion walkthroughs in the SMA documentation.

The list of arguments that can be passed to the CLI is available in the SMA documentation, but we’ll walkthrough some of the key ones here.

The SMA CLI defaults to running in Conversion mode, not Assessment mode. (You can tell the SMA to run the CLI in assessment mode by passing the -a argument.) When running a conversion, you will need to have a valid access code. You can check to see if you have a valid access code by running the following command:

./sma show-ac

If you do not have a valid access code, you can request one by following the instructions in the SMA documentation. One will be emailed to you, and you can use the install access code parameter in the CLI to install it.

Once you are activated, you can run a conversion. This will require you to specify an input and an output directory, and if you have not setup a project file already, you will need to specify a user email, organization name, and project name as well. These parameters are not necessary if you are going to keep them the same moving forward. You could only pass the input and output directory. This could look like this:

./sma -i '/your/INput/directory/path/here' -o '/your/OUTput/directory/path/here' -e [email protected] -c Your-Organization -p Your-Project-Name

This will give you a summary of the execution information that you have entered and ask if you’d like to continue:

If you’d rather not be prompted to confirm, you can pass an additional parameter (--yes or -y) to bypass the confirmation. Note that this will be essential when programmatically calling the CLI from a script, which we will see below.

The tool will print out a lot of information on what it is doing:

As long as it continues to print, it is still running. The prompt will show up again when the tool has finished. The amount that is printed might seem overwhelming as the tool will print every process it runs through, every issue it encounters, and every stage step that it completes (or fails to complete). You do not need to spend too much time pouring through this. All of this will also be available in the Logs output folder.

Viewing the Output

The SMA CLI’s output is identical to the SMA application’s output. In the output folder directory that you specified when running the tool, you will find the Reports, Logs, and Output folders. The output code is in the output folder.

For more information on working through converted code output by the SMA, you can view the conversion walkthrough.

Running the CLI Programmatically

Coming soon! The SMA team will share a script that allows you to call the SMA CLI programmatically on mulitple directories.

You can give the CLI a try today. If you have any questions, you can reach out to the SMA team at [email protected].

PreviousUsing SMA with Docker NextSnowpark Connect

Last updated 1 year ago