Release Notes
Release Notes for the Snowpark Migration Accelerator (SMA)
Note that the release notes below are organized by release date. Version numbers for both the application and the conversion core will appear below.
July 17, 2025
Application & CLI Version 2.7.6
Included SMA Core Versions
Snowpark Conversion Core 8.0.30
Added
Adjusted mappings for spark.DataReader methods.
DataFrame.unionis nowDataFrame.unionAll.DataFrame.unionByNameis nowDataFrame.unionAllByName.Added multi-level artifact dependency columns in artifact inventory
Added new Pandas EWIs documentation, from
PNDSPY1005toPNDSPY1010.Added a specific EWI for
pandas.core.series.Series.apply.
Changed
Bumped the version of Snowpark Pandas API supported by the SMA from
1.27.0to1.30.0.
Fixed
Fixed an issue with missing values in the formula to get the SQL readiness score.
Fixed a bug that was causing some Pandas elements to have the default EWI message from PySpark.
July 2nd, 2025
Application & CLI Version 2.7.5
Included SMA Core Versions
Snowpark Conversion Core 8.0.19
Changed
Refactored Pandas Imports: Pandas imports now use `modin.pandas` instead of
snowflake.snowpark.modin.pandas.Improved `dbutils` and Magic Commands Transformation:
A new
sfutils.pyfile is now generated, and alldbutilsprefixes are replaced withsfutils.For Databricks (DBX) notebooks, an implicit import for
sfutilsis automatically added.The
sfutilsmodule simulates variousdbutilsmethods, including file system operations (dbutils.fs) via a defined Snowflake FileSystem (SFFS) stage, and handles notebook execution (dbutils.notebook.run) by transforming it toEXECUTE NOTEBOOKSQL functions.dbutils.notebook.exitis removed as it is not required in Snowflake.
Fixed
Updates in SnowConvert Reports: SnowConvert reports now include the CellId column when instances originate from SMA, and the FileName column displays the full path.
Updated Artifacts Dependency for SnowConvert Reports: The SMA's artifact inventory report, which was previously impacted by the integration of SnowConvert, has been restored. This update enables the SMA tool to accurately capture and analyze Object References and Missing Object References directly from SnowConvert reports, thereby ensuring the correct retrieval of SQL dependencies for the inventory.
June 26th, 2025
Application & CLI Version 2.7.4
Desktop App
Added
Added telemetry improvements.
Fixed
Fix documentation links in conversion settings pop-up and Pandas EWIs.
Included SMA Core Versions
Snowpark Conversion Core 8.0.16
Added
Transforming Spark XML to Snowpark
Databricks SQL option in the SQL source language
Transform JDBC read connections.
Changed
All the SnowConvert reports are copied to the backup Zip file.
The folder is renamed from
SqlReportstoSnowConvertReports.SqlFunctionsInventoryis moved to the folderReports.All the SnowConvert Reports are sent to Telemetry.
Fixed
Non-deterministic issue with SQL Readiness Score.
Fixed a false-positive critical result that made the desktop crash.
Fixed issue causing the Artifacts dependency report not to show the SQL objects.
June 10th, 2025
Application & CLI Version 2.7.2
Included SMA Core Versions
Snowpark Conversion Core 8.0.2
Fixed
Addressed an issue with SMA execution on the latest Windows OS, as previously reported. This fix resolves the issues encountered in version 2.7.1.
June 9th, 2025
Application & CLI Version 2.7.1
Included SMA Core Versions
Snowpark Conversion Core 8.0.1
Added
The Snowpark Migration Accelerator (SMA) now orchestrates SnowConvert to process SQL found in user workloads, including embedded SQL in Python / Scala code, Notebook SQL cells, .sql files, and .hql files.
The SnowConvert now enhances the previous SMA capabilities:
A new folder in the Reports called SQL Reports contains the reports generated by SnowConvert.
Known Issues
The previous SMA version for SQL reports will appear empty for the following:
For
Reports/SqlElementsInventory.csv, partially covered by theReports/SqlReports/Elements.yyyymmdd.hhmmss.csv.For
Reports/SqlFunctionsInventory.csvrefer to the new location with the same name atReports/SqlReports/SqlFunctionsInventory.csv
The artifact dependency inventory:
In the
ArtifactDependencyInventorythe column for the SQL Object will appear empty
May 5th, 2025
Application & CLI Version 2.6.10
Included SMA Core Versions
Snowpark Conversion Core 7.4.0
Fixed
Fixed wrong values in the 'checkpoints.json' file.
The 'sample' value was without decimals (for integer values) and quotes.
The 'entryPoint' value had dots instead of slashes and was missing the file extension.
Updated the default value to TRUE for the setting 'Convert DBX notebooks to Snowflake notebooks'
April 28th, 2025
Application & CLI Version 2.6.8
Desktop App
Added checkpoints execution settings mechanism recognition.
Added a mechanism to collect DBX magic commands into DbxElementsInventory.csv
Added 'checkpoints.json' generation into the input directory.
Added a new EWI for all not supported magic command.
Added the collection of dbutils into DbxElementsInventory.csv from scala source notebooks
Included SMA Core Versions
Snowpark Conversion Core 7.2.53
Changed
Updates made to handle transformations from DBX Scala elements to Jupyter Python elements, and to comment the entire code from the cell.
Updates made to handle transformations from dbutils.notebook.run and “r" commands, for the last one, also comment out the entire code from the cell.
Updated the name and the letter of the key to make the conversion of the notebook files.
Fixed
Fixed the bug that was causing the transformation of DBX notebooks into .ipynb files to have the wrong format.
Fixed the bug that was causing .py DBX notebooks to not be transformable into .ipynb files.
Fixed a bug that was causing comments to be missing in the output code of DBX notebooks.
Fixed a bug that was causing raw Scala files to be converted into ipynb files.
April 21st, 2025
Application & CLI Version 2.6.7
Included SMA Core Versions
Snowpark Conversion Core 7.2.42
Changed
Updated DataFramesInventory to fill EntryPoints column
April 7th, 2025
Application & CLI Version 2.6.6
Desktop App
Added
Update DBx EWI link in the UI results page
Included SMA Core Versions
Snowpark Conversion Core 7.2.39
Added
Added Execution Flow inventory generation.
Added implicit session setup in every DBx notebook transformation
Changed
Renamed the DbUtilsUsagesInventory.csv to DbxElementsInventory.csv
Fixed
Fixed a bug that caused a Parsing error when a backslash came after a type hint.
Fixed relative imports that do not start with a dot and relative imports with a star.
March 27th, 2025
Application & CLI Version 2.6.5
Desktop App
Added
Added a new conversion setting toggle to enable or disable Sma-Checkpoints feature.
Fix report issue to not crash when post api returns 500
Included SMA Core Versions
Snowpark Conversion Core 7.2.26
Added
Added generation of the checkpoints.json file into the output folder based on the DataFramesInventory.csv.
Added "disableCheckpoints" flag into the CLI commands and additional parameters of the code processor.
Added a new replacer for Python to transform the dbutils.notebook.run node.
Added new replacers to transform the magic %run command.
Added new replacers (Python and Scala) to remove the dbutils.notebook.exit node.
Added Location column to artifacts inventory.
Changed
Refactored the normalized directory separator used in some parts of the solution.
Centralized the DBC extraction working folder name handling.
Updated Snowpark and Pandas version to v1.27.0
Updated the artifacts inventory columns to:
Name -> Dependency
File -> FileId
Status -> Status_detail
Added new column to the artifacts inventory:
Success
Fixed
Dataframes inventory was not being uploaded to the stage correctly.
March 12th, 2025
Application & CLI Version 2.6.4
Included SMA Core Versions
Snowpark Conversion Core 7.2.0
Added
An Artifact Dependency Inventory
A replacer and EWI for pyspark.sql.types.StructType.fieldNames method to snowflake.snowpark.types.StructType.fieldNames attribute.
The following PySpark functions with the status:
NotSupported
pyspark.sql.functions.map_contains_keypyspark.sql.functions.positionpyspark.sql.functions.regr_r2pyspark.sql.functions.try_to_binary
The following Pandas functions with status
pandas.core.series.Series.str.ljustpandas.core.series.Series.str.centerpandas.core.series.Series.str.padpandas.core.series.Series.str.rjust
Update the following Pyspark functions with the status
From WorkAround to Direct
pyspark.sql.functions.acoshpyspark.sql.functions.asinhpyspark.sql.functions.atanhpyspark.sql.functions.instrpyspark.sql.functions.log10pyspark.sql.functions.log1ppyspark.sql.functions.log2
From NotSupported to Direct
pyspark.sql.functions.bit_lengthpyspark.sql.functions.cbrtpyspark.sql.functions.nth_valuepyspark.sql.functions.octet_lengthpyspark.sql.functions.base64pyspark.sql.functions.unbase64
Updated the folloing Pandas functions with the status
From NotSupported to Direct
pandas.core.frame.DataFrame.poppandas.core.series.Series.betweenpandas.core.series.Series.pop
March 6th, 2025
Application & CLI Version 2.6.3
Included SMA Core Versions
Snowpark Conversion Core 7.1.13
Added
Added csv generator class for new inventory creation.
Added "full_name" column to import usages inventory.
Added transformation from pyspark.sql.functions.concat_ws to snowflake.snowpark.functions._concat_ws_ignore_nulls.
Added logic for generation of checkpoints.json.
Added the inventories:
DataFramesInventory.csv.
CheckpointsInventory.csv
February 21st, 2025
Application & CLI Version 2.6.0
Desktop App
Updated the licensing agreement, acceptance is required.
Included SMA Core Versions
Snowpark Conversion Core 7.1.2
Added
Updated the mapping status for the following Pandas elements, from NotSupported to Direct
pandas.io.html.read_htmlpandas.io.json._normalize.json_normalizepandas.core.groupby.generic.DataFrameGroupBy.pct_changepandas.core.groupby.generic.SeriesGroupBy.pct_change
Updated the mapping status for the following PySpark elements, from Rename to Direct
pyspark.sql.functions.collect_listpyspark.sql.functions.size
Fixed
Standardized the format of the version number in the inventories.
February 5th, 2025
Hotfix: Application & CLI Version 2.5.2
Desktop App
Fixed an issue when converting in the sample project option.
Included SMA Core Versions
Snowpark Conversion Core 5.3.0
February 4th, 2025
Application & CLI Version 2.5.1
Desktop App
Added a new modal when the user does not have write permission.
Updated the licensing aggrement, acceptance is required.
CLI
Fixed the year in the CLI screen when showing "--version" or "-v"
Included SMA Core Versions
Snowpark Conversion Core 5.3.0
Added
Changed
Update .NET version to v9.0.0.
Improved EWI SPRKPY1068.
Bumped the version of Snowpark Python API supported by the SMA from 1.24.0 to 1.25.0.
Updated the detailed report template, now has the Snowpark version for Pandas.
Changed the following libraries from ThirdPartyLib to BuiltIn.
configparserdataclassespathlibreadlinestatisticszlib
Known Issue
This version includes an issue when converting the sample project will not work on this version, it will be fixed on the next release
January 9th, 2025
Application & CLI Version 2.4.3
Desktop App
Added link to the troubleshooting guide in the crash report modal.
Included SMA Core Versions
Snowpark Conversion Core 4.15.0
Added
Added the following PySpark elements to ConversionStatusPySpark.csv file as
NotSupported:pyspark.sql.streaming.readwriter.DataStreamReader.tablepyspark.sql.streaming.readwriter.DataStreamReader.schemapyspark.sql.streaming.readwriter.DataStreamReader.optionspyspark.sql.streaming.readwriter.DataStreamReader.optionpyspark.sql.streaming.readwriter.DataStreamReader.loadpyspark.sql.streaming.readwriter.DataStreamReader.formatpyspark.sql.streaming.query.StreamingQuery.awaitTerminationpyspark.sql.streaming.readwriter.DataStreamWriter.partitionBypyspark.sql.streaming.readwriter.DataStreamWriter.toTablepyspark.sql.streaming.readwriter.DataStreamWriter.triggerpyspark.sql.streaming.readwriter.DataStreamWriter.queryNamepyspark.sql.streaming.readwriter.DataStreamWriter.outputModepyspark.sql.streaming.readwriter.DataStreamWriter.formatpyspark.sql.streaming.readwriter.DataStreamWriter.optionpyspark.sql.streaming.readwriter.DataStreamWriter.foreachBatchpyspark.sql.streaming.readwriter.DataStreamWriter.start
Changed
Updated Hive SQL EWIs format.
Updated Spark SQL EWIs format.
Fixed
Fixed a bug that was causing some PySpark elements not identified by the tool.
Fixed the mismatch in the ThirdParty identified calls and the ThirdParty import Calls number.
December 13th, 2024
Application & CLI Version 2.4.2
Included SMA Core Versions
Snowpark Conversion Core 4.14.0
Added
Added the following Spark elements to ConversionStatusPySpark.csv:
pyspark.broadcast.Broadcast.valuepyspark.conf.SparkConf.getAllpyspark.conf.SparkConf.setAllpyspark.conf.SparkConf.setMasterpyspark.context.SparkContext.addFilepyspark.context.SparkContext.addPyFilepyspark.context.SparkContext.binaryFilespyspark.context.SparkContext.setSystemPropertypyspark.context.SparkContext.versionpyspark.files.SparkFilespyspark.files.SparkFiles.getpyspark.rdd.RDD.countpyspark.rdd.RDD.distinctpyspark.rdd.RDD.reduceByKeypyspark.rdd.RDD.saveAsTextFilepyspark.rdd.RDD.takepyspark.rdd.RDD.zipWithIndexpyspark.sql.context.SQLContext.udfpyspark.sql.types.StructType.simpleString
Changed
Updated the documentation of the Pandas EWIs,
PNDSPY1001,PNDSPY1002andPNDSPY1003SPRKSCL1137to align with a standardized format, ensuring consistency and clarity across all the EWIs.Updated the documentation of the following Scala EWIs:
SPRKSCL1106andSPRKSCL1107. To be aligned with a standardized format, ensuring consistency and clarity across all the EWIs.
Fixed
Fixed the bug the was causing the UserDefined symbols showing in the third party usages inventory.
December 4th, 2024.
Application & CLI Version 2.4.1
Included SMA Core Versions
Snowpark Conversion Core 4.13.1
Command Line Interface
Changed
Added timestamp to the output folder.
Snowpark Conversion Core 4.13.1
Added
Added 'Source Language' column to Library Mappings Table
Added
Othersas a new category in the Pandas API Summary table of the DetailedReport.docx
Changed
Updated the documentation for Python EWI
SPRKPY1058.Updated the message for the pandas EWI
PNDSPY1002to show the relate pandas element.Updated the way we created the .csv reports, now are overwritten after a second run .
Fixed
Fixed a bug that was causing Notebook files not being generated in the output.
Fixed the replacer for
getandsetmethods frompyspark.sql.conf.RuntimeConfig, the replacer now match the correct full names.Fixed query tag incorrect version.
Fixed UserDefined packages reported as ThirdPartyLib.
November 14th, 2024
Application & CLI Version 2.3.1
Included SMA Core Versions
Snowpark Conversion Core 4.12.0
Desktop App
Fixed
Fix case-sensitive issues in --sql options.
Removed
Remove platform name from show-ac message.
Snowpark Conversion Core 4.12.0
Added
Added support for Snowpark Python 1.23.0 and 1.24.0.
Added a new EWI for the
pyspark.sql.dataframe.DataFrame.writeTofunction. All the usages of this function will now have the EWI SPRKPY1087.
Changed
Updated the documentation of the Scala EWIs from
SPRKSCL1137toSPRKSCL1156to align with a standardized format, ensuring consistency and clarity across all the EWIs.Updated the documentation of the Scala EWIs from
SPRKSCL1117toSPRKSCL1136to align with a standardized format, ensuring consistency and clarity across all the EWIs.Updated the message that is shown for the following EWIs:
Updated the documentation of the Scala EWIs from
SPRKSCL1100toSPRKSCL1105, fromSPRKSCL1108toSPRKSCL1116; fromSPRKSCL1157toSPRKSCL1175; to align with a standardized format, ensuring consistency and clarity across all the EWIs.Updated the mapping status of the following PySpark elements from NotSupported to Direct with EWI:
pyspark.sql.readwriter.DataFrameWriter.option=>snowflake.snowpark.DataFrameWriter.option: All the usages of this function now have the EWI SPRKPY1088pyspark.sql.readwriter.DataFrameWriter.options=>snowflake.snowpark.DataFrameWriter.options: All the usages of this function now have the EWI SPRKPY1089
Updated the mapping status of the following PySpark elements from Workaround to Rename:
pyspark.sql.readwriter.DataFrameWriter.partitionBy=>snowflake.snowpark.DataFrameWriter.partition_by
Updated EWI documentation: SPRKSCL1000, SPRKSCL1001, SPRKSCL1002, SPRKSCL1100, SPRKSCL1101, SPRKSCL1102, SPRKSCL1103, SPRKSCL1104, SPRKSCL1105.
Removed
Removed the
pyspark.sql.dataframe.DataFrameStatFunctions.writeToelement from the conversion status, this element does not exist.
Deprecated
Deprecated the following EWI codes:
October 30th, 2024
Application & CLI Version 2.3.0
Snowpark Conversion Core 4.11.0
Snowpark Conversion Core 4.11.0
Added
Added a new column called
Urlto theIssues.csvfile, which redirects to the corresponding EWI documentation.Added new EWIs for the following Spark elements:
[SPRKPY1082] pyspark.sql.readwriter.DataFrameReader.load
[SPRKPY1083] pyspark.sql.readwriter.DataFrameWriter.save
[SPRKPY1084] pyspark.sql.readwriter.DataFrameWriter.option
[SPRKPY1085] pyspark.ml.feature.VectorAssembler
[SPRKPY1086] pyspark.ml.linalg.VectorUDT
Added 38 new Pandas elements:
pandas.core.frame.DataFrame.select
andas.core.frame.DataFrame.str
pandas.core.frame.DataFrame.str.replace
pandas.core.frame.DataFrame.str.upper
pandas.core.frame.DataFrame.to_list
pandas.core.frame.DataFrame.tolist
pandas.core.frame.DataFrame.unique
pandas.core.frame.DataFrame.values.tolist
pandas.core.frame.DataFrame.withColumn
pandas.core.groupby.generic._SeriesGroupByScalar
pandas.core.groupby.generic._SeriesGroupByScalar[S1].agg
pandas.core.groupby.generic._SeriesGroupByScalar[S1].aggregate
pandas.core.indexes.datetimes.DatetimeIndex.year
pandas.core.series.Series.columns
pandas.core.tools.datetimes.to_datetime.date
pandas.core.tools.datetimes.to_datetime.dt.strftime
pandas.core.tools.datetimes.to_datetime.strftime
pandas.io.parsers.readers.TextFileReader.apply
pandas.io.parsers.readers.TextFileReader.astype
pandas.io.parsers.readers.TextFileReader.columns
pandas.io.parsers.readers.TextFileReader.copy
pandas.io.parsers.readers.TextFileReader.drop
pandas.io.parsers.readers.TextFileReader.drop_duplicates
pandas.io.parsers.readers.TextFileReader.fillna
pandas.io.parsers.readers.TextFileReader.groupby
pandas.io.parsers.readers.TextFileReader.head
pandas.io.parsers.readers.TextFileReader.iloc
pandas.io.parsers.readers.TextFileReader.isin
pandas.io.parsers.readers.TextFileReader.iterrows
pandas.io.parsers.readers.TextFileReader.loc
pandas.io.parsers.readers.TextFileReader.merge
pandas.io.parsers.readers.TextFileReader.rename
pandas.io.parsers.readers.TextFileReader.shape
pandas.io.parsers.readers.TextFileReader.to_csv
pandas.io.parsers.readers.TextFileReader.to_excel
pandas.io.parsers.readers.TextFileReader.unique
pandas.io.parsers.readers.TextFileReader.values
pandas.tseries.offsets
October 24th, 2024
Application Version 2.2.3
Included SMA Core Versions
Snowpark Conversion Core 4.10.0
Desktop App
Fixed
Fixed a bug that caused the SMA to show the label SnowConvert instead of Snowpark Migration Accelerator in the menu bar of the Windows version.
Fixed a bug that caused the SMA to crash when it did not have read and write permissions to the
.configdirectory in macOS and theAppDatadirectory in Windows.
Command Line Interface
Changed
Renamed the CLI executable name from
snowcttosma.Removed the source language argument so you no longer need to specify if you are running a Python or Scala assessment / conversion.
Expanded the command line arguments supported by the CLI by adding the following new arguments:
--enableJupyter|-j: Flag to indicate if the conversion of Databricks notebooks to Jupyter is enabled or not.--sql|-f: Database engine syntax to be used when a SQL command is detected.--customerEmail|-e: Configure the customer email.--customerCompany|-c: Configure the customer company.--projectName|-p: Configure the customer project.
Updated some texts to reflect the correct name of the application, ensuring consistency and clarity in all the messages.
Updated the terms of use of the application.
Updated and expanded the documentation of the CLI to reflect the latests features, enhancements and changes.
Updated the text that is shown before proceeding with the execution of the SMA to improve
Updated the CLI to accept “Yes” as a valid argument when prompting for user confirmation.
Allowed the CLI to continue the execution without waiting for user interaction by specifying the argument
-yor--yes.Updated the help information of the
--sqlargument to show the values that this argument expects.
Snowpark Conversion Core Version 4.10.0
Added
Added a new EWI for the
pyspark.sql.readwriter.DataFrameWriter.partitionByfunction. All the usages of this function will now have the EWI SPRKPY1081.Added a new column called
Technologyto theImportUsagesInventory.csvfile.
Changed
Updated the Third-Party Libraries readiness score to also take into account the
Unknownlibraries.Updated the
AssessmentFiles.zipfile to include.jsonfiles instead of.pamfiles.Improved the CSV to JSON conversion mechanism to make processing of inventories more performant.
Improved the documentation of the following EWIs:
Updated the mapping status of the following Spark Scala elements from
DirecttoRename.org.apache.spark.sql.functions.shiftLeft=>com.snowflake.snowpark.functions.shiftleftorg.apache.spark.sql.functions.shiftRight=>com.snowflake.snowpark.functions.shiftright
Updated the mapping status of the following Spark Scala elements from
Not SupportedtoDirect.org.apache.spark.sql.functions.shiftleft=>com.snowflake.snowpark.functions.shiftleftorg.apache.spark.sql.functions.shiftright=>com.snowflake.snowpark.functions.shiftright
Fixed
Fixed a bug that caused the SMA to incorrectly populate the
Origincolumn of theImportUsagesInventory.csvfile.Fixed a bug that caused the SMA to not classify imports of the libraries
io,json,loggingandunittestas Python built-in imports in theImportUsagesInventory.csvfile and in theDetailedReport.docxfile.
October 11th, 2024
Application Version 2.2.2
Features Updates include:
Snowpark Conversion Core 4.8.0
Snowpark Conversion Core Version 4.8.0
Added
Added
EwiCatalog.csvand .md files to reorganize documentationAdded the mapping status of
pyspark.sql.functions.lnDirect.Added a transformation for
pyspark.context.SparkContext.getOrCreatePlease check the EWI SPRKPY1080 for further details.
Added an improvement for the SymbolTable, infer type for parameters in functions.
Added SymbolTable supports static methods and do not assume the first parameter will be self for them.
Changed
Updated the mapping status of:
pyspark.sql.functions.array_removefromNotSupportedtoDirect.
Fixed
Fixed the Code File Sizing table in the Detail Report to exclude .sql and .hql files and added the Extra Large row in the table.
Fixed missing the
update_query_tagwhenSparkSessionis defined into multiple lines onPython.Fixed missing the
update_query_tagwhenSparkSessionis defined into multiple lines onScala.Fixed missing EWI
SPRKHVSQL1001to some SQL statements with parsing errors.Fixed keep new lines values inside string literals
Fixed the Total Lines of code showed in the File Type Summary Table
Fixed Parsing Score showed as 0 when recognize files successfully
Fixed LOC count in the cell inventory for Databricks Magic SQL Cells
September 26th, 2024
Application Version 2.2.0
Feature Updates include:
Snowpark Conversion Core 4.6.0
Snowpark Conversion Core Version 4.6.0
Added
Add transformation for
pyspark.sql.readwriter.DataFrameReader.parquet.Add transformation for
pyspark.sql.readwriter.DataFrameReader.optionwhen it is a Parquet method.
Changed
Updated the mapping status of:
pyspark.sql.types.StructType.fieldsfromNotSupportedtoDirect.pyspark.sql.types.StructType.namesfromNotSupportedtoDirect.pyspark.context.SparkContext.setLogLevelfromWorkaroundtoTransformation.More detail can be found in EWIs SPRKPY1078 and SPRKPY1079
org.apache.spark.sql.functions.roundfromWorkAroundtoDirect.org.apache.spark.sql.functions.udffromNotDefinedtoTransformation.More detail can be found in EWIs SPRKSCL1174 and SPRKSCL1175
Updated the mapping status of the following Spark elements from
DirectHelpertoDirect:org.apache.spark.sql.functions.hexorg.apache.spark.sql.functions.unhexorg.apache.spark.sql.functions.shiftleftorg.apache.spark.sql.functions.shiftrightorg.apache.spark.sql.functions.reverseorg.apache.spark.sql.functions.isnullorg.apache.spark.sql.functions.unix_timestamporg.apache.spark.sql.functions.randnorg.apache.spark.sql.functions.signumorg.apache.spark.sql.functions.signorg.apache.spark.sql.functions.collect_listorg.apache.spark.sql.functions.log10org.apache.spark.sql.functions.log1porg.apache.spark.sql.functions.base64org.apache.spark.sql.functions.unbase64org.apache.spark.sql.functions.regexp_extractorg.apache.spark.sql.functions.exprorg.apache.spark.sql.functions.date_formatorg.apache.spark.sql.functions.descorg.apache.spark.sql.functions.ascorg.apache.spark.sql.functions.sizeorg.apache.spark.sql.functions.locateorg.apache.spark.sql.functions.ntile
Fixed
Fixed value showed in the Percentage of total Pandas Api
Fixed Total percentage on ImportCalls table in the DetailReport
Deprecated
Deprecated the following EWI code:
September 12th, 2024
Application Version 2.1.7
Feature Updates include:
Snowpark Conversion Core 4.5.7
Snowpark Conversion Core 4.5.2
Snowpark Conversion Core Version 4.5.7
Hotfixed
Fixed Total row added on Spark Usages Summaries when there are not usages
Bumped of Python Assembly to Version=
1.3.111Parse trail comma in multiline arguments
Snowpark Conversion Core Version 4.5.2
Added
Added transformation for
pyspark.sql.readwriter.DataFrameReader.option:When the chain is from a CSV method call.
When the chain is from a JSON method call.
Added transformation for
pyspark.sql.readwriter.DataFrameReader.json.
Changed
Executed SMA on SQL strings passed to Python/Scala functions
Create AST in Scala/Python to emit temporary SQL unit
Create SqlEmbeddedUsages.csv inventory
Deprecate SqlStatementsInventroy.csv and SqlExtractionInventory.csv
Integrate EWI when the SQL literal could not be processed
Create new task to process SQL-embedded code
Collect info for SqlEmbeddedUsages.csv inventory in Python
Replace SQL transformed code to Literal in Python
Update test cases after implementation
Create table, views for telemetry in SqlEmbeddedUsages inventory
Collect info for SqlEmbeddedUsages.csv report in Scala
Replace SQL transformed code to Literal in Scala
Check line number order for Embedded SQL reporting
Filled the
SqlFunctionsInfo.csvwith the SQL functions documented for SparkSQL and HiveSQLUpdated the mapping status for:
org.apache.spark.sql.SparkSession.sparkContextfrom NotSupported to Transformation.org.apache.spark.sql.Builder.configfromNotSupportedtoTransformation. With this new mapping status, the SMA will remove all the usages of this function from the source code.
September 5th, 2024
Application Version 2.1.6
Hotfix change for Snowpark Engines Core version 4.5.1
Spark Conversion Core Version 4.5.1
Hotfix
Added a mechanism to convert the temporal Databricks notebooks generated by SMA in exported Databricks notebooks
August 29th, 2024
Application Version 2.1.5
Feature Updates include:
Updated Spark Conversion Core: 4.3.2
Spark Conversion Core Version 4.3.2
Added
Added the mechanism (via decoration) to get the line and the column of the elements identified in notebooks cells
Added an EWI for pyspark.sql.functions.from_json.
Added a transformation for pyspark.sql.readwriter.DataFrameReader.csv.
Enabled the query tag mechanism for Scala files.
Added the Code Analysis Score and additional links to the Detailed Report.
Added a column called OriginFilePath to InputFilesInventory.csv
Changed
Updated the mapping status of pyspark.sql.functions.from_json from Not Supported to Transformation.
Updated the mapping status of the following Spark elements from Workaround to Direct:
org.apache.spark.sql.functions.countDistinct
org.apache.spark.sql.functions.max
org.apache.spark.sql.functions.min
org.apache.spark.sql.functions.mean
Deprecated
Deprecated the following EWI codes:
SPRKSCL1135
SPRKSCL1136
SPRKSCL1153
SPRKSCL1155
Fixed
Fixed a bug that caused an incorrect calculation of the Spark API score.
Fixed an error that avoid copy SQL empty or commented files in the output folder.
Fixed a bug in the DetailedReport, the notebook stats LOC and Cell count is not accurate.
August 14th, 2024
Application Version 2.1.2
Feature Updates include:
Updated Spark Conversion Core: 4.2.0
Spark Conversion Core Version 4.2.0
Added
Add technology column to SparkUsagesInventory
Added an EWI for not defined SQL elements .
Added SqlFunctions Inventory
Collect info for SqlFunctions Inventory
Changed
The engine now processes and prints partially parsed Python files instead of leaving original file without modifications.
Python notebook cells that have parsing errors will also be processed and printed.
Fixed
Fixed
pandas.core.indexes.datetimes.DatetimeIndex.strftimewas being reported wrongly.Fix mismatch between SQL readiness score and SQL Usages by Support Status.
Fixed a bug that caused the SMA to report
pandas.core.series.Series.emptywith an incorrect mapping status.Fix mismatch between Spark API Usages Ready for Conversion in DetailedReport.docx is different than UsagesReadyForConversion row in Assessment.json.
August 8th, 2024
Application Version 2.1.1
Feature Updates include:
Updated Spark Conversion Core: 4.1.0
Spark Conversion Core Version 4.1.0
Added
Added the following information to the
AssessmentReport.jsonfileThe third-party libraries readiness score.
The number of third-party library calls that were identified.
The number of third-party library calls that are supported in Snowpark.
The color code associated with the third-party readiness score, the Spark API readiness score, and the SQL readiness score.
Transformed
SqlSimpleDataTypein Spark create tables.Added the mapping of
pyspark.sql.functions.getas direct.Added the mapping of
pyspark.sql.functions.to_varcharas direct.As part of the changes after unification, the tool now generates an execution info file in the Engine.
Added a replacer for
pyspark.sql.SparkSession.builder.appName.
Changed
Updated the mapping status for the following Spark elements
From Not Supported to Direct mapping:
pyspark.sql.functions.signpyspark.sql.functions.signum
Changed the Notebook Cells Inventory report to indicate the kind of content for every cell in the column Element
Added a
SCALA_READINESS_SCOREcolumn that reports the readiness score as related only to references to the Spark API in Scala files.Partial support to transform table properties in
ALTER TABLEandALTER VIEWUpdated the conversion status of the node
SqlSimpleDataTypefrom Pending to Transformation in Spark create tablesUpdated the version of the Snowpark Scala API supported by the SMA from
1.7.0to1.12.1:Updated the mapping status of:
org.apache.spark.sql.SparkSession.getOrCreatefrom Rename to Directorg.apache.spark.sql.functions.sumfrom Workaround to Direct
Updated the version of the Snowpark Python API supported by the SMA from
1.15.0to1.20.0:Updated the mapping status of:
pyspark.sql.functions.arrays_zipfrom Not Supported to Direct
Updated the mapping status for the following Pandas elements:
Direct mappings:
pandas.core.frame.DataFrame.anypandas.core.frame.DataFrame.applymap
Updated the mapping status for the following Pandas elements:
From Not Supported to Direct mapping:
pandas.core.frame.DataFrame.groupbypandas.core.frame.DataFrame.indexpandas.core.frame.DataFrame.Tpandas.core.frame.DataFrame.to_dict
From Not Supported to Rename mapping:
pandas.core.frame.DataFrame.map
Updated the mapping status for the following Pandas elements:
Direct mappings:
pandas.core.frame.DataFrame.wherepandas.core.groupby.generic.SeriesGroupBy.aggpandas.core.groupby.generic.SeriesGroupBy.aggregatepandas.core.groupby.generic.DataFrameGroupBy.aggpandas.core.groupby.generic.DataFrameGroupBy.aggregatepandas.core.groupby.generic.DataFrameGroupBy.apply
Not Supported mappings:
pandas.core.frame.DataFrame.to_parquetpandas.core.generic.NDFrame.to_csvpandas.core.generic.NDFrame.to_excelpandas.core.generic.NDFrame.to_sql
Updated the mapping status for the following Pandas elements:
Direct mappings:
pandas.core.series.Series.emptypandas.core.series.Series.applypandas.core.reshape.tile.qcut
Direct mappings with EWI:
pandas.core.series.Series.fillnapandas.core.series.Series.astypepandas.core.reshape.melt.meltpandas.core.reshape.tile.cutpandas.core.reshape.pivot.pivot_table
Updated the mapping status for the following Pandas elements:
Direct mappings:
pandas.core.series.Series.dtpandas.core.series.Series.groupbypandas.core.series.Series.locpandas.core.series.Series.shapepandas.core.tools.datetimes.to_datetimepandas.io.excel._base.ExcelFile
Not Supported mappings:
pandas.core.series.Series.dt.strftime
Updated the mapping status for the following Pandas elements:
From Not Supported to Direct mapping:
pandas.io.parquet.read_parquetpandas.io.parsers.readers.read_csv
Updated the mapping status for the following Pandas elements:
From Not Supported to Direct mapping:
pandas.io.pickle.read_picklepandas.io.sql.read_sqlpandas.io.sql.read_sql_query
Updated the description of Understanding the SQL Readiness Score.
Updated
PyProgramCollectorto collect the packages and populate the current packages inventory with data from Python source code.Updated the mapping status of
pyspark.sql.SparkSession.builder.appNamefrom Rename to Transformation.Removed the following Scala integration tests:
AssesmentReportTest_AssessmentMode.ValidateReports_AssessmentModeAssessmentReportTest_PythonAndScala_Files.ValidateReports_PythonAndScalaAssessmentReportTestWithoutSparkUsages.ValidateReports_WithoutSparkUsages
Updated the mapping status of
pandas.core.generic.NDFrame.shapefrom Not Supported to Direct.Updated the mapping status of
pandas.core.seriesfrom Not Supported to Direct.
Deprecated
Deprecated the EWI code
SPRKSCL1160sinceorg.apache.spark.sql.functions.sumis now a direct mapping.
Fixed
Fixed a bug by not supporting Custom Magics without arguments in Jupyter Notebook cells.
Fixed incorrect generation of EWIs in the issues.csv report when parsing errors occur.
Fixed a bug that caused the SMA not to process the Databricks exported notebook as Databricks notebooks.
Fixed a stack overflow error while processing clashing type names of declarations created inside package objects.
Fixed the processing of complex lambda type names involving generics, e.g.,
def func[X,Y](f: (Map[Option[X], Y] => Map[Y, X]))...Fixed a bug that caused the SMA to add a PySpark EWI code instead of a Pandas EWI code to the Pandas elements that are not yet recognized.
Fixed a typo in the detailed report template: renaming a column from "Percentage of all Python Files" to "Percentage of all files".
Fixed a bug where
pandas.core.series.Series.shapewas wrongly reported.
July 19th, 2024
Application Version 2.1.0
Feature Updates include:
Updated Spark (Scala and Python) Conversion Core: 3.2.0
Spark Conversion Core Version 3.2.0
Changed
New Readiness Score for SQL in the results screen
Settings were added to the desktop application to enable or disable Pandas to Snowpark Pandas API conversion
July 11th, 2024
Application Version 2.0.2
Feature Updates include:
Updated Spark (Scala and Python) Conversion Core: 3.0.0
Spark Conversion Core Version 3.0.0
Breaking Changes
This new version includes major changes:
New Download links, previous version won’t be autoupdate.
A single Access is required for Python, Scala and SparkSQL. Previous access codes for Python will continue working. However, Scala ones won’t work anymore. You need to request a new access code.
No need to select language to analyze them.
After executing the tool, you won’t received the email
Snowpark Qualification Report. As the report information is available locally to the user.
Removed
Unify Python/Scala conversion tools.
Remove Select Source from Inquire form.
Remove Select Source from NewProject/SeeSample project.
Remove table generation from SMA.Desktop.
Changed
Unify Python/Scala conversion tools.
Update to remove Python and Scala Conversion Core Version and have just an Engine Conversion Core Version.
Update results screen.
Access Code toast has information related to the Source Language.
The summary Report screen has references to the Source Language.
Input folder path validation is displaying the wrong message.
Deprecate Scala licenses so that it only uses Python access code.
June 27, 2024
Application Version 1.3.1
Feature Updates include:
Updated Spark (Scala and Python) Conversion Core: 2.48.0
Spark Conversion Core Version 2.48.0
Added
Improved the parsing recovery mechanism for Scala files and Scala notebook cells to orphan fewer lines of code
Added support for
HADOOPshell command related to HiveQLAdded support for
HDFSshell command related to HiveQLAdded support for
TBLPROPERTIESinALTER VIEWstatementsUpdated the conversion status for SQL nodes in HiveQL that doesn't need conversion
Updated the conversion status for SQL nodes in SparkSQL that doesn't need conversion
The SQL nodes without a migration status were updated to
PENDINGImproved the Jupyter parser to support as parameters the filename and the package name
Fixed
Fixed a bug that caused the SMA to not show the readiness score even though there were uses of the Spark API
Fixed a bug that caused the EWI
SPRKSCL1000to show a wrong description in the issue list table of the detailed reportFixed the parsing of
Commentclauses in SQL Statements with new linesFixed the parsing of statements after a
Lateral Viewclause in HiveQL
June 13, 2024
Application Version 1.3.0
Feature Updates include:
Updated Spark (Scala and Python) Conversion Core: 2.47.0
Spark Conversion Core Version 2.47.0
Added
Added transformation for Hive Table Comment.
Added transformation for adding or replace Comment on Create View, Create table and Create function.
Added tag to comments for
CREATE FUNCTIONnodes.Removed the generation of the
conversion_rates.csv,files.csv, andparse_errors.csvinventories.
Fixed
Fixed DotNames (such as in this example:
select * from id.12id.12_id3) which starts with numbers.Parsed and refactored Comment Clause in the Create View.
Fixed missing columns on empty inventories.
May 30, 2024
Application Version 1.2.5
Feature Updates include:
Updated Spark (Scala and Python) Conversion Core: 2.46.0
Spark Conversion Core Version 2.46.0
Added
Added a parsing score indicator that shows the percentage of all the files that were successfully parsed .
Added SPRKPY1074 EWI for mixed indentation errors.
Updates to the Detailed Report
Updated the look and feel of the report for both Python and Scala.
Added a Total row for Code File Sizing table in the detailed report.
Added files with Pandas usages table and Pandas API usages summaries table.
Added the new File Type Summary table
Added a new table called Files with Spark Usages.
Added a new table called Files with Spark Usages by Support Status.
Added SQL usages by file type table.
Added SQL usages by status table.
Transpose Notebook stats by language table
Updated the detailed docx report to classify the readiness scores with N/A values as a green result
Reindex order of tables in the deatiled report.
Updated conversion Status for SQL nodes in HiveSql and SparkSql that doesn't need conversion
Updates to SQL parsing support
Identify and register mixed indentation error.
Parse IS as Binary Operator
Support RLike as Binary Operator
Support DotNames which starts with numbers
Parse Lateral View Clause
Parse Parquet as Name in the Using table option
Parsing IF as Function name
Parse query parameters as expressions in SparkSQL.
Parse IMAGE as alias
Parse module(%) operator
Parse ALL as alias
Parse of SQL notebook cell with %% in magic commands
Added a core library mapping table to support the third party library analysis
Added ConversionStatusLibraries.csv
Fixed
Comment out remaining semicolon in top level statement in HiveQL.
Fixed Parse Lateral View with multiple AsClauses
Fixed Parse Lateral View parsing order
May 16, 2024
Application Version 1.2.4
Feature Updates include:
Updated Spark (Scala and Python) Conversion Core: 2.45.1
Spark Conversion Core Version 2.45.1
Added
Argument/parameter information in Python listed in the usages inventories
Added mappings:
General PySpark
pyspark.sql.functions.map_from_arrayspyspark.sql.dataframe.DataFrame.toPandas
ML related Spark mappings for:
pyspark.mlpyspark.ml.classificationpyspark.ml.clusteringpyspark.ml.featurepyspark.ml.regressionpyspark.ml.feature StringIndexerpyspark.ml.clustering KMeanspyspark.ml.feature OneHotEncoderpyspark.ml.feature MinMaxScalerpyspark.ml.regression LinearRegressionpyspark.ml.feature StandardScalerpyspark.ml.classification RandomForestClassifierpyspark.ml.classification LogisticRegressionpyspark.ml.feature PCApyspark.ml.classification GBTClassifierpyspark.ml.classification DecisionTreeClassifierpyspark.ml.classification LinearSVCpyspark.ml.feature RobustScalerpyspark.ml.feature Binarizerpyspark.ml.feature MaxAbsScalerpyspark.ml.feature Normalizer
Pandas API mappings have begun to the new Snowpark implementation of Pandas. These will not be converted, but will now be reported in the Pandas Usages Inventory. 82 mappings for the Pandas API were mapped. All are direct mappings with the exception of the first one:
pandas.core.series.Series.transpose[rename]pandaspandas.core.frame.DataFramepandas.core.frame.DataFrame.abspandas.core.frame.DataFrame.add_suffixpandas.core.frame.DataFrame.axespandas.core.frame.DataFrame.columnspandas.core.frame.DataFrame.copypandas.core.frame.DataFrame.cummaxpandas.core.frame.DataFrame.cumminpandas.core.frame.DataFrame.describepandas.core.frame.DataFrame.diffpandas.core.frame.DataFrame.droppandas.core.frame.DataFrame.drop_duplicatespandas.core.frame.DataFrame.dtypespandas.core.frame.DataFrame.duplicatedpandas.core.frame.DataFrame.emptypandas.core.frame.DataFrame.firstpandas.core.frame.DataFrame.first_valid_indexpandas.core.frame.DataFrame.headpandas.core.frame.DataFrame.ilocpandas.core.frame.DataFrame.isinpandas.core.frame.DataFrame.isnapandas.core.frame.DataFrame.isnullpandas.core.frame.DataFrame.iterrowspandas.core.frame.DataFrame.itertuplespandas.core.frame.DataFrame.keyspandas.core.frame.DataFrame.lastpandas.core.frame.DataFrame.last_valid_indexpandas.core.frame.DataFrame.maxpandas.core.frame.DataFrame.meanpandas.core.frame.DataFrame.medianpandas.core.frame.DataFrame.minpandas.core.frame.DataFrame.ndimpandas.core.frame.DataFrame.notnapandas.core.frame.DataFrame.notnullpandas.core.frame.DataFrame.rename_axispandas.core.frame.DataFrame.reset_indexpandas.core.frame.DataFrame.select_dtypespandas.core.frame.DataFrame.set_axispandas.core.frame.DataFrame.set_indexpandas.core.frame.DataFrame.shapepandas.core.frame.DataFrame.sizepandas.core.frame.DataFrame.squeezepandas.core.frame.DataFrame.sumpandas.core.frame.DataFrame.tailpandas.core.frame.DataFrame.takepandas.core.frame.DataFrame.updatepandas.core.frame.DataFrame.value_countspandas.core.frame.DataFrame.valuespandas.core.groupby.generic.DataFrameGroupBy.countpandas.core.groupby.generic.DataFrameGroupBy.maxpandas.core.groupby.generic.DataFrameGroupBy.sumpandas.core.series.Series.abspandas.core.series.Series.add_prefixpandas.core.series.Series.add_suffixpandas.core.series.Series.arraypandas.core.series.Series.axespandas.core.series.Series.cummaxpandas.core.series.Series.cumminpandas.core.series.Series.describepandas.core.series.Series.diffpandas.core.series.Series.dtypepandas.core.series.Series.dtypespandas.core.series.Series.first_valid_indexpandas.core.series.Series.hasnanspandas.core.series.Series.idxmaxpandas.core.series.Series.idxminpandas.core.series.Series.keyspandas.core.series.Series.lastpandas.core.series.Series.last_valid_indexpandas.core.series.Series.medianpandas.core.series.Series.notnapandas.core.series.Series.rename_axispandas.core.series.Series.set_axispandas.core.series.Series.squeezepandas.core.series.Series.Tpandas.core.series.Series.tailpandas.core.series.Series.takepandas.core.series.Series.to_listpandas.core.series.Series.to_numpypandas.core.series.Series.update
Updated Mappings:
Added transformation for csv, json, and parquet functions including:
pyspark.sql.readwriter.DataFrameWriter.jsonpyspark.sql.readwriter.DataFrameWriter.csvpyspark.sql.readwriter.DataFrameWriter.parquet
Updated mapping for
pyspark.rdd.RDD.getNumPartitionsto transformationUpdated mapping for
pyspark.storagelevel.StorageLevelto transformation
Added end-to-end test infrastructure and input/output validations
Changed the import statement transformation: not supported imports are removed and EWI messages are not generated in the code
Updated conversion Status for SQL nodes in Hive that doesn't need conversion (multiple expressions - part 02)
Update the SqlElementsInfo.csv with new identified elements
Updated Replacer and SqlElementsInfo items to include Transformation
Enable decorations in transformation to comment out unsupported nodes
Fixed the groupBy function in the source code of
org.apache.spark.sql.DataFrameto place it correctly in the symbol tabletoPandas added as pyspark in the ThirdPartyLibs
Fixed
Fixed some scenarios where EWI comments were not being added to the output code
Fixed processing of empty source cells presents in Jupyter Notebooks
Fixed parsing error message not being added in the output code
Fixed issue of
pyspark.sql.functions.udfrequiring the return_type parameter
May 2, 2024
Application Version 1.2.2
Feature Updates include:
Updated Spark (Scala and Python) Conversion Core: 2.44.0
Spark Conversion Core Version 2.44.0
Added
Argument information available in in Python usages inventory
Updated conversion Status for SQL nodes in Hive that don't need conversion
Operators - numeric expressions
Function expressions
Multiple expressions
Name expressions and literals
Parsing improvments in SparkSQL:
DESCRIBE TABLE Clause
REFRESH Clause
Add the groupBy parameters in the analysis of org.apache.spark.sql.DataFrame
Improved the logging mechanism to indicate if the logs are only written when errors happened or if all messages were logged (introduced the DebugLogger to log all messages)
Updated the default value of Scala parser timeout from 150ms to 300ms
Update SqlElementsInfo.csv to Direct Status
Changed order in the SqlElementsInfo.csv
Update parsing error message when a SQL statement is not parsed
Statements without recovery are now added to Issues.csv
Changed SqlElements mapping status to Direct and Partial
Updated the fully qualified names for the following Spark elements in the conversion status file:
pyspark.sql.streaming.readwriter.DataStreamReader
pyspark.sql.streaming.readwriter.DataStreamWriter
pyspark.sql.streaming.query.StreamingQuery
Added the following Spark elements to the conversion status file as **NotSupported**:
pyspark.sql.streaming.readwriter.DataStreamReader.format
pyspark.sql.streaming.readwriter.DataStreamReader.table
pyspark.sql.streaming.readwriter.DataStreamWriter.partitionBy
pyspark.sql.streaming.query.StreamingQuery.awaitTermination
Removed the generation of the SummaryReport.docx, SummaryReport.html, and DetailedReport.html report files. Only the DetailedReport.docx will be generated.
Fixed
Fixed the issue of the SMA tool not detecting Python cells (%magic) in .scala notebooks
Fixed EWI comments not being added to the output code
Fixed processing of empty source cells presents in jupyter notebooks
Fixed categorization of Spark identified usages and data display in Spark API usage summary table.
April 19, 2024
Application Version 1.0.4
Feature Updates include:
Updated Spark (Scala and Python) Conversion Core: 2.42.1
Spark Conversion Core Version 2.42.1
Added
ThirdPartyLibrary to Report Additional Third Party Library Indicator.
Added Transform for Hive Set Statement.
Removed warning related to Unsupported .hql files in Symbol Table Loader for Python.
Added Transform for Hive Drop Table Statement.
Added ConversionBeginTaskBase and refactored tasks.
Added Transform for session.read("query", qry) to session.sql(qry).
Added handling for ImplicitImports node from JsonObjects.
Updated the parsing errors mechanism to avoid commenting out files with parsing errors.
Updated reporting mechanism to generate empty SQL reports when no SQL is found.
Updated the status conversion for the nodes (Create statements) that do not need conversion for Hive Inventory.
Updated the status conversion for the nodes that do not need conversion for Hive Inventory.
Changed EWI SPRKHVSQL1004 to indicate 'Information from underlying data files can not be recovered' instead of 'Purge removed from DROP TABLE statement'' and Change DROP TABLE transformation, to add ewi SPRKHVSQL1004 when PURGE statement is not present.
Collapse SqlNames and SqlJoins in the SQL Usages Inventory.
Updates Several SQL Statement with status and transformations:
Nodes related with MERGE.
Nodes with INSERT, ALTER, DROP TABLE, and CTEs.
Nodes with create table, function, view, and table.
Direct transformations for SqlSelect and related nodes.
Add support for DBC implicit imports.
Fixed
Updated the parsing errors mechanism to avoid commenting out notebooks cells with parsing errors.
Updated CallFunction parse rule to verify if has backslash or not new line to avoid parsing error when return statement has id and next statement is a deconstructed tuple assignment.
Fixed an issue that caused the Import Calls section of the reports to calculate incorrect percentage values.
Fixed issue related to not generating the detailed report.
Fixed EWI SPRKHVSQL1004 not being added to DROP TABLE transformation.
Fixed parsing error about return statement with id and deconstructed tuple assignment.
Fixed an issue that caused the Issues.csv and the notifications.pam files to not show the line, column, and file id of the files with parsing errors.
Fixed the text about ranges of readiness score.
March 19, 2024
Application Version 1.0.4
Feature Updates include:
Updated Spark (Scala and Python) Conversion Core: 2.40.1
Spark Conversion Core 2.40.1
Added
Parsing support for HiveQL including support for HiveSql files (.hql)
Remove the import for
snowpark-extensionsin PythonLogo updated in the Detailed Report
Ignored files are now noted in the Detailed Report
SQL elements calculator and SQL elements table added to the detailed report
Added transformation for
WHEN NOT MATCH BY SOURCEwhen multiple match conditions existSite-packages,
pip,dist,venv, and hidden directories now excluded from processing by the SMARename Supported to IsSnowparkAnacondaSupported in the Import Usages spreadsheet
Added SQL elements to the SqlElementsInfo.csv catalog
Added a new column named Flavor to the SqlElementsInfo.csv inventory to distinguish between SparkSQL and HiveQL
Added parsing errors for SQL code to the Issues.csv file
New EWI's added for
org.apache.spark.sql.functions.splitrelated parameter errors36 additional RDD elements added to the core mapping table (currently will be listed as unsupported)
Transformation and conversion support for:
org.apache.spark.sql.types.StructFieldorg.apache.spark.sql.functions.translateorg.apache.spark.sql.Builder.enableHiveSupportpyspark.sql.functions.splitorg.apache.spark.sql.functions.splitAdjusted the replacer for
pyspark.sql.functions.unix_timestamp
Fixed
Modified the source concatenation process to ensure that magic commands are kept distinct. Now, strings are concatenated continuously until a magic command is encountered, at which point each magic command is handled separately.
Removed new lines in the format of Single line SQL when printing
Path for the generation of assessment zip files has been corrected
Corrected unnecessary imports of
org.apache.spark.sql.DatasetConversion now removes Apache Spark imports remain after migration
March 18, 2024
Application Version 1.0.0
Feature Updates include:
New Snowpark Migration Accelerator logo.
Improved Assessment reports.
Updated Spark (Scala and Python) Conversion Core: 2.33.0
Spark Conversion Core 2.33.0
Added
Added additional inventory elements to the core mapping tables (currently, listed as not supported):
Pandas not supported cases in the pandas mappings
Added ML, Streaming and Blank not supported cases
Updated custom EWIs for Micro-partition, clustering, and streaming cases
February 12, 2024
Application Version 0.38.0
Feature Updates include:
Automatic license provisioning, now you can request a new SMA license directly from the app and receive it in your email.
Updated Spark (Scala and Python) Conversion Core: 2.29.0
Spark Conversion Core 2.29.0
Added
Added SQL elements inventory
Reports are no longer filtered by readiness score or Snowflake user
Group Import Call Summary table in Assessment Report by package
Added support Snowpark API Versions:
Snowpark API version 1.10.0 on Python
Snowpark API version 1.9.0 on Python
Snowpark API version 1.8.0 on Python
Added/Updated mappings for:
Pyspark
pyspark.sql.functions.pandas_udfpyspark.sql.group.GroupedData.pivotpyspark.sql.functions.unix_timestamp
Scala
Multiple scenarios of
containsfunctions, includingorg.apache.spark.sql.Column.contains(scala.Any)org.apache.spark.sql.types.StructField.nameorg.apache.spark.sql.types.StructField.fieldsorg.apache.spark.sql.function.array_agg
Recollection of Pandas data:
Created Inventory for Pandas Usages
Supported Pandas at ConversionStatus
Added Pandas Information in reports
Generates assessment zip file
Support for parsing of an empty interpolation scenario (${})
Updated examples of the DetailedReport template in Appendix A for Python and Scala
Avoid adding hardcoded credentials to SparkConf transformation
Add JSON inventory conversion logic to code processor
Fixed
Fixed inconsistencies of table called notebook sizing by language
Fixed issue with try/except in sprocs creation
Exclude internal imports in Assessment Report and add origin to import inventory
Improve EWI message for parsing errors
Fixed error missing .map files in scala
Fixed no file type summary for other code extensions
Fixed parsing errors for methods named 'match'.
Fixed an error that omitted some files in the File Sizing table
Remove useless statement after removal of not required functions
Fix replacer to remove unsupported clearCache function
Fix parsing for *args and **kwargs with backslash
Fix scenario where alias of column with brackets was removed in transformation due to bad resolution
November 27, 2023
Application Version 0.33.1
Feature Updates include:
Name Change: SnowConvert for Spark -> Snowpark Migration Accelerator (SMA)
Updated Spark (Scala and Python) Conversion Core: 2.20.0
Trial Mode Enabled
Code Compare
See the Code Compare section of the documentation.
Updated assessment report in the UI
Walk through the updated assessment report in the application.
Updated support email available: [email protected]
Spark Conversion Core 2.20.0
Added
Add support to convert from Databricks to Jupyter (.dbc -> .ipynb)
Add line number of the error when there is a parsing error
Add company written by the user to the execution info in the assessment summary
Add mappings for:
org.apache.spark.sql.Column.contains(scala.Any)Example: Spark: col("a").contains(col("b")) Snowpark: contains(col("a"), col("b"))
Add needed data to display detailed report info in the Desktop tool reports
Updates to the assessment JSON file to accommodate the detailed assessment report
Dataframes saved as a tables using a Hive format now converted to not be specific to Hive
Add automated generation of stored procedures for Spark entry points
Add preprocess step in Python files to identify combination of spaces and tabs, and normalize them with spaces to prevent parsing errors
Inventories uploaded to telemetry even if the tool crashes
Adjust new tool name (Snowpark Migration Accelerator) in DOCX and HTML reports to accommodate the rebranding
Fixed
Fix Import call summary table in the report not matching the total value
Fix timeout issue in application for StructType with multiple fields
Fix indentation scenarios that do not require normalization in Scala
Fix 'Load Symbol Table' crash when the base class is not defined
Fix an issue causing the 'Python File Sizing' and 'Scala File Sizing' tables in the reports to display wrong values
Fix tool getting stuck when processing SQL files in Scala
November 09, 2023
Application Version 0.27.5
Updates include:
Updated Spark (Scala and Python) Conversion Core: 2.16.0
Update to the license request mechanism inside the application.
Spark Conversion Core 2.16.0
Updates include:
Add support for DataFrame alias at joins for Spark Scala.
Import Call Summary table in Assessment Report truncated and ordered.
Turn off by default the condensed file ID feature.
November 02, 2023
Application Version 0.26.0
Updates include:
Updated Spark (Scala and Python) Conversion Core: 2.14.0
The logger mechanism has been updated.
October 25, 2023
Application Version 0.25.11
Updates include:
Updated Spark (Scala and Python) Conversion Core: 2.14.0
Improved crash report flow
Fixes in Code Compare component
The button “View Reports” was changed to open the expected folder
Spark Conversion Core 2.14.0
Updates include:
Add condensed ID for filenames and use it in the log.
Refactor output folder hierarchy of the TrialMode.
Generate Reports locally in Assessment mode when the score hits 90 or higher.
Generate Reports locally in Assessment mode when it's a Snowflake user.
Create inventories as .csv files (as shown below).
Move inventories to the Reports folder (as shown below).

Reports folder now available in Assessment Mode
October 19, 2023
Version 0.25.6 (Oct 19, 2023)
Included SnowConvert Core Versions
Fixes
Inconsistencies with Spark-supported file extensions
CLI Terms and Conditions and Show Access Code options
Visual fixes
Features
SnowConvert Client separation
Version 0.24.0(Oct 04, 2023)
Included SnowConvert Core Versions
Fixes
Conversion settings persistency on project files.
Inconsistencies in SQL Assessment and Conversion reports were fixed.
Features
Feature Flags for CLI
Version 0.20.3(Sept 14, 2023)
Included SnowConvert Core Versions
SQL Conversion Core: 22.2.63
Oracle
Teradata
SQLServer
Scala Conversion Core 2.6.0
Python Conversion Core 2.6.0
Features
Analyzing sub-folders and Converting sub-folders are now available.
Include the Disable topological level reorder flag as part of the Teradata conversion settings.
Fixes
Conversion finished successfully but reporting a crashed status.
SQL Server schema was set to PUBLIC automatically.
Missing generic scanner files on Spark/Python assessment.
Updated EULA.
Version 0.19.7(Sept 7, 2023)
Included SnowConvert Core Versions
SQL Conversion Core: 22.2.48
Oracle
Teradata
SQLServer
Scala Conversion Core 2.5.0
Python Conversion Core 2.5.0
Version 0.19.1(Sept 4, 2023)
Included SnowConvert Core Versions
SQL Conversion Core: 22.2.30
Oracle
Teradata
SQLServer
Scala Conversion Core 2.4.0
Python Conversion Core 2.4.0
Fixes
Changed default Conversion Rate on Reports to Lines of Code Conversion Rate.
Fixed issues with the list of Recently opened projects.
Fixed issue when trying to open an invalid .snowct file
Version 0.17.0(Aug 24, 2023)
Included SnowConvert Core Versions
SQL Conversion Core: 22.2.9
Oracle
Teradata
SQLServer
Scala Conversion Core 2.3.31
Python Conversion Core 2.3.31
Fixes
Assessment Conversion settings on the correct platforms.
Input Folder validations.
Creating a project with an existent name in the input folder blocked the application.
Version 0.16.1(Aug 21, 2023)
Included SnowConvert Core Versions
SQL Conversion Core: 22.0.47
Oracle
Teradata
SQLServer
Scala Conversion Core 2.3.31
Python Conversion Core 2.3.31
Fixes
A unified CLI version is now available.
Fix displayed data on SQL Conversion reports.
Open recent project issues when starting a new project.
Assessment settings.
Version 0.15.2(Aug 17, 2023)
Included SnowConvert Core Versions
SQL Conversion Core: 22.0.47
Oracle
Teradata
SQLServer
Scala Conversion Core 2.3.31
Python Conversion Core 2.3.31
Fixes
An auto-update issue with the x64 version for macOS. (Requires manual reinstallation).
Fix links displayed in report pages.
Minor updates in texts and labels.
Version 0.14.5(Aug 10, 2023)
Included SnowConvert Core Versions
SQL Conversion Core: 22.0.32
Oracle
Teradata
SQLServer
Scala Conversion Core 2.3.31
Python Conversion Core 2.3.31
Hotfix change for Snowpark Engines.
Version 0.14.1 (Aug 9, 2023)
Included SnowConvert Core Versions
SQL Conversion Core: 22.0.32
Oracle
Teradata
SQLServer
Scala Conversion Core 2.3.22
Python Conversion Core 2.3.22
Fixes
Fixed visual bugs on reports.
Changes on the Request an Access Code page
Rename the access-code field on the .snowct files.
Don't create empty output folders.
Version 0.13.1 (Aug 3, 2023)
Included SnowConvert Core Versions
SQL Conversion Core: 22.0.17
Oracle
Teradata
SQLServer
Scala Conversion Core 2.3.22
Python Conversion Core 2.3.22
Fixes
Improvements in Assessment and Conversion Reports
Updates in the reports layouts.
Collapsible sections.
Order in Card Components.
Version 0.11.7 (July 27, 2023)
Included SnowConvert Core Versions
Fixes
Fixing Conversion Rate by LoC.
Adding % to SQL LoC Conversion Rate
Output path validation was added in the report viewer.
Telemetry can be disabled once a valid license is selected.
Version 0.11.3 (July 19, 2023)
Included SnowConvert Core Versions
Fixes
Conversion settings reset after changing the current step.
Minor visual improvements.
Wording changes.
Version 0.9.2 (July 12, 2023)
Included SnowConvert Core Versions
Fixes
Included preview header.
Minor visual improvements.
Version 0.8.2 (July 10, 2023)
Included SnowConvert Core Versions
Fixes
Reset the timer on the progress bar in alerts.
Fixing styles on displayed alert notifications.
Added preview banner on application header.
Improved exception handling mechanism.
Version 0.7.6 (July 03, 2023)
Included SnowConvert Core Versions
Fixes
Updates notarization tool.
Fix the conversion rate issue when using conversion settings.
Fix the open new project flow after opening an old project.
Remove the .mobilize folder from outputs.
Improve alerts and notifications.
Windows certificate naming issue. (Requires manual reinstallation).
Version 0.6.1 (June 23, 2023)
Included SnowConvert Core Versions
Fixes
Sign Windows binaries with Snowflake certificates.
Fixed issue when creating a new project after opening an existing one.
Minor styling and wording improvements.
Version 0.4.1 (June 21, 2023)
Included SnowConvert Core Versions
Fixes
The report information does not display the correct information.
Keep the conversion failed status when reopening the project.
Update texts and documentation links.
Version 0.3.0 (June 16, 2023)
Included SnowConvert Core Versions
Fixes
Added tool version in error logs.
Included custom installation wizard for Windows version.
Assessment report tables not processing numbers with commas.
The code signing certificate was changed. This affects the OTA Update, manual installation of this version is required.
Version 0.2.9 (June 15, 2023)
Included SnowConvert Core Versions
Fixes
Missing information in telemetry reports
Fix the auto-saving issue with .snowct project files.
Telemetry enabled for conversion flows.
Error is shown when trying to convert without supported files.
Last updated
