Release Notes

Release Notes for the Snowpark Migration Accelerator (SMA)

Note that the release notes below are organized by release date. Version numbers for both the application and the conversion core will appear below.

July 17, 2025

Application & CLI Version 2.7.6

Included SMA Core Versions

Snowpark Conversion Core 8.0.30

Added

Adjusted mappings for spark.DataReader methods.
DataFrame.union is now DataFrame.unionAll.
DataFrame.unionByName is now DataFrame.unionAllByName.
Added multi-level artifact dependency columns in artifact inventory
Added new Pandas EWIs documentation, from PNDSPY1005 to PNDSPY1010.
Added a specific EWI for pandas.core.series.Series.apply.

Changed

Bumped the version of Snowpark Pandas API supported by the SMA from 1.27.0 to 1.30.0.

Fixed

Fixed an issue with missing values in the formula to get the SQL readiness score.
Fixed a bug that was causing some Pandas elements to have the default EWI message from PySpark.

July 2nd, 2025

Application & CLI Version 2.7.5

Included SMA Core Versions

Snowpark Conversion Core 8.0.19

Changed

Refactored Pandas Imports: Pandas imports now use `modin.pandas` instead of snowflake.snowpark.modin.pandas.
Improved `dbutils` and Magic Commands Transformation:
- A new sfutils.py file is now generated, and all dbutils prefixes are replaced with sfutils.
- For Databricks (DBX) notebooks, an implicit import for sfutils is automatically added.
- The sfutils module simulates various dbutils methods, including file system operations (dbutils.fs) via a defined Snowflake FileSystem (SFFS) stage, and handles notebook execution (dbutils.notebook.run) by transforming it to EXECUTE NOTEBOOK SQL functions.
- dbutils.notebook.exit is removed as it is not required in Snowflake.

Fixed

Updates in SnowConvert Reports: SnowConvert reports now include the CellId column when instances originate from SMA, and the FileName column displays the full path.
Updated Artifacts Dependency for SnowConvert Reports: The SMA's artifact inventory report, which was previously impacted by the integration of SnowConvert, has been restored. This update enables the SMA tool to accurately capture and analyze Object References and Missing Object References directly from SnowConvert reports, thereby ensuring the correct retrieval of SQL dependencies for the inventory.

June 26th, 2025

Application & CLI Version 2.7.4

Desktop App

Added

Added telemetry improvements.

Fixed

Fix documentation links in conversion settings pop-up and Pandas EWIs.

Included SMA Core Versions

Snowpark Conversion Core 8.0.16

Added

Transforming Spark XML to Snowpark
Databricks SQL option in the SQL source language
Transform JDBC read connections.

Changed

All the SnowConvert reports are copied to the backup Zip file.
The folder is renamed from SqlReports to SnowConvertReports.
SqlFunctionsInventory is moved to the folder Reports.
All the SnowConvert Reports are sent to Telemetry.

Fixed

Non-deterministic issue with SQL Readiness Score.
Fixed a false-positive critical result that made the desktop crash.
Fixed issue causing the Artifacts dependency report not to show the SQL objects.

June 10th, 2025

Application & CLI Version 2.7.2

Included SMA Core Versions

Snowpark Conversion Core 8.0.2

Fixed

Addressed an issue with SMA execution on the latest Windows OS, as previously reported. This fix resolves the issues encountered in version 2.7.1.

June 9th, 2025

Application & CLI Version 2.7.1

Included SMA Core Versions

Snowpark Conversion Core 8.0.1

Added

The Snowpark Migration Accelerator (SMA) now orchestrates SnowConvert to process SQL found in user workloads, including embedded SQL in Python / Scala code, Notebook SQL cells, .sql files, and .hql files.

The SnowConvert now enhances the previous SMA capabilities:

Spark SQL

A new folder in the Reports called SQL Reports contains the reports generated by SnowConvert.

Known Issues

The previous SMA version for SQL reports will appear empty for the following:

For Reports/SqlElementsInventory.csv, partially covered by the Reports/SqlReports/Elements.yyyymmdd.hhmmss.csv.
For Reports/SqlFunctionsInventory.csv refer to the new location with the same name at Reports/SqlReports/SqlFunctionsInventory.csv

The artifact dependency inventory:

In the ArtifactDependencyInventory the column for the SQL Object will appear empty

May 5th, 2025

Application & CLI Version 2.6.10

Included SMA Core Versions

Snowpark Conversion Core 7.4.0

Fixed

Fixed wrong values in the 'checkpoints.json' file.
- The 'sample' value was without decimals (for integer values) and quotes.
- The 'entryPoint' value had dots instead of slashes and was missing the file extension.
Updated the default value to TRUE for the setting 'Convert DBX notebooks to Snowflake notebooks'

April 28th, 2025

Application & CLI Version 2.6.8

Desktop App

Added checkpoints execution settings mechanism recognition.
Added a mechanism to collect DBX magic commands into DbxElementsInventory.csv
Added 'checkpoints.json' generation into the input directory.
Added a new EWI for all not supported magic command.
Added the collection of dbutils into DbxElementsInventory.csv from scala source notebooks

Included SMA Core Versions

Snowpark Conversion Core 7.2.53

Changed

Updates made to handle transformations from DBX Scala elements to Jupyter Python elements, and to comment the entire code from the cell.
Updates made to handle transformations from dbutils.notebook.run and “r" commands, for the last one, also comment out the entire code from the cell.
Updated the name and the letter of the key to make the conversion of the notebook files.

Fixed

Fixed the bug that was causing the transformation of DBX notebooks into .ipynb files to have the wrong format.
Fixed the bug that was causing .py DBX notebooks to not be transformable into .ipynb files.
Fixed a bug that was causing comments to be missing in the output code of DBX notebooks.
Fixed a bug that was causing raw Scala files to be converted into ipynb files.

April 21st, 2025

Application & CLI Version 2.6.7

Included SMA Core Versions

Snowpark Conversion Core 7.2.42

Changed

Updated DataFramesInventory to fill EntryPoints column

April 7th, 2025

Application & CLI Version 2.6.6

Desktop App

Added

Update DBx EWI link in the UI results page

Included SMA Core Versions

Snowpark Conversion Core 7.2.39

Added

Added Execution Flow inventory generation.
Added implicit session setup in every DBx notebook transformation

Changed

Renamed the DbUtilsUsagesInventory.csv to DbxElementsInventory.csv

Fixed

Fixed a bug that caused a Parsing error when a backslash came after a type hint.
Fixed relative imports that do not start with a dot and relative imports with a star.

March 27th, 2025

Application & CLI Version 2.6.5

Desktop App

Added

Added a new conversion setting toggle to enable or disable Sma-Checkpoints feature.
Fix report issue to not crash when post api returns 500

Included SMA Core Versions

Snowpark Conversion Core 7.2.26

Added

Added generation of the checkpoints.json file into the output folder based on the DataFramesInventory.csv.
Added "disableCheckpoints" flag into the CLI commands and additional parameters of the code processor.
Added a new replacer for Python to transform the dbutils.notebook.run node.
Added new replacers to transform the magic %run command.
Added new replacers (Python and Scala) to remove the dbutils.notebook.exit node.
Added Location column to artifacts inventory.

Changed

Refactored the normalized directory separator used in some parts of the solution.
Centralized the DBC extraction working folder name handling.
Updated Snowpark and Pandas version to v1.27.0
Updated the artifacts inventory columns to:
- Name -> Dependency
- File -> FileId
- Status -> Status_detail
Added new column to the artifacts inventory:
- Success

Fixed

Dataframes inventory was not being uploaded to the stage correctly.

March 12th, 2025

Application & CLI Version 2.6.4

Included SMA Core Versions

Snowpark Conversion Core 7.2.0

Added

An Artifact Dependency Inventory
A replacer and EWI for pyspark.sql.types.StructType.fieldNames method to snowflake.snowpark.types.StructType.fieldNames attribute.
The following PySpark functions with the status:

Direct Status

pyspark.sql.functions.bitmap_bit_position
pyspark.sql.functions.bitmap_bucket_number
pyspark.sql.functions.bitmap_construct_agg
pyspark.sql.functions.equal_null
pyspark.sql.functions.ifnull
pyspark.sql.functions.localtimestamp
pyspark.sql.functions.max_by
pyspark.sql.functions.min_by
pyspark.sql.functions.nvl
pyspark.sql.functions.regr_avgx
pyspark.sql.functions.regr_avgy
pyspark.sql.functions.regr_count
pyspark.sql.functions.regr_intercept
pyspark.sql.functions.regr_slope
pyspark.sql.functions.regr_sxx
pyspark.sql.functions.regr_sxy
pyspark.sql.functions.regr

NotSupported

pyspark.sql.functions.map_contains_key
pyspark.sql.functions.position
pyspark.sql.functions.regr_r2
pyspark.sql.functions.try_to_binary

The following Pandas functions with status

pandas.core.series.Series.str.ljust
pandas.core.series.Series.str.center
pandas.core.series.Series.str.pad
pandas.core.series.Series.str.rjust

Update the following Pyspark functions with the status

From WorkAround to Direct

pyspark.sql.functions.acosh
pyspark.sql.functions.asinh
pyspark.sql.functions.atanh
pyspark.sql.functions.instr
pyspark.sql.functions.log10
pyspark.sql.functions.log1p
pyspark.sql.functions.log2

From NotSupported to Direct

pyspark.sql.functions.bit_length
pyspark.sql.functions.cbrt
pyspark.sql.functions.nth_value
pyspark.sql.functions.octet_length
pyspark.sql.functions.base64
pyspark.sql.functions.unbase64

Updated the folloing Pandas functions with the status

From NotSupported to Direct

pandas.core.frame.DataFrame.pop
pandas.core.series.Series.between
pandas.core.series.Series.pop

March 6th, 2025

Application & CLI Version 2.6.3

Included SMA Core Versions

Snowpark Conversion Core 7.1.13

Added

Added csv generator class for new inventory creation.
Added "full_name" column to import usages inventory.
Added transformation from pyspark.sql.functions.concat_ws to snowflake.snowpark.functions._concat_ws_ignore_nulls.
Added logic for generation of checkpoints.json.
Added the inventories:
- DataFramesInventory.csv.
- CheckpointsInventory.csv

February 21st, 2025

Application & CLI Version 2.6.0

Desktop App

Updated the licensing agreement, acceptance is required.

Included SMA Core Versions

Snowpark Conversion Core 7.1.2

Added

Updated the mapping status for the following PySpark elements, from NotSupported to Direct

pyspark.sql.types.ArrayType.json
pyspark.sql.types.ArrayType.jsonValue
pyspark.sql.types.ArrayType.simpleString
pyspark.sql.types.ArrayType.typeName
pyspark.sql.types.AtomicType.json
pyspark.sql.types.AtomicType.jsonValue
pyspark.sql.types.AtomicType.simpleString
pyspark.sql.types.AtomicType.typeName
pyspark.sql.types.BinaryType.json
pyspark.sql.types.BinaryType.jsonValue
pyspark.sql.types.BinaryType.simpleString
pyspark.sql.types.BinaryType.typeName
pyspark.sql.types.BooleanType.json
pyspark.sql.types.BooleanType.jsonValue
pyspark.sql.types.BooleanType.simpleString
pyspark.sql.types.BooleanType.typeName
pyspark.sql.types.ByteType.json
pyspark.sql.types.ByteType.jsonValue
pyspark.sql.types.ByteType.simpleString
pyspark.sql.types.ByteType.typeName
pyspark.sql.types.DecimalType.json
pyspark.sql.types.DecimalType.jsonValue
pyspark.sql.types.DecimalType.simpleString
pyspark.sql.types.DecimalType.typeName
pyspark.sql.types.DoubleType.json
pyspark.sql.types.DoubleType.jsonValue
pyspark.sql.types.DoubleType.simpleString
pyspark.sql.types.DoubleType.typeName
pyspark.sql.types.FloatType.json
pyspark.sql.types.FloatType.jsonValue
pyspark.sql.types.FloatType.simpleString
pyspark.sql.types.FloatType.typeName
pyspark.sql.types.FractionalType.json
pyspark.sql.types.FractionalType.jsonValue
pyspark.sql.types.FractionalType.simpleString
pyspark.sql.types.FractionalType.typeName
pyspark.sql.types.IntegerType.json
pyspark.sql.types.IntegerType.jsonValue
pyspark.sql.types.IntegerType.simpleString
pyspark.sql.types.IntegerType.typeName
pyspark.sql.types.IntegralType.json
pyspark.sql.types.IntegralType.jsonValue
pyspark.sql.types.IntegralType.simpleString
pyspark.sql.types.IntegralType.typeName
pyspark.sql.types.LongType.json
pyspark.sql.types.LongType.jsonValue
pyspark.sql.types.LongType.simpleString
pyspark.sql.types.LongType.typeName
pyspark.sql.types.MapType.json
pyspark.sql.types.MapType.jsonValue
pyspark.sql.types.MapType.simpleString
pyspark.sql.types.MapType.typeName
pyspark.sql.types.NullType.json
pyspark.sql.types.NullType.jsonValue
pyspark.sql.types.NullType.simpleString
pyspark.sql.types.NullType.typeName
pyspark.sql.types.NumericType.json
pyspark.sql.types.NumericType.jsonValue
pyspark.sql.types.NumericType.simpleString
pyspark.sql.types.NumericType.typeName
pyspark.sql.types.ShortType.json
pyspark.sql.types.ShortType.jsonValue
pyspark.sql.types.ShortType.simpleString
pyspark.sql.types.ShortType.typeName
pyspark.sql.types.StringType.json
pyspark.sql.types.StringType.jsonValue
pyspark.sql.types.StringType.simpleString
pyspark.sql.types.StringType.typeName
pyspark.sql.types.StructType.json
pyspark.sql.types.StructType.jsonValue
pyspark.sql.types.StructType.simpleString
pyspark.sql.types.StructType.typeName
pyspark.sql.types.TimestampType.json
pyspark.sql.types.TimestampType.jsonValue
pyspark.sql.types.TimestampType.simpleString
pyspark.sql.types.TimestampType.typeName
pyspark.sql.types.StructField.simpleString
pyspark.sql.types.StructField.typeName
pyspark.sql.types.StructField.json
pyspark.sql.types.StructField.jsonValue
pyspark.sql.types.DataType.json
pyspark.sql.types.DataType.jsonValue
pyspark.sql.types.DataType.simpleString
pyspark.sql.types.DataType.typeName
pyspark.sql.session.SparkSession.getActiveSession
pyspark.sql.session.SparkSession.version
pandas.io.html.read_html
pandas.io.json._normalize.json_normalize
pyspark.sql.types.ArrayType.fromJson
pyspark.sql.types.MapType.fromJson
pyspark.sql.types.StructField.fromJson
pyspark.sql.types.StructType.fromJson
pandas.core.groupby.generic.DataFrameGroupBy.pct_change
pandas.core.groupby.generic.SeriesGroupBy.pct_change

Updated the mapping status for the following Pandas elements, from NotSupported to Direct

pandas.io.html.read_html
pandas.io.json._normalize.json_normalize
pandas.core.groupby.generic.DataFrameGroupBy.pct_change
pandas.core.groupby.generic.SeriesGroupBy.pct_change

Updated the mapping status for the following PySpark elements, from Rename to Direct

pyspark.sql.functions.collect_list
pyspark.sql.functions.size

Fixed

Standardized the format of the version number in the inventories.

February 5th, 2025

Hotfix: Application & CLI Version 2.5.2

Desktop App

Fixed an issue when converting in the sample project option.

Included SMA Core Versions

Snowpark Conversion Core 5.3.0

February 4th, 2025

Application & CLI Version 2.5.1

Desktop App

Added a new modal when the user does not have write permission.
Updated the licensing aggrement, acceptance is required.

CLI

Fixed the year in the CLI screen when showing "--version" or "-v"

Included SMA Core Versions

Snowpark Conversion Core 5.3.0

Added

Added the following Python Third-Party libraries with Direct status:

about-time
affinegap
aiohappyeyeballs
alibi-detect
alive-progress
allure-nose2
allure-robotframework
anaconda-cloud-cli
anaconda-mirror
astropy-iers-data
asynch
asyncssh
autots
autoviml
aws-msk-iam-sasl-signer-python
azure-functions
backports.tarfile
blas
bottle
bson
cairo
capnproto
captum
categorical-distance
census
clickhouse-driver
clustergram
cma
conda-anaconda-telemetry
configspace
cpp-expected
dask-expr
data-science-utils
databricks-sdk
datetime-distance
db-dtypes
dedupe
dedupe-variable-datetime
dedupe_lehvenshtein_search
dedupe_levenshtein_search
diff-cover
diptest
dmglib
docstring_parser
doublemetaphone
dspy-ai
econml
emcee
emoji
environs
eth-abi
eth-hash
eth-typing
eth-utils
expat
filetype
fitter
flask-cors
fpdf2
frozendict
gcab
geojson
gettext
glib-tools
google-ads
google-ai-generativelanguage
google-api-python-client
google-auth-httplib2
google-cloud-bigquery
google-cloud-bigquery-core
google-cloud-bigquery-storage
google-cloud-bigquery-storage-core
google-cloud-resource-manager
google-generativeai
googlemaps
grapheme
graphene
graphql-relay
gravis
greykite
grpc-google-iam-v1
harfbuzz
hatch-fancy-pypi-readme
haversine
hiclass
hicolor-icon-theme
highered
hmmlearn
holidays-ext
httplib2
icu
imbalanced-ensemble
immutabledict
importlib-metadata
importlib-resources
inquirerpy
iterative-telemetry
jaraco.context
jaraco.test
jiter
jiwer
joserfc
jsoncpp
jsonpath
jsonpath-ng
jsonpath-python
kagglehub
keplergl
kt-legacy
langchain-community
langchain-experimental
langchain-snowflake
langchain-text-splitters
libabseil
libflac
libgfortran-ng
libgfortran5
libglib
libgomp
libgrpc
libgsf
libmagic
libogg
libopenblas
libpostal
libprotobuf
libsentencepiece
libsndfile
libstdcxx-ng
libtheora
libtiff
libvorbis
libwebp
lightweight-mmm
litestar
litestar-with-annotated-types
litestar-with-attrs
litestar-with-cryptography
litestar-with-jinja
litestar-with-jwt
litestar-with-prometheus
litestar-with-structlog
lunarcalendar-ext
matplotlib-venn
metricks
mimesis
modin-ray
momepy
mpg123
msgspec
msgspec-toml
msgspec-yaml
msitools
multipart
namex
nbconvert-all
nbconvert-core
nbconvert-pandoc
nlohmann_json
numba-cuda
numpyro
office365-rest-python-client
openapi-pydantic
opentelemetry-distro
opentelemetry-instrumentation
opentelemetry-instrumentation-system-metrics
optree
osmnx
pathlib
pdf2image
pfzy
pgpy
plumbum
pm4py
polars
polyfactory
poppler-cpp
postal
pre-commit
prompt-toolkit
propcache
py-partiql-parser
py_stringmatching
pyatlan
pyfakefs
pyfhel
pyhacrf-datamade
pyiceberg
pykrb5
pylbfgs
pymilvus
pymoo
pynisher
pyomo
pypdf
pypdf-with-crypto
pypdf-with-full
pypdf-with-image
pypng
pyprind
pyrfr
pysoundfile
pytest-codspeed
pytest-trio
python-barcode
python-box
python-docx
python-gssapi
python-iso639
python-magic
python-pandoc
python-zstd
pyuca
pyvinecopulib
pyxirr
qrcode
rai-sdk
ray-client
ray-observability
readline
rich-click
rouge-score
ruff
scikit-criteria
scikit-mobility
sentencepiece-python
sentencepiece-spm
setuptools-markdown
setuptools-scm
setuptools-scm-git-archive
shareplum
simdjson
simplecosine
sis-extras
slack-sdk
smac
snowflake-sqlalchemy
snowflake_legacy
socrata-py
spdlog
sphinxcontrib-images
sphinxcontrib-jquery
sphinxcontrib-youtube
splunk-opentelemetry
sqlfluff
squarify
st-theme
statistics
streamlit-antd-components
streamlit-condition-tree
streamlit-echarts
streamlit-feedback
streamlit-keplergl
streamlit-mermaid
streamlit-navigation-bar
streamlit-option-menu
strictyaml
stringdist
sybil
tensorflow-cpu
tensorflow-text
tiledb-ptorchaudio
torcheval
trio-websocket
trulens-connectors-snowflake
trulens-core
trulens-dashboard
trulens-feedback
trulens-otel-semconv
trulens-providers-cortex
tsdownsample
typing
typing-extensions
typing_extensions
unittest-xml-reporting
uritemplate
us
uuid6
wfdb
wsproto
zlib
zope.index

Added the following Python BuiltIn libraries with Direct status:

aifc
array
ast
asynchat
asyncio
asyncore
atexit
audioop
base64
bdb
binascii
bitsect
builtins
bz2
calendar
cgi
cgitb
chunk
cmath
cmd
code
codecs
codeop
colorsys
compileall
concurrent
contextlib
contextvars
copy
copyreg
cprofile
crypt
csv
ctypes
curses
dbm
difflib
dis
distutils
doctest
email
ensurepip
enum
errno
faulthandler
fcntl
filecmp
fileinput
fnmatch
fractions
ftplib
functools
gc
getopt
getpass
gettext
graphlib
grp
gzip
hashlib
heapq
hmac
html
http
idlelib
imaplib
imghdr
imp
importlib
inspect
ipaddress
itertools
keyword
linecache
locale
lzma
mailbox
mailcap
marshal
math
mimetypes
mmap
modulefinder
msilib
multiprocessing
netrc
nis
nntplib
numbers
operator
optparse
ossaudiodev
pdb
pickle
pickletools
pipes
pkgutil
platform
plistlib
poplib
posix
pprint
profile
pstats
pty
pwd
py_compile
pyclbr
pydoc
queue
quopri
random
re
reprlib
resource
rlcompleter
runpy
sched
secrets
select
selectors
shelve
shlex
signal
site
sitecustomize
smtpd
smtplib
sndhdr
socket
socketserver
spwd
sqlite3
ssl
stat
string
stringprep
struct
subprocess
sunau
symtable
sysconfig
syslog
tabnanny
tarfile
telnetlib
tempfile
termios
test
textwrap
threading
timeit
tkinter
token
tokenize
tomllib
trace
traceback
tracemalloc
tty
turtle
turtledemo
types
unicodedata
urllib
uu
uuid
venv
warnings
wave
weakref
webbrowser
wsgiref
xdrlib
xml
xmlrpc
zipapp
zipfile
zipimport
zoneinfo

Added the following Python BuiltIn libraries with NotSupported status:

msvcrt
winreg
winsound

Changed

Update .NET version to v9.0.0.
Improved EWI SPRKPY1068.
Bumped the version of Snowpark Python API supported by the SMA from 1.24.0 to 1.25.0.
Updated the detailed report template, now has the Snowpark version for Pandas.
Changed the following libraries from ThirdPartyLib to BuiltIn.
- configparser
- dataclasses
- pathlib
- readline
- statistics
- zlib

Updated the mapping status for the following Pandas elements, from Direct to Partial:

pandas.core.frame.DataFrame.add
pandas.core.frame.DataFrame.aggregate
pandas.core.frame.DataFrame.all
pandas.core.frame.DataFrame.apply
pandas.core.frame.DataFrame.astype
pandas.core.frame.DataFrame.cumsum
pandas.core.frame.DataFrame.div
pandas.core.frame.DataFrame.dropna
pandas.core.frame.DataFrame.eq
pandas.core.frame.DataFrame.ffill
pandas.core.frame.DataFrame.fillna
pandas.core.frame.DataFrame.floordiv
pandas.core.frame.DataFrame.ge
pandas.core.frame.DataFrame.groupby
pandas.core.frame.DataFrame.gt
pandas.core.frame.DataFrame.idxmax
pandas.core.frame.DataFrame.idxmin
pandas.core.frame.DataFrame.inf
pandas.core.frame.DataFrame.join
pandas.core.frame.DataFrame.le
pandas.core.frame.DataFrame.loc
pandas.core.frame.DataFrame.lt
pandas.core.frame.DataFrame.mask
pandas.core.frame.DataFrame.merge
pandas.core.frame.DataFrame.mod
pandas.core.frame.DataFrame.mul
pandas.core.frame.DataFrame.ne
pandas.core.frame.DataFrame.nunique
pandas.core.frame.DataFrame.pivot_table
pandas.core.frame.DataFrame.pow
pandas.core.frame.DataFrame.radd
pandas.core.frame.DataFrame.rank
pandas.core.frame.DataFrame.rdiv
pandas.core.frame.DataFrame.rename
pandas.core.frame.DataFrame.replace
pandas.core.frame.DataFrame.resample
pandas.core.frame.DataFrame.rfloordiv
pandas.core.frame.DataFrame.rmod
pandas.core.frame.DataFrame.rmul
pandas.core.frame.DataFrame.rolling
pandas.core.frame.DataFrame.round
pandas.core.frame.DataFrame.rpow
pandas.core.frame.DataFrame.rsub
pandas.core.frame.DataFrame.rtruediv
pandas.core.frame.DataFrame.shift
pandas.core.frame.DataFrame.skew
pandas.core.frame.DataFrame.sort_index
pandas.core.frame.DataFrame.sort_values
pandas.core.frame.DataFrame.sub
pandas.core.frame.DataFrame.to_dict
pandas.core.frame.DataFrame.transform
pandas.core.frame.DataFrame.transpose
pandas.core.frame.DataFrame.truediv
pandas.core.frame.DataFrame.var
pandas.core.indexes.datetimes.date_range
pandas.core.reshape.concat.concat
pandas.core.reshape.melt.melt
pandas.core.reshape.merge.merge
pandas.core.reshape.pivot.pivot_table
pandas.core.reshape.tile.cut
pandas.core.series.Series.add
pandas.core.series.Series.aggregate
pandas.core.series.Series.all
pandas.core.series.Series.any
pandas.core.series.Series.cumsum
pandas.core.series.Series.div
pandas.core.series.Series.dropna
pandas.core.series.Series.eq
pandas.core.series.Series.ffill
pandas.core.series.Series.fillna
pandas.core.series.Series.floordiv
pandas.core.series.Series.ge
pandas.core.series.Series.gt
pandas.core.series.Series.lt
pandas.core.series.Series.mask
pandas.core.series.Series.mod
pandas.core.series.Series.mul
pandas.core.series.Series.multiply
pandas.core.series.Series.ne
pandas.core.series.Series.pow
pandas.core.series.Series.quantile
pandas.core.series.Series.radd
pandas.core.series.Series.rank
pandas.core.series.Series.rdiv
pandas.core.series.Series.rename
pandas.core.series.Series.replace
pandas.core.series.Series.resample
pandas.core.series.Series.rfloordiv
pandas.core.series.Series.rmod
pandas.core.series.Series.rmul
pandas.core.series.Series.rolling
pandas.core.series.Series.rpow
pandas.core.series.Series.rsub
pandas.core.series.Series.rtruediv
pandas.core.series.Series.sample
pandas.core.series.Series.shift
pandas.core.series.Series.skew
pandas.core.series.Series.sort_index
pandas.core.series.Series.sort_values
pandas.core.series.Series.std
pandas.core.series.Series.sub
pandas.core.series.Series.subtract
pandas.core.series.Series.truediv
pandas.core.series.Series.value_counts
pandas.core.series.Series.var
pandas.core.series.Series.where
pandas.core.tools.numeric.to_numeric

Updated the mapping status for the following Pandas elements, from NotSupported to Direct:

pandas.core.frame.DataFrame.attrs
pandas.core.indexes.base.Index.to_numpy
pandas.core.series.Series.str.len
pandas.io.html.read_html
pandas.io.xml.read_xml
pandas.core.indexes.datetimes.DatetimeIndex.mean
pandas.core.resample.Resampler.indices
pandas.core.resample.Resampler.nunique
pandas.core.series.Series.items
pandas.core.tools.datetimes.to_datetime
pandas.io.sas.sasreader.read_sas
pandas.core.frame.DataFrame.attrs
pandas.core.frame.DataFrame.style
pandas.core.frame.DataFrame.items
pandas.core.groupby.generic.DataFrameGroupBy.head
pandas.core.groupby.generic.DataFrameGroupBy.median
pandas.core.groupby.generic.DataFrameGroupBy.min
pandas.core.groupby.generic.DataFrameGroupBy.nunique
pandas.core.groupby.generic.DataFrameGroupBy.tail
pandas.core.indexes.base.Index.is_boolean
pandas.core.indexes.base.Index.is_floating
pandas.core.indexes.base.Index.is_integer
pandas.core.indexes.base.Index.is_monotonic_decreasing
pandas.core.indexes.base.Index.is_monotonic_increasing
pandas.core.indexes.base.Index.is_numeric
pandas.core.indexes.base.Index.is_object
pandas.core.indexes.base.Index.max
pandas.core.indexes.base.Index.min
pandas.core.indexes.base.Index.name
pandas.core.indexes.base.Index.names
pandas.core.indexes.base.Index.rename
pandas.core.indexes.base.Index.set_names
pandas.core.indexes.datetimes.DatetimeIndex.day_name
pandas.core.indexes.datetimes.DatetimeIndex.month_name
pandas.core.indexes.datetimes.DatetimeIndex.time
pandas.core.indexes.timedeltas.TimedeltaIndex.ceil
pandas.core.indexes.timedeltas.TimedeltaIndex.days
pandas.core.indexes.timedeltas.TimedeltaIndex.floor
pandas.core.indexes.timedeltas.TimedeltaIndex.microseconds
pandas.core.indexes.timedeltas.TimedeltaIndex.nanoseconds
pandas.core.indexes.timedeltas.TimedeltaIndex.round
pandas.core.indexes.timedeltas.TimedeltaIndex.seconds
pandas.core.reshape.pivot.crosstab
pandas.core.series.Series.dt.round
pandas.core.series.Series.dt.time
pandas.core.series.Series.dt.weekday
pandas.core.series.Series.is_monotonic_decreasing
pandas.core.series.Series.is_monotonic_increasing

Updated the mapping status for the following Pandas elements, from NotSupported to Partial:

pandas.core.frame.DataFrame.align
pandas.core.series.Series.align
pandas.core.frame.DataFrame.tz_convert
pandas.core.frame.DataFrame.tz_localize
pandas.core.groupby.generic.DataFrameGroupBy.fillna
pandas.core.groupby.generic.SeriesGroupBy.fillna
pandas.core.indexes.datetimes.bdate_range
pandas.core.indexes.datetimes.DatetimeIndex.std
pandas.core.indexes.timedeltas.TimedeltaIndex.mean
pandas.core.resample.Resampler.asfreq
pandas.core.resample.Resampler.quantile
pandas.core.series.Series.map
pandas.core.series.Series.tz_convert
pandas.core.series.Series.tz_localize
pandas.core.window.expanding.Expanding.count
pandas.core.window.rolling.Rolling.count
pandas.core.groupby.generic.DataFrameGroupBy.aggregate
pandas.core.groupby.generic.SeriesGroupBy.aggregate
pandas.core.frame.DataFrame.applymap
pandas.core.series.Series.apply
pandas.core.groupby.generic.DataFrameGroupBy.bfill
pandas.core.groupby.generic.DataFrameGroupBy.ffill
pandas.core.groupby.generic.SeriesGroupBy.bfill
pandas.core.groupby.generic.SeriesGroupBy.ffill
pandas.core.frame.DataFrame.backfill
pandas.core.frame.DataFrame.bfill
pandas.core.frame.DataFrame.compare
pandas.core.frame.DataFrame.unstack
pandas.core.frame.DataFrame.asfreq
pandas.core.series.Series.backfill
pandas.core.series.Series.bfill
pandas.core.series.Series.compare
pandas.core.series.Series.unstack
pandas.core.series.Series.asfreq
pandas.core.series.Series.argmax
pandas.core.series.Series.argmin
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.microsecond
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.nanosecond
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.day_name
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.month_name
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.month_start
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.month_end
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_year_start
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_year_end
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_quarter_start
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_quarter_end
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.is_leap_year
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.floor
pandas.core.indexes.accessors.CombinedDatetimelikeProperties.ceil
pandas.core.groupby.generic.DataFrameGroupBy.idxmax
pandas.core.groupby.generic.DataFrameGroupBy.idxmin
pandas.core.groupby.generic.DataFrameGroupBy.std
pandas.core.indexes.timedeltas.TimedeltaIndex.mean
pandas.core.tools.timedeltas.to_timedelta

Known Issue

This version includes an issue when converting the sample project will not work on this version, it will be fixed on the next release

January 9th, 2025

Application & CLI Version 2.4.3

Desktop App

Added link to the troubleshooting guide in the crash report modal.

Included SMA Core Versions

Snowpark Conversion Core 4.15.0

Added

Added the following PySpark elements to ConversionStatusPySpark.csv file as NotSupported:
- pyspark.sql.streaming.readwriter.DataStreamReader.table
- pyspark.sql.streaming.readwriter.DataStreamReader.schema
- pyspark.sql.streaming.readwriter.DataStreamReader.options
- pyspark.sql.streaming.readwriter.DataStreamReader.option
- pyspark.sql.streaming.readwriter.DataStreamReader.load
- pyspark.sql.streaming.readwriter.DataStreamReader.format
- pyspark.sql.streaming.query.StreamingQuery.awaitTermination
- pyspark.sql.streaming.readwriter.DataStreamWriter.partitionBy
- pyspark.sql.streaming.readwriter.DataStreamWriter.toTable
- pyspark.sql.streaming.readwriter.DataStreamWriter.trigger
- pyspark.sql.streaming.readwriter.DataStreamWriter.queryName
- pyspark.sql.streaming.readwriter.DataStreamWriter.outputMode
- pyspark.sql.streaming.readwriter.DataStreamWriter.format
- pyspark.sql.streaming.readwriter.DataStreamWriter.option
- pyspark.sql.streaming.readwriter.DataStreamWriter.foreachBatch
- pyspark.sql.streaming.readwriter.DataStreamWriter.start

Changed

Updated Hive SQL EWIs format.
Updated Spark SQL EWIs format.

Fixed

Fixed a bug that was causing some PySpark elements not identified by the tool.
Fixed the mismatch in the ThirdParty identified calls and the ThirdParty import Calls number.

December 13th, 2024

Application & CLI Version 2.4.2

Included SMA Core Versions

Snowpark Conversion Core 4.14.0

Added

Added the following Spark elements to ConversionStatusPySpark.csv:
- pyspark.broadcast.Broadcast.value
- pyspark.conf.SparkConf.getAll
- pyspark.conf.SparkConf.setAll
- pyspark.conf.SparkConf.setMaster
- pyspark.context.SparkContext.addFile
- pyspark.context.SparkContext.addPyFile
- pyspark.context.SparkContext.binaryFiles
- pyspark.context.SparkContext.setSystemProperty
- pyspark.context.SparkContext.version
- pyspark.files.SparkFiles
- pyspark.files.SparkFiles.get
- pyspark.rdd.RDD.count
- pyspark.rdd.RDD.distinct
- pyspark.rdd.RDD.reduceByKey
- pyspark.rdd.RDD.saveAsTextFile
- pyspark.rdd.RDD.take
- pyspark.rdd.RDD.zipWithIndex
- pyspark.sql.context.SQLContext.udf
- pyspark.sql.types.StructType.simpleString

Changed

Updated the documentation of the Pandas EWIs, PNDSPY1001, PNDSPY1002 and PNDSPY1003 SPRKSCL1137 to align with a standardized format, ensuring consistency and clarity across all the EWIs.
Updated the documentation of the following Scala EWIs: SPRKSCL1106 and SPRKSCL1107. To be aligned with a standardized format, ensuring consistency and clarity across all the EWIs.

Fixed

Fixed the bug the was causing the UserDefined symbols showing in the third party usages inventory.

December 4th, 2024.

Application & CLI Version 2.4.1

Included SMA Core Versions

Snowpark Conversion Core 4.13.1

Command Line Interface

Changed

Added timestamp to the output folder.

Snowpark Conversion Core 4.13.1

Added

Added 'Source Language' column to Library Mappings Table
Added Others as a new category in the Pandas API Summary table of the DetailedReport.docx

Changed

Updated the documentation for Python EWI SPRKPY1058.
Updated the message for the pandas EWI PNDSPY1002 to show the relate pandas element.
Updated the way we created the .csv reports, now are overwritten after a second run .

Fixed

Fixed a bug that was causing Notebook files not being generated in the output.
Fixed the replacer for get and set methods from pyspark.sql.conf.RuntimeConfig, the replacer now match the correct full names.
Fixed query tag incorrect version.
Fixed UserDefined packages reported as ThirdPartyLib.

November 14th, 2024

Application & CLI Version 2.3.1

Included SMA Core Versions

Snowpark Conversion Core 4.12.0

Desktop App

Fixed

Fix case-sensitive issues in --sql options.

Removed

Remove platform name from show-ac message.

Snowpark Conversion Core 4.12.0

Added

Added support for Snowpark Python 1.23.0 and 1.24.0.
Added a new EWI for the pyspark.sql.dataframe.DataFrame.writeTo function. All the usages of this function will now have the EWI SPRKPY1087.

Changed

Updated the documentation of the Scala EWIs from SPRKSCL1137 to SPRKSCL1156 to align with a standardized format, ensuring consistency and clarity across all the EWIs.
Updated the documentation of the Scala EWIs from SPRKSCL1117 to SPRKSCL1136 to align with a standardized format, ensuring consistency and clarity across all the EWIs.
Updated the message that is shown for the following EWIs:
- SPRKPY1082
- SPRKPY1083
Updated the documentation of the Scala EWIs from SPRKSCL1100 to SPRKSCL1105, from SPRKSCL1108 to SPRKSCL1116; from SPRKSCL1157 to SPRKSCL1175; to align with a standardized format, ensuring consistency and clarity across all the EWIs.
Updated the mapping status of the following PySpark elements from NotSupported to Direct with EWI:
- pyspark.sql.readwriter.DataFrameWriter.option => snowflake.snowpark.DataFrameWriter.option: All the usages of this function now have the EWI SPRKPY1088
- pyspark.sql.readwriter.DataFrameWriter.options => snowflake.snowpark.DataFrameWriter.options: All the usages of this function now have the EWI SPRKPY1089
Updated the mapping status of the following PySpark elements from Workaround to Rename:
- pyspark.sql.readwriter.DataFrameWriter.partitionBy => snowflake.snowpark.DataFrameWriter.partition_by
Updated EWI documentation: SPRKSCL1000, SPRKSCL1001, SPRKSCL1002, SPRKSCL1100, SPRKSCL1101, SPRKSCL1102, SPRKSCL1103, SPRKSCL1104, SPRKSCL1105.

Removed

Removed the pyspark.sql.dataframe.DataFrameStatFunctions.writeTo element from the conversion status, this element does not exist.

Deprecated

Deprecated the following EWI codes:
- SPRKPY1081
- SPRKPY1084

October 30th, 2024

Application & CLI Version 2.3.0

Snowpark Conversion Core 4.11.0

Snowpark Conversion Core 4.11.0

Added

Added a new column called Url to the Issues.csv file, which redirects to the corresponding EWI documentation.
Added new EWIs for the following Spark elements:
- [SPRKPY1082] pyspark.sql.readwriter.DataFrameReader.load
- [SPRKPY1083] pyspark.sql.readwriter.DataFrameWriter.save
- [SPRKPY1084] pyspark.sql.readwriter.DataFrameWriter.option
- [SPRKPY1085] pyspark.ml.feature.VectorAssembler
- [SPRKPY1086] pyspark.ml.linalg.VectorUDT
Added 38 new Pandas elements:
- pandas.core.frame.DataFrame.select
- andas.core.frame.DataFrame.str
- pandas.core.frame.DataFrame.str.replace
- pandas.core.frame.DataFrame.str.upper
- pandas.core.frame.DataFrame.to_list
- pandas.core.frame.DataFrame.tolist
- pandas.core.frame.DataFrame.unique
- pandas.core.frame.DataFrame.values.tolist
- pandas.core.frame.DataFrame.withColumn
- pandas.core.groupby.generic._SeriesGroupByScalar
- pandas.core.groupby.generic._SeriesGroupByScalar[S1].agg
- pandas.core.groupby.generic._SeriesGroupByScalar[S1].aggregate
- pandas.core.indexes.datetimes.DatetimeIndex.year
- pandas.core.series.Series.columns
- pandas.core.tools.datetimes.to_datetime.date
- pandas.core.tools.datetimes.to_datetime.dt.strftime
- pandas.core.tools.datetimes.to_datetime.strftime
- pandas.io.parsers.readers.TextFileReader.apply
- pandas.io.parsers.readers.TextFileReader.astype
- pandas.io.parsers.readers.TextFileReader.columns
- pandas.io.parsers.readers.TextFileReader.copy
- pandas.io.parsers.readers.TextFileReader.drop
- pandas.io.parsers.readers.TextFileReader.drop_duplicates
- pandas.io.parsers.readers.TextFileReader.fillna
- pandas.io.parsers.readers.TextFileReader.groupby
- pandas.io.parsers.readers.TextFileReader.head
- pandas.io.parsers.readers.TextFileReader.iloc
- pandas.io.parsers.readers.TextFileReader.isin
- pandas.io.parsers.readers.TextFileReader.iterrows
- pandas.io.parsers.readers.TextFileReader.loc
- pandas.io.parsers.readers.TextFileReader.merge
- pandas.io.parsers.readers.TextFileReader.rename
- pandas.io.parsers.readers.TextFileReader.shape
- pandas.io.parsers.readers.TextFileReader.to_csv
- pandas.io.parsers.readers.TextFileReader.to_excel
- pandas.io.parsers.readers.TextFileReader.unique
- pandas.io.parsers.readers.TextFileReader.values
- pandas.tseries.offsets

October 24th, 2024

Application Version 2.2.3

Included SMA Core Versions

Snowpark Conversion Core 4.10.0

Desktop App

Fixed

Fixed a bug that caused the SMA to show the label SnowConvert instead of Snowpark Migration Accelerator in the menu bar of the Windows version.
Fixed a bug that caused the SMA to crash when it did not have read and write permissions to the .config directory in macOS and the AppData directory in Windows.

Command Line Interface

Changed

Renamed the CLI executable name from snowct to sma.
Removed the source language argument so you no longer need to specify if you are running a Python or Scala assessment / conversion.
Expanded the command line arguments supported by the CLI by adding the following new arguments:
- --enableJupyter | -j: Flag to indicate if the conversion of Databricks notebooks to Jupyter is enabled or not.
- --sql | -f: Database engine syntax to be used when a SQL command is detected.
- --customerEmail | -e: Configure the customer email.
- --customerCompany | -c: Configure the customer company.
- --projectName | -p: Configure the customer project.
Updated some texts to reflect the correct name of the application, ensuring consistency and clarity in all the messages.
Updated the terms of use of the application.
Updated and expanded the documentation of the CLI to reflect the latests features, enhancements and changes.
Updated the text that is shown before proceeding with the execution of the SMA to improve
Updated the CLI to accept “Yes” as a valid argument when prompting for user confirmation.
Allowed the CLI to continue the execution without waiting for user interaction by specifying the argument -y or --yes.
Updated the help information of the --sql argument to show the values that this argument expects.

Snowpark Conversion Core Version 4.10.0

Added

Added a new EWI for the pyspark.sql.readwriter.DataFrameWriter.partitionBy function. All the usages of this function will now have the EWI SPRKPY1081.
Added a new column called Technology to the ImportUsagesInventory.csv file.

Changed

Updated the Third-Party Libraries readiness score to also take into account the Unknown libraries.
Updated the AssessmentFiles.zip file to include .json files instead of .pam files.
Improved the CSV to JSON conversion mechanism to make processing of inventories more performant.
Improved the documentation of the following EWIs:
Updated the mapping status of the following Spark Scala elements from Direct to Rename.
- org.apache.spark.sql.functions.shiftLeft => com.snowflake.snowpark.functions.shiftleft
- org.apache.spark.sql.functions.shiftRight => com.snowflake.snowpark.functions.shiftright
Updated the mapping status of the following Spark Scala elements from Not Supported to Direct.
- org.apache.spark.sql.functions.shiftleft => com.snowflake.snowpark.functions.shiftleft
- org.apache.spark.sql.functions.shiftright => com.snowflake.snowpark.functions.shiftright

Fixed

Fixed a bug that caused the SMA to incorrectly populate the Origin column of the ImportUsagesInventory.csv file.
Fixed a bug that caused the SMA to not classify imports of the libraries io, json, logging and unittest as Python built-in imports in the ImportUsagesInventory.csv file and in the DetailedReport.docx file.

October 11th, 2024

Application Version 2.2.2

Features Updates include:

Snowpark Conversion Core 4.8.0

Snowpark Conversion Core Version 4.8.0

Added

Added EwiCatalog.csv and .md files to reorganize documentation
Added the mapping status of pyspark.sql.functions.ln Direct.
Added a transformation for pyspark.context.SparkContext.getOrCreate
- Please check the EWI SPRKPY1080 for further details.
Added an improvement for the SymbolTable, infer type for parameters in functions.
Added SymbolTable supports static methods and do not assume the first parameter will be self for them.
Added documentation for missing EWIs

Changed

Updated the mapping status of:
- pyspark.sql.functions.array_remove from NotSupported to Direct.

Fixed

Fixed the Code File Sizing table in the Detail Report to exclude .sql and .hql files and added the Extra Large row in the table.
Fixed missing the update_query_tag when SparkSession is defined into multiple lines on Python.
Fixed missing the update_query_tag when SparkSession is defined into multiple lines on Scala.
Fixed missing EWI SPRKHVSQL1001 to some SQL statements with parsing errors.
Fixed keep new lines values inside string literals
Fixed the Total Lines of code showed in the File Type Summary Table
Fixed Parsing Score showed as 0 when recognize files successfully
Fixed LOC count in the cell inventory for Databricks Magic SQL Cells

September 26th, 2024

Application Version 2.2.0

Feature Updates include:

Snowpark Conversion Core 4.6.0

Snowpark Conversion Core Version 4.6.0

Added

Add transformation for pyspark.sql.readwriter.DataFrameReader.parquet.
Add transformation for pyspark.sql.readwriter.DataFrameReader.option when it is a Parquet method.

Changed

Updated the mapping status of:
- pyspark.sql.types.StructType.fields from NotSupported to Direct.
- pyspark.sql.types.StructType.names from NotSupported to Direct.
- pyspark.context.SparkContext.setLogLevel from Workaround to Transformation.
  - More detail can be found in EWIs SPRKPY1078 and SPRKPY1079
- org.apache.spark.sql.functions.round from WorkAround to Direct.
- org.apache.spark.sql.functions.udf from NotDefined to Transformation.
  - More detail can be found in EWIs SPRKSCL1174 and SPRKSCL1175
Updated the mapping status of the following Spark elements from DirectHelper to Direct:
- org.apache.spark.sql.functions.hex
- org.apache.spark.sql.functions.unhex
- org.apache.spark.sql.functions.shiftleft
- org.apache.spark.sql.functions.shiftright
- org.apache.spark.sql.functions.reverse
- org.apache.spark.sql.functions.isnull
- org.apache.spark.sql.functions.unix_timestamp
- org.apache.spark.sql.functions.randn
- org.apache.spark.sql.functions.signum
- org.apache.spark.sql.functions.sign
- org.apache.spark.sql.functions.collect_list
- org.apache.spark.sql.functions.log10
- org.apache.spark.sql.functions.log1p
- org.apache.spark.sql.functions.base64
- org.apache.spark.sql.functions.unbase64
- org.apache.spark.sql.functions.regexp_extract
- org.apache.spark.sql.functions.expr
- org.apache.spark.sql.functions.date_format
- org.apache.spark.sql.functions.desc
- org.apache.spark.sql.functions.asc
- org.apache.spark.sql.functions.size
- org.apache.spark.sql.functions.locate
- org.apache.spark.sql.functions.ntile

Fixed

Fixed value showed in the Percentage of total Pandas Api
Fixed Total percentage on ImportCalls table in the DetailReport

Deprecated

Deprecated the following EWI code:
- SPRKSCL1115

September 12th, 2024

Application Version 2.1.7

Feature Updates include:

Snowpark Conversion Core 4.5.7
Snowpark Conversion Core 4.5.2

Snowpark Conversion Core Version 4.5.7

Hotfixed

Fixed Total row added on Spark Usages Summaries when there are not usages
Bumped of Python Assembly to Version=1.3.111
- Parse trail comma in multiline arguments

Snowpark Conversion Core Version 4.5.2

Added

Added transformation for pyspark.sql.readwriter.DataFrameReader.option:
- When the chain is from a CSV method call.
- When the chain is from a JSON method call.
Added transformation for pyspark.sql.readwriter.DataFrameReader.json.

Changed

Executed SMA on SQL strings passed to Python/Scala functions
- Create AST in Scala/Python to emit temporary SQL unit
- Create SqlEmbeddedUsages.csv inventory
- Deprecate SqlStatementsInventroy.csv and SqlExtractionInventory.csv
- Integrate EWI when the SQL literal could not be processed
- Create new task to process SQL-embedded code
- Collect info for SqlEmbeddedUsages.csv inventory in Python
- Replace SQL transformed code to Literal in Python
- Update test cases after implementation
- Create table, views for telemetry in SqlEmbeddedUsages inventory
- Collect info for SqlEmbeddedUsages.csv report in Scala
- Replace SQL transformed code to Literal in Scala
- Check line number order for Embedded SQL reporting
Filled the SqlFunctionsInfo.csv with the SQL functions documented for SparkSQL and HiveSQL
Updated the mapping status for:
- org.apache.spark.sql.SparkSession.sparkContext from NotSupported to Transformation.
- org.apache.spark.sql.Builder.config from NotSupported to Transformation. With this new mapping status, the SMA will remove all the usages of this function from the source code.

September 5th, 2024

Application Version 2.1.6

Hotfix change for Snowpark Engines Core version 4.5.1

Spark Conversion Core Version 4.5.1

Hotfix

Added a mechanism to convert the temporal Databricks notebooks generated by SMA in exported Databricks notebooks

August 29th, 2024

Application Version 2.1.5

Feature Updates include:

Updated Spark Conversion Core: 4.3.2

Spark Conversion Core Version 4.3.2

Added

Added the mechanism (via decoration) to get the line and the column of the elements identified in notebooks cells
Added an EWI for pyspark.sql.functions.from_json.
Added a transformation for pyspark.sql.readwriter.DataFrameReader.csv.
Enabled the query tag mechanism for Scala files.
Added the Code Analysis Score and additional links to the Detailed Report.
Added a column called OriginFilePath to InputFilesInventory.csv

Changed

Updated the mapping status of pyspark.sql.functions.from_json from Not Supported to Transformation.
Updated the mapping status of the following Spark elements from Workaround to Direct:
- org.apache.spark.sql.functions.countDistinct
- org.apache.spark.sql.functions.max
- org.apache.spark.sql.functions.min
- org.apache.spark.sql.functions.mean

Deprecated

Deprecated the following EWI codes:
- SPRKSCL1135
- SPRKSCL1136
- SPRKSCL1153
- SPRKSCL1155

Fixed

Fixed a bug that caused an incorrect calculation of the Spark API score.
Fixed an error that avoid copy SQL empty or commented files in the output folder.
Fixed a bug in the DetailedReport, the notebook stats LOC and Cell count is not accurate.

August 14th, 2024

Application Version 2.1.2

Feature Updates include:

Updated Spark Conversion Core: 4.2.0

Spark Conversion Core Version 4.2.0

Added

Add technology column to SparkUsagesInventory
Added an EWI for not defined SQL elements .
Added SqlFunctions Inventory
Collect info for SqlFunctions Inventory

Changed

The engine now processes and prints partially parsed Python files instead of leaving original file without modifications.
Python notebook cells that have parsing errors will also be processed and printed.

Fixed

Fixed pandas.core.indexes.datetimes.DatetimeIndex.strftime was being reported wrongly.
Fix mismatch between SQL readiness score and SQL Usages by Support Status.
Fixed a bug that caused the SMA to report pandas.core.series.Series.empty with an incorrect mapping status.
Fix mismatch between Spark API Usages Ready for Conversion in DetailedReport.docx is different than UsagesReadyForConversion row in Assessment.json.

August 8th, 2024

Application Version 2.1.1

Feature Updates include:

Updated Spark Conversion Core: 4.1.0

Spark Conversion Core Version 4.1.0

Added

Added the following information to the AssessmentReport.json file
- The third-party libraries readiness score.
- The number of third-party library calls that were identified.
- The number of third-party library calls that are supported in Snowpark.
- The color code associated with the third-party readiness score, the Spark API readiness score, and the SQL readiness score.
Transformed SqlSimpleDataType in Spark create tables.
Added the mapping of pyspark.sql.functions.get as direct.
Added the mapping of pyspark.sql.functions.to_varchar as direct.
As part of the changes after unification, the tool now generates an execution info file in the Engine.
Added a replacer for pyspark.sql.SparkSession.builder.appName.

Changed

Updated the mapping status for the following Spark elements
- From Not Supported to Direct mapping:
  - pyspark.sql.functions.sign
  - pyspark.sql.functions.signum
Changed the Notebook Cells Inventory report to indicate the kind of content for every cell in the column Element
Added a SCALA_READINESS_SCORE column that reports the readiness score as related only to references to the Spark API in Scala files.
Partial support to transform table properties in ALTER TABLE and ALTER VIEW
Updated the conversion status of the node SqlSimpleDataType from Pending to Transformation in Spark create tables
Updated the version of the Snowpark Scala API supported by the SMA from 1.7.0 to 1.12.1:
- Updated the mapping status of:
  - org.apache.spark.sql.SparkSession.getOrCreate from Rename to Direct
  - org.apache.spark.sql.functions.sum from Workaround to Direct
Updated the version of the Snowpark Python API supported by the SMA from 1.15.0 to 1.20.0:
- Updated the mapping status of:
  - pyspark.sql.functions.arrays_zip from Not Supported to Direct
Updated the mapping status for the following Pandas elements:
- Direct mappings:
  - pandas.core.frame.DataFrame.any
  - pandas.core.frame.DataFrame.applymap
Updated the mapping status for the following Pandas elements:
- From Not Supported to Direct mapping:
  - pandas.core.frame.DataFrame.groupby
  - pandas.core.frame.DataFrame.index
  - pandas.core.frame.DataFrame.T
  - pandas.core.frame.DataFrame.to_dict
- From Not Supported to Rename mapping:
  - pandas.core.frame.DataFrame.map
Updated the mapping status for the following Pandas elements:
- Direct mappings:
  - pandas.core.frame.DataFrame.where
  - pandas.core.groupby.generic.SeriesGroupBy.agg
  - pandas.core.groupby.generic.SeriesGroupBy.aggregate
  - pandas.core.groupby.generic.DataFrameGroupBy.agg
  - pandas.core.groupby.generic.DataFrameGroupBy.aggregate
  - pandas.core.groupby.generic.DataFrameGroupBy.apply
- Not Supported mappings:
  - pandas.core.frame.DataFrame.to_parquet
  - pandas.core.generic.NDFrame.to_csv
  - pandas.core.generic.NDFrame.to_excel
  - pandas.core.generic.NDFrame.to_sql
Updated the mapping status for the following Pandas elements:
- Direct mappings:
  - pandas.core.series.Series.empty
  - pandas.core.series.Series.apply
  - pandas.core.reshape.tile.qcut
- Direct mappings with EWI:
  - pandas.core.series.Series.fillna
  - pandas.core.series.Series.astype
  - pandas.core.reshape.melt.melt
  - pandas.core.reshape.tile.cut
  - pandas.core.reshape.pivot.pivot_table
Updated the mapping status for the following Pandas elements:
- Direct mappings:
  - pandas.core.series.Series.dt
  - pandas.core.series.Series.groupby
  - pandas.core.series.Series.loc
  - pandas.core.series.Series.shape
  - pandas.core.tools.datetimes.to_datetime
  - pandas.io.excel._base.ExcelFile
- Not Supported mappings:
  - pandas.core.series.Series.dt.strftime
Updated the mapping status for the following Pandas elements:
- From Not Supported to Direct mapping:
  - pandas.io.parquet.read_parquet
  - pandas.io.parsers.readers.read_csv
Updated the mapping status for the following Pandas elements:
- From Not Supported to Direct mapping:
  - pandas.io.pickle.read_pickle
  - pandas.io.sql.read_sql
  - pandas.io.sql.read_sql_query
Updated the description of Understanding the SQL Readiness Score.
Updated PyProgramCollector to collect the packages and populate the current packages inventory with data from Python source code.
Updated the mapping status of pyspark.sql.SparkSession.builder.appName from Rename to Transformation.
Removed the following Scala integration tests:
- AssesmentReportTest_AssessmentMode.ValidateReports_AssessmentMode
- AssessmentReportTest_PythonAndScala_Files.ValidateReports_PythonAndScala
- AssessmentReportTestWithoutSparkUsages.ValidateReports_WithoutSparkUsages
Updated the mapping status of pandas.core.generic.NDFrame.shape from Not Supported to Direct.
Updated the mapping status of pandas.core.series from Not Supported to Direct.

Deprecated

Deprecated the EWI code SPRKSCL1160 since org.apache.spark.sql.functions.sum is now a direct mapping.

Fixed

Fixed a bug by not supporting Custom Magics without arguments in Jupyter Notebook cells.
Fixed incorrect generation of EWIs in the issues.csv report when parsing errors occur.
Fixed a bug that caused the SMA not to process the Databricks exported notebook as Databricks notebooks.
Fixed a stack overflow error while processing clashing type names of declarations created inside package objects.
Fixed the processing of complex lambda type names involving generics, e.g., def func[X,Y](f: (Map[Option[X], Y] => Map[Y, X]))...
Fixed a bug that caused the SMA to add a PySpark EWI code instead of a Pandas EWI code to the Pandas elements that are not yet recognized.
Fixed a typo in the detailed report template: renaming a column from "Percentage of all Python Files" to "Percentage of all files".
Fixed a bug where pandas.core.series.Series.shape was wrongly reported.

July 19th, 2024

Application Version 2.1.0

Feature Updates include:

Updated Spark (Scala and Python) Conversion Core: 3.2.0

Spark Conversion Core Version 3.2.0

Changed

New Readiness Score for SQL in the results screen
Settings were added to the desktop application to enable or disable Pandas to Snowpark Pandas API conversion

July 11th, 2024

Application Version 2.0.2

Feature Updates include:

Updated Spark (Scala and Python) Conversion Core: 3.0.0

Spark Conversion Core Version 3.0.0

Breaking Changes

This new version includes major changes:

New Download links, previous version won’t be autoupdate.
A single Access is required for Python, Scala and SparkSQL. Previous access codes for Python will continue working. However, Scala ones won’t work anymore. You need to request a new access code.
No need to select language to analyze them.
After executing the tool, you won’t received the email Snowpark Qualification Report. As the report information is available locally to the user.

Removed

Unify Python/Scala conversion tools.
- Remove Select Source from Inquire form.
- Remove Select Source from NewProject/SeeSample project.
- Remove table generation from SMA.Desktop.

Changed

Unify Python/Scala conversion tools.
- Update to remove Python and Scala Conversion Core Version and have just an Engine Conversion Core Version.
- Update results screen.
- Access Code toast has information related to the Source Language.
- The summary Report screen has references to the Source Language.
- Input folder path validation is displaying the wrong message.
- Deprecate Scala licenses so that it only uses Python access code.

June 27, 2024

Application Version 1.3.1

Feature Updates include:

Updated Spark (Scala and Python) Conversion Core: 2.48.0

Spark Conversion Core Version 2.48.0

Added

Improved the parsing recovery mechanism for Scala files and Scala notebook cells to orphan fewer lines of code
Added support for HADOOP shell command related to HiveQL
Added support for HDFS shell command related to HiveQL
Added support for TBLPROPERTIES in ALTER VIEW statements
Updated the conversion status for SQL nodes in HiveQL that doesn't need conversion
Updated the conversion status for SQL nodes in SparkSQL that doesn't need conversion
The SQL nodes without a migration status were updated to PENDING
Improved the Jupyter parser to support as parameters the filename and the package name

Fixed

Fixed a bug that caused the SMA to not show the readiness score even though there were uses of the Spark API
Fixed a bug that caused the EWI SPRKSCL1000 to show a wrong description in the issue list table of the detailed report
Fixed the parsing of Comment clauses in SQL Statements with new lines
Fixed the parsing of statements after a Lateral View clause in HiveQL

June 13, 2024

Application Version 1.3.0

Feature Updates include:

Updated Spark (Scala and Python) Conversion Core: 2.47.0

Spark Conversion Core Version 2.47.0

Added

Added transformation for Hive Table Comment.
Added transformation for adding or replace Comment on Create View, Create table and Create function.
Added tag to comments for CREATE FUNCTION nodes.
Removed the generation of the conversion_rates.csv, files.csv, and parse_errors.csv inventories.

Fixed

Fixed DotNames (such as in this example: select * from id.12id.12_id3) which starts with numbers.
Parsed and refactored Comment Clause in the Create View.
Fixed missing columns on empty inventories.

May 30, 2024

Application Version 1.2.5

Feature Updates include:

Updated Spark (Scala and Python) Conversion Core: 2.46.0

Spark Conversion Core Version 2.46.0

Added

Added a parsing score indicator that shows the percentage of all the files that were successfully parsed .
Added SPRKPY1074 EWI for mixed indentation errors.
Updates to the Detailed Report
- Updated the look and feel of the report for both Python and Scala.
- Added a Total row for Code File Sizing table in the detailed report.
- Added files with Pandas usages table and Pandas API usages summaries table.
- Added the new File Type Summary table
- Added a new table called Files with Spark Usages.
- Added a new table called Files with Spark Usages by Support Status.
- Added SQL usages by file type table.
- Added SQL usages by status table.
- Transpose Notebook stats by language table
- Updated the detailed docx report to classify the readiness scores with N/A values as a green result
- Reindex order of tables in the deatiled report.
Updated conversion Status for SQL nodes in HiveSql and SparkSql that doesn't need conversion
Updates to SQL parsing support
- Identify and register mixed indentation error.
- Parse IS as Binary Operator
- Support RLike as Binary Operator
- Support DotNames which starts with numbers
- Parse Lateral View Clause
- Parse Parquet as Name in the Using table option
- Parsing IF as Function name
- Parse query parameters as expressions in SparkSQL.
- Parse IMAGE as alias
- Parse module(%) operator
- Parse ALL as alias
- Parse of SQL notebook cell with %% in magic commands
Added a core library mapping table to support the third party library analysis
Added ConversionStatusLibraries.csv

Fixed

Comment out remaining semicolon in top level statement in HiveQL.
Fixed Parse Lateral View with multiple AsClauses
Fixed Parse Lateral View parsing order

May 16, 2024

Application Version 1.2.4

Feature Updates include:

Updated Spark (Scala and Python) Conversion Core: 2.45.1

Spark Conversion Core Version 2.45.1

Added

Argument/parameter information in Python listed in the usages inventories
Added mappings:
- General PySpark
  - pyspark.sql.functions.map_from_arrays
  - pyspark.sql.dataframe.DataFrame.toPandas
- ML related Spark mappings for:
  - pyspark.ml
  - pyspark.ml.classification
  - pyspark.ml.clustering
  - pyspark.ml.feature
  - pyspark.ml.regression
  - pyspark.ml.feature StringIndexer
  - pyspark.ml.clustering KMeans
  - pyspark.ml.feature OneHotEncoder
  - pyspark.ml.feature MinMaxScaler
  - pyspark.ml.regression LinearRegression
  - pyspark.ml.feature StandardScaler
  - pyspark.ml.classification RandomForestClassifier
  - pyspark.ml.classification LogisticRegression
  - pyspark.ml.feature PCA
  - pyspark.ml.classification GBTClassifier
  - pyspark.ml.classification DecisionTreeClassifier
  - pyspark.ml.classification LinearSVC
  - pyspark.ml.feature RobustScaler
  - pyspark.ml.feature Binarizer
  - pyspark.ml.feature MaxAbsScaler
  - pyspark.ml.feature Normalizer
- Pandas API mappings have begun to the new Snowpark implementation of Pandas. These will not be converted, but will now be reported in the Pandas Usages Inventory. 82 mappings for the Pandas API were mapped. All are direct mappings with the exception of the first one:
  - pandas.core.series.Series.transpose [rename]
  - pandas
  - pandas.core.frame.DataFrame
  - pandas.core.frame.DataFrame.abs
  - pandas.core.frame.DataFrame.add_suffix
  - pandas.core.frame.DataFrame.axes
  - pandas.core.frame.DataFrame.columns
  - pandas.core.frame.DataFrame.copy
  - pandas.core.frame.DataFrame.cummax
  - pandas.core.frame.DataFrame.cummin
  - pandas.core.frame.DataFrame.describe
  - pandas.core.frame.DataFrame.diff
  - pandas.core.frame.DataFrame.drop
  - pandas.core.frame.DataFrame.drop_duplicates
  - pandas.core.frame.DataFrame.dtypes
  - pandas.core.frame.DataFrame.duplicated
  - pandas.core.frame.DataFrame.empty
  - pandas.core.frame.DataFrame.first
  - pandas.core.frame.DataFrame.first_valid_index
  - pandas.core.frame.DataFrame.head
  - pandas.core.frame.DataFrame.iloc
  - pandas.core.frame.DataFrame.isin
  - pandas.core.frame.DataFrame.isna
  - pandas.core.frame.DataFrame.isnull
  - pandas.core.frame.DataFrame.iterrows
  - pandas.core.frame.DataFrame.itertuples
  - pandas.core.frame.DataFrame.keys
  - pandas.core.frame.DataFrame.last
  - pandas.core.frame.DataFrame.last_valid_index
  - pandas.core.frame.DataFrame.max
  - pandas.core.frame.DataFrame.mean
  - pandas.core.frame.DataFrame.median
  - pandas.core.frame.DataFrame.min
  - pandas.core.frame.DataFrame.ndim
  - pandas.core.frame.DataFrame.notna
  - pandas.core.frame.DataFrame.notnull
  - pandas.core.frame.DataFrame.rename_axis
  - pandas.core.frame.DataFrame.reset_index
  - pandas.core.frame.DataFrame.select_dtypes
  - pandas.core.frame.DataFrame.set_axis
  - pandas.core.frame.DataFrame.set_index
  - pandas.core.frame.DataFrame.shape
  - pandas.core.frame.DataFrame.size
  - pandas.core.frame.DataFrame.squeeze
  - pandas.core.frame.DataFrame.sum
  - pandas.core.frame.DataFrame.tail
  - pandas.core.frame.DataFrame.take
  - pandas.core.frame.DataFrame.update
  - pandas.core.frame.DataFrame.value_counts
  - pandas.core.frame.DataFrame.values
  - pandas.core.groupby.generic.DataFrameGroupBy.count
  - pandas.core.groupby.generic.DataFrameGroupBy.max
  - pandas.core.groupby.generic.DataFrameGroupBy.sum
  - pandas.core.series.Series.abs
  - pandas.core.series.Series.add_prefix
  - pandas.core.series.Series.add_suffix
  - pandas.core.series.Series.array
  - pandas.core.series.Series.axes
  - pandas.core.series.Series.cummax
  - pandas.core.series.Series.cummin
  - pandas.core.series.Series.describe
  - pandas.core.series.Series.diff
  - pandas.core.series.Series.dtype
  - pandas.core.series.Series.dtypes
  - pandas.core.series.Series.first_valid_index
  - pandas.core.series.Series.hasnans
  - pandas.core.series.Series.idxmax
  - pandas.core.series.Series.idxmin
  - pandas.core.series.Series.keys
  - pandas.core.series.Series.last
  - pandas.core.series.Series.last_valid_index
  - pandas.core.series.Series.median
  - pandas.core.series.Series.notna
  - pandas.core.series.Series.rename_axis
  - pandas.core.series.Series.set_axis
  - pandas.core.series.Series.squeeze
  - pandas.core.series.Series.T
  - pandas.core.series.Series.tail
  - pandas.core.series.Series.take
  - pandas.core.series.Series.to_list
  - pandas.core.series.Series.to_numpy
  - pandas.core.series.Series.update
Updated Mappings:
- Added transformation for csv, json, and parquet functions including:
  - pyspark.sql.readwriter.DataFrameWriter.json
  - pyspark.sql.readwriter.DataFrameWriter.csv
  - pyspark.sql.readwriter.DataFrameWriter.parquet
- Updated mapping for pyspark.rdd.RDD.getNumPartitions to transformation
- Updated mapping for pyspark.storagelevel.StorageLevel to transformation
Added end-to-end test infrastructure and input/output validations
Changed the import statement transformation: not supported imports are removed and EWI messages are not generated in the code
Updated conversion Status for SQL nodes in Hive that doesn't need conversion (multiple expressions - part 02)
Update the SqlElementsInfo.csv with new identified elements
Updated Replacer and SqlElementsInfo items to include Transformation
Enable decorations in transformation to comment out unsupported nodes
Fixed the groupBy function in the source code of org.apache.spark.sql.DataFrame to place it correctly in the symbol table
toPandas added as pyspark in the ThirdPartyLibs

Fixed

Fixed some scenarios where EWI comments were not being added to the output code
Fixed processing of empty source cells presents in Jupyter Notebooks
Fixed parsing error message not being added in the output code
Fixed issue of pyspark.sql.functions.udf requiring the return_type parameter

May 2, 2024

Application Version 1.2.2

Feature Updates include:

Updated Spark (Scala and Python) Conversion Core: 2.44.0

Spark Conversion Core Version 2.44.0

Added

Argument information available in in Python usages inventory
Updated conversion Status for SQL nodes in Hive that don't need conversion
- Operators - numeric expressions
- Function expressions
- Multiple expressions
- Name expressions and literals
Parsing improvments in SparkSQL:
- DESCRIBE TABLE Clause
- REFRESH Clause
Add the groupBy parameters in the analysis of org.apache.spark.sql.DataFrame
Improved the logging mechanism to indicate if the logs are only written when errors happened or if all messages were logged (introduced the DebugLogger to log all messages)
Updated the default value of Scala parser timeout from 150ms to 300ms
Update SqlElementsInfo.csv to Direct Status
Changed order in the SqlElementsInfo.csv
Update parsing error message when a SQL statement is not parsed
Statements without recovery are now added to Issues.csv
Changed SqlElements mapping status to Direct and Partial
Updated the fully qualified names for the following Spark elements in the conversion status file:
- pyspark.sql.streaming.readwriter.DataStreamReader
- pyspark.sql.streaming.readwriter.DataStreamWriter
- pyspark.sql.streaming.query.StreamingQuery
Added the following Spark elements to the conversion status file as **NotSupported**:
- pyspark.sql.streaming.readwriter.DataStreamReader.format
- pyspark.sql.streaming.readwriter.DataStreamReader.table
- pyspark.sql.streaming.readwriter.DataStreamWriter.partitionBy
- pyspark.sql.streaming.query.StreamingQuery.awaitTermination
Removed the generation of the SummaryReport.docx, SummaryReport.html, and DetailedReport.html report files. Only the DetailedReport.docx will be generated.

Fixed

Fixed the issue of the SMA tool not detecting Python cells (%magic) in .scala notebooks
Fixed EWI comments not being added to the output code
Fixed processing of empty source cells presents in jupyter notebooks
Fixed categorization of Spark identified usages and data display in Spark API usage summary table.

April 19, 2024

Application Version 1.0.4

Feature Updates include:

Updated Spark (Scala and Python) Conversion Core: 2.42.1

Spark Conversion Core Version 2.42.1

Added

ThirdPartyLibrary to Report Additional Third Party Library Indicator.
Added Transform for Hive Set Statement.
Removed warning related to Unsupported .hql files in Symbol Table Loader for Python.
Added Transform for Hive Drop Table Statement.
Added ConversionBeginTaskBase and refactored tasks.
Added Transform for session.read("query", qry) to session.sql(qry).
Added handling for ImplicitImports node from JsonObjects.
Updated the parsing errors mechanism to avoid commenting out files with parsing errors.
Updated reporting mechanism to generate empty SQL reports when no SQL is found.
Updated the status conversion for the nodes (Create statements) that do not need conversion for Hive Inventory.
Updated the status conversion for the nodes that do not need conversion for Hive Inventory.
Changed EWI SPRKHVSQL1004 to indicate 'Information from underlying data files can not be recovered' instead of 'Purge removed from DROP TABLE statement'' and Change DROP TABLE transformation, to add ewi SPRKHVSQL1004 when PURGE statement is not present.
Collapse SqlNames and SqlJoins in the SQL Usages Inventory.
Updates Several SQL Statement with status and transformations:
- Nodes related with MERGE.
- Nodes with INSERT, ALTER, DROP TABLE, and CTEs.
- Nodes with create table, function, view, and table.
- Direct transformations for SqlSelect and related nodes.
Add support for DBC implicit imports.

Fixed

Updated the parsing errors mechanism to avoid commenting out notebooks cells with parsing errors.
Updated CallFunction parse rule to verify if has backslash or not new line to avoid parsing error when return statement has id and next statement is a deconstructed tuple assignment.
Fixed an issue that caused the Import Calls section of the reports to calculate incorrect percentage values.
Fixed issue related to not generating the detailed report.
Fixed EWI SPRKHVSQL1004 not being added to DROP TABLE transformation.
Fixed parsing error about return statement with id and deconstructed tuple assignment.
Fixed an issue that caused the Issues.csv and the notifications.pam files to not show the line, column, and file id of the files with parsing errors.
Fixed the text about ranges of readiness score.

March 19, 2024

Application Version 1.0.4

Feature Updates include:

Updated Spark (Scala and Python) Conversion Core: 2.40.1

Spark Conversion Core 2.40.1

Added

Parsing support for HiveQL including support for HiveSql files (.hql)
Remove the import for snowpark-extensions in Python
Logo updated in the Detailed Report
Ignored files are now noted in the Detailed Report
SQL elements calculator and SQL elements table added to the detailed report
Added transformation for WHEN NOT MATCH BY SOURCE when multiple match conditions exist
Site-packages, pip, dist, venv, and hidden directories now excluded from processing by the SMA
Rename Supported to IsSnowparkAnacondaSupported in the Import Usages spreadsheet
Added SQL elements to the SqlElementsInfo.csv catalog
Added a new column named Flavor to the SqlElementsInfo.csv inventory to distinguish between SparkSQL and HiveQL
Added parsing errors for SQL code to the Issues.csv file
New EWI's added for org.apache.spark.sql.functions.split related parameter errors
36 additional RDD elements added to the core mapping table (currently will be listed as unsupported)
Transformation and conversion support for:
- org.apache.spark.sql.types.StructField
- org.apache.spark.sql.functions.translate
- org.apache.spark.sql.Builder.enableHiveSupport
- pyspark.sql.functions.split
- org.apache.spark.sql.functions.split
- Adjusted the replacer for pyspark.sql.functions.unix_timestamp

Fixed

Modified the source concatenation process to ensure that magic commands are kept distinct. Now, strings are concatenated continuously until a magic command is encountered, at which point each magic command is handled separately.
Removed new lines in the format of Single line SQL when printing
Path for the generation of assessment zip files has been corrected
Corrected unnecessary imports of org.apache.spark.sql.Dataset
Conversion now removes Apache Spark imports remain after migration

March 18, 2024

Application Version 1.0.0

Feature Updates include:

New Snowpark Migration Accelerator logo.
Improved Assessment reports.
Updated Spark (Scala and Python) Conversion Core: 2.33.0

Spark Conversion Core 2.33.0

Added

Added additional inventory elements to the core mapping tables (currently, listed as not supported):
- Pandas not supported cases in the pandas mappings
- Added ML, Streaming and Blank not supported cases
Updated custom EWIs for Micro-partition, clustering, and streaming cases

February 12, 2024

Application Version 0.38.0

Feature Updates include:

Automatic license provisioning, now you can request a new SMA license directly from the app and receive it in your email.
Updated Spark (Scala and Python) Conversion Core: 2.29.0

Spark Conversion Core 2.29.0

Added

Added SQL elements inventory
Reports are no longer filtered by readiness score or Snowflake user
Group Import Call Summary table in Assessment Report by package
Added support Snowpark API Versions:
- Snowpark API version 1.10.0 on Python
- Snowpark API version 1.9.0 on Python
- Snowpark API version 1.8.0 on Python
Added/Updated mappings for:
- Pyspark
  - pyspark.sql.functions.pandas_udf
  - pyspark.sql.group.GroupedData.pivot
  - pyspark.sql.functions.unix_timestamp
- Scala
  - Multiple scenarios of contains functions, including org.apache.spark.sql.Column.contains(scala.Any)
  - org.apache.spark.sql.types.StructField.name
  - org.apache.spark.sql.types.StructField.fields
  - org.apache.spark.sql.function.array_agg
Recollection of Pandas data:
- Created Inventory for Pandas Usages
- Supported Pandas at ConversionStatus
- Added Pandas Information in reports
Generates assessment zip file
Support for parsing of an empty interpolation scenario (${})
Updated examples of the DetailedReport template in Appendix A for Python and Scala
Avoid adding hardcoded credentials to SparkConf transformation
Add JSON inventory conversion logic to code processor

Fixed

Fixed inconsistencies of table called notebook sizing by language
Fixed issue with try/except in sprocs creation
Exclude internal imports in Assessment Report and add origin to import inventory
Improve EWI message for parsing errors
Fixed error missing .map files in scala
Fixed no file type summary for other code extensions
Fixed parsing errors for methods named 'match'.
Fixed an error that omitted some files in the File Sizing table
Remove useless statement after removal of not required functions
Fix replacer to remove unsupported clearCache function
Fix parsing for *args and **kwargs with backslash
Fix scenario where alias of column with brackets was removed in transformation due to bad resolution

November 27, 2023

The tool's name has changed from SnowConvert for Spark to the Snowpark Migration Accelerator (SMA).

Application Version 0.33.1

Feature Updates include:

Name Change: SnowConvert for Spark -> Snowpark Migration Accelerator (SMA)
Updated Spark (Scala and Python) Conversion Core: 2.20.0
Trial Mode Enabled
- See the Sample Project section of the documentation.
Code Compare
- See the Code Compare section of the documentation.
Updated assessment report in the UI
- Walk through the updated assessment report in the application.
Updated support email available: [email protected]

Spark Conversion Core 2.20.0

Added

Add support to convert from Databricks to Jupyter (.dbc -> .ipynb)
Add line number of the error when there is a parsing error
Add company written by the user to the execution info in the assessment summary

Add mappings for:

org.apache.spark.sql.Column.contains(scala.Any)

Example:
Spark:    col("a").contains(col("b"))
Snowpark: contains(col("a"), col("b"))

Add needed data to display detailed report info in the Desktop tool reports
- Updates to the assessment JSON file to accommodate the detailed assessment report
Dataframes saved as a tables using a Hive format now converted to not be specific to Hive
Add automated generation of stored procedures for Spark entry points
Add preprocess step in Python files to identify combination of spaces and tabs, and normalize them with spaces to prevent parsing errors
Inventories uploaded to telemetry even if the tool crashes
Adjust new tool name (Snowpark Migration Accelerator) in DOCX and HTML reports to accommodate the rebranding

Fixed

Fix Import call summary table in the report not matching the total value
Fix timeout issue in application for StructType with multiple fields
Fix indentation scenarios that do not require normalization in Scala
Fix 'Load Symbol Table' crash when the base class is not defined
Fix an issue causing the 'Python File Sizing' and 'Scala File Sizing' tables in the reports to display wrong values
Fix tool getting stuck when processing SQL files in Scala

November 09, 2023

Application Version 0.27.5

Updates include:

Updated Spark (Scala and Python) Conversion Core: 2.16.0
Update to the license request mechanism inside the application.

Spark Conversion Core 2.16.0

Updates include:

Add support for DataFrame alias at joins for Spark Scala.
Import Call Summary table in Assessment Report truncated and ordered.
Turn off by default the condensed file ID feature.

November 02, 2023

Application Version 0.26.0

Updates include:

Updated Spark (Scala and Python) Conversion Core: 2.14.0
The logger mechanism has been updated.

October 25, 2023

Application Version 0.25.11

Updates include:

Updated Spark (Scala and Python) Conversion Core: 2.14.0
Improved crash report flow
Fixes in Code Compare component
The button “View Reports” was changed to open the expected folder

Spark Conversion Core 2.14.0

Updates include:

Add condensed ID for filenames and use it in the log.
Refactor output folder hierarchy of the TrialMode.
Generate Reports locally in Assessment mode when the score hits 90 or higher.
Generate Reports locally in Assessment mode when it's a Snowflake user.
Create inventories as .csv files (as shown below).
Move inventories to the Reports folder (as shown below).
Reports folder now available in Assessment Mode

October 19, 2023

Version 0.25.6 (Oct 19, 2023)

Included SnowConvert Core Versions

Scala Conversion Core 2.13.0
Python Conversion Core 2.13.0

Fixes

Inconsistencies with Spark-supported file extensions
CLI Terms and Conditions and Show Access Code options
Visual fixes

Features

SnowConvert Client separation

Version 0.24.0(Oct 04, 2023)

Included SnowConvert Core Versions

Scala Conversion Core 2.9.0
Python Conversion Core 2.9.0

Fixes

Conversion settings persistency on project files.
Inconsistencies in SQL Assessment and Conversion reports were fixed.

Features

Feature Flags for CLI

Version 0.20.3(Sept 14, 2023)

Included SnowConvert Core Versions

SQL Conversion Core: 22.2.63
- Oracle
- Teradata
- SQLServer
Scala Conversion Core 2.6.0
Python Conversion Core 2.6.0

Features

Analyzing sub-folders and Converting sub-folders are now available.
Include the Disable topological level reorder flag as part of the Teradata conversion settings.

Fixes

Conversion finished successfully but reporting a crashed status.
SQL Server schema was set to PUBLIC automatically.
Missing generic scanner files on Spark/Python assessment.
Updated EULA.

Version 0.19.7(Sept 7, 2023)

Included SnowConvert Core Versions

SQL Conversion Core: 22.2.48
- Oracle
- Teradata
- SQLServer
Scala Conversion Core 2.5.0
Python Conversion Core 2.5.0

Version 0.19.1(Sept 4, 2023)

Included SnowConvert Core Versions

SQL Conversion Core: 22.2.30
- Oracle
- Teradata
- SQLServer
Scala Conversion Core 2.4.0
Python Conversion Core 2.4.0

Fixes

Changed default Conversion Rate on Reports to Lines of Code Conversion Rate.
Fixed issues with the list of Recently opened projects.
Fixed issue when trying to open an invalid .snowct file

Version 0.17.0(Aug 24, 2023)

Included SnowConvert Core Versions

SQL Conversion Core: 22.2.9
- Oracle
- Teradata
- SQLServer
Scala Conversion Core 2.3.31
Python Conversion Core 2.3.31

Fixes

Assessment Conversion settings on the correct platforms.
Input Folder validations.
Creating a project with an existent name in the input folder blocked the application.

Version 0.16.1(Aug 21, 2023)

Included SnowConvert Core Versions

SQL Conversion Core: 22.0.47
- Oracle
- Teradata
- SQLServer
Scala Conversion Core 2.3.31
Python Conversion Core 2.3.31

Fixes

A unified CLI version is now available.
Fix displayed data on SQL Conversion reports.
Open recent project issues when starting a new project.
Assessment settings.

Version 0.15.2(Aug 17, 2023)

Included SnowConvert Core Versions

SQL Conversion Core: 22.0.47
- Oracle
- Teradata
- SQLServer
Scala Conversion Core 2.3.31
Python Conversion Core 2.3.31

Fixes

An auto-update issue with the x64 version for macOS. (Requires manual reinstallation).
Fix links displayed in report pages.
Minor updates in texts and labels.

Version 0.14.5(Aug 10, 2023)

Included SnowConvert Core Versions

SQL Conversion Core: 22.0.32
- Oracle
- Teradata
- SQLServer
Scala Conversion Core 2.3.31
Python Conversion Core 2.3.31

Hotfix change for Snowpark Engines.

Version 0.14.1 (Aug 9, 2023)

Included SnowConvert Core Versions

SQL Conversion Core: 22.0.32
- Oracle
- Teradata
- SQLServer
Scala Conversion Core 2.3.22
Python Conversion Core 2.3.22

Fixes

Fixed visual bugs on reports.
Changes on the Request an Access Code page
Rename the access-code field on the .snowct files.
Don't create empty output folders.

Version 0.13.1 (Aug 3, 2023)

Included SnowConvert Core Versions

SQL Conversion Core: 22.0.17
- Oracle
- Teradata
- SQLServer
Scala Conversion Core 2.3.22
Python Conversion Core 2.3.22

Fixes

Improvements in Assessment and Conversion Reports

Updates in the reports layouts.
Collapsible sections.
Order in Card Components.

Version 0.11.7 (July 27, 2023)

Included SnowConvert Core Versions

SQL Conversion Core: 21.0.170
Scala Conversion Core 2.2.1
Python Conversion Core 2.2.1

Fixes

Fixing Conversion Rate by LoC.

Adding % to SQL LoC Conversion Rate

Output path validation was added in the report viewer.

Telemetry can be disabled once a valid license is selected.

Version 0.11.3 (July 19, 2023)

Included SnowConvert Core Versions

SQL Conversion Core: 21.0.153
Scala Conversion Core 2.2.1
Python Conversion Core 2.2.1

Fixes

Conversion settings reset after changing the current step.

Minor visual improvements.

Wording changes.

Version 0.9.2 (July 12, 2023)

Included SnowConvert Core Versions

SQL Conversion Core: 21.0.129
Scala Conversion Core 2.1.16 1
Python Conversion Core 2.1.161

Fixes

Included preview header.

Minor visual improvements.

Version 0.8.2 (July 10, 2023)

Included SnowConvert Core Versions

SQL Conversion Core: 21.0.84
Scala Conversion Core 2.1.16 1
Python Conversion Core 2.1.161

Fixes

Reset the timer on the progress bar in alerts.
Fixing styles on displayed alert notifications.
Added preview banner on application header.
Improved exception handling mechanism.

Version 0.7.6 (July 03, 2023)

Included SnowConvert Core Versions

SQL Conversion Core: 21.0.84
Scala Conversion Core 2.1.14 3
Python Conversion Core 2.1.143

Fixes

Updates notarization tool.
Fix the conversion rate issue when using conversion settings.
Fix the open new project flow after opening an old project.
Remove the .mobilize folder from outputs.
Improve alerts and notifications.
Windows certificate naming issue. (Requires manual reinstallation).

Version 0.6.1 (June 23, 2023)

Included SnowConvert Core Versions

SQL Conversion Core: 21.0.60
Scala Conversion Core 2.1.120
Python Conversion Core 2.1.120

Fixes

Sign Windows binaries with Snowflake certificates.
Fixed issue when creating a new project after opening an existing one.
Minor styling and wording improvements.

Version 0.4.1 (June 21, 2023)

Included SnowConvert Core Versions

SQL Conversion Core: 21.0.60
Scala Conversion Core 2.1.120
Python Conversion Core 2.1.120

Fixes

The report information does not display the correct information.
Keep the conversion failed status when reopening the project.
Update texts and documentation links.

Version 0.3.0 (June 16, 2023)

Included SnowConvert Core Versions

SQL Conversion Core: 21.0.60
Scala Conversion Core 2.1.120
Python Conversion Core 2.1.120

Fixes

Added tool version in error logs.
Included custom installation wizard for Windows version.
Assessment report tables not processing numbers with commas.
The code signing certificate was changed. This affects the OTA Update, manual installation of this version is required.

Version 0.2.9 (June 15, 2023)

Included SnowConvert Core Versions

SQL Conversion Core: 21.0.49
Scala Conversion Core 1.1.135
Python Conversion Core 2.1.112

Fixes

Missing information in telemetry reports
Fix the auto-saving issue with .snowct project files.
Telemetry enabled for conversion flows.
Error is shown when trying to convert without supported files.

PreviousOpen Source Libraries NextOld Version Release Notes

Last updated 3 months ago