LogoLogo
SnowflakeDocumentation Home
  • Snowpark Migration Accelerator Documentation
  • General
    • Introduction
    • Getting Started
      • Download and Access
      • Installation
        • Windows Installation
        • MacOS Installation
        • Linux Installation
    • Conversion Software Terms of Use
      • Open Source Libraries
    • Release Notes
      • Old Version Release Notes
        • SC Spark Scala Release Notes
          • Known Issues
        • SC Spark Python Release Notes
          • Known Issues
    • Roadmap
  • User Guide
    • Overview
    • Before Using the SMA
      • Supported Platforms
      • Supported Filetypes
      • Code Extraction
      • Pre-Processing Considerations
    • Project Overview
      • Project Setup
      • Configuration and Settings
      • Tool Execution
    • Assessment
      • How the Assessment Works
      • Assessment Quick Start
      • Understanding the Assessment Summary
      • Readiness Scores
      • Output Reports
        • Curated Reports
        • SMA Inventories
        • Generic Inventories
        • Assessment zip file
      • Output Logs
      • Spark Reference Categories
    • Conversion
      • How the Conversion Works
      • Conversion Quick Start
      • Conversion Setup
      • Understanding the Conversion Assessment and Reporting
      • Output Code
    • Using the SMA CLI
      • Additional Parameters
  • Use Cases
    • Assessment Walkthrough
      • Walkthrough Setup
        • Notes on Code Preparation
      • Running the Tool
      • Interpreting the Assessment Output
        • Assessment Output - In Application
        • Assessment Output - Reports Folder
      • Running the SMA Again
    • Conversion Walkthrough
    • Sample Project
    • Using SMA with Docker
    • SMA CLI Walkthrough
  • Issue Analysis
    • Approach
    • Issue Code Categorization
    • Issue Codes by Source
      • General
      • Python
        • SPRKPY1000
        • SPRKPY1001
        • SPRKPY1002
        • SPRKPY1003
        • SPRKPY1004
        • SPRKPY1005
        • SPRKPY1006
        • SPRKPY1007
        • SPRKPY1008
        • SPRKPY1009
        • SPRKPY1010
        • SPRKPY1011
        • SPRKPY1012
        • SPRKPY1013
        • SPRKPY1014
        • SPRKPY1015
        • SPRKPY1016
        • SPRKPY1017
        • SPRKPY1018
        • SPRKPY1019
        • SPRKPY1020
        • SPRKPY1021
        • SPRKPY1022
        • SPRKPY1023
        • SPRKPY1024
        • SPRKPY1025
        • SPRKPY1026
        • SPRKPY1027
        • SPRKPY1028
        • SPRKPY1029
        • SPRKPY1030
        • SPRKPY1031
        • SPRKPY1032
        • SPRKPY1033
        • SPRKPY1034
        • SPRKPY1035
        • SPRKPY1036
        • SPRKPY1037
        • SPRKPY1038
        • SPRKPY1039
        • SPRKPY1040
        • SPRKPY1041
        • SPRKPY1042
        • SPRKPY1043
        • SPRKPY1044
        • SPRKPY1045
        • SPRKPY1046
        • SPRKPY1047
        • SPRKPY1048
        • SPRKPY1049
        • SPRKPY1050
        • SPRKPY1051
        • SPRKPY1052
        • SPRKPY1053
        • SPRKPY1054
        • SPRKPY1055
        • SPRKPY1056
        • SPRKPY1057
        • SPRKPY1058
        • SPRKPY1059
        • SPRKPY1060
        • SPRKPY1061
        • SPRKPY1062
        • SPRKPY1063
        • SPRKPY1064
        • SPRKPY1065
        • SPRKPY1066
        • SPRKPY1067
        • SPRKPY1068
        • SPRKPY1069
        • SPRKPY1070
        • SPRKPY1071
        • SPRKPY1072
        • SPRKPY1073
        • SPRKPY1074
        • SPRKPY1075
        • SPRKPY1076
        • SPRKPY1077
        • SPRKPY1078
        • SPRKPY1079
        • SPRKPY1080
        • SPRKPY1081
        • SPRKPY1082
        • SPRKPY1083
        • SPRKPY1084
        • SPRKPY1085
        • SPRKPY1086
        • SPRKPY1087
        • SPRKPY1088
        • SPRKPY1089
        • SPRKPY1101
      • Spark Scala
        • SPRKSCL1000
        • SPRKSCL1001
        • SPRKSCL1002
        • SPRKSCL1100
        • SPRKSCL1101
        • SPRKSCL1102
        • SPRKSCL1103
        • SPRKSCL1104
        • SPRKSCL1105
        • SPRKSCL1106
        • SPRKSCL1107
        • SPRKSCL1108
        • SPRKSCL1109
        • SPRKSCL1110
        • SPRKSCL1111
        • SPRKSCL1112
        • SPRKSCL1113
        • SPRKSCL1114
        • SPRKSCL1115
        • SPRKSCL1116
        • SPRKSCL1117
        • SPRKSCL1118
        • SPRKSCL1119
        • SPRKSCL1120
        • SPRKSCL1121
        • SPRKSCL1122
        • SPRKSCL1123
        • SPRKSCL1124
        • SPRKSCL1125
        • SPRKSCL1126
        • SPRKSCL1127
        • SPRKSCL1128
        • SPRKSCL1129
        • SPRKSCL1130
        • SPRKSCL1131
        • SPRKSCL1132
        • SPRKSCL1133
        • SPRKSCL1134
        • SPRKSCL1135
        • SPRKSCL1136
        • SPRKSCL1137
        • SPRKSCL1138
        • SPRKSCL1139
        • SPRKSCL1140
        • SPRKSCL1141
        • SPRKSCL1142
        • SPRKSCL1143
        • SPRKSCL1144
        • SPRKSCL1145
        • SPRKSCL1146
        • SPRKSCL1147
        • SPRKSCL1148
        • SPRKSCL1149
        • SPRKSCL1150
        • SPRKSCL1151
        • SPRKSCL1152
        • SPRKSCL1153
        • SPRKSCL1154
        • SPRKSCL1155
        • SPRKSCL1156
        • SPRKSCL1157
        • SPRKSCL1158
        • SPRKSCL1159
        • SPRKSCL1160
        • SPRKSCL1161
        • SPRKSCL1162
        • SPRKSCL1163
        • SPRKSCL1164
        • SPRKSCL1165
        • SPRKSCL1166
        • SPRKSCL1167
        • SPRKSCL1168
        • SPRKSCL1169
        • SPRKSCL1170
        • SPRKSCL1171
        • SPRKSCL1172
        • SPRKSCL1173
        • SPRKSCL1174
        • SPRKSCL1175
      • SQL
        • SparkSQL
          • SPRKSPSQL1001
          • SPRKSPSQL1002
          • SPRKSPSQL1003
          • SPRKSPSQL1004
          • SPRKSPSQL1005
          • SPRKSPSQL1006
        • Hive
          • SPRKHVSQL1001
          • SPRKHVSQL1002
          • SPRKHVSQL1003
          • SPRKHVSQL1004
          • SPRKHVSQL1005
          • SPRKHVSQL1006
      • Pandas
        • PNDSPY1001
        • PNDSPY1002
        • PNDSPY1003
        • PNDSPY1004
      • DBX
        • SPRKDBX1001
    • Troubleshooting the Output Code
      • Locating Issues
    • Workarounds
    • Deploying the Output Code
  • Translation Reference
    • Translation Reference Overview
    • SIT Tagging
      • SQL statements
    • SQL Embedded code
    • HiveSQL
      • Supported functions
    • Spark SQL
      • Spark SQL DDL
        • Create Table
          • Using
      • Spark SQL DML
        • Merge
        • Select
          • Distinct
          • Values
          • Join
          • Where
          • Group By
          • Union
      • Spark SQL Data Types
      • Supported functions
  • Workspace Estimator
    • Overview
    • Getting Started
  • INTERACTIVE ASSESSMENT APPLICATION
    • Overview
    • Installation Guide
  • Support
    • General Troubleshooting
      • How do I give SMA permission to the config folder?
      • Invalid Access Code error on VDI
      • How do I give SMA permission to Documents, Desktop, and Downloads folders?
    • Frequently Asked Questions (FAQ)
      • Using SMA with Jupyter Notebooks
      • How to request an access code
      • Sharing the Output with Snowflake
      • DBC files explode
    • Glossary
    • Contact Us
Powered by GitBook
On this page
  • Detailed Report
  • Readiness Scores Summary
  • File Summary
  • Spark API Summary
  • Pandas API Usage Summary
  • Import Reference Summary
  • SQL Reference Summary
  • Snowpark Migration Accelerator (SMA) Issue Summary
  1. User Guide
  2. Assessment
  3. Output Reports

Curated Reports

Reporting to guide you on the road to a successful migration

PreviousOutput ReportsNextSMA Inventories

Last updated 7 months ago

The Snowpark Migration Accelerator (SMA) builds several reports that combine information from the detailed data that is generated by the assessment. Those reports are listed here.

The from the assessment are listed on the next couple of pages.

Detailed Report

The DetailedReport.html is deprecated since Spark Conversion Core V2.43.0

Note: This page will cover all sections of the detailed report as it appears in the document file.

The detailed report is the primary report generated by the SMA. This report contains multiple sections.

Below are the sections of the assessment report with descriptions of each section:

The first page in the detailed report has a brief description of the SMA tool.

This page has a subsection:

Readiness Scores Summary

In this section also provides more information for each readiness score:

  • Spark API: This will give you the count of references (usages) to the Spark API, the count that are ready for conversion. The readiness score is the [usages ready for conversion] / [identified usages].

  • Third Party Libraries: This will give you the percentage of imported third-party libraries that are categorized as supported in Snowflake. The readiness score is the [third party import supported in Snowflake] / [all third-party imports]

File Summary

On the next page starts the file summary. Depending on the quantity of unique filetypes present in this execution of the tool, this section may take multiple pages in the report.

  • File Type Summary: Gives the count of each technology that was recognized, how many lines of code were in those files total, and the percentage of all files that match that file extension.

  • File Extension Summary: Gives the count of each file extension that was recognized, how many lines of code were in those files total, and the percentage of all files that match that file extension.

  • Code File Sizing: This gives the "t-shirt" sizing for the code files present in this execution of the SMA . The size buckets are given in the table along with the count of files in each description on the percentage of all code files that meet that description.

  • Notebook Stats by Language: A count of the lines of code and cells belonging to a certain technology for all notebooks scanned.

  • Notebook Sizing by Language: This will give a "t-shirt" sizing for each notebook file based on the files of code present in that file. The notebook "type" (Python, Scala, or SQL) is related to the count of cells related to any of those languages. The sizing is determined as follows:

    • XS - less than 50 lines of code

    • S - between 50 and 200 lines of code

    • M - between 200 and 500 lines of code

    • L - between 500 and 1000 lines of code

    • XL - greater than 1000 lines of code

Spark API Summary

The Spark API Summary is a deeper dive into what makes up the readiness score presented on the Readiness Score section. There are four tables in this section. The first one is a summary of what files have Spark API references, the second table is a summary of what is supported or unsupported, the third breaks the readiness score down by Spark API category, and the fourth breaks it down by Mapping Status.

In each, the concept of supported and unsupported usages (references to the Spark API) will be shared. To be clear on the definition of supported vs. unsupported:

  • Supported: The SMA knows of a conversion or workaround that can take the listed API element to the Snowpark API.

  • Unsupported: The SMA does not know of a conversion or workaround that can take the listed API element to the Snowpark API or it does not recognize the element. This is does not mean that there is no conversion path forward. It simply means that the conversion cannot be automated.

  • Files with Spark Usages: This table breaks down by technology. This table gives you an idea of how many Spark references are in the whole workload.

  • Files with Spark Usages by Support Status: This table is the count of supported vs unsupported usages in the source codebase order by technology.

  • Spark API Usage Summaries: This table breaks down by category of the Spark API. Each category is given the count of each type by supported or not supported for Python and Scala. The final reported value is the Spark API Readiness Score. This should be the same value as reported on the Readiness Score section.

Pandas API Usage Summary

Note that the Pandas API Usage Summary is only for thus executions that have Python files.

Much like the Spark API Summary shown above, the Pandas API Summary list the references to the Pandas API.

  • Files with Pandas Usages: This table breaks down by technology. This table gives you an idea of how many Pandas references are in the whole workload.

  • Pandas API Usages Summary: This table is order based on the Pandas library in the source codebase and gives you an idea of how many times a specific Pandas library is been used.

Import Reference Summary

This section will show anything imported into a file in the codebase. This could be third party libraries or other elements imported into any file in the codebase. This table should exclude imports from other files in the workload.

The table shows imported packages, whether that package is supported in the anaconda distribution in Snowpark, the count of how time it is imported (likely correlated to the number of files using that import), and the percent of all files with that import. It's important to note that the percent column will show a total value of 100%, but the percent values above it do not necessarily need to add up to 100%. It's likely that multiple imports will occur in the same files.

SQL Reference Summary

  • SQL Usages by File Type: This table breaks down by technology. This table gives you an idea of how many SQL files or SQL cells are being identified in the whole workload.

  • SQL Usages by Support Status: This table is order based on if exist or not an equivalent in Snowflake.

Snowpark Migration Accelerator (SMA) Issue Summary

The SMA generates issues each time it needs to report a warning, conversion error, or parsing error in for the scanned codebase. These issues and working through them are the basis for completing a successful migration using the Snowpark Migration Accelerator.

In this summary, each issue will be listed along with the issue code (including a link to the documentation site with more information on each issue), the count of how many times that issue occurs in a workload, and the severity level.

The severity levels (Warning, Conversion Error, and Parsing Error) are described above as well as a summary organized by severity level.

As general advice, parsing errors should be resolved immediately, conversion errors should be resolved programmatically, and warnings should be noted and watched as the migration moves forward.

Appendixes

There is currently only one appendix. This shows a description of each mapping status category.


The Summary Report is deprecated since Spark Conversion Core V2.43.0


These are the output reports generated by the SMA. Next up is the detailed spreadsheets available in the output.

Execution Summary: The execution summary has your organization name and email that you entered on the page. There is also a unique id number for each execution of the SMA that is given here (and will appear often in the inventories section). There is also a timestamp, and version information for both the SMA and the Snowpark API.

On the next page starts the readiness scores summary. Provides the readiness scores for and with detailed information in how to interpret them. These scores are simple metrics used to show the user how "ready" a codebase is for Snowflake. It should look something like this:

Note that most of this information is also available in the itself.

Spark API Usage by Support Category: This breaks the count of references to the Spark API (usages) out by the mapping status or category that the tool defines. These are listed and described on of this documentation.

For more detailed information on the issues and analyzing the issues, review of this documentation.

This is the full detailed report. All of the information in the report comes from generated by the SMA.

Looking for more information in the detailed report? Reach out to the SMA team at .

project creation
assessment summary presented in the application
the Spark Reference Categories page
the issue analysis section
the inventory files
sma-support@snowflake.com
detailed spreadsheets with the inventoried elements
File Summary (Part 1)
Spark API Usage Summary
Pandas API Usage Summary
Code Import Calls
Spark API
Third-Party libraries