Generic Inventories

Additional data not specific to a particular source language

When the Snowpark Migration Accelerator (SMA), it runs two scanners: one is specific to the source language of the source code, and the other is a generic scanner designed to pull basic information about the files and keywords present in the codebase. The data generated by the source specific scanner is described in the SMA Inventories section of this documentation. The data generated by the generic scanner is described on this page.

Note that some of these files have a .pam extension, but note that they are still comma separated files like a .csv. There are also duplicated entries across some of these files as the data may be organized differently to support using it for analysis.

File Summary

The files.pam has an inventory of each file present in that execution of the tool. The filetype and size are reported in this inventory. This files is a duplicate of the files.csv described in the SMA Inventories section of this documentation.

Generic File Inventory

The FilesInventory.csv contains information related to categorization of each files and the line count for each file.

  • Filename: includes both the filename and path from the root input directory

  • Extension: the file’s extension

  • Technology: the source file’s technology based on extension

  • Status: “OK” if the tool could identify the file (unidentified files won’t be listed so this is the only value)

  • isBinary: TRUE if the file is a binary, FALSE if not, UNKNOWN if the tool does not recognize the extension

  • Bytes: size of the file in bytes

  • ContentType: whether the line of code is Code, Comment, Blank, or Other (if the tool does not recognize the line type)

  • ContentLines: total count of code lines in that file

  • CommentLines: total count of comment lines in that file

  • BlankLines: total count of blank lines in that file

Keyword Counts

The KeywordCounts.csv lists every keyword found in each fileby technology. This is not limited to source languages supported by the SMA. This could be any source language that the generic scanner is able to read.

  • FileId: file where the keyword was found and the relative path to that file.

  • Technology: the source technology for this file

  • Keyword: text that correspond to a keyword. Ex: from, import, DataFrame, etc.

  • Count: the number of times that keyword shows up in a single line.

Lines Inventory

The line_counts.pam file describes whether a line in a scanned file has code, comments, or is blank, and gives a count of each one.

  • FileId: the filename.

  • LineKind: what type of text is in the line (code, comment, or blank)

  • Count: the number of lines for that FileId and LineKind

Tool Execution Inventory

The tool_execution.pam file has some basic information about this run of the SMA tool. It is a duplicate of the tool_execution.csv file described in the SMA Inventories section of this documentation.

Word Counts

The word_counts.pam file reports the count of a given keyword in each file in the scanned codebase.

  • FileId: file where the keyword was found and the relative path to that file.

  • Keyword: text that correspond to a keyword. Ex: from, import, DataFrame, etc.

  • Count: the number of times that keyword shows up in a single line.

Last updated

#332: [SIT-1562] SQL Readiness

Change request updated