Data Validation

SnowConvert AI, as part of the end-to-end experience, offers the option to validate the migrated data to Snowflake to ensure that it matches the data that was originally hosted in your database.

Data Validation Modes

The data validation feature employs two distinct validation levels, Schema Validations and Metrics Validations, to ensure your data is successfully migrated to Snowflake.

Schema Validations

Schema validation confirms that the basic structure of your migrated table is preserved in Snowflake. It validates the following table attributes:

  • Table name

  • Column names

  • Ordinal position of each column

  • Data types

  • Character maximum length for text columns

  • Numeric precision and scale for numeric columns

  • Row Count

Metrics Validations

Metrics validation focuses on verifying that the data itself matches the original source. This is achieved by comparing aggregate metrics between the original table and the new Snowflake table. While the specific metrics can vary by column data type, the feature generally evaluates the following:

  • Minimum value

  • Maximum value

  • Average

  • Nulls count

  • Distinct count

  • Standard deviation

  • Variance

How to use this feature


This feature is an optional step within the end-to-end SQL Server migration process. It provides data validations after you've used SnowConvert AI to move your data.

Prerequisites

This feature requires you to have installed Python ≥ 3.9 or <12 and have it available in your PATH

Usage

  1. When the Data migration step is finalized, a Data Validation Step will be available to verify that your data is correct.

Data validation step
  1. Once the Validate Data step initiates, it will verify that you have Python available and will set up a virtual environment to execute the Data Validation tool.

    Python environment setup

  2. If the Python environment is successful, it will initiate the data validation by performing the validation modes explained above.

Validating a table
  1. After it finishes, a report will be generated, and a summary table will be displayed so you can see the discrepancies in the table. Also, a CSV file report is generated so you can visualize and share it.

Generated report in SnowConvert AI

The validation results are classified into three categories:

Category
Descriptio

Values match exactly between the source database and Snowflake.

Snowflake table has minor differences (e.g. higher numeric precision) that don't impact data.

Values do not match between the original and Snowflake databases.

Finally, you can open the reports folder and get access to the generated CSV reports.

Generated reports in CSV format

Last updated