Normalize

Translation reference to convert Teradata Normalize functionality to Snowflake

Description

NORMALIZE specifies that period values in the first-period column that meet or overlap are combined to form a period that encompasses the individual period values. For more information about Normalize clause, see the Teradata documentation.

Sample Source Patterns

Some parts in the output code are omitteed for clarity reasons.

Sample data

IN -> Teradata_01.sql
CREATE TABLE project (
    emp_id INTEGER,
    project_name VARCHAR(20),
    dept_id INTEGER,
    duration PERIOD(DATE)
);

INSERT INTO project
VALUES
    (
        10,
        'First Phase',
        1000,
        PERIOD(DATE '2010-01-10', DATE '2010-03-20')
    );

INSERT INTO project
VALUES
    (
        10,
        'First Phase',
        2000,
        PERIOD(DATE '2010-03-20', DATE '2010-07-15')
    );

INSERT INTO project
VALUES
    (
        10,
        'Second Phase',
        2000,
        PERIOD(DATE '2010-06-15', DATE '2010-08-18')
    );

INSERT INTO project
VALUES
    (
        20,
        'First Phase',
        2000,
        PERIOD(DATE '2010-03-10', DATE '2010-07-20')
    );

INSERT INTO project
VALUES
    (
        20,
        'Second Phase',
        1000,
        PERIOD(DATE '2020-05-10', DATE '2020-09-20')
    );
OUT -> Teradata_01.sql
CREATE OR REPLACE TABLE project (
    emp_id INTEGER,
    project_name VARCHAR(20),
    dept_id INTEGER,
    duration VARCHAR(24) !!!RESOLVE EWI!!! /*** SSC-EWI-TD0053 - SNOWFLAKE DOES NOT SUPPORT THE PERIOD DATATYPE, ALL PERIODS ARE HANDLED AS VARCHAR INSTEAD ***/!!!
)
COMMENT = '{"origin":"sf_sc","name":"snowconvert","version":{"major":1, "minor":0},{"attributes":{"component":"teradata"}}'
;

INSERT INTO project
VALUES (
10,
        'First Phase',
        1000, PUBLIC.PERIOD_UDF(DATE '2010-01-10', DATE '2010-03-20') !!!RESOLVE EWI!!! /*** SSC-EWI-TD0053 - SNOWFLAKE DOES NOT SUPPORT THE PERIOD DATATYPE, ALL PERIODS ARE HANDLED AS VARCHAR INSTEAD ***/!!!);

INSERT INTO project
VALUES (
10,
        'First Phase',
        2000, PUBLIC.PERIOD_UDF(DATE '2010-03-20', DATE '2010-07-15') !!!RESOLVE EWI!!! /*** SSC-EWI-TD0053 - SNOWFLAKE DOES NOT SUPPORT THE PERIOD DATATYPE, ALL PERIODS ARE HANDLED AS VARCHAR INSTEAD ***/!!!);

INSERT INTO project
VALUES (
10,
        'Second Phase',
        2000, PUBLIC.PERIOD_UDF(DATE '2010-06-15', DATE '2010-08-18') !!!RESOLVE EWI!!! /*** SSC-EWI-TD0053 - SNOWFLAKE DOES NOT SUPPORT THE PERIOD DATATYPE, ALL PERIODS ARE HANDLED AS VARCHAR INSTEAD ***/!!!);

INSERT INTO project
VALUES (
20,
        'First Phase',
        2000, PUBLIC.PERIOD_UDF(DATE '2010-03-10', DATE '2010-07-20') !!!RESOLVE EWI!!! /*** SSC-EWI-TD0053 - SNOWFLAKE DOES NOT SUPPORT THE PERIOD DATATYPE, ALL PERIODS ARE HANDLED AS VARCHAR INSTEAD ***/!!!);

INSERT INTO project
VALUES (
20,
        'Second Phase',
        1000, PUBLIC.PERIOD_UDF(DATE '2020-05-10', DATE '2020-09-20') !!!RESOLVE EWI!!! /*** SSC-EWI-TD0053 - SNOWFLAKE DOES NOT SUPPORT THE PERIOD DATATYPE, ALL PERIODS ARE HANDLED AS VARCHAR INSTEAD ***/!!!);

Normalize Clause

Suppose you want to use Normalize clause with the employee id.

IN -> Teradata_02.sql
SELECT
    NORMALIZE emp_id,
    duration
FROM
    project;
OUT -> Teradata_02.sql
!!!RESOLVE EWI!!! /*** SSC-EWI-TD0079 - THE REQUIRED PERIOD TYPE COLUMN WAS NOT FOUND ***/!!!
WITH NormalizeCTE AS
(
    SELECT
        T1.*,
        SUM(GroupStartFlag)
        OVER (
        PARTITION BY
            emp_id, duration
        ORDER BY
            PeriodColumn_begin
        ROWS UNBOUNDED PRECEDING) GroupID
    FROM
        (
            SELECT
                emp_id,
                duration,
                PUBLIC.PERIOD_BEGIN_UDF(PeriodColumn) PeriodColumn_begin,
                PUBLIC.PERIOD_END_UDF(PeriodColumn) PeriodColumn_end,
                (CASE
                    WHEN PeriodColumn_begin <= LAG(PeriodColumn_end)
                    OVER (
                    PARTITION BY
                        emp_id, duration
                    ORDER BY
                        PeriodColumn_begin,
                        PeriodColumn_end)
                        THEN 0
                    ELSE 1
                END) GroupStartFlag
            FROM
                project
        ) T1
)
SELECT
    emp_id,
    duration,
    PUBLIC.PERIOD_UDF(MIN(PeriodColumn_begin), MAX(PeriodColumn_end))
FROM
    NormalizeCTE
GROUP BY
    emp_id,
    duration,
    GroupID;

Known Issues

Normalize clause can use ON MEETS OR OVERLAPS, ON OVERLAPS or ON OVERLAPS OR MEETS, for these cases SnowConvert will add an error that this translation is planned for the future.

IN -> Teradata_03.sql
SELECT NORMALIZE ON MEETS OR OVERLAPS emp_id, duration FROM table1;
OUT -> Teradata_03.sql
SELECT
       !!!RESOLVE EWI!!! /*** SSC-EWI-0073 - PENDING FUNCTIONAL EQUIVALENCE REVIEW FOR 'NORMALIZE SET QUANTIFIER' NODE ***/!!!
       NORMALIZE ON MEETS OR OVERLAPS emp_id,
duration FROM
table1;
  1. SSC-EWI-0073: Pending Functional Equivalence Review.

  2. SSC-EWI-TD0079: The required period type column was not found.

  3. SSC-EWI-TD0053: Snowflake does not support the period datatype, all periods are handled as varchar instead.

Last updated