Regex functions

Translation specification for the regular expresions from Teradata to their Snowflake equivalents

Description

Both Teradata and Snowflake offer support for functions that apply regular expressions over varchar inputs. See the Teradata documentation and Snowflake documentation for more details.

REGEXP_SUBSTR(source. regexp [, position, occurrence, match])
REGEXP_REPLACE(source. regexp [, replace_string, position, occurrence, match])
REGEXP_INSTR(source. regexp [, position, occurrence, return_option, match])
REGEXP_SIMILAR(source. regexp [, match])
REGEXP_SPLIT_TO_TABLE(inKey. source. regexp, match)

Sample Source Patterns

Setup data

Teradata

IN -> Teradata_01.sql
CREATE TABLE regexpTable
(
    col1 CHAR(35)
);

INSERT INTO regexpTable VALUES('hola');

Snowflake

OUT -> Teradata_01.sql
CREATE TABLE regexpTable
(
    col1 CHAR(35)
)
COMMENT = '{"origin":"sf_sc","name":"snowconvert","version":{"major":1, "minor":0},{"attributes":{"component":"teradata"}}'
;

INSERT INTO regexpTable
VALUES ('hola');

Regex transformation example

Teradata

IN -> Teradata_02.sql
SELECT
REGEXP_REPLACE(col1,'.*(h(i|o))','ha', 1, 0, 'x'),
REGEXP_SUBSTR(COL1,'.*(h(i|o))', 2, 1, 'x'),
REGEXP_INSTR(COL1,'.*(h(i|o))',1, 1, 0, 'x'),
REGEXP_SIMILAR(COL1,'.*(h(i|o))', 'xl')
FROM regexpTable;

Snowflake

OUT -> Teradata_02.sql
--** SSC-FDM-0007 - MISSING DEPENDENT OBJECT "regexpTable" **
SELECT
REGEXP_REPLACE(col1, '.*(h(i|o))', 'ha', 1, 0),
REGEXP_SUBSTR(COL1, '.*(h(i|o))', 2, 1),
REGEXP_INSTR(COL1, '.*(h(i|o))', 1, 1, 0),
--** SSC-FDM-TD0016 - VALUE 'l' FOR PARAMETER 'match_arg' IS NOT SUPPORTED IN SNOWFLAKE **
REGEXP_LIKE(COL1, '.*(h(i|o))')
FROM
regexpTable;

Known Issues

1. Snowflake only supports POSIX regular expressions

The user will be warned when SnowConvert finds a non-POSIX regular expression.

2. Teradata "match_arg" option 'l' is unsupported in Snowflake

The option 'l' has no counterpart in Snowflake and the user will be warned if SnowConvert finds them.

3. Fixed size of the CHAR datatype may cause different behavior

Some regex functions in Teradata will try to match the whole column of CHAR datatype in a table even if some of the characters in the column were left empty due to a smaller string being inserted. In Snowflake this does not happen because the CHAR datatype is of variable size.

4. REGEXP_SPLIT_TO_TABLE not supported

The function is currently not supported by Snowflake.

  1. SSC-FDM-0007: Element with missing dependencies.

  2. SSC-FDM-TD0016: Value 'l' for parameter 'match_arg' is not supported in Snowflake.

Last updated