Regex functions

Translation specification for the regular expresions from Teradata to their Snowflake equivalents

Description

Both Teradata and Snowflake offer support for functions that apply regular expressions over varchar inputs. See the Teradata documentation and Snowflake documentation for more details.

REGEXP_SUBSTR(source. regexp [, position, occurrence, match])
REGEXP_REPLACE(source. regexp [, replace_string, position, occurrence, match])
REGEXP_INSTR(source. regexp [, position, occurrence, return_option, match])
REGEXP_SIMILAR(source. regexp [, match])
REGEXP_SPLIT_TO_TABLE(inKey. source. regexp, match)

Sample Source Patterns

Setup data

Teradata

CREATE TABLE regexpTable
(
    col1 CHAR(35)
);

INSERT INTO regexpTable VALUES('hola');

Snowflake

CREATE TABLE regexpTable
(
    col1 CHAR(35)
);

INSERT INTO regexpTable VALUES('hola');

Regex transformation example

Teradata

SELECT
REGEXP_REPLACE(col1,'.*(h(i|o))','ha', 1, 0, 'x'),
REGEXP_SUBSTR(COL1,'.*(h(i|o))', 2, 1, 'x'),
REGEXP_INSTR(COL1,'.*(h(i|o))',1, 1, 0, 'x'),
REGEXP_SIMILAR(COL1,'.*(h(i|o))', 'x')
FROM regexpTable;

Snowflake

SELECT
--** MSC-WARNING - MSCEWI2021 - VALUE 'x' FOR PARAMETER 'match_arg' IS NOT SUPPORTED IN SNOWFLAKE **
REGEXP_REPLACE(col1, '.*(h(i|o))', 'ha', 1, 0),
--** MSC-WARNING - MSCEWI2021 - VALUE 'x' FOR PARAMETER 'match_arg' IS NOT SUPPORTED IN SNOWFLAKE **
REGEXP_SUBSTR(COL1, '.*(h(i|o))', 2, 1),
--** MSC-WARNING - MSCEWI2021 - VALUE 'x' FOR PARAMETER 'match_arg' IS NOT SUPPORTED IN SNOWFLAKE **
REGEXP_INSTR(COL1, '.*(h(i|o))', 1, 1, 0),
--** MSC-WARNING - MSCEWI2021 - VALUE 'x' FOR PARAMETER 'match_arg' IS NOT SUPPORTED IN SNOWFLAKE **
REGEXP_LIKE(COL1, '.*(h(i|o))')
FROM
PUBLIC.regexpTable;

Known Issues

1. Snowflake only supports POSIX regular expressions

The user will be warned when SnowConvert finds a non-POSIX regular expression.

2. Teradata "match_arg" has some options unsupported in Snowflake

The options 'l' and 'x' have no counterpart in Snowflake and the user will be warned if SnowConvert finds them.

3. Fixed size of the CHAR datatype may cause different behavior

Some regex functions in Teradata will try to match the whole column of CHAR datatype in a table even if some of the characters in the column were left empty due to a smaller string being inserted. In Snowflake this does not happen because the CHAR datatype is of variable size.

4. REGEXP_SPLIT_TO_TABLE not supported

The function is currently not supported by Snowflake.

  1. MSCEWI2020: Only Posix regex supported.

  2. MSCEWI2021: Unsupported use of the 'match_arg' parameter.

  3. MSCEWI2031: The result may differ due to the char type having a fixed length in Teradata.

  4. MSCEWI1031: Function not supported.

Last updated