REGEXP LIKE UDF
Some parts in the output code are omitted for clarity reasons.
Description
REGEXP_LIKE
performs regular expression matching. This condition evaluates strings using characters as defined by the input character set. (Oracle Language Regerence REGEXP_LIKE Condition)
Oracle REGEXP_LIKE
and Snowflake REGEXP_LIKE
condition, have some functional differences, to minimize these differences and replicate Oracle REGEXP_LIKE
function better, we added a custom UDF. The main idea is to escape the backslash symbol from the regular expression where it is required. These are the special characters that need to be escaped when they come with a backslash: 'd', 'D', 'w', 'W', 's', 'S', 'A', 'Z', 'n'
. Also, the backreference expression (matches the same text as most recently matched by the "number specified" capturing group) needs to be escaped.
Custom UDF overloads
REGEXP_LIKE_UDF(string, string)
Parameters
COL: is the character expression that serves as the search value.
PATTERN: is the regular expression.
Oracle
Snowflake
REGEXP_LIKE_UDF(string, string, string)
Parameters
COL: is the character expression that serves as the search value.
PATTERN: is the regular expression.
MATCHPARAM: is a character expression that let's change the default matching behavior of the condition. In the following table, there are the Oracle characters with their description and their equivalent in the UDF.
Match Parameter | Description | UDF Equivalent |
---|---|---|
'i' | Specifies case-insensitive matching, even if the determined collation of the condition is case-sensitive. | 'i' |
'c' | Specifies case-sensitive and accent-sensitive matching, even if the determined collation of the condition is case-insensitive or accent-insensitive. | Does not have an equivalent. It is being removed from the parameter.. |
'n' | Allows the period (.), which is the match-any-character wildcard character, to match the newline character. If you omit this parameter, then the period does not match the newline character. | 's' |
'm' | Treats the source string as multiple lines. Oracle interprets | 'm' |
'x' | Ignores whitespace characters. By default, whitespace characters match themselves. | Does not have an equivalent. It is being removed from the parameter. |
Oracle
Snowflake
Known Issues
1. UDF match parameter may not behave as expected
Due to all the characters available in the Oracle match parameter does not have their equivalent in the user-defined function, the query result may have some functional differences compared to Oracle.
2. UDF pattern parameter does not allow only '\\' as a regular expression
If as a pattern parameter the regular expression used is only '\\' an exception will be thrown like this: JavaScript execution error: Uncaught SyntaxError: Invalid regular expression: //: \ at end of pattern in REGEXP_LIKE_UDF at 'return COL.match(new RegExp(PATTERN));' position 17 stackstrace: REGEXP_LIKE_UDF
Last updated