SIMILAR TO
Pattern-matching conditions
Description
The SIMILAR TO operator matches a string expression, such as a column name, with a SQL standard regular expression pattern. A SQL regular expression pattern can include a set of pattern-matching metacharacters, including the two supported by the LIKE operator. (Redshift SQL Language reference SIMILAR TO).
Grammar Syntax
Pattern-matching metacharacters
Matches any sequence of zero or more characters. To achieve full equivalence in Snowflake, we need to replace the '%' operator with '.*' in the pattern.
Matches any single character. To achieve full equivalence in Snowflake, we need to replace the _
operator with .
and add the s
parameter to enable the POSIX wildcard character .
to match newline characters.
Denotes alternation. This case is fully supported in Snowflake.
Repeat the previous item zero or more times. This can have a different behavior when newline characters are included.
Repeat the previous item one or more times. This can have a different behavior when newline characters are included.
Repeat the previous item zero or one time. This can have a different behavior when newline characters are included.
Repeat the previous item exactly m times and it is fully supported in Snowflake.
Repeat the previous item at least m and not more than n times and it is fully supported in Snowflake.
Repeat the previous item m or more times and it is fully supported in Snowflake.
Parentheses group items into a single logical item and it is fully supported in Snowflake.
A bracket expression specifies a character class, just as in POSIX regular expressions.
Sample Source Patterns
Setup data
% : Matches any sequence of zero or more characters
Input Code:
Output Code:
_ : Matches any single character
Input Code:
Output Code:
| : Denotes alternation
Input Code:
Output Code:
{m, n} : Repeat the previous item exactly m times.
Input Code:
Output Code:
+ : Repeat the previous item one or more times
Input Code:
Output Code:
* : Repeat the previous item zero or more times
Input Code:
Output Code:
? : Repeat the previous item zero or one time
Input Code:
Output Code:
() : Parentheses group items into a single logical item
Input Code:
Output Code:
[...] : Specifies a character class
Input Code:
Output Code:
Escape characters
The following characters will be escaped if they appear in the pattern and are not the escape character itself:
.
$
^
Input Code:
Output Code:
Pattern stored in a variable
If these patterns are stored in a variable, the required adjustments for equivalence will not be applied. You can refer to the recommendations outlined in the table at the beginning of this document for additional equivalence guidelines.
Input Code:
Output Code:
Usage of collate columns
Arguments with COLLATE specifications are not currently supported in the RLIKE function. As a result, the COLLATE clause must be disabled to use this function. However, this may lead to differences in the results.
Input Code:
Output Code:
If you require equivalence for these scenarios, you can manually add the following parameters to the function to achieve functional equivalence:
Parameter
Description
c
Case-sensitive matching
i
Case-insensitive matching
Known Issues
The behavior of fixed char types may differ. Click here for more information.
The
RLIKE
function uses POSIX extended regular expressions, which may result in different behavior in certain cases, especially when line breaks are involved. It appears that when line breaks are present in the string and a match occurs on one line, it returns a positive result for the entire string, even though the match only occurred on a single line and not across the whole string. For example:
To achieve maximum equivalence, some modifications are made to the pattern operators.
If these patterns are stored in a variable, SnowConvert does not apply the necessary adjustments for equivalence.
Arguments with COLLATE specifications are not currently supported in the RLIKE function.
Related EWIs
SSC-FDM-0032: Parameter is not a literal value, transformation could not be fully applied.
SSC-FDM-PG0011: The use of the COLLATE column constraint has been disabled for this pattern-matching condition.
Last updated