A subquery is a query within another query. Subqueries in a FROM or WHERE clause are used to provide data that will be used to limit or compare/evaluate the data returned by the containing query. (Snowflake subqueries documentation).
Subqueries can be correlated/uncorrelated as well as scalar/non-scalar.
Correlated subqueries reference columns from the outer query. In Snowflake, correlated subqueries execute for each row in the query. On the other hand, Uncorrelated subqueries do not reference the outer query and are executed once for the entire query.
Scalar subqueries return a single value as result, otherwise the subquery is non-scalar.
The following patterns are based on these categories.
Snowflake evaluates correlated subqueries at compile time to determine if they are scalar and therefore valid in the context were a single return value is expected. To solve this, the ANY_VALUE aggregate function is added to the returned column when the result is not an aggregate function. This allows the compiler to determine a single value return is expected. Since scalar subqueries are expected to return a single value the function ANY_VALUE will not change the result, it will just return the original value as is.
Teradata
SELECT col2
FROM tableA
WHERE col1 = (SELECT col3 FROM tableB WHERE col2 = col4);
+------+
| col2 |
+------+
| Hey |
+------+
Snowflake
SELECT
col2
FROM
tableA
WHERE col1 =
--** SSC-FDM-0002 - CORRELATED SUBQUERIES MAY HAVE SOME FUNCTIONAL DIFFERENCES. **
(
SELECT
ANY_VALUE(col3) FROM
tableB
WHERE
RTRIM( col2) = RTRIM(col4));
Non-scalar subqueries specified inside subquery operators (ANY/ALL/IN/EXISTS) are supported.
Non-scalar subqueries used as derived tables are also supported.
Teradata
SELECT col2
FROM tableA
WHERE col1 IN (SELECT col3 FROM tableB);
SELECT col2
FROM tableA
WHERE col1 >= ALL(SELECT col3 FROM tableB);
SELECT col2, myDerivedTable.col4
FROM tableA, (SELECT * FROM tableB) AS myDerivedTable
WHERE col1 = myDerivedTable.col3;
SELECT
col2
FROM
tableA
WHERE col1 IN (
SELECT
col3 FROM
tableB
);
SELECT
col2
FROM
tableA
WHERE col1 >= ALL(
SELECT
col3 FROM
tableB
);
SELECT
col2,
myDerivedTable.col4
FROM
tableA, (
SELECT
* FROM
tableB
) AS myDerivedTable
WHERE col1 = myDerivedTable.col3;
1. Subqueries with FETCH first that are not uncorrelated scalar
Oracle allows using the FETCH clause in subqueries, Snowflake only allows using this clause if the subquery is uncorrelated scalar, otherwise an exception will be generated.
SnowConvert will mark any inalid usage of FETCH in subqueries with SSC-EWI-0108
Oracle:
-- Correlated scalar
SELECT col2
FROM tableA
WHERE col2 = (SELECT col4 FROM tableB WHERE col3 = col1 FETCH FIRST ROW ONLY);
-- Uncorrelated scalar
SELECT col2
FROM tableA
WHERE col2 = (SELECT col4 FROM tableB FETCH FIRST ROW ONLY);
Snowflake:
-- Correlated scalar
SELECT col2
FROM
tableA
WHERE col2 =
--** SSC-FDM-0002 - CORRELATED SUBQUERIES MAY HAVE SOME FUNCTIONAL DIFFERENCES. **
!!!RESOLVE EWI!!! /*** SSC-EWI-0108 - THE FOLLOWING SUBQUERY MATCHES AT LEAST ONE OF THE PATTERNS CONSIDERED INVALID AND MAY PRODUCE COMPILATION ERRORS ***/!!! (SELECT
ANY_VALUE( col4) FROM
tableB
WHERE col3 = col1
FETCH FIRST 1 ROW ONLY);
-- Uncorrelated scalar
SELECT col2
FROM
tableA
WHERE col2 = (SELECT col4 FROM
tableB
FETCH FIRST 1 ROW ONLY);
Related EWIs
SSC-FDM-0002: Correlated subquery may have functional differences
SSC-EWI-0108: The following subquery matches at least one of the patterns considered invalid and may produce compilation errors