SELECT

Important Notice: Migration of Documentation Website

Please be advised that our documentation website is currently undergoing a migration to a new platform. To ensure you have access to the most up-to-date information, we kindly request that you visit our new documentation website located at:

Official Snowflake Snowconvert Documentation

For any immediate assistance or if you encounter any issues, please contact our support team at [email protected].

Thank you for your understanding.

Applies to

Hive SQL
Spark SQL
Databricks SQL

Description

Spark supports a SELECT statement and conforms to the ANSI SQL standard. Queries are used to retrieve result sets from one or more tables. (Spark SQL Language Reference SELECT)

This grammar is partially supported in Snowflake. Translation pending for these CREATE VIEW elements:

[ SORT BY { expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [ , ... ] } ]
[ CLUSTER BY { expression [ , ... ] } ]
[ DISTRIBUTE BY { expression [, ... ] } ]
[ WINDOW { named_window [ , WINDOW named_window, ... ] } ]
[ PIVOT clause ]
[ UNPIVOT clause ]
[ LATERAL VIEW clause ] [ ... ]
[ regex_column_names ]
[ TRANSFORM (...) ]
[ LIMIT non_literal_expression ]

from_item :=
join_relation
table_value_function
LATERAL(subquery)
file_format.`file_path`

select_statement { INTERSECT | EXCEPT } { ALL | DISTINCT } select_statement

Grammar Syntax

[ WITH with_query [ , ... ] ]
select_statement [ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select_statement, ... ]
    [ ORDER BY { expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [ , ... ] } ]
    [ SORT BY { expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [ , ... ] } ]
    [ CLUSTER BY { expression [ , ... ] } ]
    [ DISTRIBUTE BY { expression [, ... ] } ]
    [ WINDOW { named_window [ , WINDOW named_window, ... ] } ]
    [ LIMIT { ALL | expression } ]

select_statement :=
SELECT [ hints , ... ] [ ALL | DISTINCT ] { [ [ named_expression | regex_column_names ] [ , ... ] | TRANSFORM (...) ] }
    FROM { from_item [ , ... ] }
    [ PIVOT clause ]
    [ UNPIVOT clause ]
    [ LATERAL VIEW clause ] [ ... ] 
    [ WHERE boolean_expression ]
    [ GROUP BY expression [ , ... ] ]
    [ HAVING boolean_expression ]
    
with_query :=
expression_name [ ( column_name [ , ... ] ) ] [ AS ] ( query )

from_item :=
table_relation |
join_relation |
table_value_function |
inline_table |
LATERAL(subquery) |
file_format.`file_path`

Sample Source Patterns

GROUP BY

The WITH { CUBE | ROLLUP } syntax is transformed to its CUBE(expr1, ...) or ROLLUP(expr1, ...) equivalent

Input Code:

IN -> Databricks_01.sql

-- Basic case of GROUP BY
SELECT id, sum(quantity) FROM dealer GROUP BY 1;

-- Grouping by GROUPING SETS
SELECT city, car_model, sum(quantity) AS sum FROM dealer
    GROUP BY GROUPING SETS ((city, car_model), (city), (car_model), ());

-- Grouping by ROLLUP
SELECT city, car_model, sum(quantity) AS sum FROM dealer
    GROUP BY ROLLUP(city, car_model);

SELECT city, car_model, sum(quantity) AS sum FROM dealer
    GROUP BY city, car_model WITH ROLLUP;

-- Grouping by CUBE
SELECT city, car_model, sum(quantity) AS sum FROM dealer
    GROUP BY CUBE(city, car_model);

SELECT city, car_model, sum(quantity) AS sum FROM dealer
    GROUP BY city, car_model WITH CUBE;

Output Code:

OUT -> Databricks_01.sql

-- Basic case of GROUP BY
SELECT id,
    SUM(quantity) FROM
    dealer
GROUP BY 1;

-- Grouping by GROUPING SETS
SELECT city, car_model,
    SUM(quantity) AS sum FROM
    dealer
    GROUP BY GROUPING SETS ((city, car_model), (city), (car_model), () !!!RESOLVE EWI!!! /*** SSC-EWI-0073 - PENDING FUNCTIONAL EQUIVALENCE REVIEW FOR 'EmptyGroupingSet' NODE ***/!!!);

-- Grouping by ROLLUP
SELECT city, car_model,
    SUM(quantity) AS sum FROM
    dealer
    GROUP BY
    ROLLUP(city, car_model);

SELECT city, car_model,
    SUM(quantity) AS sum FROM
    dealer
GROUP BY
    ROLLUP(city, car_model);

-- Grouping by CUBE
SELECT city, car_model,
    SUM(quantity) AS sum FROM
    dealer
    GROUP BY CUBE(city, car_model) !!!RESOLVE EWI!!! /*** SSC-EWI-0073 - PENDING FUNCTIONAL EQUIVALENCE REVIEW FOR 'CUBE' NODE ***/!!!;

SELECT city, car_model,
    SUM(quantity) AS sum FROM
    dealer
GROUP BY
    CUBE(city, car_model);

Hints

Snowflake performs automatic optimization of JOINs and partitioning, meaning that hints are unnecessary, they are preserved as comments in the output code.

Input Code:

IN -> Databricks_02.sql

SELECT
/*+ REBALANCE */ /*+ COALESCE(2) */
*
FROM my_table;

Output Code:

OUT -> Databricks_02.sql

SELECT
/*+ REBALANCE */ /*+ COALESCE(2) */
*
FROM
my_table;

CTE

The AS keyword is optional in Spark/Databricks, however in Snowflake is required so it is added.

Input Code:

IN -> Databricks_03.sql

WITH my_cte (
   SELECT id, name FROM my_table
)
SELECT *
FROM my_cte
WHERE id = 1;

Output Code:

OUT -> Databricks_03.sql

WITH my_cte AS (
     SELECT id, name FROM
        my_table
  )
SELECT *
FROM
     my_cte
WHERE id = 1;

LIMIT

LIMIT ALL is removed as it is not needed in Snowflake, LIMIT with a literal value is preserved as-is.

Input Code:

IN -> Databricks_04.sql

SELECT * FROM my_table LIMIT ALL;

SELECT * FROM my_table LIMIT 5;

Output Code:

OUT -> Databricks_04.sql


SELECT * FROM
my_table;

SELECT * FROM
my_table
LIMIT 5;