Skewed By

Create table option

Description

SKEWED BY clause in the CREATE TABLE statement is used to handle skewed data distributions in a more efficient way. Skewed data occurs when the values in one or more columns have a disproportionate number of occurrences, leading to an uneven distribution of data across the underlying storage. This can result in performance issues, especially during query processing. For more information please refer to Skewed Tables.

SKEWED BY clause is not supported by Snowflake.

Grammar Syntax

skewed_by:
    SKEWED BY (col_name, ...)
    ON ((col_value, ...), ...)
    [STORED AS DIRECTORIES]

Sample Source

Hive

CREATE TABLE table1 (
    col1 STRING 
)
SKEWED BY (COL1)
ON ('test1', 'test2')
STORED AS DIRECTORIES;

Snowflake

CREATE TABLE table1 (
    col1 STRING 
) 
----** MSC-HVXXXX: THE SKEWED BY CLAUSE IS NOT SUPPORTED IN SNOWFLAKE.
--SKEWED BY (COL1)
--ON ('test1', 'test2')
--STORED AS DIRECTORIES;

Known Issues

  1. In Snowflake, data distribution and optimization are handled automatically by the system, and you don't need to explicitly specify clauses like SKEWED BY. Snowflake uses a multi-cluster, shared data architecture, and it automatically partitions, distributes, and manages data for optimal query performance. For more information please refer to Clustering Keys & Clustered Tables.

  1. MSC-HVXXXX: THE SKEWED BY CLAUSE IS NOT SUPPORTED IN SNOWFLAKE.

Last updated