CREATE EXTERNAL TABLE

Description

External Tables defines a new table using a Data Source. (Spark SQL Language Reference CREATE DATASOURCE TABLE)

CREATE TABLE [ IF NOT EXISTS ] table_identifier
[ ( col_name1 col_type1 [ COMMENT col_comment1 ], ... ) ]
USING data_source
[ OPTIONS ( key1=val1, key2=val2, ... ) ]
[ PARTITIONED BY ( col_name1, col_name2, ... ) ]
[ CLUSTERED BY ( col_name3, col_name4, ... ) 
    [ SORTED BY ( col_name [ ASC | DESC ], ... ) ] 
    INTO num_buckets BUCKETS ]
[ LOCATION path ]
[ COMMENT table_comment ]
[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
[ AS select_statement ]

The CREATE EXTERNAL TABLE statement from Spark/Databricks will be transformed to a CREATE EXTERNAL TABLE statement from Snowflake; however, this transformation requires user intervention.

In order to complete the transformation performed by SnowConvert, it is necessary to define a Storage Integration, an External Stage, and (optionally) a Notification Integration that have access to the external source where files are located. Please refer to the following guides on how to set up the connection for each provider:

Important considerations for the transformations shown on this page:

  • The @EXTERNAL_STAGE placeholder must be replaced with the external stage created after following the previous guide.

  • It is assumed that the external stage will point to the root of the bucket. This is important to consider because the PATTERN clause generated for each table specifies the file/folder paths starting at the base of the bucket, defining the external stage pointing to a different location in the bucket might produce undesired behavior.

  • The AUTO_REFRESH = FALSE clause is generated to avoid errors. Please note that automatic refresh of external table metadata is only valid if your Snowflake account cloud provider and the bucket provider are the same, and a Notification Integration was created.

Sample Source Patterns

Create External Table with explicit column list

When the column list is provided, SnowConvert will automatically generate the AS expression column options for each column in order to extract the file values.

IN -> spark_01.sql
CREATE EXTERNAL TABLE IF NOT EXISTS external_table
(
  order_id int,
  date string,
  client_name string,
  total float
)
USING AVRO
LOCATION 'gs://sc_external_table_bucket/folder_with_avro/orders.avro';
OUT -> spark_01.sql
CREATE EXTERNAL TABLE IF NOT EXISTS external_table
(
  order_id int AS CAST(GET_IGNORE_CASE($1, 'order_id') AS int),
  date string AS CAST(GET_IGNORE_CASE($1, 'date') AS string),
  client_name string AS CAST(GET_IGNORE_CASE($1, 'client_name') AS string),
  total float AS CAST(GET_IGNORE_CASE($1, 'total') AS float)
)
!!!RESOLVE EWI!!! /*** SSC-EWI-0032 - EXTERNAL TABLE REQUIRES AN EXTERNAL STAGE TO ACCESS gs:, DEFINE AND REPLACE THE EXTERNAL_STAGE PLACEHOLDER ***/!!!
LOCATION = @EXTERNAL_STAGE
AUTO_REFRESH = false
FILE_FORMAT = (TYPE = AVRO)
PATTERN = '/sc_external_table_bucket/folder_with_avro/orders.avro'
COMMENT = '{ "origin": "sf_sc", "name": "snowconvert", "version": {  "major": 0,  "minor": 0,  "patch": "0" }, "attributes": {  "component": "spark",  "convertedOn": "06/18/2025",  "domain": "no-domain-provided" }}';

CREATE EXTERNAL TABLE without an explicit column list

When the column list is not provided, BigQuery automatically detects the schema of the columns from the file structure. To replicate this behavior, SnowConvert will generate a USING TEMPLATE clause that makes use of the INFER_SCHEMA function to generate the column definitions.

Since the INFER_SCHEMA function requires a file format to work, SnowConvert will generate a temporary file format for this purpose. This file format is only required when running the CREATE EXTERNAL TABLE statement, and it will be automatically dropped when the session ends.

IN -> spark_02.sql
CREATE EXTERNAL TABLE IF NOT EXISTS external_table_No_Columns
using AVRO
LOCATION 'gs://sc_external_table_bucket/folder_with_avro/orders.avro';
OUT -> spark_02.sql
CREATE OR REPLACE TEMPORARY FILE FORMAT SC_HIVE_FORMAT_ORDERS_NO_COLUMNS_FORMAT
TYPE = AVRO;
CREATE EXTERNAL TABLE IF NOT EXISTS hive_format_orders_No_Columns USING TEMPLATE (
SELECT
  ARRAY_AGG(OBJECT_CONSTRUCT('COLUMN_NAME', COLUMN_NAME, 'TYPE', TYPE, 'NULLABLE', NULLABLE, 'EXPRESSION', EXPRESSION))
FROM
  --** SSC-FDM-0035 - THE INFER_SCHEMA FUNCTION REQUIRES A FILE PATH WITHOUT WILDCARDS TO GENERATE THE TABLE TEMPLATE, REPLACE THE FILE_PATH PLACEHOLDER WITH IT **
  TABLE(INFER_SCHEMA(LOCATION => '@EXTERNAL_STAGE/FILE_PATH', FILE_FORMAT => 'SC_HIVE_FORMAT_ORDERS_NO_COLUMNS_FORMAT'))
)
!!!RESOLVE EWI!!! /*** SSC-EWI-0032 - EXTERNAL TABLE REQUIRES AN EXTERNAL STAGE TO ACCESS gs:, DEFINE AND REPLACE THE EXTERNAL_STAGE PLACEHOLDER ***/!!!
LOCATION = @EXTERNAL_STAGE
AUTO_REFRESH = false
FILE_FORMAT = (TYPE = AVRO)
PATTERN = '/sc_external_table_bucket/folder_with_avro/orders.avro'
COMMENT = '{ "origin": "sf_sc", "name": "snowconvert", "version": {  "major": 0,  "minor": 0,  "patch": "0" }, "attributes": {  "component": "spark",  "convertedOn": "06/18/2025",  "domain": "no-domain-provided" }}';

CREATE EXTERNAL TABLE using Hive format

The creation of External Tables using Hive Format is also supported. They will have an FDM added informing the user that inserting into those tables is not supported.

IN -> spark_03.sql
CREATE EXTERNAL TABLE IF NOT EXISTS External_table_hive_format
(
  order_id int,
  date string,
  client_name string,
  total float
)
stored as AVRO
LOCATION 'gs://sc_external_table_bucket/folder_with_avro/orders.avro';
OUT -> spark_02.sql
--** SSC-FDM-HV0001 - INSERTING VALUES INTO AN EXTERNAL TABLE IS NOT SUPPORTED IN SNOWFLAKE **
CREATE EXTERNAL TABLE IF NOT EXISTS hive_format_orders_Andres
(
  order_id int AS CAST(GET_IGNORE_CASE($1, 'order_id') AS int),
  date string AS CAST(GET_IGNORE_CASE($1, 'date') AS string),
  client_name string AS CAST(GET_IGNORE_CASE($1, 'client_name') AS string),
  total float AS CAST(GET_IGNORE_CASE($1, 'total') AS float)
)
!!!RESOLVE EWI!!! /*** SSC-EWI-0032 - EXTERNAL TABLE REQUIRES AN EXTERNAL STAGE TO ACCESS gs:, DEFINE AND REPLACE THE EXTERNAL_STAGE PLACEHOLDER ***/!!!
LOCATION = @EXTERNAL_STAGE
AUTO_REFRESH = false
FILE_FORMAT = (TYPE = AVRO)
PATTERN = '/sc_external_table_bucket/folder_with_avro/orders.avro'
COMMENT = '{ "origin": "sf_sc", "name": "snowconvert", "version": {  "major": 0,  "minor": 0,  "patch": "0" }, "attributes": {  "component": "spark",  "convertedOn": "06/18/2025",  "domain": "no-domain-provided" }}';

Known Issues

1. External tables with unsupported file formats

Snowflake supports the following Spark formats:

  • CSV

  • PARQUET

  • ORC

  • XML

  • JSON

  • AVRO

Other formats will be marked as not supported.

2. Unsupported table options

Some table options are not supported by Snowconvert and are marked with an EWI.

IN -> spark_04.sql
CREATE EXTERNAL TABLE IF NOT EXISTS hive_format_orders_Andres
(
  order_id int,
  date string,
  client_name string,
  total float
)
using AVRO
LOCATION 'gs://sc_external_table_bucket/folder_with_avro/orders.avro'
Tblproperties (
    'unsupported_table_option' = 'value'
);
OUT -> spark_02.sql
CREATE EXTERNAL TABLE IF NOT EXISTS hive_format_orders_Andres
(
  order_id int AS CAST(GET_IGNORE_CASE($1, 'order_id') AS int),
  date string AS CAST(GET_IGNORE_CASE($1, 'date') AS string),
  client_name string AS CAST(GET_IGNORE_CASE($1, 'client_name') AS string),
  total float AS CAST(GET_IGNORE_CASE($1, 'total') AS float)
)
    !!!RESOLVE EWI!!! /*** SSC-EWI-0032 - EXTERNAL TABLE REQUIRES AN EXTERNAL STAGE TO ACCESS gs:, DEFINE AND REPLACE THE EXTERNAL_STAGE PLACEHOLDER ***/!!!
    LOCATION = @EXTERNAL_STAGE
    AUTO_REFRESH = false
    PATTERN = '/sc_external_table_bucket/folder_with_avro/orders.avro'
    FILE_FORMAT = (TYPE = AVRO)
    !!!RESOLVE EWI!!! /*** SSC-EWI-0016 - SNOWFLAKE DOES NOT SUPPORT THE OPTIONS: 'UNSUPPORTED_TABLE_OPTION'. ***/!!!
    TBLPROPERTIES (
  'unsupported_table_option' = 'value'
    )
    COMMENT = '{ "origin": "sf_sc", "name": "snowconvert", "version": {  "major": 0,  "minor": 0,  "patch": "0" }, "attributes": {  "component": "spark",  "convertedOn": "06/19/2025",  "domain": "no-domain-provided" }}';
  1. SSC-EWI-0029: External table data format not supported in Snowflake

  2. SSC-EWI-0032: External table requires an external stage to access an external location, define and replace the EXTERNAL_STAGE placeholder

  3. SSC-FDM-0034: The INFER_SCHEMA function requires a file path without wildcards to generate the table template, replace the FILE_PATH placeholder with it

  4. SSC-EWI-0016: Snowflake does not support the options clause.

  5. SSC-FDM-HV0001: Inserting values into an external table is not supported in Snowflake

Last updated