Important considerations for the transformations shown in this page:
The @EXTERNAL_STAGE placeholder must be replaced with the external stage created after following the previous guide.
It is assumed that the external stage will point to the root of the bucket. This is important to consider because the PATTERN clause generated for each table specifies the file/folder paths starting at the base of the bucket, defining the external stage pointing to a different location in the bucket might produce undesired behavior.
The AUTO_REFRESH = FALSE clause is generated to avoid errors, please note that automatic refresh of external table metadata is only valid if your Snowflake account cloud provider and the bucket provider are the same and a Notification Integration was created.
Sample Source Patterns
CREATE EXTERNAL TABLE with explicit column list
When the column list is provided, SnowConvert will automatically generate the AS expression column options for each column in order to extract the file values.
Input Code:
IN -> BigQuery_01.sql
CREATE OR REPLACE EXTERNAL TABLE test.Employees_test
(
Employee_id INTEGER,
Name STRING,
Mail STRING,
Position STRING,
Salary INTEGER
)
OPTIONS(
FORMAT='CSV',
SKIP_LEADING_ROWS=1,
URIS=['gs://sc_external_table_bucket/folder_with_csv/Employees.csv']
);
Output Code:
OUT -> BigQuery_01.sql
CREATE OR REPLACE EXTERNAL TABLE test.Employees_test
(
Employee_id INTEGER AS CAST(GET_IGNORE_CASE($1, 'c1') AS INTEGER),
Name STRING AS CAST(GET_IGNORE_CASE($1, 'c2') AS STRING),
Mail STRING AS CAST(GET_IGNORE_CASE($1, 'c3') AS STRING),
Position STRING AS CAST(GET_IGNORE_CASE($1, 'c4') AS STRING),
Salary INTEGER AS CAST(GET_IGNORE_CASE($1, 'c5') AS INTEGER)
)
!!!RESOLVE EWI!!! /*** SSC-EWI-BQ0015 - EXTERNAL TABLE REQUIRES AN EXTERNAL STAGE TO ACCESS gs://sc_external_table_bucket, DEFINE AND REPLACE THE EXTERNAL_STAGE PLACEHOLDER ***/!!!
LOCATION = @EXTERNAL_STAGE
AUTO_REFRESH = false
PATTERN = 'folder_with_csv/Employees.csv'
FILE_FORMAT = (TYPE = CSV SKIP_HEADER =1);
CREATE EXTERNAL TABLE without explicit column list
Since the INFER_SCHEMA function requires a file format to work, SnowConvert will generate a temporary file format for this purpose, this file format is only required when running the CREATE EXTERNAL TABLE statement and it will be automatically dropped when the session ends.
Input Code:
IN -> BigQuery_02.sql
CREATE OR REPLACE EXTERNAL TABLE test.my_external_table_json
OPTIONS(
FORMAT='JSON',
URIS=['gs://sc_external_table_bucket/folder_with_json/Cars.jsonl']
);
Output Code:
OUT -> BigQuery_02.sql
CREATE OR REPLACE TEMPORARY FILE FORMAT SC_TEST_MY_EXTERNAL_TABLE_JSON_FORMAT
TYPE = JSON;
CREATE OR REPLACE EXTERNAL TABLE test.my_external_table_json USING TEMPLATE (
SELECT
ARRAY_AGG(OBJECT_CONSTRUCT('COLUMN_NAME', COLUMN_NAME, 'TYPE', TYPE, 'NULLABLE', NULLABLE, 'EXPRESSION', EXPRESSION))
FROM
TABLE(INFER_SCHEMA(LOCATION => '@EXTERNAL_STAGE/folder_with_json/Cars.jsonl', FILE_FORMAT => 'SC_TEST_MY_EXTERNAL_TABLE_JSON_FORMAT'))
)
!!!RESOLVE EWI!!! /*** SSC-EWI-BQ0015 - EXTERNAL TABLE REQUIRES AN EXTERNAL STAGE TO ACCESS gs://sc_external_table_bucket, DEFINE AND REPLACE THE EXTERNAL_STAGE PLACEHOLDER ***/!!!
LOCATION = @EXTERNAL_STAGE
AUTO_REFRESH = false
PATTERN = 'folder_with_json/Cars.jsonl'
FILE_FORMAT = (TYPE = JSON);
CREATE EXTERNAL TABLE with multiple URIs
When multiple source URIs are specified, they will be joined in the regex of the PATTERN clause in Snowflake, the wildcard * characters used will be transformed to its .* equivalent in Snowflake.
Input Code:
IN -> BigQuery_03.sql
CREATE OR REPLACE EXTERNAL TABLE test.multipleFilesTable
(
Name STRING,
Code STRING,
Price NUMERIC,
Expiration_date DATE
)
OPTIONS(
format="CSV",
skip_leading_rows = 1,
uris=['gs://sc_external_table_bucket/folder_with_csv/Food.csv', 'gs://sc_external_table_bucket/folder_with_csv/other_products/*']
);
Output Code:
OUT -> BigQuery_03.sql
CREATE OR REPLACE EXTERNAL TABLE test.multipleFilesTable
(
Name STRING AS CAST(GET_IGNORE_CASE($1, 'c1') AS STRING),
Code STRING AS CAST(GET_IGNORE_CASE($1, 'c2') AS STRING),
Price NUMERIC AS CAST(GET_IGNORE_CASE($1, 'c3') AS NUMERIC),
Expiration_date DATE AS CAST(GET_IGNORE_CASE($1, 'c4') AS DATE)
)
!!!RESOLVE EWI!!! /*** SSC-EWI-BQ0015 - EXTERNAL TABLE REQUIRES AN EXTERNAL STAGE TO ACCESS gs://sc_external_table_bucket, DEFINE AND REPLACE THE EXTERNAL_STAGE PLACEHOLDER ***/!!!
LOCATION = @EXTERNAL_STAGE
AUTO_REFRESH = false
PATTERN = 'folder_with_csv/Food.csv|folder_with_csv/other_products/.*'
FILE_FORMAT = (TYPE = CSV SKIP_HEADER = 1);
WITH CONNECTION clause
The WITH CONNECTION clause is removed because the connection information is already provided to Snowflake using the Storage Integration.
CREATE OR REPLACE EXTERNAL TABLE test.songs_test
(
Name STRING AS CAST(GET_IGNORE_CASE($1, 'c1') AS STRING),
Release_date INTEGER AS CAST(GET_IGNORE_CASE($1, 'c2') AS INTEGER),
Songs INT AS CAST(GET_IGNORE_CASE($1, 'c3') AS INT),
Genre STRING AS CAST(GET_IGNORE_CASE($1, 'c4') AS STRING)
)
!!!RESOLVE EWI!!! /*** SSC-EWI-BQ0015 - EXTERNAL TABLE REQUIRES AN EXTERNAL STAGE TO ACCESS gs://sc_external_table_bucket, DEFINE AND REPLACE THE EXTERNAL_STAGE PLACEHOLDER ***/!!!
LOCATION = @EXTERNAL_STAGE
AUTO_REFRESH = false
PATTERN = 'folder_with_csv/Albums.csv'
FILE_FORMAT = (TYPE = CSV
ENCODING= 'UTF8' SKIP_HEADER =1
FIELD_DELIMITER='|'
COMPRESSION= GZIP);
Known Issues
1. CREATE EXTERNAL TABLE without explicit column list and CSV file format
Currently, Snowflake external tables do not support parsing the header of CSV files. When a external table with no explicit column list and CSV file format is found, SnowConvert will produce the SKIP_HEADER file format option to avoid runtime errors, however, this will cause the table column names to have the autogenerated names c1, c2, ..., cN.
An FDM is generated to notify that the header can not be parsed and that manually renaming the columns is necessary to preserve the names.
Input Code:
IN -> BigQuery_06.sql
CREATE OR REPLACE EXTERNAL TABLE test.my_external_table_csv
OPTIONS(
FORMAT='CSV',
URIS=['gs://sc_external_table_bucket/folder_with_csv/Employees.csv']
);
Output Code:
OUT -> BigQuery_06.sql
CREATE OR REPLACE TEMPORARY FILE FORMAT SC_TEST_MY_EXTERNAL_TABLE_CSV_FORMAT
TYPE = CSV
SKIP_HEADER = 1;
CREATE OR REPLACE EXTERNAL TABLE test.my_external_table_csv
--** SSC-FDM-BQ0005 - PARSING THE CSV HEADER IS NOT SUPPORTED IN EXTERNAL TABLES, COLUMNS MUST BE RENAMED TO MATCH THE ORIGINAL NAMES **
USING TEMPLATE (
SELECT
ARRAY_AGG(OBJECT_CONSTRUCT('COLUMN_NAME', COLUMN_NAME, 'TYPE', TYPE, 'NULLABLE', NULLABLE, 'EXPRESSION', EXPRESSION))
FROM
TABLE(INFER_SCHEMA(LOCATION => '@EXTERNAL_STAGE/folder_with_csv/Employees.csv', FILE_FORMAT => 'SC_TEST_MY_EXTERNAL_TABLE_CSV_FORMAT'))
)
!!!RESOLVE EWI!!! /*** SSC-EWI-BQ0015 - EXTERNAL TABLE REQUIRES AN EXTERNAL STAGE TO ACCESS gs://sc_external_table_bucket, DEFINE AND REPLACE THE EXTERNAL_STAGE PLACEHOLDER ***/!!!
LOCATION = @EXTERNAL_STAGE
AUTO_REFRESH = false
PATTERN = 'folder_with_csv/Employees.csv'
FILE_FORMAT = (TYPE = CSV SKIP_HEADER = 1);
2. External tables referencing Google Drive sources
Snowflake does not support reading data from files hosted in Google Drive, an FDM will be generated to notify about this and request that the files are uploaded to the bucket and accessed through the external stage.
The PATTERN clause will hold autogenerated placeholders FILE_PATH0, FILE_PATH1, ..., FILE_PATHN that should be replaced with the file/folder path after the files were moved to the external location.
Input Code:
IN -> BigQuery_07.sql
CREATE OR REPLACE EXTERNAL TABLE test.my_external_table_drive_test
OPTIONS(
FORMAT='JSON',
URIS=['https://drive.google.com/open?id=someFileId']
);
Output Code:
OUT -> BigQuery_07.sql
CREATE OR REPLACE TEMPORARY FILE FORMAT SC_TEST_MY_EXTERNAL_TABLE_DRIVE_TEST_FORMAT
TYPE = JSON;
CREATE OR REPLACE EXTERNAL TABLE test.my_external_table_drive_test USING TEMPLATE (
SELECT
ARRAY_AGG(OBJECT_CONSTRUCT('COLUMN_NAME', COLUMN_NAME, 'TYPE', TYPE, 'NULLABLE', NULLABLE, 'EXPRESSION', EXPRESSION))
FROM
--** SSC-FDM-BQ0008 - THE INFER_SCHEMA FUNCTION REQUIRES A FILE PATH WITHOUT WILDCARDS TO GENERATE THE TABLE TEMPLATE, REPLACE THE FILE_PATH PLACEHOLDER WITH IT **
TABLE(INFER_SCHEMA(LOCATION => '@EXTERNAL_STAGE/FILE_PATH', FILE_FORMAT => 'SC_TEST_MY_EXTERNAL_TABLE_DRIVE_TEST_FORMAT'))
)
!!!RESOLVE EWI!!! /*** SSC-EWI-BQ0015 - EXTERNAL TABLE REQUIRES AN EXTERNAL STAGE TO ACCESS A EXTERNAL LOCATION, DEFINE AND REPLACE THE EXTERNAL_STAGE PLACEHOLDER ***/!!!
LOCATION = @EXTERNAL_STAGE
AUTO_REFRESH = false
--** SSC-FDM-BQ0006 - READING FROM GOOGLE DRIVE IS NOT SUPPORTED IN SNOWFLAKE, UPLOAD THE FILES TO THE EXTERNAL LOCATION AND REPLACE THE FILE_PATH PLACEHOLDERS **
PATTERN = 'FILE_PATH0'
FILE_FORMAT = (TYPE = JSON);
3. External tables with the GOOGLE_SHEETS file format
Snowflake does not support Google Sheets as a file format, however, its structure is similar to CSV files, which are supported by Snowflake.
When SnowConvert detects a external table using the GOOGLE_SHEETS format, it will produce a external table with the CSV file format instead.
Since Google Sheets are stored in Google Drive, it would be necessary to upload the files as CSV to the external location and specify the file paths in the PATTERN clause, just as mentioned in the previous issue.
Input Code:
IN -> BigQuery_08.sql
CREATE OR REPLACE EXTERNAL TABLE test.spreadsheetTable
(
Name STRING,
Code INTEGER,
Price INTEGER,
Expiration_date DATE
)
OPTIONS(
format="GOOGLE_SHEETS",
skip_leading_rows = 1,
uris=['https://docs.google.com/spreadsheets/d/someFileId/edit?usp=sharing']
);
Output Code:
OUT -> BigQuery_08.sql
--** SSC-FDM-BQ0007 - READING FROM SPREADSHEETS IS NOT SUPPORTED IN SNOWFLAKE, USE THE CSV FILE TYPE INSTEAD **
CREATE OR REPLACE EXTERNAL TABLE test.spreadsheetTable
(
Name STRING AS CAST(GET_IGNORE_CASE($1, 'c1') AS STRING),
Code INTEGER AS CAST(GET_IGNORE_CASE($1, 'c2') AS INTEGER),
Price INTEGER AS CAST(GET_IGNORE_CASE($1, 'c3') AS INTEGER),
Expiration_date DATE AS CAST(GET_IGNORE_CASE($1, 'c4') AS DATE)
)
!!!RESOLVE EWI!!! /*** SSC-EWI-BQ0015 - EXTERNAL TABLE REQUIRES AN EXTERNAL STAGE TO ACCESS A EXTERNAL LOCATION, DEFINE AND REPLACE THE EXTERNAL_STAGE PLACEHOLDER ***/!!!
LOCATION = @EXTERNAL_STAGE
AUTO_REFRESH = false
--** SSC-FDM-BQ0006 - READING FROM GOOGLE DRIVE IS NOT SUPPORTED IN SNOWFLAKE, UPLOAD THE FILES TO THE EXTERNAL LOCATION AND REPLACE THE FILE_PATH PLACEHOLDERS **
PATTERN = 'FILE_PATH0'
FILE_FORMAT = (TYPE = CSV SKIP_HEADER = 1);
4. External tables with unsupported file formats
Snowflake supports the following BigQuery formats:
BigQuery
Snowflake
AVRO
AVRO
CSV
GOOGLE_SHEETS
CSV
NEWLINE_DELIMITED_JSON
JSON
JSON
ORC
ORC
PARQUET
PARQUET
Other formats will be marked as not supported.
Input Code:
IN -> BigQuery_09.sql
CREATE OR REPLACE EXTERNAL TABLE test.backup_restore_table
OPTIONS (
format = 'DATASTORE_BACKUP',
uris = ['gs://backup_bucket/backup_folder/*']
);
Output Code:
OUT -> BigQuery_09.sql
!!!RESOLVE EWI!!! /*** SSC-EWI-BQ0013 - EXTERNAL TABLE DATA FORMAT NOT SUPPORTED IN SNOWFLAKE ***/!!!
CREATE OR REPLACE EXTERNAL TABLE test.backup_restore_table
OPTIONS (
format = 'DATASTORE_BACKUP',
uris = ['gs://backup_bucket/backup_folder/*']
);
5. Hive partitioned external tables
Snowflake does not support hive partitioned external tables, the WITH PARTITION COLUMNS clause will be marked as not supported.
CREATE OR REPLACE TEMPORARY FILE FORMAT SC_TEST_CUSTOMHIVEPARTITIONEDTABLE_FORMAT
TYPE = PARQUET;
CREATE EXTERNAL TABLE test.CustomHivePartitionedTable USING TEMPLATE (
SELECT
ARRAY_AGG(OBJECT_CONSTRUCT('COLUMN_NAME', COLUMN_NAME, 'TYPE', TYPE, 'NULLABLE', NULLABLE, 'EXPRESSION', EXPRESSION))
FROM
--** SSC-FDM-BQ0008 - THE INFER_SCHEMA FUNCTION REQUIRES A FILE PATH WITHOUT WILDCARDS TO GENERATE THE TABLE TEMPLATE, REPLACE THE FILE_PATH PLACEHOLDER WITH IT **
TABLE(INFER_SCHEMA(LOCATION => '@EXTERNAL_STAGE/FILE_PATH', FILE_FORMAT => 'SC_TEST_CUSTOMHIVEPARTITIONEDTABLE_FORMAT'))
)
!!!RESOLVE EWI!!! /*** SSC-EWI-BQ0015 - EXTERNAL TABLE REQUIRES AN EXTERNAL STAGE TO ACCESS gs://sc_external_table_bucket, DEFINE AND REPLACE THE EXTERNAL_STAGE PLACEHOLDER ***/!!!
LOCATION = @EXTERNAL_STAGE
AUTO_REFRESH = false
!!!RESOLVE EWI!!! /*** SSC-EWI-BQ0014 - HIVE PARTITIONED EXTERNAL TABLES ARE NOT SUPPORTED IN SNOWFLAKE ***/!!!
WITH PARTITION COLUMNS (
field_1 STRING,
field_2 INT64)
PATTERN = 'folder_with_parquet/.*'
FILE_FORMAT = (TYPE = PARQUET)
!!!RESOLVE EWI!!! /*** SSC-EWI-BQ0001 - SNOWFLAKE DOES NOT SUPPORT THE OPTIONS: HIVE_PARTITION_URI_PREFIX, REQUIRE_HIVE_PARTITION_FILTER. ***/!!!
OPTIONS(
hive_partition_uri_prefix = 'gs://sc_external_table_bucket/folder_with_parquet',
require_hive_partition_filter = false
);
6. External table without columns list and no valid file URI for the INFER_SCHEMA function
The INFER_SCHEMA function requires a LOCATION parameter that specifies the path to a file or folder that will be used to construct the table columns, however, this path does not support regex, meaning that the wildcard * character is not supported.
When the table has no columns, SnowConvert will check all URIS in order to find one that does not use wildcards and use it in the INFER_SCHEMA function, when no URI meets such criteria an FDM and FILE_PATH placeholder will be generated, the placeholder has to be replaced with the path of one of the files referenced by the external table to generate the table columns.
Input Code:
IN -> BigQuery_11.sql
CREATE OR REPLACE EXTERNAL TABLE test.my_external_table_json2
OPTIONS(
FORMAT='JSON',
URIS=['gs://sc_external_table_bucket/folder_with_json/*']
);
Output Code:
OUT -> BigQuery_11.sql
CREATE OR REPLACE TEMPORARY FILE FORMAT SC_TEST_MY_EXTERNAL_TABLE_JSON2_FORMAT
TYPE = JSON;
CREATE OR REPLACE EXTERNAL TABLE test.my_external_table_json2 USING TEMPLATE (
SELECT
ARRAY_AGG(OBJECT_CONSTRUCT('COLUMN_NAME', COLUMN_NAME, 'TYPE', TYPE, 'NULLABLE', NULLABLE, 'EXPRESSION', EXPRESSION))
FROM
--** SSC-FDM-BQ0008 - THE INFER_SCHEMA FUNCTION REQUIRES A FILE PATH WITHOUT WILDCARDS TO GENERATE THE TABLE TEMPLATE, REPLACE THE FILE_PATH PLACEHOLDER WITH IT **
TABLE(INFER_SCHEMA(LOCATION => '@EXTERNAL_STAGE/FILE_PATH', FILE_FORMAT => 'SC_TEST_MY_EXTERNAL_TABLE_JSON2_FORMAT'))
)
!!!RESOLVE EWI!!! /*** SSC-EWI-BQ0015 - EXTERNAL TABLE REQUIRES AN EXTERNAL STAGE TO ACCESS gs://sc_external_table_bucket, DEFINE AND REPLACE THE EXTERNAL_STAGE PLACEHOLDER ***/!!!
LOCATION = @EXTERNAL_STAGE
AUTO_REFRESH = false
PATTERN = 'folder_with_json/.*'
FILE_FORMAT = (TYPE = JSON);
7. Unsupported table options
Input Code:
IN -> BigQuery_12.sql
CREATE OR REPLACE EXTERNAL TABLE dataset.CsvTable
(
x INTEGER,
y STRING
)
OPTIONS (
format = 'CSV',
uris = ['gs://bucket/example.csv'],
field_delimiter = '|',
max_bad_records = 5
);
Output Code:
OUT -> BigQuery_12.sql
CREATE OR REPLACE EXTERNAL TABLE dataset.CsvTable
(
x INTEGER AS CAST(GET_IGNORE_CASE($1, 'c1') AS INTEGER),
y STRING AS CAST(GET_IGNORE_CASE($1, 'c2') AS STRING)
)
!!!RESOLVE EWI!!! /*** SSC-EWI-BQ0015 - EXTERNAL TABLE REQUIRES AN EXTERNAL STAGE TO ACCESS gs://bucket, DEFINE AND REPLACE THE EXTERNAL_STAGE PLACEHOLDER ***/!!!
LOCATION = @EXTERNAL_STAGE
AUTO_REFRESH = false
PATTERN = 'example.csv'
FILE_FORMAT = (TYPE = CSV
field_delimiter = '|')
!!!RESOLVE EWI!!! /*** SSC-EWI-BQ0001 - SNOWFLAKE DOES NOT SUPPORT THE OPTIONS: MAX_BAD_RECORDS. ***/!!!
OPTIONS(
max_bad_records = 5
);
Related EWIs
The CREATE EXTERNAL TABLE statement from BigQuery will be transformed to a CREATE EXTERNAL TABLE statement from , however, this transformation requires user intervention.
In order to complete the transformation performed by SnowConvert, it is necessary to define a , a and (optional) that have access to the external source were files are located. Please refer to the following guides on how to set up the connection for each provider:
When the column list is not provided, BigQuery automatically detects the schema of the columns from the file structure. To replicate this behavior, SnowConvert will generate a USING TEMPLATE clause that makes use of the function to generate the column definitions.
Any other table option not mentioned in the pattern will be marked as not supported.
: External table data format not supported in snowflake
: Hive partitioned external tables are not supported in snowflake
: External table requires an external stage to access a external location, define and replace the EXTERNAL_STAGE placeholder
: The INFER_SCHEMA function requires a file path without wildcards to generate the table template, replace the FILE_PATH placeholder with it
: Parsing the CSV header is not supported in external tables, columns must be renamed to match the original names
: Reading from Google Drive is not supported in Snowflake, upload the files to the external location and replace the FILE_PATH placeholders
: Reading from spreadsheets is not supported in snowflake, use the CSV file type instead