The NCHAR data type specifies a fixed-length character string in the national character set. ()
NCHAR allows to store special characters with their Unicode to be preserved across any usage, these special characters may need more bits to be stored and that is why, by default, the NCHAR character set is AL16UTF16, contrary to the common character data set for CHAR which is usually AL32UTF8.
NCHAR is preserved as NCHAR in Snowflake, but, in the background, Snowflake uses VARCHAR. is also valid for NCHAR.
SELECT
OCTET_LENGTH(nchar_column1) /*** SSC-FDM-OR0015 - LENGTHB TRANSFORMED TO OCTET_LENGTH RESULTS MAY VARY DUE TO MEMORY MANAGEMENT OF DBMS ***/,
OCTET_LENGTH(nchar_column2) /*** SSC-FDM-OR0015 - LENGTHB TRANSFORMED TO OCTET_LENGTH RESULTS MAY VARY DUE TO MEMORY MANAGEMENT OF DBMS ***/
FROM
nchar_data_types;
Note that the number specified in the column declaration is the size in characters and not in bytes, That is why we see more space used to store those special characters.
In Snowflake, VARCHAR uses UTF-8, size can vary depending on the Unicode character that can be represented in 1, 2, 3, or 4 bytes. In this case, the Cambodian character is using 3 bytes to be stored.
Besides these slight differences, the integrity of the data is preserved.
Known Issues
1. Results obtained from some built-in functions may vary
As explained in the previous section, there may be cases using built-in functions over the columns that may retrieve different results. For example, get the length of a column.