#EWI: SPRKPY1075 => The parse_json does not apply schema validation, if you need to filter/validate based on schema you might need to introduce some logic.df.select(parse_json(df.value))#EWI: SPRKPY1075 => The parse_json does not apply schema validation, if you need to filter/validate based on schema you might need to introduce some logic.df.select(parse_json(df.value))#EWI: SPRKPY1075 => The parse_json does not apply schema validation, if you need to filter/validate based on schema you might need to introduce some logic.df.select(parse_json(df.value))
For the function from_json the schema is not really passed for inference it is used for validation. See this examples:
data = [ ('{"name": "John", "age": 30, "city": "New York"}',), ('{"name": "Jane", "age": "25", "city": "San Francisco"}',)]df = spark.createDataFrame(data, ["json_str"])
Example 1: Enforce Data Types and Change Column Names:
# Parse JSON column with schemaparsed_df = df.withColumn("parsed_json", from_json(col("json_str"), schema))parsed_df.show(truncate=False)# +------------------------------------------------------+---------------------------+# |json_str |parsed_json |# +------------------------------------------------------+---------------------------+# |{"name": "John", "age": 30, "city": "New York"} |{John, 30, New York} |# |{"name": "Jane", "age": "25", "city": "San Francisco"}|{Jane, null, San Francisco}|# +------------------------------------------------------+---------------------------+# notice that values outside of the schema were dropped and columns not matched are returned as null
Example 2: Select Specific Columns:
# Define a schema with only the columns we want to usepartial_schema =StructType([StructField("name", StringType(), True),StructField("city", StringType(), True)])# Parse JSON column with partial schemapartial_df = df.withColumn("parsed_json", from_json(col("json_str"), partial_schema))partial_df.show(truncate=False)# +------------------------------------------------------+---------------------+# |json_str |parsed_json |# +------------------------------------------------------+---------------------+# |{"name": "John", "age": 30, "city": "New York"} |{John, New York} |# |{"name": "Jane", "age": "25", "city": "San Francisco"}|{Jane, San Francisco}|# +------------------------------------------------------+---------------------+# there is also an automatic filtering
Recommendations
For more support, you can email us at sma-support@snowflake.com. If you have a contract for support with Snowflake, reach out to your sales engineer and they can direct your support needs.