SPRKPY1073
pyspark.sql.functions.udf
Description
Scenarios
Scenario 1
from pyspark.sql import SparkSession, DataFrameStatFunctions
from pyspark.sql.functions import col, udf
spark = SparkSession.builder.getOrCreate()
data = [['Q1', 'Test 1'],
['Q2', 'Test 2'],
['Q3', 'Test 1'],
['Q4', 'Test 1']]
columns = ['Quadrant', 'Value']
df = spark.createDataFrame(data, columns)
my_udf = udf(lambda s: len(s))
df.withColumn('Len Value' ,my_udf(col('Value')) ).show()Scenario 2
Additional recommendations
Last updated
