-
Notifications
You must be signed in to change notification settings - Fork 34
Spark Native Functions
Spark native functions offer ease-of-use, flexibility and performance benefits far beyond what Spark user-defined functions (UDFs) can do. To learn more about Spark native functions, read this.
Once you build some native functions, you have to make a decision whether you want to use them only from Scala or also make them available via SparkSQL. Using them from Scala simply requires creating a Column
-oriented instantiation API as done in Spark's functions
. Using them from SparkSQL requires registration. spark-alchemy makes this easy.
Once you have built one or more native functions, you create a registration object that extends NativeFunctionRegistration
and implements expressions
, e.g., the way HLLFunctionRegistration
does. As the comment in NativeFunctionRegistration
says, this is code pulled from FunctionRegistry
in OSS Spark.
To use the functions from SparkSQL, you have to register them by calling the equivalent of HLLFunctionRegistration.registerFunctions(spark)
.
That's all there is to it.
spark-alchemy is maintained by the team at Swoop. If you'd like to contribute to our open-source efforts, by joining our team or from your company, let us know at spark-interest at swoop dot com
.