-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exposed Column.Apply() API #323
Exposed Column.Apply() API #323
Conversation
/// </summary> | ||
/// <param name="colObject">Column object to apply</param> | ||
/// <returns>Column object</returns> | ||
public Column Apply(object colObject) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this only going to be exposed for Column type or anything like how scala has it defined
/**
* Extracts a value or values from a complex type.
* The following types of extraction are supported:
* <ul>
* <li>Given an Array, an integer ordinal can be used to retrieve a single value.</li>
* <li>Given a Map, a key of the correct type can be used to retrieve an individual value.</li>
* <li>Given a Struct, a string fieldName can be used to extract that field.</li>
* <li>Given an Array of Structs, a string fieldName can be used to extract filed
* of every struct in that array, and return an Array of fields.</li>
* </ul>
* @group expr_ops
* @since 1.4.0
*/
def apply(extraction: Any): Column = withExpr {
UnresolvedExtractValue(expr, lit(extraction).expr)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any which is why it is an object, I will change the parameter name to avoid confusion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The summary description should be updated to be somewhat similar to scala's.
/// <returns>Column object</returns> | ||
public Column Apply(object colObject) | ||
{ | ||
return Apply("apply", colObject); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a unit test
/// </summary> | ||
/// <param name="colObject">Column object to apply</param> | ||
/// <returns>Column object</returns> | ||
public Column Apply(object colObject) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're revealing this public method Apply()
then I think we should rename the private method Apply()
which has totally different semantics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any suggestions on what to call the private Apply function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about ApplyMethod
@imback82 it looks like python's |
Actually, this is a bug in Spark. I am fixing it: https://issues.apache.org/jira/browse/SPARK-29664 |
…d unit test for Apply() Column API
…ta/spark into nidutta/ExposeColumnApplyAPI
|
||
/// <summary> | ||
/// Extracts a value or values from a complex type. | ||
///The following types of extraction are supported: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
///The following types of extraction are supported: | |
/// The following types of extraction are supported: |
Same for the next 5 lines
|
||
col = col1.Name("alias"); | ||
//col.Explain(true); -> Do I want to do this on col1 or col2? Or remove this line. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//col.Explain(true); -> Do I want to do this on col1 or col2? Or remove this line. | |
col1.Explain(true); |
/// </summary> | ||
/// <param name="Obj">object to apply</param> | ||
/// <returns>Column object</returns> | ||
public Column Apply(object Obj) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anything more meaningful than Obj
? Same comment regarding the param description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change the paramname and description to something more meaningful
/// </summary> | ||
/// <param name="mappingObject">Object to use to map the values on returning Column object </param> | ||
/// <returns>Column object</returns> | ||
public Column Apply(object mappingObject) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's use extraction
. Update description as well.
/// of every struct in that array, and return an Array of fields. | ||
/// | ||
/// </summary> | ||
/// <param name="mappingObject">Object to use to map the values on returning Column object </param> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// <param name="mappingObject">Object to use to map the values on returning Column object </param> | |
/// <param name="mappingObject">Object to use to map the values on returning Column object</param> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, merging to master.
Fixes #201 |
Spark side change is also merged: apache/spark#26351 |
We are excited to review your PR.
So we can do the best job, please check:
Fixes #nnnn
in your description to cause GitHub to automatically close the issue(s) when your PR is merged.This PR addresses feature request #314 and fixes issue #201