Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Python bindings for string literal support in AST #13073

Conversation

karthikeyann
Copy link
Contributor

@karthikeyann karthikeyann commented Apr 5, 2023

Description

Depends on #13061
Add Python bindings for string scalar support in AST
Add unit test for string comparison - column vs column, column vs literal.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@karthikeyann karthikeyann added feature request New feature or request 3 - Ready for Review Ready for review by team Python Affects Python cuDF API. 4 - Needs cuDF (Python) Reviewer non-breaking Non-breaking change labels Apr 5, 2023
@karthikeyann karthikeyann requested a review from a team as a code owner April 5, 2023 21:45
@karthikeyann karthikeyann self-assigned this Apr 5, 2023
@karthikeyann karthikeyann requested a review from a team as a code owner April 5, 2023 21:45
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Apr 5, 2023
@karthikeyann karthikeyann requested a review from vyasr April 18, 2023 14:16
@karthikeyann karthikeyann requested a review from vyasr April 21, 2023 14:07
@PointKernel
Copy link
Member

Removing libcudf review requests since it's no longer relevant

@karthikeyann karthikeyann requested a review from a team April 25, 2023 17:55
python/cudf/cudf/core/dataframe.py Outdated Show resolved Hide resolved
python/cudf/cudf/tests/test_dataframe.py Show resolved Hide resolved
self.c_scalar.int_ptr = make_unique[numeric_scalar[int64_t]](
intval, True
)
self.c_scalar.reset(new numeric_scalar[int64_t](value, True))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Shouldn't every new have a corresponding delete? Note that if we need to call delete, it can be done in the __dealloc__ method of this class
  • Can we continue to use a unique/shared pointer so we don't have to worry about defining __dealloc__ or calling delete?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I think I see what's going here, we're resetting the value of the unique pointer to a new numeric_scalar. But why do it this way instead of make_unique?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c_scalar is a unique pointer here. This line is calling unique_ptr constructor where argument is the object pointer. This unique pointer will delete the pointing object when it's destroyed.

Copy link
Contributor Author

@karthikeyann karthikeyann May 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make_unique creates the numeric_scalar unique pointer. It can't be type casted and stored to generic base pointer. Hence this approach. I saw similar approach in another Cython file as well.
https://github.com/rapidsai/cudf/blob/branch-23.06/python/cudf/cudf/_lib/scalar.pyx#L250

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I approved the PR, but still have a question. Why prefer:

c_scalar.reset(new ....)

versus:

c_scalar = make_unique(...)

?

Copy link
Contributor Author

@karthikeyann karthikeyann May 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would require releasing the unique_ptr of derived pointer anyway to store in unique_ptr<scalar> class.
unique_ptr[scalar] c_scalar = unique_ptr(static_cast<scalar*>(make_unique(...).release()));

Hence using the new operator directly.
Is 'releaseing the unique_ptr and typecasting' safer than using reset(new... ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it - thanks for clarifying! Definitely a TIL for me

Is 'releaseing the unique_ptr and typecasting' safer than using reset(new... ?

Do you mean is it safer from a Cython perspective? If so, the answer is that it should be no different than in C++. I would say whatever pattern you prefer in C++, we should use here as well.

@karthikeyann
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 3883682 into rapidsai:branch-23.06 May 8, 2023
davidwendt pushed a commit to davidwendt/cudf that referenced this pull request May 8, 2023
Depends on rapidsai#13061
Add Python bindings for string scalar support in AST
Add unit test for string comparison - column vs column, column vs literal.

Authors:
  - Karthikeyan (https://github.com/karthikeyann)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Ashwin Srinath (https://github.com/shwina)

URL: rapidsai#13073
@vyasr vyasr added 4 - Needs Review Waiting for reviewer to review or respond and removed 4 - Needs cuDF (Python) Reviewer labels Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants