-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Use of applymap on StringColumns #3646
Comments
Hey @dmitra79, an option is to use If a) , we may need to tweak the process a bit. Please try:
Output is:
If b), you'll bring in numpy as well, but only for typecasting in this instance:
Output is:
Please let me know if this helps! Just an FYI, I couldn't get your |
Hi @taureandyernv , Thank you for your response! My goal was to use just the integer part (the id is essentially an integer with a single letter code, and I don't need the letter) to sort the ids and partition the set systematically. We ended up with doing this without converting to integers (but had to convert to pandas). Ex: Thanks for mentioning
|
@dmitra79 , Courtesy of @VibhuJawa, we do have a string accessor for dataframes
and then
Outputs
here is a great blog for reference: https://medium.com/rapids-ai/show-me-the-word-count-3146e1173801 |
Great - thank you! |
@dantegd if @dmitra79 agrees, the usecase issue is solved by using string accessors instead of
outputs:
Thoughts? |
I agree that this resolves the issue. Having this functionality in applymap would be great, or alternatively better documentation describing how to handle strings. I was not aware of the string functionality or of nvstrings from cudf documentation |
This is being explored in the long term, but string UDFs are an ongoing challenge due to memory and branching challenges. |
As noted, this is a challenging problem. I'm going to close this issue to consolidate further discussion in #3802 |
I am trying to use applymap method on a String column to convert to integers. (The values are ex. 'A101', 'B236', etc). I am using cudf 0.10.0 on Ubuntu 18.04.2 LTS (GNU/Linux 4.15.0-45-generic x86_64)
My code:
``
def id2int(x):
return int(x[1:])
s=cudf.Series(id_df['id'])
z=s.applymap(id2int)
``
The error is:
The text was updated successfully, but these errors were encountered: