You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Now that the row offset iterator is written, the next step in getting strings converted in the row to column and column to row code is to implement one side. This is the implementation issue for the column to row portion of the work. This will accept a table with string columns in it and convert it into the JCUDF row format for the spark-rapids plugin.
Describe the solution you'd like
The kernel will break up the work with a warp doing a single row. The 0th thread of the warp will write the offset/length of the data and then all threads will participate in the memcpy_async call to copy the actual string data.
Describe alternatives you've considered
Other methods to parallelize the work were considered including trying to break it up where each thread would copy a specific number of bytes to the proper destination. This would result in lower_bound calls to try and figure out the destination for the data and had issues with data chunks spanning multiple destinations. The complexity of this approach led us to the current solution in the interest of time.
Additional context
This is part of the larger feature of #10033
The text was updated successfully, but these errors were encountered:
This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.
Is your feature request related to a problem? Please describe.
Now that the row offset iterator is written, the next step in getting strings converted in the row to column and column to row code is to implement one side. This is the implementation issue for the column to row portion of the work. This will accept a table with string columns in it and convert it into the JCUDF row format for the spark-rapids plugin.
Describe the solution you'd like
The kernel will break up the work with a warp doing a single row. The 0th thread of the warp will write the offset/length of the data and then all threads will participate in the memcpy_async call to copy the actual string data.
Describe alternatives you've considered
Other methods to parallelize the work were considered including trying to break it up where each thread would copy a specific number of bytes to the proper destination. This would result in
lower_bound
calls to try and figure out the destination for the data and had issues with data chunks spanning multiple destinations. The complexity of this approach led us to the current solution in the interest of time.Additional context
This is part of the larger feature of #10033
The text was updated successfully, but these errors were encountered: