Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Refactor aggregation APIs #7456

Closed
ttnghia opened this issue Feb 26, 2021 · 5 comments
Closed

[FEA] Refactor aggregation APIs #7456

ttnghia opened this issue Feb 26, 2021 · 5 comments
Labels
improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. wontfix This will not be worked on

Comments

@ttnghia
Copy link
Contributor

ttnghia commented Feb 26, 2021

Currently, aggregation APIs (groupby, reductions, rolling, etc.) are scattered around in multiple files and there are inconsistencies between the directory structures in cpp/include/, cpp/src/, cpp/tests/, and cpp/benchmarks/. For example:

cpp/include/:

  • include/cudf/aggregation.hpp
  • include/cudf/groupby.hpp
  • include/cudf/rolling.hpp
  • ....

cpp/src/:

  • src/aggregation/.
  • src/groupby/.
  • src/rolling/.
  • ....

cpp/tests/:

  • tests/aggregation/.
  • tests/groupby/.
  • tests/grouped_rolling/.
  • tests/lead_lag/.
  • ....

I suggest to group all aggregation APIs together, so those APIs will be in include/aggreation/*.*, src/aggregation/*.*, tests/aggregation/*.* etc.

@ttnghia ttnghia added Needs Triage Need team to review and classify feature request New feature or request labels Feb 26, 2021
@ttnghia ttnghia added 4 - Needs Review Waiting for reviewer to review or respond code quality good first issue Good for newcomers libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Feb 26, 2021
@harrism
Copy link
Member

harrism commented Mar 3, 2021

I think this should be covered as part of #7106 and #6895

@harrism harrism removed 4 - Needs Review Waiting for reviewer to review or respond Needs Triage Need team to review and classify feature request New feature or request good first issue Good for newcomers non-breaking Non-breaking change labels Mar 3, 2021
@jrhemstad
Copy link
Contributor

I don't think it makes sense to group everything that touches aggregation under a aggregation/ directory. It's too broad a category to be useful. Grouping groupby,reduction,scan,rolling,etc. into the same directory is more harmful than helpful.

@ttnghia
Copy link
Contributor Author

ttnghia commented Mar 3, 2021

Yes, it would be complex to combine everything to one. So, how about make them sub-categories? For example, we can have

aggregation/grouby/
aggregation/reduction/
aggregation/rolling/

Grouping the related APIs together make things easier to access and refer to.

@jrhemstad
Copy link
Contributor

I still think that hurts more than it helps. The fact that a very diverse set of algorithms share a common aggregation input parameter is an implementation detail. Adding another layer of indirection to group things by an implementation detail doesn't seem worth it.

@harrism
Copy link
Member

harrism commented Mar 4, 2021

Good point. Moreover, not all groupby operations are aggregations. Same goes for rolling. Suggest we close this as wontfix.

@ttnghia ttnghia added the wontfix This will not be worked on label Mar 4, 2021
@ttnghia ttnghia closed this as completed Mar 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants