Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Option to add a name to grouping in by, especially for boolean expressions #2504

Closed
samukweku opened this issue Jun 28, 2020 · 2 comments · Fixed by #3333
Closed

[FR] Option to add a name to grouping in by, especially for boolean expressions #2504

samukweku opened this issue Jun 28, 2020 · 2 comments · Fixed by #3333
Assignees
Labels
groupby Group-by functionality and Reducers improve Improvement of an existing functionality
Milestone

Comments

@samukweku
Copy link
Contributor

samukweku commented Jun 28, 2020

Instead of a default C0, it would be nice to have some relevant name

Example:

from datatable import dt, f, by

grades = [48, 99, 75, 80, 42, 80, 72, 68, 36, 78]
data = {'ID': ["x%d" % r for r in range(10)],
             'Gender': ['F', 'M', 'F', 'M', 'F', 'M', 'F', 'M', 'M', 'M'],
             'ExamYear': [2007, 2007, 2007, 2008, 2008,
                          2008, 2008, 2009, 2009, 2009],
             'Class': ['algebra', 'stats', 'bio', 'algebra',
                       'algebra', 'stats', 'stats', 'algebra', 'bio', 'bio'],
             'Participated': ['yes', 'yes', 'yes', 'yes', 'no',
                              'yes', 'yes', 'yes', 'yes', 'yes'],
             'Passed': ['yes' if x > 50 else 'no' for x in grades],
             'Employed': [True, True, True, False,
                          False, False, False, True, True, False],
             'Grade': grades}

df = dt.Frame(data)
df[:, dt.mean(f.Grade), by(f.ExamYear < 2009)]

   | C0 | Grade
---+----+---------
 0 |  0 | 60.6667
 1 |  1 | 70.8571

Suggested form:

df[:, dt.mean(f.Grade), by(name = f.ExamYear < 2009)]
@st-pasha st-pasha self-assigned this Jun 28, 2020
@st-pasha st-pasha added groupby Group-by functionality and Reducers improve Improvement of an existing functionality labels Jun 28, 2020
@st-pasha st-pasha added this to the Release 0.11.0 milestone Jun 28, 2020
@st-pasha st-pasha removed this from the Release 0.11.0 milestone Aug 25, 2020
@st-pasha st-pasha removed their assignment Sep 24, 2020
@pradkrish
Copy link
Contributor

@samukweku what if name is not provided (similar to how we do it now), any suggestions what column name to use then?

@samukweku
Copy link
Contributor Author

samukweku commented May 29, 2021

@pradkrish if no name is provided, then we use datatable's form - C0 or C1, ... similar to the example shared above.

@oleksiyskononenko oleksiyskononenko added this to the Release 1.1.0 milestone Sep 19, 2022
oleksiyskononenko pushed a commit that referenced this issue Sep 20, 2022
This PR implements column's aliasing as proposed in #2684. We couldn't name the method `.as()` though, because `as` is a built-in python keyword — hence, we use `.alias()` instead. Column aliasing is now also available in the group-by clause.

Closes #2504
samukweku added a commit that referenced this issue Sep 20, 2022
This PR implements column's aliasing as proposed in #2684. We couldn't name the method `.as()` though, because `as` is a built-in python keyword — hence, we use `.alias()` instead. Column aliasing is now also available in the group-by clause.

Closes #2504
samukweku added a commit that referenced this issue Sep 21, 2022
This PR implements column's aliasing as proposed in #2684. We couldn't name the method `.as()` though, because `as` is a built-in python keyword — hence, we use `.alias()` instead. Column aliasing is now also available in the group-by clause.

Closes #2504
samukweku added a commit that referenced this issue Jan 2, 2023
This PR implements column's aliasing as proposed in #2684. We couldn't name the method `.as()` though, because `as` is a built-in python keyword — hence, we use `.alias()` instead. Column aliasing is now also available in the group-by clause.

Closes #2504
samukweku added a commit that referenced this issue Jan 3, 2023
This PR implements column's aliasing as proposed in #2684. We couldn't name the method `.as()` though, because `as` is a built-in python keyword — hence, we use `.alias()` instead. Column aliasing is now also available in the group-by clause.

Closes #2504
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
groupby Group-by functionality and Reducers improve Improvement of an existing functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants