[FR] Option to add a name to grouping in `by`, especially for boolean expressions #2504

samukweku · 2020-06-28T06:34:32Z

Instead of a default C0, it would be nice to have some relevant name

Example:

from datatable import dt, f, by

grades = [48, 99, 75, 80, 42, 80, 72, 68, 36, 78]
data = {'ID': ["x%d" % r for r in range(10)],
             'Gender': ['F', 'M', 'F', 'M', 'F', 'M', 'F', 'M', 'M', 'M'],
             'ExamYear': [2007, 2007, 2007, 2008, 2008,
                          2008, 2008, 2009, 2009, 2009],
             'Class': ['algebra', 'stats', 'bio', 'algebra',
                       'algebra', 'stats', 'stats', 'algebra', 'bio', 'bio'],
             'Participated': ['yes', 'yes', 'yes', 'yes', 'no',
                              'yes', 'yes', 'yes', 'yes', 'yes'],
             'Passed': ['yes' if x > 50 else 'no' for x in grades],
             'Employed': [True, True, True, False,
                          False, False, False, True, True, False],
             'Grade': grades}

df = dt.Frame(data)
df[:, dt.mean(f.Grade), by(f.ExamYear < 2009)]

   | C0 | Grade
---+----+---------
 0 |  0 | 60.6667
 1 |  1 | 70.8571

Suggested form:

df[:, dt.mean(f.Grade), by(name = f.ExamYear < 2009)]

The text was updated successfully, but these errors were encountered:

pradkrish · 2021-05-28T20:57:15Z

@samukweku what if name is not provided (similar to how we do it now), any suggestions what column name to use then?

samukweku · 2021-05-29T11:14:17Z

@pradkrish if no name is provided, then we use datatable's form - C0 or C1, ... similar to the example shared above.

This PR implements column's aliasing as proposed in #2684. We couldn't name the method `.as()` though, because `as` is a built-in python keyword — hence, we use `.alias()` instead. Column aliasing is now also available in the group-by clause. Closes #2504

st-pasha self-assigned this Jun 28, 2020

st-pasha added groupby Group-by functionality and Reducers improve Improvement of an existing functionality labels Jun 28, 2020

st-pasha added this to the Release 0.11.0 milestone Jun 28, 2020

st-pasha removed this from the Release 0.11.0 milestone Aug 25, 2020

st-pasha removed their assignment Sep 24, 2020

samukweku mentioned this issue Jul 15, 2022

[ENH] Column renaming #3313

Closed

samukweku mentioned this issue Aug 10, 2022

[ENH] Column aliasing #3333

Merged

oleksiyskononenko added this to the Release 1.1.0 milestone Sep 19, 2022

oleksiyskononenko assigned samukweku Sep 19, 2022

oleksiyskononenko closed this as completed in #3333 Sep 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FR] Option to add a name to grouping in `by`, especially for boolean expressions #2504

[FR] Option to add a name to grouping in `by`, especially for boolean expressions #2504

samukweku commented Jun 28, 2020 •

edited by oleksiyskononenko

Loading

pradkrish commented May 28, 2021

samukweku commented May 29, 2021 •

edited

Loading

[FR] Option to add a name to grouping in by, especially for boolean expressions #2504

[FR] Option to add a name to grouping in by, especially for boolean expressions #2504

Comments

samukweku commented Jun 28, 2020 • edited by oleksiyskononenko Loading

Example:

Suggested form:

pradkrish commented May 28, 2021

samukweku commented May 29, 2021 • edited Loading

[FR] Option to add a name to grouping in `by`, especially for boolean expressions #2504

[FR] Option to add a name to grouping in `by`, especially for boolean expressions #2504

samukweku commented Jun 28, 2020 •

edited by oleksiyskononenko

Loading

samukweku commented May 29, 2021 •

edited

Loading