Add mode #453

cigrainger · 2022-12-17T03:39:07Z

See: #452. This is a WIP because I'm not sure how to handle the fact that you can have no mode or multiple. This gets particularly hairy in a groupby scenario where each group may have different lengths (and thus would cause problems with other summary statistics like mean, median where you know the length will be 1).

Pandas:
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.mode.html
https://pandas.pydata.org/docs/reference/api/pandas.Series.mode.html#pandas.Series.mode
https://stackoverflow.com/a/54304691

R:
https://stackoverflow.com/questions/66972590/how-to-find-mean-median-mode-based-on-distinctive-groups-in-r
https://cran.r-project.org/web/packages/modeest/modeest.pdf

Okay! Thanks to #725 this is good to go. Closes #452.

philss

This gets particularly hairy in a groupby scenario where each group may have different lengths

I think this is fine, because I think the computation is going to occur inside the group's context. Maybe adding a test - actually change an existing one with another column - to the summarise or mutate can clarify a little bit. WDYT?

lib/explorer/series.ex

cigrainger · 2022-12-17T04:05:24Z

I think this is fine, because I think the computation is going to occur inside the group's context

Yes, between groups, but not within groups. So take the example where you summarise down to mean and mode. If mode is length 0 or >1, then you have a problem where within groups the aggregations are different lengths. Adding a test or two now to clarify as suggested 👍.

cigrainger · 2022-12-17T04:25:12Z

Well it clarified in a different direction: summarise w/ mode returns a single value per group -- a list! We'll have to pick up #401 anyway.

josevalim

LGTM although it seems we are waiting on lists?

Change map helper functions' arguments

lib/explorer/series.ex

josevalim · 2023-11-12T14:16:41Z

lib/explorer/series.ex

+  @doc """
+  Gets the most common value of the series.
+
+  ## Supported dtypes


Let’s just say all except lists? Otherwise it is easy for this to go out of date!

Yep fair 😂

josevalim · 2023-11-12T14:18:55Z

lib/explorer/shared.ex

-    non_list_dtypes = [
+    non_list_dtypes = non_list_types()
+    list_dtypes = for dtype <- non_list_dtypes, do: {:list, dtype}
+    non_list_dtypes ++ list_dtypes


List types are recursive so this is theoretically incomplete. Instead of doing this list, what if we just append {:list, :any} instead?

Sure, fine by me 👍

So it looks like this is used for tests? I'm going to leave it how it was (I just wrapped the non-list ones into a separate function) and then I think we should revisit it in a future PR.

billylanchantin

Woo! It's cool to see this PR get across the finish line!

I just had one suggestion about the docs: I think it's worth calling out ties.

lib/explorer/series.ex

Co-authored-by: Billy Lanchantin <[email protected]>

josevalim

Ship it and I can look at the dtypes stuff. :)

philss reviewed Dec 17, 2022

View reviewed changes

lib/explorer/series.ex Outdated Show resolved Hide resolved

cigrainger force-pushed the cg/mode branch from 201075c to 4fe633c Compare December 17, 2022 04:00

cigrainger force-pushed the cg/mode branch from 4fe633c to 846b404 Compare December 17, 2022 04:18

josevalim approved these changes Dec 17, 2022

View reviewed changes

kylewhite21 mentioned this pull request Dec 21, 2022

compute mode on grouped DataFrame #452

Closed

liamdiprose pushed a commit to liamdiprose/explorer that referenced this pull request Feb 16, 2023

Merge pull request elixir-explorer#453 from SeokminHong/impl-encoder

adc0b9d

Change map helper functions' arguments

cigrainger force-pushed the cg/mode branch from 846b404 to 449cae8 Compare November 12, 2023 06:41

josevalim reviewed Nov 12, 2023

View reviewed changes

lib/explorer/series.ex Outdated Show resolved Hide resolved

josevalim reviewed Nov 12, 2023

View reviewed changes

lib/explorer/series.ex Outdated Show resolved Hide resolved

cigrainger force-pushed the cg/mode branch from 6fb2b72 to 8aec840 Compare November 12, 2023 12:01

cigrainger added 8 commits November 12, 2023 14:18

Add mode

8a353e1

Return series rather than value

0103b3b

Update flake

9470642

Add mode feature

1db01d7

Set correct guard

6850427

Add mode expr

c980560

Allow mode for all dtypes

b57802f

Add s_mode

7e14624

cigrainger force-pushed the cg/mode branch from 8aec840 to 303207f Compare November 12, 2023 13:23

Add tests

f3b8b95

cigrainger force-pushed the cg/mode branch from 303207f to f3b8b95 Compare November 12, 2023 13:34

cigrainger marked this pull request as ready for review November 12, 2023 13:34

Exclude list types

5e006c5

cigrainger requested review from billylanchantin, philss and josevalim November 12, 2023 13:50

josevalim reviewed Nov 12, 2023

View reviewed changes

billylanchantin approved these changes Nov 12, 2023

View reviewed changes

lib/explorer/series.ex Outdated Show resolved Hide resolved

cigrainger force-pushed the cg/mode branch from b433498 to 13179b6 Compare November 12, 2023 15:16

cigrainger and others added 2 commits November 12, 2023 16:19

Apply suggestions

7f837a2

Call out ties in docs

ff81098

Co-authored-by: Billy Lanchantin <[email protected]>

cigrainger force-pushed the cg/mode branch from 13179b6 to ff81098 Compare November 12, 2023 15:19

josevalim approved these changes Nov 12, 2023

View reviewed changes

cigrainger merged commit e0c02a4 into main Nov 12, 2023
4 checks passed

cigrainger deleted the cg/mode branch November 12, 2023 17:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add mode #453

Add mode #453

cigrainger commented Dec 17, 2022 •

edited

Loading

philss left a comment

cigrainger commented Dec 17, 2022

cigrainger commented Dec 17, 2022 •

edited

Loading

josevalim left a comment

josevalim Nov 12, 2023

cigrainger Nov 12, 2023

josevalim Nov 12, 2023

cigrainger Nov 12, 2023

cigrainger Nov 12, 2023

billylanchantin left a comment •

edited

Loading

josevalim left a comment

Add mode #453

Add mode #453

Conversation

cigrainger commented Dec 17, 2022 • edited Loading

philss left a comment

Choose a reason for hiding this comment

cigrainger commented Dec 17, 2022

cigrainger commented Dec 17, 2022 • edited Loading

josevalim left a comment

Choose a reason for hiding this comment

josevalim Nov 12, 2023

Choose a reason for hiding this comment

cigrainger Nov 12, 2023

Choose a reason for hiding this comment

josevalim Nov 12, 2023

Choose a reason for hiding this comment

cigrainger Nov 12, 2023

Choose a reason for hiding this comment

cigrainger Nov 12, 2023

Choose a reason for hiding this comment

billylanchantin left a comment • edited Loading

Choose a reason for hiding this comment

josevalim left a comment

Choose a reason for hiding this comment

cigrainger commented Dec 17, 2022 •

edited

Loading

cigrainger commented Dec 17, 2022 •

edited

Loading

billylanchantin left a comment •

edited

Loading