Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: followup for resample, #11841 #12140

Closed
8 tasks done
jreback opened this issue Jan 25, 2016 · 1 comment
Closed
8 tasks done

ENH: followup for resample, #11841 #12140

jreback opened this issue Jan 25, 2016 · 1 comment
Labels
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Jan 25, 2016

@jorisvandenbossche comments for followup on #11841

  • the repr: can we follow here PEP8 as well? :-) (I mean spaces after the comma's, I think this would make it a bit more readable, maybe also quotes around the strings)
  • upsample has been removed? (but is still in the documentation) Or is this now asfreq?

doc are updated, it IS now .asfreq

  • apply gives Exception: Must produce aggregated value (for a series) if you pass it a function that does not return a aggregated value. For groupby this works, would be nice to have this consistent.
    When applying it on a resampled dataframe, you get the cryptic ValueError: cannot copy sequence with size 3 to array axis with dimension 180 message. Typical example is just a .apply(lambda x: x)
  • Resampler.fillna has no explanation
  • DatetimeIndexResampler has no docstring yet
  • apply, agg, aggregate have no docstring, transform a very brief
    In theory it would be best if this were addressed in this PR, but given the PEP8 changes waiting on this PR, it's OK for me to leave this for a follow-up PR
  • Further, an inconsistency between r.agg() and r[].agg() in:
In [81]: df = pd.DataFrame(np.random.randn(1000, 3),
   ....:                  index=pd.date_range('1/1/2012', freq='S', periods=1000
),
   ....:                  columns=['A', 'B', 'C'])

In [82]: r = df.resample('3T')

In [83]: r.agg({'r1':'mean', 'r2':'sum'})
SpecificationError: nested dictionary is ambiguous in aggregation

In [84]: r[['A', 'B']].agg({'r1':'mean', 'r2':'sum'})
Out[84]:
                           r1                   r2
                            A         B          A          B
2012-01-01 00:00:00 -0.059093  0.040993 -10.636800   7.378766
2012-01-01 00:03:00 -0.037877  0.055133  -6.817820   9.923978
2012-01-01 00:06:00 -0.101921  0.061913 -18.345769  11.144373
2012-01-01 00:09:00  0.097927 -0.074492  17.626838 -13.408612
2012-01-01 00:12:00 -0.027901  0.035319  -5.022220   6.357459
2012-01-01 00:15:00 -0.037696  0.022259  -3.769580   2.225884
  • validate that we have tests for all combinations of aggregation
grouped.agg({'C': 'mean', 'D': 'sum'})
grouped[['C', 'D']].agg({'C': 'mean', 'D': 'sum'})
grouped.agg({'C': ['mean', 'sum'], 'D': ['mean', 'sum']})
grouped.agg({'C': {'r': 'mean', 'r2': 'sum'}, 'D': {'r': 'mean', 'r2': 'sum'}})
@jreback jreback added this to the 0.18.0 milestone Jan 25, 2016
@jreback
Copy link
Contributor Author

jreback commented Jan 26, 2016

@jorisvandenbossche
this should address all of the remaining points: jreback@2cccc70

@jreback jreback closed this as completed in 1dc49f5 Feb 2, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant