Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrame pivot/melt-cast/etc. #29

Closed
HarlanH opened this issue Jul 16, 2012 · 10 comments
Closed

DataFrame pivot/melt-cast/etc. #29

HarlanH opened this issue Jul 16, 2012 · 10 comments
Labels

Comments

@HarlanH
Copy link
Contributor

HarlanH commented Jul 16, 2012

Functionality to take a DF in wide form and make it long, and vice-versa

@HarlanH
Copy link
Contributor Author

HarlanH commented Aug 8, 2012

we've got stack/unstack. Someone should sit down and try to figure out if that's good enough or if we need additional functionality.

@StefanKarpinski
Copy link
Member

I'm sure you guys are aware of the [reshape] and [reshape2]. Also worth considering is the find/sparse functions in Matlab that kind of do similar things and are a really classic design.

@johnmyleswhite
Copy link
Contributor

I find that the stack / unstack API is much less helpful than the cast / melt API. At some point, I'd like to go through and clean this up.

tshort pushed a commit that referenced this issue Jan 22, 2013
@tshort
Copy link
Contributor

tshort commented Jan 22, 2013

The existing stack(df, ["colX", "colY"] is equivalent of Hadley's melt(df, measure.vars = c("colX", "colY"). I added versions of melt and melt_df that use stack and stack_df. The only difference really is that the melt functions prefer id_vars.

That was the easy part. cast is the tough one.

@johnmyleswhite
Copy link
Contributor

It would be really nice to use id_vars instead. That's just a set complement, right?

@tshort
Copy link
Contributor

tshort commented Jan 22, 2013

Yes it is. That's what the new melt does. If df has columns x1 through x4, melt(df, ["x1", "x2"]) is the same as stack(df, ["x3","x4"]). Both functions also allow three arguments.

@johnmyleswhite
Copy link
Contributor

Ah. Much better! This is looking really promising.

@tshort
Copy link
Contributor

tshort commented Jan 24, 2013

More changes here. I added a simplistic pivot_table function. The purpose is like Hadley's dcast or Wes's pivot_table. The format is pivot_table(d, R, C, D, fun) where R and C are vectors indicating columns that are to be pivoted to either rows or columns in the results. D is the column to take for data. fun is the function to apply to aggregate (defaults to mean if left off).

@tshort
Copy link
Contributor

tshort commented Jan 24, 2013

Also, I changed to argument order for unstack to match that of pivot_table.

@tshort
Copy link
Contributor

tshort commented Feb 14, 2013

I'm closing as we've got most basics between stack, unstack, melt, and pivot_table. pivot_table is like Hadley's cast. It's still fairly limited, so folks can add specific feature requests.

@tshort tshort closed this as completed Feb 14, 2013
nalimilan pushed a commit that referenced this issue Jul 8, 2017
* don't bundle with my other PR

* use sprint to test output

* try and account for different line endings when hashing

* fix tests

* hashing
nalimilan pushed a commit that referenced this issue Jul 8, 2017
* don't bundle with my other PR

* use sprint to test output

* try and account for different line endings when hashing

* fix tests

* hashing
nalimilan pushed a commit that referenced this issue Jul 8, 2017
* don't bundle with my other PR

* use sprint to test output

* try and account for different line endings when hashing

* fix tests

* hashing
nalimilan pushed a commit that referenced this issue Jul 8, 2017
* don't bundle with my other PR

* use sprint to test output

* try and account for different line endings when hashing

* fix tests

* hashing
nalimilan pushed a commit that referenced this issue Jul 8, 2017
* don't bundle with my other PR

* use sprint to test output

* try and account for different line endings when hashing

* fix tests

* hashing
nalimilan pushed a commit that referenced this issue Jul 8, 2017
* don't bundle with my other PR

* use sprint to test output

* try and account for different line endings when hashing

* fix tests

* hashing
rofinn pushed a commit that referenced this issue Aug 17, 2017
* don't bundle with my other PR

* use sprint to test output

* try and account for different line endings when hashing

* fix tests

* hashing
nalimilan pushed a commit that referenced this issue Aug 25, 2017
* don't bundle with my other PR

* use sprint to test output

* try and account for different line endings when hashing

* fix tests

* hashing
quinnj pushed a commit that referenced this issue Sep 2, 2017
* don't bundle with my other PR

* use sprint to test output

* try and account for different line endings when hashing

* fix tests

* hashing
quinnj pushed a commit that referenced this issue Sep 2, 2017
* don't bundle with my other PR

* use sprint to test output

* try and account for different line endings when hashing

* fix tests

* hashing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants