-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
requesting new feature which covers stack, unstack and permutedims in a simpler way (at least conceptually) #2732
Comments
If I understand your example correctly you want this (I am using your data):
Is this correct? (except that this pattern will not automatically fill missing levels with |
That is correct, simple transposing within each group. Just as you mentioned |
This is intentional. In particular in your examples above I was not clear about the rules how column names should be generated in case an explicit information about column name is missing. |
I think some generic names can be used, for example c1, c2, .... |
Anyway - I think the best way to move forward in this case is:
When developing the functionality ideally you can think how your proposal plays with the following earlier related requests: #1181, #2698, #2422, #2215, #2205, #2148, #1839 |
I am not sure if this has been discussed or not, but I thought it may be an interesting feature to add.
The
stack()
andunstack()
functions are very good tools for reshaping a DataFrame, however, whenever I am going to use these two functions I scratch my head about how to use them. Thepermutedims()
is ok but not very flexible. In the following I am purposing a new feature for transposing a DataFrame, call itT
function, that may simplify the reshaping of a DataFrame.Let set the first argument of this function to be a DataFrame (
df
), and the second argument be a list of variables ([:v1,...,:vp]
) that I am interested to transpose. In practice these variables should be somehow homogenous (i.e. all be numeric or all be strings). It means thatdf[!, [:v1,...,:vp]]
is a rectangle array of homogenous data (in some sense).In the simplest form
T(df, [:v1,...,:vp])
return a new DataFrame which the rows ofdf[!, [:v1,...,:vp]]
become its columns and the columns ofdf[!, [:v1,...,:vp]]
become its rows. Sincedf
doesn't have row names the column names for the new DataFrame can be generated, e.g.[:c1,..., :cq]
(or whatever). However, the new DataFrame has an extra column which includes the name of the transposed variables. E.g. ifdf
isT(df,[:v1,:v2,:v3])
isNow suppose that my data are grouped by some variables, e.g. in the following data my data are grouped by variable
:Country
.For this example
T
accept a third argument which is the grouping variable(s), and it does the same thing as the simplest case but within each group and has an extra column which keeps track of the group information, i.e. the output ofT(df,[:Pop2000,:Pop2010,:Pop2020], [:Country])
isHowever, for this example
:c1
and:c2
are actually meaningful, they are the population for male and female. SoT
can accept an optional argument which is the name to use instead ofc1
, ..... I.e. the output ofT(df,[:Pop2000,:Pop2010,:Pop2020], [:Country],:name)
isWhen a group has less rows just the output table fills it with missings.
Actually,
T
can handle most of whatstack
,unstack
andpermutedims
do.Stack
Let see how it works when we want to stack. I borrow the following example from help
stack(df,[:c,:d])
is similar toT(df, [:c,:d],[:a,:b])
, i.e. I want to transpose:c
and:d
in each group constructed by[:a, :b]
.unstack
Another example from help, let
df
(I reorder :id an :a for the sake of demonstration)unstack(long, :id, :variable, :value)
is similar toT(df,[:value],[:id,:a], :variable)
i.e. transpose:value
within each group constructed by[:id,:a]
.Another example
If I want to have information of each patient in one row I can use
T(df, [:measure],[:Hospital,:Patient], :visits)
, i.e. transpose everymeasures
of a givenpatient
and putmissing
when a group has less rows than the others.The text was updated successfully, but these errors were encountered: