Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Used Base.names #22

Closed
oxinabox opened this issue May 10, 2019 · 20 comments · Fixed by #83
Closed

Used Base.names #22

oxinabox opened this issue May 10, 2019 · 20 comments · Fixed by #83

Comments

@oxinabox
Copy link
Member

After some reflection, I think we should overload Base.names
DataFrames does it, and we have a way better argument than they do.

@timholy
Copy link

timholy commented Sep 16, 2019

On the other hand, struct uses fieldnames and NamedTuples doesn't define either one. Ref JuliaImages/ImageAxes.jl#33.

I'm not sure I'm against names, but I'm not sure I'm for it either. I know what MyModule.funcname does. I know what mystruct.fldname does. Heck, I know what df.colname does. What does nda.x do?

@oxinabox
Copy link
Member Author

NamedTuples doesn't define either one
NamedTuple instances do define propertynames

julia> a = (a=1, b=2)
(a = 1, b = 2)

julia> propertynames(a)
(:a, :b)

What does nda.x do?

Errors.
We definately should not use fieldnames, or propertynames
nda[x=1, y=2] on the other hand...

I'm not sure I'm against names, but I'm not sure I'm for it either.

My feelings exactly.
In practice though I do want this function often,
but that might be a symptom, of another thing,
and also can always call it dimnames and export it.

@timholy
Copy link

timholy commented Sep 16, 2019

I generally like dimnames. And yes, the functionality is very useful, just wondering what name it should have.

@Tokazama
Copy link
Contributor

I'm coming around to dimnames. It has a succinct and descriptive name. This also leaves the use of names open for concurrent use with dimnames if someone in the future decides that they need to use it concurrently for an array with dimension names.

@Tokazama
Copy link
Contributor

Is this a desirable feature that's just waiting for a pull request, or is it a no go?

@oxinabox
Copy link
Member Author

Yeah, I think lets make dimnames happen.

@nickrobinson251
Copy link
Contributor

nickrobinson251 commented Nov 23, 2019

Any thoughts on dims over dimnames for this function?

I think dims pairs nicely with axes. And I think the useage matches the dims keyword.

I slightly prefer it. But I am also fine with dimnames and think most important is having this functionality in an easier-to-use way than NamedDims.names.

@mcabbott
Copy link
Collaborator

+1 for giving the internal function NamedDims.names a better and non-clashing name (and a dep-warning). My vote would actually be for names_or_defaults to be 100% clear, but dimnames works too.

I still think that in addition to this fix, we should also make Base.names work on NamedDimsArrays only, and make names(A,d) work like size(A,d), axes(A,d) etc.

@Tokazama
Copy link
Contributor

I slightly prefer dimnames because it's more descriptive and internally there is already dim which is really close to dims.

The problem with using Base.names is that it precludes any parent type from using it (e.g., something with Table traits). One could always wrap a NamedDims array inside of another array if you want to change the behavior of Base.names, but that would kind of eliminate the point of using this package then.

@timholy
Copy link

timholy commented Nov 23, 2019

I haven't reviewed this recently, but just on general principles I'd urge caution about supplying default names if the array doesn't already have them---my vote is to return nothing or missing in such cases, and force the coder to then call defaultnames if s/he really wants names. See discussion in JuliaImages/ImageCore.jl#89.

@nickrobinson251
Copy link
Contributor

nickrobinson251 commented Nov 23, 2019

internally there is already dim which is really close to dims.

Actually, that has convinced me in favour of dims! 😆

I think this is nice/good:

julia> nda = NamedDimsArray(rand(2, 3), (:r, :c));

julia> dim(nda, :c)
2

julia> dims(nda)
(:r, :c)

I guess then maybe it'd also be nice to have:

julia> dim(nda, 2)  
:c  

@mcabbott
Copy link
Collaborator

mcabbott commented Nov 23, 2019

Not suggesting changing it now, but dim already seems confusing IMO, since it returns dims:

julia> dim(nda, (:c, :r))
(2, 1)

If the terminology is to call the integers dimensions, and the symbols names (or dimension names), then I think functions should follow what they return.

@timholy
Copy link

timholy commented Nov 23, 2019

So dim returns an Int but dims returns a tuple of symbols? Seems confusing.

@nickrobinson251
Copy link
Contributor

related #10

maybe this should be dims in all cases 😆

that is

julia> nda = NamedDimsArray(rand(2, 3), (:r, :c));

julia> dims(nda)
(:r, :c)

julia> dims(nda, :c)
2

julia> dims(nda, 2)  
:c  

julia> dims(nda, (2, 1))  
(:c , :r)

@Tokazama
Copy link
Contributor

It may make more sense for dim to be finddims so it's consistent with the other find* functions in base.

@mcabbott
Copy link
Collaborator

mcabbott commented Nov 23, 2019

But every second function contains something like numerical_dims = dim(A, dims), which needs to always return the integer(s). You need a function which does this. It could be called finddims or numerical_dims but doubt the churn is worth it. It shouldn’t be called dims as that will clash with the name of the keyword, in every second function.

@oxinabox
Copy link
Member Author

There is a tiering system of things that are hard in software engineering.

  • Easy: hello world
  • Moderate: Travelling Salesman
  • Hard: Naming things
  • Impossible: Video Conferencing software

@Tokazama
Copy link
Contributor

TBH I don't feel strongly at all about renaming dim. My comment was mostly an attempt to reduce the cognitive ambiguity between dim and dims. However, it sounds like dims has other shortcomings.

I haven't reviewed this recently, but just on general principles I'd urge caution about supplying default names if the array doesn't already have them---my vote is to return nothing or missing in such cases, and force the coder to then call defaultnames if s/he really wants names. See discussion in JuliaImages/ImageCore.jl#89.

Given the potential complexity of default names in this package (how they're encoded internally, wild card names, etc.), perhaps it's best to just decide what the user facing API is (e.g., names, dimnames) and punt the specific of how default names will work in another issue. I don't think the choice here will necessarily limit implementing default names under dimnames, default_names, etc..

@mcabbott
Copy link
Collaborator

mcabbott commented Nov 23, 2019

But #83 just renames NamedDims.names to dimnames, right? This does still return default names, as it should e.g. for new_names = unify_names(dimnames(a), dimnames(b), dimnames.(cs)...) to work correctly. Because unify_names still wants :_ wildcards.

Ah it also exports this function. Perhaps the question is whether the exported function should be this one with wildcards, or something else.

@oxinabox
Copy link
Member Author

Perhaps the question is whether the exported function should be this one with wildcards, or something else.

Indeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants