Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document privacy of type-fields #12064

Closed
mauro3 opened this issue Jul 8, 2015 · 85 comments · Fixed by #40533
Closed

Document privacy of type-fields #12064

mauro3 opened this issue Jul 8, 2015 · 85 comments · Fixed by #40533
Labels
docs This change adds or pertains to documentation

Comments

@mauro3
Copy link
Contributor

mauro3 commented Jul 8, 2015

It is not documented that fields of types are considered private (at least by some people), as @nalimilan mentioned over in julia-users. It's probably worth giving some guidelines in the documentation when fields are considered private. As that is probably not always true, for instance when using a type just to store stuff.

Also, using an leading underscore is sometimes used to denote privacy of methods or (extra?) privacy of fields. When should _ be used vs just assume privacy of fields? When should methods use a _?

It is probably worthwhile to reach some sort of consensus about this and documenting it before going forward with field overloading (#1974 and PR #5848).

@ScottPJones
Copy link
Contributor

Tagging all fields you want to keep private with a _ is a lot more of a bother than simply adding public/private keywords to the definitions where you can actually get a warning or error if misused.
If the default is that fields of types should be considered private (something I've yet to see in the Julia codebase or packages), then maybe using a "public" keyword would be better.
Simply documenting something that is not followed in practice is not really that helpful.

@mauro3
Copy link
Contributor Author

mauro3 commented Jul 8, 2015

@ScottPJones: I think that is a separate issue: language feature vs documentation/convention. (Not sure whether one has been opened yet).

@KristofferC
Copy link
Member

Field overloading would give the same flexibility to the dot notation as the @property annotation in Python, right? In that case we could just adopt the same convention as Python, fields are considered public by default unless they start with an underscore. Direct field access is no longer problematic since the implementation can now change in a backwards compatible way by overloading the field access.

@ScottPJones
Copy link
Contributor

Yes, that should be opened as well. I still stand by my last point, that documenting something that people simply don't follow, or feel is incorrect (see @tknopp's comments about immutables in the same julia-users discussion) isn't going to do much good. The horse has already left the barn.

@rofinn
Copy link
Contributor

rofinn commented Jul 8, 2015

I'm in favor of a _ convention as well. It would also be easy enough to add it to Lint.jl. We might also want to use the same convention for functions that are private to a file. @ScottPJones: The public/private syntax always seemed a bit verbose to me, but I could be convinced otherwise.

@ScottPJones
Copy link
Contributor

If the default for all fields of a type is private, then you are simply adding 7 characters "public " to those fields. On the other hand, if you use a _ convention (which doesn't really buy you any assurance at all that people won't use your internals!), you'll be adding a _ for most every field, and most every field reference. I think _ is a lot more verbose in practice.

Also, for 29 years I worked on a language with probably >100,000 programmers using it, and before we added public/private (default private for functions in a module) (which was about 18 years ago, IIRC), we constantly had problems with programmers using internal interfaces, and then complaining that we broke things they depended on. I'd like the programmer community of julians get to be as large or larger, and I'd hope that julia not repeat mistakes I've had to painfully deal with in the past.

@carnaval
Copy link
Contributor

carnaval commented Jul 8, 2015

Why do you only consider the PoV of the writer of a library (/piece of code) ?

I've been in the opposite situation many times when I'm using someone's code and they made some things private because, after all, it's "good practice" to hide most of your internals. However, I know what I'm doing, I read the source, the compiler could do it but refuses because of some keyword ?

In other words, it's not only mindless idiots using your pristine code, sometimes you are also using some mindless idiot's code ;-)

@ihnorton
Copy link
Member

ihnorton commented Jul 8, 2015

As currently envisioned, I think getfield/setfield overloading would facilitate a privacy convention for people who want it, without any other changes to the language. To @carnaval's point, fields would still technically be accessible using Core.getfield (or whatever it ends up being called), but anyone who does that will own their breakage.

@ScottPJones
Copy link
Contributor

No, I'm actually considering both. As a user of somebody else's code, I'd like to know just what the real API is, what the contract I have with the module / library / package.
That way I don't waste any of my time when the owner of that package decides to rewrite everything.

If you see that the code you are using does not give you some needed functionality in its API, then you can, depending on if it is open source or not, then you'd file an enhancement request (that's what I would get from customers), for open source, you'd raise an issue (if you don't know how to fix it yourself), or submit a PR (which a smart guy like you would probably do!). For open source, you'd even have the option of forking the darn thing if the author(s) are not responsive (or died, or whatever), and people could start using your new improved version that does have the functionality you want.

Mucking around in the internals only helps you, makes your code more fragile, and doesn't help anybody else. Doing the above helps the entire community.

I've been on both sides of the fence throughout my career, and I've had to deal with my own share of mindless idiots! (luckily, I haven't seen run into any yet in the julia community [we may disagree, yes, but I do know they are brilliant])

Maybe this could be handled like deprecations. privatewarn == 0 means no warnings,
privatewarn == 1 means a single warning, privatewarn == 2 means give an error if some mindless idiot is mucking about in my beautiful code! 😀
Would that make this not such a bother to you?

@nolta
Copy link
Member

nolta commented Jul 8, 2015

The road to C++/Java is paved with good intentions. -1e6 to any sort of public/private wankery.

@ScottPJones
Copy link
Contributor

@ihnorton The problem with that is, it is still just a convention, and is not easy to find out if people are breaking that convention. Also, having to use .. every time I just want to access the fields in my own types directly would be incredibly annoying, IMO.
@nolta The Julia code base and packages already have problems, because it has no mechanism to keep the abstraction and the implementation separate. This has nothing to do with C++/Java.
This is more about avoiding the "object orgy" that happens in a lot of dynamic languages. See https://en.wikipedia.org/wiki/Object_orgy
I also see this as allowing people to get a bit closer to the niceness that CLU had, if they want to.

@ihnorton
Copy link
Member

ihnorton commented Jul 8, 2015

getfield(::MyType, Any) = error("No!") would be something more than a convention. The existence of Core.getfield is an escape-hatch. If people use that, it's not my/your/our problem.

Also, having to use .. every time I just want to access the fields in my own types directly would be incredibly annoying, IMO.

so use setters and getters...

@lobingera
Copy link

I tried to follow this already at the mailing list and questioned myself: where did i follow this or the other convention in the last ~30 years of programming? I entered object orientation late and always found the private/public differentiation as something obscure. I understand where it comes from and why it's really, really needed, but in writing code and especially in rapid prototyping it's defining a speed limit.

Two things come here to my mind:

  1. A real programmer can write fortran programs in any language -> If you try to stop people to express their ideas with certain language constructs, they'll certainly find ways around.
  2. In SW Engineering all problems are communication problems -> If you cannot transport the message, don't use this, well...

tl;dr: a convention for the name that can be checked by a lint should be enough.

@ScottPJones
Copy link
Contributor

@ihnorton Why would I want the extra complication of adding setters and getters, just to access my own internal structures? That seems like a waste of my time.

@lobingera How do you run lint on the programs that hundreds of thousands of people have written, that you have no access to?
What happens if you make what you think is a minor internal change, and you break software running all over the world? (which can have severe economic effects as well, if your company is selling software).

@lobingera
Copy link

@ScottPJones

Well, i somehow believe in the superiority of Open Source. Therefore the situation that i have no access to the 'other' code doesn't happen.
But actually i meant, having a strong warning and structural checking system on my side, when i contribute code should be enough.

@rofinn
Copy link
Contributor

rofinn commented Jul 8, 2015

@ScottPJones I'm still in favor of the convention and linting approach cause it seems like the path of least resistance. However, I suppose we could have a pub keyword that autogenerates the setters and getters for you. I don't think adding public/private is really going to solve your problem if you aren't using tests cases, linters, etc that should be finding most of these issues. Anyone who has used C++ knows that it can be extremely easy to break your software with a minor internal change ;) (ie: a memory leak).

@malmaud
Copy link
Contributor

malmaud commented Jul 8, 2015

-1 to language-level enforcement of private/public. This is a language
primarily for rapid implementation of scientific algorithms; not Java.

On Wed, Jul 8, 2015 at 11:33 AM, Andreas Lobinger [email protected]
wrote:

@ScottPJones https://github.com/ScottPJones

Well, i somehow believe in the superiority of Open Source. Therefore the
situation that i have no access to the 'other' code doesn't happen.
But actually i meant, having a strong warning and structural checking
system on my side, when i contribute code should be enough.


Reply to this email directly or view it on GitHub
#12064 (comment).

@Keno
Copy link
Member

Keno commented Jul 8, 2015

Yeah, a huge -1 to mandatory access enforcement from me too. In C++, they only get in the way if you're trying out something new before you know what the right interface is. And if you do know what the right interface is, having to access private fields is a code smell that could be caught by Lint.

@ScottPJones
Copy link
Contributor

@Rory-Finnegan the convention and linting approach does nothing to help if you don't have access to the code using the modules/packages that you write. Before we added the ability to optionally use public/private tags for functions, properties, and methods, we used to waste a very large amount of time due to customers having abused some code that was supposed to be internal only (to say nothing of the problem of said customers being upset that their wonderful code stopped working).

@malmaud Why should julia be limited to scientific computing? That seems very narrow. Julia seems to me to be able to replace C, C++, Java, and Python for most of my programming (and I'm not doing scientific computing).

@Keno What is the problem with an optional feature, that the author of a module can use or not as they see fit? Also, if it is controlled via a switch like depwarn, it can be turned complete off for "rapid application development" (that could even be the default setting). There is nothing "mandatory" about what I've been proposing for Julia.

@ScottPJones
Copy link
Contributor

I understand where it comes from and why it's really, really needed, but in writing code and especially in rapid prototyping it's defining a speed limit.

@lobingera So, you do understand that encapsulation is really, really needed.
I've never found that the ability to mark things as internal (private) or externally visible as ever slowing down my writing any code, nor did it slow down any of the customers (they were very happy about it, after it was introduced) (and the language I worked on was for precisely "rapid application development").
The customers were all writing code that had to be maintained reliably for decades, in mission critical applications (hospitals, banks, on-line trading, etc.).

To me, always having to run Lint while I'm developing something, is a much bigger speed limit than simply having the compiler warn me immediately.

@mikewl
Copy link
Contributor

mikewl commented Jul 8, 2015

I think, this could be done somewhat like this:

  1. Non-annontated fields are a grey area and are accessible always (Lint can have a switch to add warnings for non-annotated fields as that warning should not be default).

  2. A keyword private or some other keyword is created that marks it as a private field. This field can be accessed then by implementing either getfield, setfield! or both depending on your needs. This would then allow for the field to be validated as it is being set externally for example or creating a read-only field.

    So this would be something extra that you only would add if you felt it was necessary for your work. I personally would never use it myself. However, if I wanted to use Julia in a field such as finance... I would be far more careful about encapsulation. This would simply be a way of enforcing encapsulation when necessary. I think it would be a useful feature but one that would not and should not be used very often. This way would also not break anything and not change anyone's workflow.

However, this isn't urgently needed. Right now I don't think very long running or critical systems are being written in Julia. I also feel that Julia is useful outside of scientific computing. That currently though is not its focus. This probably should be revisited at a later date once field overloading has been implemented as without it this can't be implemented as nicely.

@vtjnash
Copy link
Member

vtjnash commented Jul 8, 2015

Yeah, a huge -1 to mandatory access enforcement from me too. In C++, they only get in the way if you're trying out something new before you know what the right interface is. And if you do know what the right interface is, having to access private fields is a code smell that could be caught by Lint.

it certainly doesn't stop LLVM from iterating their API continuously. additionally, in many cases, they enforce the public/private distinction via public / private headers – i'm not sure how well that transfers to a language based on dynamic reflection instead of headers, however.

@hayd
Copy link
Member

hayd commented Jul 8, 2015

Could this be done in a package? E.g. with a @private macro? Taking a keyword seems a mite premature.

As mentioned _ prefix is used in python to denote "private", without enforcement. Python's done without.

@ScottPJones
Copy link
Contributor

@vtjnash No, this isn't a cure-all, but at least in my experience, having the option of enforcing at least some level of encapsulation is critical to building reliable systems. Since I also stated that the default could be to simply not even warn or give error messages (a la depwarn), it wouldn't effect absolutely anybody who didn't need this functionality.
Just because the LLVM developers seem to like to change things every release (and there seems to be an endless stream of bugs that Julia has to deal with) doesn't mean that that is a good thing!
How LLVM deals with public/private distinctions I think is totally orthogonal to how Julia might deal with it.

@hayd if it can be done with say @public and @Private macros, that would be just fine, I really don't care so much about the syntax, but rather the functionality I need to be able to deliver reliable systems that will still be working in 30 years time. I still don't have my head wrapped around how to deal with Julia meta-programming yet, so I have no idea what is possible.

@Mike43110 From conversations I had at JuliaCon, I think there are actually a number of people who would welcome anything that can be done to help writing maintainable, reliable, large applications in Julia. We (myself and the Belgian company I'm working for) are definitely interested in using Julia already for critical systems (which is why I take these issues so seriously).

@ScottPJones
Copy link
Contributor

Although Python gets by fine without any enforced privacy, Python is also notorious for being poorly suited for writing large scale reliable systems - lack of privacy of internals is only one factor among many, but it contributes.

@hayd As @tkelman noted above (in the julia-users thread), Python's unenforced _ convention really is not enough.

@tkelman
Copy link
Contributor

tkelman commented Jul 8, 2015

You also neglected to quote my other point. Prototype implementation or bust. Not worth spilling bits talking about it.

@ScottPJones
Copy link
Contributor

I responded to that in the julia-users group.
It's not spilling bits, I think it's been quite useful.
I wouldn't have thought that it would be possible to do this as a macro, not yet knowing all their ins-and-outs yet, but if so, then that would be fine for a prototype (like traits now, I suppose).

@mikewl
Copy link
Contributor

mikewl commented Jul 8, 2015

@ScottPJones @public and @private please! I am sure those people aren't interested in this.

I don't know enough about macros to know if it would be possible. It would make for a good prototype if possible though.

@tknopp
Copy link
Contributor

tknopp commented Jul 8, 2015

@ScottPJones: Please stop quoting things out of the context this is very misleading. I have made pretty clear that for mutable types it is common practice in julia that fields are private and that for immutables this is not entirely clear.

+1 for documenting the common practice and working on better interface support

Julia is already an excellent language for writing maintainable large scale applications. The type system including subtyping of abstract types helps a lot in this.

@ghost
Copy link

ghost commented Jul 8, 2015

For what little that my opinion matters, -1 to making information hiding a part of the core language. +1 for better documenting what we consider to be idiomatic Julia, perhaps even a manual page on "Writing Idiomatic Julia Code".

I would also like to propose "Access Equality" to replace "Consenting Adults", putting myself firmly in the pro-equality camp.

@tkelman
Copy link
Contributor

tkelman commented Jul 10, 2015

I don't see the point in manually adding getter methods for every field of every type and changing all other code to use them for those cases where the data structure is an integral part of the API, and there's no way to make non-breaking changes anyway.

I imagine a grep of package code that uses sparse matrices would show that 90% guess to be overly optimistic.

@mikewl
Copy link
Contributor

mikewl commented Jul 11, 2015

I agree with @tkelman. When you have a data storage type, why would you want to write getter/setters which are just mydata(myobject) = myobject.data.

I do have a thought though. Is this necessary if name aliasing is allowed in #1974? If you wanted to mimic a private field you could just make a getter/setter return an error. Read-only fields would then be fields with only a get method added. You could still access the raw fields from .. or whatever operator is used for grabbing the raw value. A macro @private could be used to automatically generate the error methods.

For methods, modules already exist and using should be recommended as the default way to import methods. Public methods are then the ones exported and private are all the unexported methods.

@rofinn
Copy link
Contributor

rofinn commented Jul 11, 2015

@Mike43110 While, I'm fine with the public API of a module being defined by what it exports, I disagree with using being the mechanism by which that is enforced. My reasoning is that I like the explicitness and control over my namespace that import provides over using, but then I can obviously import anything I like without any warnings.

@ssfrr
Copy link
Contributor

ssfrr commented Jul 11, 2015

I agree that there are definitely times where the fields really are integral to the API and it seems like extra boilerplate to add accessor methods. I've always liked the opening to python's PEP8 which makes it explicit that the style guidelines are for encouraging default behavior and consistency, but not to be followed at the expense of clarity and good design.

So far it seems like @tkelman's sparse matrix example is the most concrete counter-example. Can you point me/us to some of the code that exemplifies where accessing the fields is the better API? I think it would be great to have some more examples of when the "methods-over-fields" rule should be broken.

@tkelman
Copy link
Contributor

tkelman commented Jul 11, 2015

Maybe that 90% number was reasonably accurate after all, since most quick searches for uses of the colptr field out in packages do indeed look like they could get away with using nzrange if they were targeting 0.4-only. Though there are a lot of times in personal code where I want to do things like append or delete or modify a column in-place, that require pushing or popping to colptr and appending/deleting from rowval and nzval. Sometimes you just don't have all that much abstraction over the underlying data structure, if you want to write high-performance code.

@tknopp
Copy link
Contributor

tknopp commented Jul 11, 2015

Where exactly is the issue? Don't you get rowval and nzval by the respective getter functions? It seems to be that the only thing missing is a getter for colptr. Although this entire example is a little tricky because these functions can be used to bring a Sparse matrix into an inconsistent state.

The Image type from Images.jl is a similar example, which enhances an ordinary array be additional information. There is however never a need to directly access a member of the Image type. Instead one calls data(im) and by this obtains the underlaying array.

@tkelman
Copy link
Contributor

tkelman commented Jul 11, 2015

The issue is adding getter functions seems like a pointless exercise if they'll only ever be used for a single type.

@tknopp
Copy link
Contributor

tknopp commented Jul 11, 2015

You may call it pointless but its common practice to decouple the interface from the implementation. The Image implementation shows that there are people that care for this distinction.

But this is all a little subjective. One can also have in C++/Java public member variables. Its simply discouraged to do so.

One could follow the Python/C# property business in order to have clean public properties. I am just not sure if its worth the complexity.

@tkelman
Copy link
Contributor

tkelman commented Jul 11, 2015

My argument is it's not always necessary and it doesn't always buy you anything except polluting your namespace. You aren't really decoupling the interface from the implementation if the getters are all trivial - you've just moved any potential renaming breakage from the field name to the getter name.

@tknopp
Copy link
Contributor

tknopp commented Jul 11, 2015

We don't have to argue about this point. It is very subjective and I am not a fan of enforcing anything in that direction.

The suggestion that fields are private is for people very used to OO. In Java/C# it is absolutely standard to program against interfaces and these don't expose fields.

(But again, the sparse matrix type is tricky because IMHO rowval and colptr are not really public because you can can render the object into an inconsistent state by direct access.)

@tkelman
Copy link
Contributor

tkelman commented Jul 11, 2015

If it were impossible to do anything risky, we'd end up in a situation like Matlab where you have to jump through hoops like writing a mex file to actually touch the data structures you sometimes need to work with.

I'm less convinced about safety and privacy on a language level as the discussion goes on. I just don't think the large-system, strong static safety guarantees design space is something that Julia will ever be perfectly suited for. To do that right it needs to be a design consideration from the very start, and it leads you to something that looks a lot like Ada or Rust.

@nalimilan
Copy link
Member

We should probably recommend avoiding direct field access and using accessor functions in general, but still consider it OK for some types to define an official fields API and tell users to use them directly. Indeed, for most types you shouldn't play with the internal representation, but for some, like sparse matrices, the code sometimes needs to be tightly related with the representation to be efficient. There would be little benefit in making it use accessors, as it only makes sense when applied to this particular representation of sparse matrices.

But I still think it's worth recommending avoiding direct field access in most cases, i.e. "in absence of strong reasons against it, provide users with accessors; only require them to directly access fields for operations that are necessarily dependent on the internal representation of the object".

@KristofferC
Copy link
Member

With getfield overloading, access to fields using the dot notation is no longer mixing interface and implementation. The implementation can change in backwards compatible ways by simply overloading the field. If you want to add some error checking later, just overload the field and add the error check before returning the field. The nice thing is that you a priori do not need to write getters and setters. Only if they are later deemed necessary they are added and it is backwards compatible. Python seems to do perfectly fine using this paradigm. Not that we should strive to be Python but having to add six getters and setters for something that is basically a storage type because "fields are private" is soul crushing.

@tknopp
Copy link
Contributor

tknopp commented Jul 12, 2015

@KristofferC: Julia does not allow field overloading so this is just speculation. Properties are what seem to block #1974 because it will mix size(x) and x.size (see Jeffs comments in that thread)

@tkelman
Copy link
Contributor

tkelman commented Jul 12, 2015

I don't think anything is blocking #1974 other than implementation effort of properly finishing #5848. It will happen, though possibly not in time for 0.4, we'll have to see.

@tknopp
Copy link
Contributor

tknopp commented Jul 12, 2015

If this is so clear then this issue and #12083 are obsolete and can be closed.

@toivoh
Copy link
Contributor

toivoh commented Jul 13, 2015

I don't think that dot operator overloading resolves this entire issue. When I write code, I establish an interface (through documentation, and possibly aided by appropriate language features). That is my contract with the user. I want to make it as small as possible so that I am free to change the implementation afterwards.

With dot operator overloading, it will be a lot safer to establish an interface that includes field access. But for my implementation, I should still be able to use fields that are private (in the sense that I am free to change their meaning or remove them), without jumping through hoops such as creating dummy getfield overloads that throw an error when trying to access them.

@tknopp
Copy link
Contributor

tknopp commented Jul 13, 2015

@toivoh: This is all correct and fine. But until these things are landed it does not make sense to document the privacy of fields as it simply does not hold anymore after #1974. This issue is (as the title indicates) is about privacy of fields.

@taqtiqa-mark
Copy link

In case it helps.... the following Discourse comment provides and example of how to create private properties for a type.

vtjnash pushed a commit to vtjnash/julia that referenced this issue Apr 19, 2021
fredrikekre pushed a commit that referenced this issue Apr 20, 2021
ElOceanografo pushed a commit to ElOceanografo/julia that referenced this issue May 4, 2021
antoine-levitt pushed a commit to antoine-levitt/julia that referenced this issue May 9, 2021
johanmon pushed a commit to johanmon/julia that referenced this issue Jul 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs This change adds or pertains to documentation
Projects
None yet