-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String constructor truncates data. #32528
Comments
There is the issue that changing the current stealing behavior is technically breaking. It may be that no one is relying on this behavior so that it could be considered a "minor change" but we'd have to test that out. I'm guessing that the most common case is that after a byte vector has been "stolen" it is never used again so the code won't care if it's truncated or not. |
Breaking this guarantee is to me "breaking" even if the code still (eventually) produces the same answer. |
I wouldn't be surprised or even necessarily upset if this was considered too big a change for a minor release (though if possible, I'd still prefer it in a minor release). However, I think this is confusing semantics that goes against standard naming conventions and should at minimum be considered for Telling people that mutating functions are marked by a |
It's worth remembering what we did before, which was share data (if possible) but not truncate the vector. After all, truncating the vector is not necessary. Then you just need to tell people not to mutate the vector later. Frankly I liked that a bit better, since |
There was also an intermediate time when this would sometimes truncate the byte array and sometimes not, depending on details of how the byte array was constructed. For 2.0, this would be a great use case for @Keno's freeze/thaw stuff since you could just say that |
How much of the surprise factor here would be cured if the function were called |
This exact behavior once cost me an hour of debugging, before I finally figured out what was going on 😄. I would much prefer if |
Would it be crazy to just deprecate (and eventually delete) the |
That's definitely a breaking change, so unfortunately, yes, it's "crazy" for a 1.x release. We could, however, leave |
Ah, yes, sorry, I was predicating that on saving this for 2.0, per @MasonProtter's point.
Yeah, that's what I was hoping to avoid by forcing everyone to do the replacement from String to String! for those cases. |
If we're willing to wait until 2.0 to do this then yes, we can deprecate |
One thing to keep in mind with this discussion is that people often automatically generate constructors via a On one hand, we don't want to suddenly make that pattern less performant when applied to strings, but on the other hand, I think it's fair to say that I would be pretty baffled if I wrote something like |
In Keno's proposal you get that by making a copy of the vector. So we could also just use the existing |
Can this be marked for the 2.0 milestone so that it is not missed accidentally? |
|
Could we pretend that (or implement it that way) struct String
utf8_codeunits::Vector{UInt8}
end from which Not being different from any other |
That's how it used to be with the issue that any mutation to the input vector (which in this case you will still have a reference to) is basically undefined behavior since strings are assumed to be immutable. The alternative is unconditionally copying. |
So a discussion came un on slack that this behaviour
is fairly surprising.
One normally expects that function calls be explicit if they mutate data and I think cosntructors like
String
should be held to similar standards. I see there was some discussion in #26093, and the conclusion seemed to be that since there is no typeString!
then naming the constructorString!
doesn't make sense, especially because (according to those more knowledgable than I) nearly 100% of uses ofString
want the memory stealing behaviour.I think it is worth revisiting this issue. I'd argue that even though one usually wants memory stealing, it's best to always be explicit if that is happening, either through a keyword argument, e.g.
String(x, steal=true)
, or through a different constructor nameString!
.The current behaviour seems like a real footgun to me, even if it's convenient when you know what you're doing.
The text was updated successfully, but these errors were encountered: