-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Values, variables, pointers, and references #821
Conversation
pointers. Explicitly discuss the use cases of references in C++ and propose specific approaches to address those use cases. This is something that we've been discussing across the team for a long time, and while there are definitely still challenges in this space we will need to address going forward, I want to try to codify where we are at and provide for a few fundamentals that haven't really been spelled out previously. That said, I've been staring at this document for *far* too long in a draft, and so I may be missing parts that are confusing or need work, so any help from folks to make this a coherent story is definitely appreciated. The current structure and wording is heavily informed by several reviews and suggestions from @zygoloid, @josh11b, and @wolffg with much appreciation. =] Some core examples of the consequence of this proposal: - Using `let` where we currently use `var` to declare a locally scoped immutable view of a value: ``` let index: i32 = 42; ``` - Specifying the expected semantics of parameters to by default be these immutable views of values like `let`. These should behave like C++ `const` references but allowing copies under *as-if*. - Specifying that `var` creates an *L-value* and binds names to it. - Defining `var` is being allowed to nest within `let` to mark a part of a pattern as an L-value: ``` let (x: i64, var y: i64) = (1, 2); // Ok to mutate `y`: y += F(); ``` When the entire declaration is a `var` the `let` can be omitted. This works with function parameters as well to mark *consuming* an input into a locally mutable L-value: ``` fn RegisterName(var name: String) { // `name` is a local L-value in this function and can be mutated. } ``` - Implementing operators by rewriting into method calls through an interface, which can then use `[addr me: Self*]` to implicitly obtain a mutable pointer to an object for mutating operators. - Providing user-defined pointer-like types and the implementation of both the `*`-operator and `->` member access in terms of rewriting into member calls through an interface and then forming L-values. - Providing indexed access through rewrites into method calls as well. Beyond these use cases, thread-safe interfaces and more complex lifetime based dispatch are deferred for future work. See the proposal for details here, and looking forward to feedback!
today. These overlap, but are meaningfully distinct. | ||
|
||
1. An _immutable view_ of a value | ||
2. The _thread-safe interface_ of a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you should use "safe" or "thread-safe" in contexts where you are talking about the "thread-compatible" contract. Those are explicitly different things. Thread-compatible types don't have thread-safe interfaces, they have const and non-const interfaces and a contract that says what safe usage of those interfaces are. Those interfaces are only ever conditionally safe, not safe in the sense of the normal usage of that term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Referring to this as the thread-safe interface is common, and I think was popularized by this talk by Herb Sutter: https://channel9.msdn.com/posts/C-and-Beyond-2012-Herb-Sutter-You-dont-know-blank-and-blank (summary at 27:30 onwards). I think the viewpoint is that if the const interface is the only interface you use on an object, then it is (or should be!) thread-safe, even if the non-const interface is not.
If you want to say that an interface is only thread-safe if concurrent usage of other (non-thread-safe) interfaces on the same object would be safe, then even code that protected all member accesses with a mutex wouldn't be thread-safe (because you might concurrently destroy the object), so I think it's more useful to say that an interface is thread-safe if concurrent use of that interface (and no others) is safe, even though this is non-composable (an object might have two interfaces that are individually thread-safe but that can't be used safely at the same time, such as a getter that doesn't take a lock and a setter that does).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was at least one place where I was using "safe" in a much more dubious way that I've removed. It could easily imply that it ways checked to be safe by ensuring that no other interface was being concurrently used. I agree that usage was confusing, and I'll try to see if I repeated this anywhere else.
I largely agree with @zygoloid about "thread-safe interface" being an OK term, but if it is really confusing folks, I can try to come up with different terms...
proposals/p0821.md
Outdated
performing this refactoring, there is a need to translate between local | ||
variables and parameters in both directions. In order to ensure these | ||
translations are unsurprising and don't face significant expressive gaps or | ||
behavioral differences, it is important to have strong conceptual integrity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this "conceptual integrity" or "semantic consistency"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think someone suggested "conceptual integrity" here. I'm not sure what the difference between these two terms would be -- one indicates a consistent set of concepts, the other a consistent set of semantics (likely through a consistent set of concepts)?
Anyways, if semantic cosistency reads better to you, happy to use it.
proposals/p0821.md
Outdated
evolutionary space for safe primitives to be added. There is no specific goal to | ||
radically change the overarching patterns that emerge currently in C++ API | ||
design. At most, the hope is to simplify and address their shortcomings, not to | ||
shift to a completely new model. For example, moving to a model of everything |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this design is a significant divergence from C++. Which is not to say I disagree with it or that it is as different as the Java model, but I think it would be more honest to describe this as an experiment with doing things differently in the hope that it is sufficiently better to be worth the interop and retraining costs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was trying to talk about the patterns of API design, but you're absolutely right that the building blocks used within those APIs are very different. I've tried to be more explicit now, does this help?
proposals/p0821.md
Outdated
``` | ||
|
||
This _immutable view_ can be thought of as requiring that the semantics of the | ||
program be exactly the same whether it is implemented in terms of a view of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what "view" means here. You are defining the phrase "immutable view" and I feel like I understand "immutable" already pretty well so I was reading this text to understand the "view" part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is really an issue of my example. I've changed it to more clearly model a view. Does this help?
proposals/p0821.md
Outdated
- The view must not be used to mutate the value, or those mutations would be | ||
lost if made to a copy. | ||
|
||
Put differently, it makes a copy valid under the as-if rules of C++. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this statement is a bit subtle with respect to causation. I think you are saying: "we want an immutable view to enforce the conditions of the as-if rules, so that a copy would be valid in C++" not "the as-if rules are applicable, so a copy is valid in C++", but you might also mean "we want to use an immutable view in the cases where the as-if rules would allow a copy in C++". I was unsure what "it" refers to here and that made me unsure of how to interpret the word "makes".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "C++" here was just to identify what I meant by "the as-if" rule. I've used a link instead to make this less ambiguous.
And I've removed the "it" which was also confusing. Is this better?
proposals/p0821.md
Outdated
- _immutable views_ | ||
- _thread-safe interfaces_ | ||
- _smart pointers_ | ||
- _consuming input_ | ||
- _lifetime overloading_ | ||
- _mutable operands_ | ||
- _user-defined dereference_ | ||
- _indexed access syntax_ | ||
- _member and subobject accessors_ | ||
- _non-null pointers_ | ||
- _syntax-free dereferencing_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that listing the use cases up front is providing much value. Perhaps the section headings can be tweaked so that the table of contents is filling this role?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm open to suggestions here, listing them was one from @wolffg.
While some correspond nicely to section headings, others don't. I spent some time thinking about how to restructure the sections to arrange for there to always be a section heading that matched and wasn't able to come up with anything satisfying. I'm not at all opposed to such a structure if you or others see a good way to get from here to there though.
proposals/p0821.md
Outdated
``` | ||
void SomeFunction(...) { | ||
// ... | ||
|
||
constinit const int id = ...; | ||
|
||
// Cannot mutate `id` here accidentally. | ||
// ... | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect that you are correct that this example technically has the immutable view semantics, but is unfamiliar at least to me and so does not serve the expository function that it is meant to serve. Const references and pass-by-copy/value are where I expect users to be looking for immutable view semantics, even though C++ doesn't deliver these specific semantics. I think it might be more honest to say this is a place where we are diverging from C++ because this is what we think users want, rather than asking them to choose between const reference and pass-by-copy/value.
Anchoring on C++ here is awkward because Rust, for types without interior mutability, has immutable borrows which are a bit closer I think.
|
||
fn Example(a: Point, b: Point, dest: Point*) -> Float { | ||
if (...) { | ||
// Rewritten to: (*dest).(Assignable.Assign)(a); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically this should be rewritten in such a way as to support different RHS types.
// Rewritten to: (*dest).(Assignable.Assign)(a); | |
// Rewritten to: (*dest).(Assignable(typeof(a)).Assign)(a); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My hope was to just use simpler homogenous interfaces for exposition here, and let the actual operator proposal fully dig into this. Does that make sense to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The direction of this proposal looks good to me.
I'm somewhat nervous about conflating mutability and addressability. I'm also a little worried that not having const
as an immutable view of mutable data will create migration complexity. But I think we can use this as a basis for exploring those questions.
Detailed review on the assumption that the bulk of the proposal wording will end up in the design documents largely unchanged.
There are two different semantic models that underpin how `const` is used in C++ | ||
today. These overlap, but are meaningfully distinct. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are both about a const
view of an object (eg, a pointer or reference to a const
-qualified type), rather than about an actually-immutable const
object. It might be worth teasing those apart so it's clear you're only talking about the former here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually-ummutable const
objects are also mostly used to have an immutable view of some value?
I tried seperating this out anyways, and it made it a bit more complex. Not sure what change you're thinking would help here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think you're right; const objects fit the immutable view model. I think the cases I'm trying to tease apart are an immutable view of a value, and a handle to an object where we cannot modify the object via that handle. The latter isn't a view of a value, though, because the value of the object may change over time, and because we do have pointer identity.
I wonder if perhaps "thread-safe" is too narrow of a term for the second category. That is, I wonder if the two models are:
- immutable view of an unchanging value: const is a promise that the value will not change, neither by being modified through this handle nor by being modified by another handle.
- immutable interface to a potentially changing value: const is an interface guarantee by which a client of a handle promises that they will not change the value through that handle, but the value may change in other ways. Often the const interface is thread-safe, but this is also used to model immutability in single-threaded situations. This is a special case of a more general desire for a type to provide different interfaces to different clients.
the function body. There is no such easy assurance for return values. As a | ||
consequence, this proposal suggests return values are copied initially for | ||
safety and predictability. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not thought this all the way through, bear with me... do we want to guarantee that a copy happens, or should we give a weaker guarantee that there is no lifetime issue, but that we may or may not make a copy?
What I'm thinking is: we could say that a function that returns by value may return a reference to anything that it knows lives at least as long as any of its parameters (including me
). Then at the call site, we can check whether the returned value outlives any of the function parameters and make a copy if so. Then:
// F doesn't make a copy; returned value lives at least as long as parameter s
fn F(s: String) -> String { return s; }
// G makes a copy: v doesn't live long enough
fn G(s: String) -> String {
let v: String = "foo" + s + "bar";
return v;
}
fn Print(s: String);
fn H() {
// Parameter of `F` can be kept alive until `;`, so we know
// the return value lives at least that long and don't need
// to make any copies.
Print(F("hello"));
// `x` outlives the function argument, so we make a copy here.
let x: String = F("hello");
Print(x);
}
I suppose we can avoid making a copy even in the second case in H
by lifetime-extending the "hello"
temporary. I'm not sure that's a good idea; it might be too unpredictable.
One problem with this is that the function return is creating an immutable view, and we need the callee to know how long it's promising that returned value will remain immutable for. I suppose this is nothing new; this is analogous to a classic C++ issue:
const string &s = v[i];
v.push_back("x");
use(s);
... where this either works or fails depending on whether v[i]
produces a reference to an existing object (eg, v
is vector<string>
) or ends up binding s
to a temporary (eg, v
is vector<const char*>
). We might want some simple syntax to force a copy and end a chain of immutable views. (You could use var
for that, but that also implies mutability, which might be undesirable.)
There's also a calling convention complexity issue with this kind of approach: if F
can either return a handle to some existing object or copy to some caller-provided storage, then the caller always needs to provide the storage and may need to perform a branch to tell whether it should provide a copy. That seems like something we could handle but I'm not sure whether it'll be worthwhile unless we get to avoid a lot of copies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, yes.
This is exactly the direction I'd like to explore here, as an incremental performance improvement by reducing copies. This is what I'm alluding to in the paragraph below around "lifetime tracking" system. Ideally such a system would let the cases such as the ones you mention Just Work by letting the caller copy only when necessary, and the API use annotations to give it the maximum knowledge of how late it can wait to create such a copy.
I also agree about forcing a copy at some point. A var
is I tihnk going to work out reasonably well in practice, but we could also have a copy
operation (in the expression space) to make it easier to chain into something that doesn't need mutability and to avoid creating a statement. Technically, I think we can already do this:
fn Copy[T:! Type](var v: T) -> T { return v; }
But it seems likely better to give the language more visibility into it rather than doing it like this.
Anyways, all of this is for future work IMO. How much of this should I record as ideas so we don't lose them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed as #828. I think filing the issue is enough for now, but a link to it from here might be useful I suppose.
One use case not obviously or fully addressed by the tools proposed here is | ||
overloading function calls by observing the lifetime of arguments. The use case | ||
here would be selecting different implementation strategies for the same | ||
function or operation based on whether an argument lifetime happens to be ending | ||
and viable to move-from. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example of this that immediately springs to mind is overloading operator+
for strings to reuse storage where possible. But I think that's because that was the example from the C++ paper that introduced the facility. I can't say that I've ever seen this kind of overloading actually be done in practice outside of that example -- and even in the case of that example it's not obvious to me that the optimization is worth the complexity, and maybe the problem is that we're providing the wrong interface. Do we have any more such examples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That optimization becomes more important as the size of the data increases. TensorFlow provides this optimization transparently for its (typically large) tensor values, but not at the C++ API layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The few places where I have seen this optimization in C++ APIs where it was very important, it was exactly as Josh said -- due to the size of the data being large.
However, in those cases (in LLVM mostly), relying on operator overloading was too subtle and the APIs evolved to be very explicit even at the call site, and forced separate code paths when needed for different scenarios. It wasn't a big ergonomic burden given the importance of not adding a heap allocation.
This is part of why I'm a bit skeptical about addressing this with overloading. But I also don't want to absolutely preclude revisiting it -- I think this too is part of the experiment I'm suggesting.
proposals/p0821.md
Outdated
Should we immediately provide the escape hatch of an unsafe address-of operation | ||
on immutable views? Even if "no", we can always revisit this later and add the | ||
operation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the alternative right now? You'd presumably either change a let
to var
, or add a new local var
, and then use the address of the variable, I suppose. I think we should not add this for now, but once we are writing Carbon code we should be on the lookout for that pattern and take it as evidence to reconsider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is the alternative, and I agree. I've moved this to an alternative considered.
- Pointers are expected to be deeply familiar to C++ programmers and easily | ||
[interoperate with C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For C++ interoperability and migration, I don't have a great picture of how we'll map C++ pointer and reference types, and especially pointers and references to const
-qualified types, into Carbon. In some cases we'll want an immutable view, but in other cases we'll want a weaker "can't be modified through this handle but can change while this handle exists" view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea is that the default migration and interop of references (including const references) in C++ would be to pointers in Carbon. For migrating the const
-qualified interface shift, something like a facet type but still using pointers.
Then we can look for specific patterns that can be reliably recognized and instead migrated to the immutable value views. For example, by-value parameters that are clearly never mutated. Or const
-reference parameters without const_cast
s. Maybe some others. But the fallback for references would always be pointers here.
Would it be useful to write this up in the proposal? In how much detail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think including something like your above comment in the proposal would be helpful from an anchoring perspective, even if it's marked as provisional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just responding to threads where I have questions or there is more discussion, and landing easy suggestions. I'll then make a second pass trying to make the more substantial changes needed and responding to other threads.
proposals/p0821.md
Outdated
- _immutable views_ | ||
- _thread-safe interfaces_ | ||
- _smart pointers_ | ||
- _consuming input_ | ||
- _lifetime overloading_ | ||
- _mutable operands_ | ||
- _user-defined dereference_ | ||
- _indexed access syntax_ | ||
- _member and subobject accessors_ | ||
- _non-null pointers_ | ||
- _syntax-free dereferencing_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm open to suggestions here, listing them was one from @wolffg.
While some correspond nicely to section headings, others don't. I spent some time thinking about how to restructure the sections to arrange for there to always be a section heading that matched and wasn't able to come up with anything satisfying. I'm not at all opposed to such a structure if you or others see a good way to get from here to there though.
|
||
fn Example(a: Point, b: Point, dest: Point*) -> Float { | ||
if (...) { | ||
// Rewritten to: (*dest).(Assignable.Assign)(a); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My hope was to just use simpler homogenous interfaces for exposition here, and let the actual operator proposal fully dig into this. Does that make sense to you?
the function body. There is no such easy assurance for return values. As a | ||
consequence, this proposal suggests return values are copied initially for | ||
safety and predictability. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, yes.
This is exactly the direction I'd like to explore here, as an incremental performance improvement by reducing copies. This is what I'm alluding to in the paragraph below around "lifetime tracking" system. Ideally such a system would let the cases such as the ones you mention Just Work by letting the caller copy only when necessary, and the API use annotations to give it the maximum knowledge of how late it can wait to create such a copy.
I also agree about forcing a copy at some point. A var
is I tihnk going to work out reasonably well in practice, but we could also have a copy
operation (in the expression space) to make it easier to chain into something that doesn't need mutability and to avoid creating a statement. Technically, I think we can already do this:
fn Copy[T:! Type](var v: T) -> T { return v; }
But it seems likely better to give the language more visibility into it rather than doing it like this.
Anyways, all of this is for future work IMO. How much of this should I record as ideas so we don't lose them?
with pointers. While this has some ergonomic cost, it seems minimal and isolated | ||
to a relatively rare use case. | ||
|
||
### Indexed access syntax |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I agree this is an important use case.
I would definitely consider a design which allowed those kinds of different variations to all be expressed. Not sure if it would be through prioritization or letting the type choose which style of interface it wants to expose for its indexed syntax.
I'm very interested in all three of the options you mention. I'd at a minimum like for types to be able to choose between the first and last options. I think the middle option would be very interesting to investigate to understand the value compared to the third.
How much should this happen in this proposal? (Also happy to grab some open discussion time to dive deep here.)
One use case not obviously or fully addressed by the tools proposed here is | ||
overloading function calls by observing the lifetime of arguments. The use case | ||
here would be selecting different implementation strategies for the same | ||
function or operation based on whether an argument lifetime happens to be ending | ||
and viable to move-from. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The few places where I have seen this optimization in C++ APIs where it was very important, it was exactly as Josh said -- due to the size of the data being large.
However, in those cases (in LLVM mostly), relying on operator overloading was too subtle and the APIs evolved to be very explicit even at the call site, and forced separate code paths when needed for different scenarios. It wasn't a big ergonomic burden given the importance of not adding a heap allocation.
This is part of why I'm a bit skeptical about addressing this with overloading. But I also don't want to absolutely preclude revisiting it -- I think this too is part of the experiment I'm suggesting.
Co-authored-by: josh11b <[email protected]> Co-authored-by: Richard Smith <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think either responded or responded-and-updated-proposal for all the comment threads now. If I missed anything, let me know!
proposals/p0821.md
Outdated
performing this refactoring, there is a need to translate between local | ||
variables and parameters in both directions. In order to ensure these | ||
translations are unsurprising and don't face significant expressive gaps or | ||
behavioral differences, it is important to have strong conceptual integrity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think someone suggested "conceptual integrity" here. I'm not sure what the difference between these two terms would be -- one indicates a consistent set of concepts, the other a consistent set of semantics (likely through a consistent set of concepts)?
Anyways, if semantic cosistency reads better to you, happy to use it.
proposals/p0821.md
Outdated
evolutionary space for safe primitives to be added. There is no specific goal to | ||
radically change the overarching patterns that emerge currently in C++ API | ||
design. At most, the hope is to simplify and address their shortcomings, not to | ||
shift to a completely new model. For example, moving to a model of everything |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was trying to talk about the patterns of API design, but you're absolutely right that the building blocks used within those APIs are very different. I've tried to be more explicit now, does this help?
There are two different semantic models that underpin how `const` is used in C++ | ||
today. These overlap, but are meaningfully distinct. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually-ummutable const
objects are also mostly used to have an immutable view of some value?
I tried seperating this out anyways, and it made it a bit more complex. Not sure what change you're thinking would help here?
proposals/p0821.md
Outdated
``` | ||
|
||
This _immutable view_ can be thought of as requiring that the semantics of the | ||
program be exactly the same whether it is implemented in terms of a view of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is really an issue of my example. I've changed it to more clearly model a view. Does this help?
proposals/p0821.md
Outdated
- The view must not be used to mutate the value, or those mutations would be | ||
lost if made to a copy. | ||
|
||
Put differently, it makes a copy valid under the as-if rules of C++. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "C++" here was just to identify what I meant by "the as-if" rule. I've used a link instead to make this less ambiguous.
And I've removed the "it" which was also confusing. Is this better?
proposals/p0821.md
Outdated
*dest = b; | ||
} | ||
|
||
// Rewritten to: return a.(Subtractable(Float).Subtract)(b); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully fixed now?
proposals/p0821.md
Outdated
fundamental scaling problem in this style of overloading: it creates a | ||
combinatorial explosion of possible overloads. Consider a function with N | ||
parameters that would benefit from lifetime overloading. If each one benefits | ||
_independently_ from the others, we would need N\*N overloads to express all the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
proposals/p0821.md
Outdated
Should we immediately provide the escape hatch of an unsafe address-of operation | ||
on immutable views? Even if "no", we can always revisit this later and add the | ||
operation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is the alternative, and I agree. I've moved this to an alternative considered.
- Pointers are expected to be deeply familiar to C++ programmers and easily | ||
[interoperate with C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea is that the default migration and interop of references (including const references) in C++ would be to pointers in Carbon. For migrating the const
-qualified interface shift, something like a facet type but still using pointers.
Then we can look for specific patterns that can be reliably recognized and instead migrated to the immutable value views. For example, by-value parameters that are clearly never mutated. Or const
-reference parameters without const_cast
s. Maybe some others. But the fallback for references would always be pointers here.
Would it be useful to write this up in the proposal? In how much detail?
- Pointers are expected to be deeply familiar to C++ programmers and easily | ||
[interoperate with C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code). | ||
|
||
## Alternatives considered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this document does only a little to explain what is wrong with the C++ model for our purposes. This document is mostly "see how this other thing addresses many of the same use cases" but with a model that is different enough that it is definitely going to be an interop and training issue (for example, what happens to types with const
instance variables?). The only explanation I recall is: we don't want both pointer and references since that introduces complexity in the same place we are going to want to add safety. I feel like there are a lot more changes that deserve some explanation of how this is an improvement for our purposes.
Beyond properties, Carbon is expected to explore some lifetime tracking system, | ||
and when that happens it should be considered for enabling non-copy returns of | ||
immutable values with a tracked lifetime. These might provide for more general | ||
or complex forms of read-only _member and subobject accessors_ than can be | ||
represented through properties. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do non-copy returns need to be coupled to lifetime tracking? In particular, suppose I want to expose mutable access to some complex sub-object that's part of my logical state (i.e. that's copied when I am copied), but I also want clients to be able to read it when I'm immutable. IIUC, under this proposal that would look something like:
fn Foo[me: Self]() -> FooType;
fn MutableFoo[addr me: Self*]() -> FooType*;
This paragraph suggests that in the future there could be a way to modify Foo
so that it doesn't copy the underlying object, so long as its lifetime is properly annotated. But why is lifetime-tracking more important for Foo
than for MutableFoo
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you write MutableFoo
it's clear that there's a lifetime issue: you're returning a pointer, and that pointer needs to still point to something when it's used. But when you write Foo
, you're notionally returning by value, and the fact that we may choose to do that without making a copy is an implementation detail. While that implementation detail is exposed to the programmer (there are -- presumably -- restrictions on concurrent mutation of the FooType
object just like there are when passing a parameter by value), the fact that it's up to the implementation to make this decision to some extent shifts the burden for checking the lifetime rules from the programmer to the implementation.
As an extreme example:
fn Foo[me: Self]() -> FooType {
var x: FooType = ...;
return x;
}
... obviously should not return a non-copied handle to a stack variable whose lifetime ends when the function returns.
interface that it implements, and for APIs to use this narrow interface to only | ||
interact with the underlying type in particular ways. While they are presented | ||
as a way to make code _generic_ over multiple types, they can also be used to | ||
simply enforce constraints on the interface exposed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a problem with this approach, though: generic interfaces can carry a nontrivial performance penalty (e.g. the time costs of dynamic dispatch, and possibly the storage costs of witness table pointers) that isn't needed when we're merely subsetting the API of a specific, known type. For example, in C++ if a class contains an array of pointers to mutable T
, it can define an accessor that exposes it as an array of pointers to read-only T
, with zero space or time overhead. I don't see how we can achieve that in Carbon if we're modeling "read-only T
" as a generic interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only using dynamic trait objects involves dynamic dispatch and witness table pointers, not generics generally. Normally you would expect generics to act more like templates for code generation purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But wouldn't we need something like a dynamic trait object to support use cases like the one I mentioned?
OtherFunction(other_id); | ||
|
||
// We can also pass ephemeral values: | ||
OtherFunction(other_id + 2); | ||
|
||
// Or values that may be backed by read-only memory: | ||
static const int fixed_id = 42; | ||
OtherFunction(fixed_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OtherFunction(other_id); | |
// We can also pass ephemeral values: | |
OtherFunction(other_id + 2); | |
// Or values that may be backed by read-only memory: | |
static const int fixed_id = 42; | |
OtherFunction(fixed_id); | |
SomeFunction(other_id); | |
// We can also pass ephemeral values: | |
SomeFunction(other_id + 2); | |
// Or values that may be backed by read-only memory: | |
static const int fixed_id = 42; | |
SomeFunction(fixed_id); |
Pointer traversal sometimes has a noticeable cost. Having explicit pointers will | ||
show developers in the codebase where they are explicitly traversing memory and | ||
allow them to optimize them when necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a tension between this paragraph and the preceding one, because this argument seems like it applies just as much when the data is immutable, whereas the previous paragraph suggests that pointers can only point to mutable data.
proves this is an important pattern to support without the contortions of | ||
manually creating a local copy (or changing to pointers). | ||
|
||
### References in addition to pointers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a doc I wrote up a long time ago: https://docs.google.com/document/d/1grW9NXTZl1UsdytoE-Q2N3WwRQEDQUPg9HWOzyiQcjA/edit#
(No conclusions or deep analysis, really just writing down some alternatives.)
temporary. However, the rules for parameters and locals are the same in C++ and | ||
so this would create serious lifetime bugs. This is fixed in C++ by applying | ||
_lifetime extension_ to the temporary. The result is that `const` references are | ||
quite different from other references, but they are also quite useful: they are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why const&
is "quite different from other references" - all declarations of reference type do lifetime extension of prvalues, excepting non-const lvalue since that's an error.
int&& x = 0;
also does lifetime extension.
would become Carbon code such as: | ||
|
||
``` | ||
fn LogSize(large_data: Container) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to take Container
by value; how does this not require the caller to copy a Container
(or, alternatively, how does a function take ownership of a Container
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems specifically here that Carbon is avoiding ownership types a la Rust/C++, which I don't personally love... How do you plan on doing ownership?
We triage inactive PRs and issues in order to make it easier to find active work. If this PR should remain active, please comment or remove the |
We triage inactive PRs and issues in order to make it easier to find active work. If this PR should remain active or becomes active again, please reopen it. |
Migrated to #2006 |
Flesh out and solidify the design around values, variables, and
pointers. Explicitly discuss the use cases of references in C++ and
propose specific approaches to address those use cases.
This is something that we've been discussing across the team for a long
time, and while there are definitely still challenges in this space we
will need to address going forward, I want to try to codify where we are
at and provide for a few fundamentals that haven't really been spelled
out previously.
That said, I've been staring at this document for far too long in
a draft, and so I may be missing parts that are confusing or need work,
so any help from folks to make this a coherent story is definitely
appreciated. The current structure and wording is heavily informed by
several reviews and suggestions from @zygoloid, @josh11b, and @wolffg
with much appreciation. =]
Some core examples of the consequence of this proposal:
Using
let
where we currently usevar
to declare a locally scopedimmutable view of a value:
Specifying the expected semantics of parameters to by default
be these immutable views of values like
let
. These should behavelike C++
const
references but allowing copies under as-if.Specifying that
var
creates an L-value and binds names to it.Defining that
var
patterns are allowed to nest withinlet
tomark a part of a pattern as an L-value:
When the entire declaration is a
var
thelet
can be omitted.This works with function parameters as well to mark consuming an
input into a locally mutable L-value:
Implementing operators by rewriting into method calls through an
interface, which can then use
[addr me: Self*]
to implicitly obtaina mutable pointer to an object for mutating operators.
Providing user-defined pointer-like types and the implementation of
both the
*
-operator and->
member access in terms of rewritinginto member calls through an interface and then forming L-values.
Providing indexed access through rewrites into method calls as well.
Beyond these use cases, thread-safe interfaces and more complex lifetime
based dispatch are deferred for future work.
See the proposal for details here, and looking forward to feedback!