-
-
Notifications
You must be signed in to change notification settings - Fork 804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VIP: stateful singleton modules with ownership hierarchy #3722
Comments
Having intensively discussed the different tradeoffs with @charles-cooper over the last months I think this proposal is the "better" one of the two (#3723). The "ownership hierarchy" concept is pretty common in languages that deal with memory management and resource allocation (e.g. C, C++, or Rust) and has shown its virtues. I think this can be carried over to contract-oriented programming. Some open questions:
|
@DanielSchiavini suggests that ownership should be the default, whereas borrowing needs to be marked |
as an example: # library1.vy
import Library2
library2: borrows(Library2)
def __init__(self):
pass
# contract
import Library1
import Library2
library1: Library1[library2]
library2: Library2
def __init__():
self.library2.__init__()
self.library1.__init__() |
i kind of agree (at least at this moment in time -- the two approaches both have their merits and i have gone back and forth on them many times). typically when people put state in a module, they intend for it to be global. this is especially familiar for people coming from a python background. put another way, the multiple instantiation paradigm is more elegant, but actually more error prone if you consider the global lock use case. it's too easy to forget to tie two instances together when the library designer intended for a piece of state to be global (which is the default design attitude for somebody who is writing a module).
i think private variable declarations hurt composability. more generally, my current design philosophy is that importers should set constraints, not the importees. this design philosophy maximizes composability.
i will need to think about this more, but i think that delegatecall use cases are kind of orthogonal to this proposal.
|
Exactly - generally, it's not a straightforward exercise to immediately understand the implications of multiple instances and stateful actions. Singletons optimise better IMHO for safety & reasonability, which again, is an important design principle for any Vyper feature.
Hmm, I don't think |
after ruminating on this for a few days, i favor a system which marks ownership and additionally requires annotation of write dependencies as is proposed in #3723. borrowship may also be marked, although it this is a relatively small detail and can be added or removed in the future. in the following examples, i renamed the keywords # Library1.vy
import Library2 as library2
uses: library2
def __init__():
pass
# contract
import Library1 as lib1
import Library2 as lib2
initializes: lib2
initializes: lib1[library2 := lib2]
def __init__():
lib2.__init__()
lib1.__init__() more formally, the ownership hierarchy as exposed to the user is therefore:
as an implementation detail, i settled on using the walrus operator ( as a larger example, i wrote up the token example using this syntax here: https://gist.github.com/charles-cooper/fb5caff4eee8bbf92ed86cefaa39a855 |
How about Also, is Your example could look like: import Library1
import Library2
def __init__():
Library2()
Library1(library2=Library2)
|
i don't think we need to restrict ourselves to solidity protected keywords, we should rather choose the word which best represents the semantics. the biggest "dent" to UX (if you can call it that) here is that programmers won't be able to have state variables named
i considered not requiring the one, it lets the programmer control where the library goes in the storage layout. i think this is important, since the other options are to (somewhat arbitrarily) either choose storage layout order depending on import order, or where the initializations occur in the source code. this way the storage layout is clear from the order in which storage variable declarations and second, it allows compile-time resolution of the dependency resolution could be done in source code, but it starts to get weird once source code is not just straight-line, e.g.: def __init__():
if block.number % 2 == 0:
Library2()
Library1(param1, param2, library2=Library2)
else:
Library1(param2, param1, library2=Libraryyy2) # probably a user typo, need to throw an error even though the first call to Library() is well-formed.
Library2() # how does this affect storage layout?
added to the list above, although i am not really a fan of |
this commit implements "singleton modules with ownership hierarchy" as described in #3722. to accomplish this, two new language constructs are added: `UsesDecl` and `InitializesDecl`. these are exposed to the user as `uses:` and `initializes:`. they are also accompanied by new `AnalysisResult` data structures: `UsesInfo` and `InitializesInfo`. `uses` and `initializes` can be thought of as a constraint system on the module system. a `uses: my-module` annotation is required if `my_module`'s state is accessed (read or written), and `initializes: my_module` is required to call `my_module.__init__()`. a module can be `use`d any number of times; it can only be `initialize`d once. a module which has been used (directly, or transitively) by the compilation target (main entry point module), must be `initialize`d exactly once. `initializes:` is also required to declare which modules it has been `initialize`d with. for example, if `mod1` declares it `uses: mod2`, then any `initializes: mod1` statement must declare *which* instance of `mod2` it has been initialized with. although there is only ever a single instance of `mod2`, this user-facing requirement improves readability by forcing the user to be aware of what the state access dependencies are for a given, `initialize`d module. the `NamedExpr` node ("walrus operator") has been added to the AST to support the initializer syntax. (note: the walrus operator is used, because the originally proposed syntax, `mod1[mod2 = mod2]` is rejected by the python parser). a new compiler pass, `vyper/semantics/analysis/global.py` has been added to implement the global initializer constraint, as it cannot be defined recursively (without a global context). since `__init__()` functions can now be called from other `__init__()` functions (which is not allowed for normal `@external` functions!), a new `@deploy` visibility has been added to vyper's visibility system. `@deploy` functions can be called from other `@deploy` functions, and never from `@external` or `@internal` functions. they also have special treatment in the ABI relative to other `@external` functions. `initializes:` is useful since it also serves the purpose of being a storage allocator directive. wherever `initializes:` is placed, is where the module will be placed in storage (and code, transient storage, or any other future storage locations). this commit refactors the storage allocator so that it recurses into child modules whenever it sees an `initializes:` statement. it refactors several data structures surrounding the storage allocator, including removing inheritance on the `DataPosition` data structure (which has also been renamed to `VarOffset`). some utility functions have been added for calculating the size of a given variable, which also get used in codegen (`get_element_ptr()`). additional work/refactoring in this commit: - new analysis machinery for detecting reads/writes for all `ExprInfo`s - dynamic programming on the `get_expr_info()` routine - refactoring of `visit_Expr`, which fixes call mutability analysis - move `StringEnum` back to vyper/utils.py - remove the "TYPE_DEFINITION" kludge in certain builtins, replace with usage of `TYPE_T` - improve `tag_exceptions()` formatting - remove `Context.globals`, as we rely on the results of the front-end analyser now. - remove dead variable: `Context.in_assertion` - refactor `generate_ir_for_function` into `generate_ir_for_external_function` and `generate_ir_for_internal_function` - move `get_nonreentrant_lock` to `function_definitions/common.py` - simplify layout allocation across locations into single function - add `VyperType.get_size_in()` and `VarInfo.get_size()` helper functions so we don't need to do as much switch/case in implementation functions - refactor `codegen/core.py` functions to use `VyperType.get_size()` - fix interfaces access from `.vyi` files
implemented in #3729 |
Simple Summary
extend the import system by allowing "stateful modules" (that is, modules with top-level state variables). introduce a constraint system on the import system which maximizes safety + usability.
this is one of two proposals exploring the stateful module design space; the other is #3723.
Motivation
re-using code which encapsulates state is in general a useful feature to have for a language! however, in a contract-oriented programming context, this is a double edged sword because reasoning about storage is fundamentally difficult, especially when storage accesses are hidden behind a layer of abstraction. consider two basic approaches to the problem:
this has a further issue which we will discuss in a bit, which is that access to
dep1
's__init__()
function is uncontrolled. that is, it could be called multiple times in the import graph. this is a correctness problem, because programmers expect constructors to be called at most one time.the other benefit here would be clear access to imported
__init__()
functions. since each instantiation is local, it is straightforward to enforce that__init__()
is called one time for each instantiation. (in the above example,self._lock.__init__(...)
andself._foo.__init__(...)
would have to be called in the main__init__()
function.enumerated, the issues brought up above are:
__init__()
this proposal proposes a third option, which draws inspiration from linear type systems and the rust borrow checker.
the design proposed here is to enforce the one-def rule, but to address the issues above, additionally introduce an ownership system which allows the compiler to enforce constraints on how module state is written and initialized.
note on a design choice:
owns: some_module
is a design requirement which allows the programmer to control where the module is laid out in storage.Useful Definitions/Terminology
Specification
Final Specification.
this proposal introduces an effects hierarchy for interacting with modules:
initializes
anduses
. these correspond to the terminologyowns
andborrows
from linear type systems, respectively.the basic rules here are:
initialize
d exactly one time in the import graph.module
initializesmodule2
, thenmodule2.__init__()
must be called inmodule.__init__()
. declaring ownership "seals off" access tomodule2.__init__()
. it is envisioned that it will probably be used sparingly or near the top of the import graph.__init__()
function unless they are already owned.use
d.initializes
impliesuses
.module1
declaresuses: module2
, then the initializer formodule1
must be declared likeinitializes: module1[module2 := module2]
.Original Specification
for historical/research purposes, the original spec is below. this was the design with
seals:
but notuses:
. this original design is superseded by the design described here: #3722 (comment).this proposal introduces an effects hierarchy for interacting with modules:
owns
andseals
. an alternative name forowns
could beinitializes
.owns
is used here since it is the terminology used in linear type systems.the basic rules here are:
own
ed exactly one time in the import graph.module
ownsmodule2
, thenmodule2.__init__()
must be called inmodule.__init__()
. declaring ownership "seals off" access tomodule2.__init__()
. it is envisioned that it will probably be used sparingly or near the top of the import graph.__init__()
function unless they are already owned.module2
, no other modules can write to it (or directly call mutating functions onmodule2
).own
ed once.seals:
implies ownership.note that
seals:
can be considered as an extension to the ownership system. in other words, theseals:
semantics is not required to be implemented.some examples, with a tentative syntax:
an obligatory token example:
note an alternative design for this hypothetical project could be for
Mint
toown: Owned
and be responsible for calling its constructor. thenContract.vy
would not be able toown: Owned
. this is left as a design choice to library writers, when to "seal" ownership of modules and when to leave them open. for illustration, this is what that design would look like:Backwards Compatibility
does not change any existing language features, fully backwards compatible
Dependencies
References
Copyright
Copyright and related rights waived via CC0
The text was updated successfully, but these errors were encountered: