-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Improve the module system #11745
Comments
CC @bstrie because i think i saw you chime in the complains on IRC yesterday |
Crates are pretty similar to Haskell's module system. (Admittedly we don't have the tool support to use them like that easily, yet.) |
They may be similar, but they are harder to use. Maybe mostly due to aspect "1"? |
+1 on this topic from my side. The main thing that bothered me is the binding between files and modules -> each new file automatically generates a new module. When you try to divide your code into many files (e.g. one file per object as often done in C++/C#/others and even enforced in Java) you don't get files but even modules. With the additional modules you have to reimport everything and you can no longer access private fields/methods. That is often required when you have some objects that tightly interact. Imho that leads to either very large source code files or you begin to make much more things public than they should be. As a backwards compatible improvement to this I could imagine the following:
The incompatible way is to do it simply like C#, Java, AS3, ...:
Besides that I would think that crate local visibility for types would also help to overcome some of the visibility issues that creating new files->modules involves and would really like to see that feature. |
Please no. Haskell has many great attributes but the module system is emphatically not one of them. As far as I'm aware Rust's module system is strictly more expressive than Haskell's. Rust decouples the unit of visibility (a module) from the unit of compilation (a crate), which I think is a smooth move. In Haskell the two concepts are tied together. You can recover a Haskell-like system in Rust by basically just never ever using submodules. Put all your definitions at the top level, and if you want to start a new module, start a new crate as well, and they will be compiled separately. So if that's what you want, you can do it today. |
Currently i have the impression that Rust's is even less practical.
What does this solve? Can you give an example of the day-to-day advantages of this? Currently i only see more code and more complexity := cost.
I don't think that we want to encourage putting unrelated code in a giant crate file. Your other solution would make Rust projects have twice the number of files than a comparable project in many other programming languages. |
A couple of points:
*assuming you only declare |
Re aspect 1 and crates: you import a crate with Re aspect 1 in general: |
I'm not interested in a complete overhaul of the module system at this point. It's just too late in the game. What I am interested in is making the current system slightly more restrictive than it currently is, in order to reduce the amount of complexity that a user has to keep in their head. I propose two restrictions:
|
+1 brstrie I'm new to rust, but I'm having trouble understanding the current module system. The use-before-mod is completely unintuitive (at-least with the current state of the documentation) to me as a new user. Maybe it will be less confusing in the future. I also agree that Globs should not make it into 1.0, tooling can be built around the compiler to help automatically insert use/extern mod statements. |
Coming from C#, the Rust module system did not work like I expected, but since Rust is not a verbose class oriented language like C#, the idea of modules becomes more important. I actually like the way a module is located in a specific file by rigid rules. In C#, one has to rely on the IDE to find the declaration of an object. What makes Rust more complicated, is having to put 'mod' and 'use' at top of each file. This becomes a burden when the hierarchy is still in flux. My idea is to add a new "header" like file named "rust.rh" per folder which one can declare the 'mod' and 'use' for the source files. All source files are compiled as if "rust.rh" is included directly at top of the source, but it can not contain code or declaration of new modules. No explicit declaration is necessary to use "rust.rh". This means getting started is easier, one can simply create a new empty document in an existing project and start typing. The compiler must ignore circular modules pointing to the same file and to do this it needs to strip "rust.rh" for the file it is processing. For future development the "rust.rh" file can contain instructions to flatten all the files in the folder, making them part of the same module. This will make source easier to manage per folder. My wish is to not having to type 'mod' and 'use' at all per file. |
That won't work, since a folder contains many modules (as many as files are in it), and each module could be in a completely different position in the module tree, and have different needs for imports. I agree to others that explicit imports through "use" can get easier in future as soon as there is a decent IDE. |
+1 to @bstrie from me too. Leaving the shadowing rules like they currently are would block that path in the future. |
@Kimundi , while allowing shadowing post-1.0 is theoretically backwards-compatible, it would be impossible to reintroduce use-before-mod in a backwards-compatible fashion. So we'd have to be content with either never allowing shadowing, OR allowing shadowing but not being able to enforce the perceived "order" with use-before-mod, OR breaking compatibility. |
I would also like to caution everyone again against proposing drastic, sweeping changes to the module system at this stage. This is a highly complex topic with far-reaching implications. We've been through a lot of iteration to arrive at our current system, and while it could be better it's also not nearly as bad as it used to be (who else remembers Remember that there are only three hard problems in programming language design:
|
Yeah, we might be able to relax the ordering restrictions a bit.
Crates do not support mutual recursion. Modules do.
Totally opposed. Haskell modules do not support mutual recursion, which is a tremendous burden. |
I would be very concerned about limiting one's horizons to Haskell only. Modules are an incredibly rich area of computer science. The more I learn about how powerful the module systems are in SML and OCaml, the more I'm aware of how little I know about the language construct. From the Reddit:
|
I would also add, that it might be important to think about how modules play with traits a little more. SML and OCaml have amazing module systems, but don't have type classes, Haskell has poor modules, but great type classes. What happens when you put them both in the same language? Are there any precedents for that? Here is a good page describing a bit about OCaml's modules: https://realworldocaml.org/v1/en/html/first-class-modules.html I don't know if Rust needs that kind of power due to the fact we have type classes. Or they could be two sides of the same coin. I don't know. Just opening up the conversation. |
Maybe someone should first clarify what a module is. But maybe we have different expectations, as I'm coming from an OO world. |
Couldn't we replace the ordering rule for |
I guess what I'm getting at it, where do types, traits, and type parameters fit in with paths and use statements? The associated item syntax Sorry if I am derailing the original intent of the original post – I guess part of this is more a response to the title 'Improve the module system'. Syntax is sometimes informed by the underlying semantics, so if something's weird there, then maybe it is suggesting that we have some deeper, unresolved design questions to consider. @pcwalton and @nikomatsakis may already have ideas on this though. Some interesting papers: |
Yes, that is an issue. Perhaps the OP is indeed thinking of modules as more of 'namespaces' which are more for organizing large sets of items into manageable, self-documenting chunks, rather than the powerful language features displayed in languages like SML that are designed to help enable the modularization of code. We need some clarification on terminology here otherwise we will be talking past each other. If I recall correctly @nikomatsakis and I have talked about type paremeterisation of modules. It could allow for some powerful patterns, but I think there were issues with it making things like resolve much more complex, and it might also overlap some of the functionality of traits (see the papers I posted above). |
@bstrie I know, that's why I didn't talk about allowing arbitrary order of imports and module definitions here. :) Forbidding both import and module shadowing and arbitrary ordering would be a backward compatible ruleset to tweak in the future. |
As i understand it, crates are developed top-down, while many other languages support bottom-up development. What i mean by this is, that in other languages it is not unusual to first develop types and functions/methods in independent files/modules and then bind them together in a library, executable or you don't (object file). If this view is true, the confusion could be solved by documentation. |
This is some of the latest module related research that I know of: "ML modules provide hierarchical namespace management, as well as fine-grained control over the propagation of type information, but they do not allow modules to be broken up into mutually recursive, separately compilable components. Mixin modules facilitate recursive linking of separately compiled components, but they are not hierarchically composable and typically do not support type abstraction. We synthesize the complementary advantages of these two mechanisms in a novel module system design we call MixML. " and haskells take on MixML: " Module systems like that of Haskell permit only a weak form of modularity in which module implementations directly depend on other implementations and must be processed in dependency order. Module systems like that of ML, on the other hand, permit a stronger form of modularity in which explicit interfaces express assumptions about dependencies, and each module can be typechecked and reasoned about independently. In this paper, we present Backpack, a new language for building separately-typecheckable packages on top of a weak module system like Haskell's. The design of Backpack is inspired by the MixML module calculus of Rossberg and Dreyer, but differs significantly in detail. Like MixML, Backpack supports explicit interfaces and recursive linking. Unlike MixML, Backpack supports a more flexible applicative semantics of instantiation. Moreover, its design is motivated less by foundational concerns and more by the practical concern of integration into Haskell, which has led us to advocate simplicity—in both the syntax and semantics of Backpack—over raw expressive power. " This still seems to be of research status but might give inspiration. At least it is the state of the art AFAIK. |
I don't care one way or the other, but I am very much against any changes that breaks mutual recursion of modules. Thanks. |
Also related is the Backpack extension to Haskell's module system: |
Is replacing or extending our module system with one whose scope is not just namespacing but also abstraction (like ML and Backpack) seriously under consideration? |
On Sat, Jan 25, 2014 at 01:35:43AM -0800, Gábor Lehel wrote:
Not in the short term, I don't think. |
I'm not really advocating this right now – there is already enough to do for 1.0, but I do think it would be prudent to consider this so that we don't back ourselves into a corner in the future. |
my 2c.... to me the rust module system is surprisingly counter-intuitive, having not used anything similar; But could you simply educate users through more elaborate error messages: when you try to reference something that doesn't exist, the compiler could look in places that correspond to common errors users make .. and suggest corrective action ("did you mean ::foo::bar()" or "add 'use foo::bar' to make bar visible") I would have guessed something more like haskells' would be easier to get into but I gather there are good reasons not to do it that way. ("use mod ...." "use mod .... as " .. and just keep a graph of modules..) I usually end up making a "common.rs" with a load of use's and just use common::* all over the place; as mentioned above it makes life easier when a system is in flux. Seems like one can avoid making a seperate source for that by saying "use super::*", not sure if thats a good or bad idea At the minute i'm basically shying away from heavily using the module system because of the lack of an IDE; i still try to make symbol names unambiguous across my (so far, small) projects. With polymorphism going on you shouldn't need so many symbols. Having said that, in C++ i use nested classes alot, and even C++ people say "avoid that, make a namespace.." ... so I am begining to see the rationale behind the rust method. |
I started writing multi-file Rust today, and I think the way it currently works is completely counterintuitive to newcomers. Documentation is no substitute for sane defaults. If something doesn't change, I believe you'll end up with include!() as the preferred way of managing modules. Proposal for Unsurprising ModulesSummary
Source layout
Now step back and look at that source layout again. I bet you understood exactly how my proposed modules work without having to read any documentation. In Detail
Backward Compatibility
Open Questions
|
Here is one proposal: At the moment all imports come from the root of the crate by default in order to remove ambiguities when resolving paths. How about making working relative to the current module the default? Say we have a crate set up like: mod baz {
mod foo {
fn bar() {}
}
} Let's say we are working within the
This might make things less surprising – I still get tripped up by having to use |
Woops, I pressed "Close and comment" by accident! Sorry! |
@o11c - i'm the same in that every time i pick it up after a break, what it does is completely surprising; [1] Educating in error messages:- [1.1] When you expected to be able to reference a symbol and its in the wrong place, the compiler looks for the closest match, and tells you what the correct path was. You just need to be reminded of the fact 'use' is crate relative, and code is mod relative. This automatic search would also be hugely useful in this early time without an IDE. [1.2] When you started out saying "mod foo.." where you want foo like I always did, the compiler can warn you.. "warning, multiple copies of mod foo, prefer to bring modules in the crate root and 'use' them elsewhere". [3] fixing the glob import bugs (currently use super::) doesn't work. I've been adding a 'common.rs' , and if i could just opt to carry common 'pub uses' declared in the crate root that would streamline it. I'm happy with the logic of how the system actually works ... its a combination of namespacing and modules... i've not used that much in C++ voluntarily because you've got the alternative of classes-in-classes, and its an extra layer. The Rust way gives more meaning to the directory tree. There's one more change I would suggest, I suspect it won't be popular. |
I really think we should either have |
I suggested making @bjz The relative paths for |
@o11c, I'm all for simplifications of the module system, but I don't actually think it's that complicated:
There is now a significant amount of Rust code in the wild, including large multifile libraries (e.g. most of the libs in this repo, many of the projects here) and there are not many uses of Of your open questions, these really need to be answered:
(The answer to the second one is yes: the stdlib reuses things from other non-sibling modules a lot, e.g. random example of |
The biggest pain IMO is splitting a module across multiple files. Right now you need reexports or Another minor pain is that I frequently find myself saying |
I'd just like to point out that @o11c's module system (at least the "Source Layout" section) is pretty much how modules work in D. That system works well and it's pretty intuitive. The filesystem is always going to be there and we'll be putting code inside files inside folders forever; might as well have that hierarchy be reflected as modules/packages and reduce confusion. |
@cmr I like the "inherit" idea, does a better job than the "common.rs / use common::* everywhere" hack.
less keywords for the newcommer to throw in, and more control by adding more keywords. |
...another thing where "I thought it already worked that way" (and not just for (And I know I've relied on this mistaken assumption in at least one other comment somewhere, and no one called me on it. Please do, if you notice!) |
It'd be sad to use On Mon, Mar 3, 2014 at 8:02 AM, Gábor Lehel [email protected]:
|
cc @flaper87 |
(To be clear I wasn't implying the C++ behavior is better.) @flaper87 you know there's a |
@glaebhoerl yes, but 1) I'm subscribed to all rust notifications (yes, I go through pretty much all of them) and 2) the cc helps with my email filters, tags etc ;) |
On Sun, Mar 02, 2014 at 03:04:00AM -0800, Huon Wilson wrote:
+1 |
its not that its complicated - its that its just works in a way completely unexpected by anyone from a different language. But errors could fix that. |
2014年3月3日 下午11:18于 "Niko Matsakis" [email protected]写道:
|
That's why it's a bug. No amount of error messages or documentation will fix that. |
If we could sum up the Rust module system with " Unfortunately the module system can't be so succinctly explained because it's far more complex than that. |
This should be written up as a proper RFC now that we have the process: https://github.com/rust-lang/rfcs/ |
An RFC has been opened: rust-lang/rfcs#18 Further discussion should migrate to that pull request (and alternate proposal should become new RFCs). Closing. |
A year ago, I selected Go from Rust, C++, and Go. After a year, I am trying Rust again because Go does not support building binary libraries. Last year, I had learned about the complexity of “mod” and “use”. This year, in order to avoid having too many confusions and limitations with the use of “mod” and “use”, I will put all structs and functions into main.rs. If I implement multiple files, the project code pattern must allocate code in a tree structure - one way direction down to the tree branch to call functions/structs. I have tested it and it does not support cross-call functions/structs from different files similar to Go and C#. I strongly suggest Rust to implment Go style of Package naming. It allows a very flexible of function/struct call as-if a single file. This improvement can attact expansion of user base significantly. Or recommend Visual Studio Code editor, allow users to define different blocks of a single file, let it view as-if multiple files. |
I am not sure what the chances are, but if it could be improved, it should be before 1.0.
Rust's module system has a steep learning/doing curve. I think there are two reasons for this:
mod
anduse
. This leads to the shadowing rules, that have weird consequences: you need to import a path viause
which is not even defined at this point in time, becauseuse
needs to beforemod
. I find this not only unfamiliar but also counterintuitive.use
statements are relative to the crate, the meaning changes depending on which file you currently compile. The consequence is that it makes it more difficult than necessary to first write filea.rs
, thenb.rs
(usesa.rs
), and thenc.rs
(usesb.rs
) and make them compile individually. Even in Haskell that is easy. In Rust you either have to go back and forth and adjust theuse
-staments or only put themod
-statements in a dedicated top-level crate (lib.rs
ormain.rs
) which defies encapsulation.This could be an issue that is fixable with more/better documentation. However, when you need a lot of words/actions to describe/do something in one system which does not need that care in another system, that could be an indication that there is some accidental complexity. I think this complexity is accidental since i am not yet aware of which gains are compensating for this.
The text was updated successfully, but these errors were encountered: