Skip to content
This repository has been archived by the owner on Mar 25, 2018. It is now read-only.

My 2 cents on modules — v2 #20

Open
leo opened this issue Jul 29, 2015 · 21 comments
Open

My 2 cents on modules — v2 #20

leo opened this issue Jul 29, 2015 · 21 comments

Comments

@leo
Copy link

leo commented Jul 29, 2015

I recently wrote something about how modules in node currently work (using NPM) and @Fishrock123 pointed me here, since the node repo probably wasn't the right place to post something like this.

I was basically asking why NPM doesn't simply put all packages in one place (like most package managers do it), because this method would make the whole thing a bit more continuous while keeping it less complex.

After I did some research, I also found out that NPM already introduced the "flat" feature in version 3.0.0, which simply puts all modules on one level within the node_modules directory (as long as there are no version incompatibilities, otherwise they'll be nested as before).

Since that already solves one of the problems I've mentioned earlier ("nesting inception"), I'm glad that they made such a decision. It's already going into the direction which I pointed out and moves all modules into one level.

Talking about the other things I've mentioned in my post: Why don't we go another step and move all modules into a single, global directory? I mean, they're already in a single one (for each project).

Although I've already read things like this, nobody really pointed out what's the problem here (and why they don't want to use the method of keeping everything globally). Everywhere I look, I just see something like this:

It's not going to happen, because it's a terrible idea that causes more problems than it solves.

But ... why?

I'm assuming that one problem might be the fact that some modules don't work with the latest version of a module which they depend on. That's true, right? And if that's the main reason why you don't want to switch to a different setup, why don't we simply do it like this:

  1. Let's create a global directory for all node modules.

  2. Within this directory, there are sub-directories (a.k.a. "modules")

  3. And now (to solve the version-problem): Each "modules" shouldn't simply contain the contents of the specific module. It should contain another set of sub-directories which are named like the version tags which they're representing. For example: One folder within there is called "1.0.0", a second one "1.0.1" and another one "1.0.2".

  4. Not the clue about this method: Every module that's getting installed is currently also automatically installing it's own dependencies, right? Then let's keep that like it is! Every time a module gets installed, it's looking into the global module directory and searches for the module which is needed. As soon as it finds the needed module's directory, it takes a look into this directory and looks for the needed version tag. If this version tag exists, it doesn't need to install it again. But if it doesn't exist, the package manager will simply download this version from the repo and put it into it's own folder (which is then named like the version - e.g. "1.0.0"). Summed up: Like this, we're able to keep multiple versions of a single module on our machine, but at the same time, everything is globally and in one place.

  5. Now the only question which is left is: "How can a module/app then require a module?" — Easy: If it depends on the latest version (aka.: higher than "..."), it can simply use require( 'module-name' ). And that, gentlemen, should be the standard. And if the creators of a module simply aren't able to keep up with the updates of their module's dependencies and need to use an older version of a module, they could use something like this:

    require( 'module-1.2.3' )
    

    or this:

    require( 'module' ).v( 1.2.3 )
    

    (I know that those solutions are very dirty, but I'm sure that someone is able to come up with something better as soon as he understands why intention with this post).

diagram

I know that we're not able to change the way how modules work NOW. And that's also not what I want. I just want you to think about how we could shape this whole thing into a more constant, less confusing and less app-specific direction in some future.

Also thank you for taking the time and reading this. If you have an own opinion about that, please drop it into the comment section down below!

@kobalicek
Copy link

The current nesting approach also doesn't work well with native addons, which should be installed in a flat hierarchy by default I guess.

@serapath
Copy link

Regardin (4.)

  • I guess that woud also decrease traffic on npm, because instead of me downloading all kinds of modules all the time, the one's I use often will just be "fetched" from my global modules directory and npm linked into my local module directories (or copied as a fallback?)

Regarding (5.)

  • There is another open issue here Install multiple versions of a dependency npm/npm#5499, which talks about the question, whether it would be a great thing to install multiple versions of a dependency into the same project - which would often simplify the lifes of frontend developers so adding that feature too might not be so bad? :-)
  • What if, instead of require( 'module-1.2.3' ) or something similar, the information is just read from the local package.json file?
    • To not be stuck with require('moduleName@version') across source files, a suffix could be defined to reference a potentially different version of a module, e.g. require('moduleName:a') and/or require('moduleName:b'). Without the suffix the latest version will be required.

Example:
package.json

  "dependencies": {
    "mydependency": "mydependency^1.3.0",
    "mydependency:a": "mydependency^1.1.0",
    "mydependency:b": "mydependency^1.0.0"
  }

So if the author wants to update require('moduleName:a'), there are two choices. Either update the package.json or search&replace "moduleName:a" across the project.

@leo
Copy link
Author

leo commented Jul 29, 2015

I guess that woud also decrease traffic on npm, because instead of me downloading all kinds of modules all the time, the one's I use often will just be "fetched" from my global modules directory and npm linked into my local module directories (or copied as a fallback?)

The first thing you've said is true and another great advantage of this method. Thanks for pointing it out! But the last thing about npm link isn't how I thought we will do it. To explain my post up there a bit more clearly: I don't necessarily want the npm_modules folder to exist anymore. All modules should be fetched from the global directory and also loaded from there.

I'm not sure, but I think on OS X, this directory could be something like ~/npm or ~/node_modules (within the user directory of the OS).

I also noticed that the modules which are currently stored in ~/npm already contain multiple version tags:


screenshot

So I think my suggestion won't entirely change how NPM works. Maybe just a few improvements are enough to make it fit to our solution.



What if, instead of require( 'module-1.2.3' ) or something similar, the information is just read from the local package.json file?

So if the author wants to update require('moduleName:a'), there are two choices. Either update the package.json or search&replace "moduleName:a" across the project.

That sounds like a really great idea, @serapath. But I would just do it like this ...

"dependencies": {
    "handlebars": "^3.0.3",
    "express": "^4.13.0",
    "browserify": {
        "a": "^2.3.0",
        "b": "^11.0.0"
    }
}

... to avoid the repetition of a the module name.

And I would also suggest that those a and b strings should be completely free to be chosen. Than means if someone needs a special version of a module, he doesn't need to use this syntax. He could also use a different, random key. For example:

"browserify": {
    "phoenix": "^2.3.0",
    "latest-beta": "^11.0.0"
}

This way, he's free to use his own identifier (the codename of the project for example).

But we also need to keep in mind that this means that node would need to parse the package.json file every time the module is being used. Therefore I'm not sure if that's the best way to do it when talking about performance (but maybe we could also use an internal cache to store the relation between a suffix and a dependency version).



There is another open issue here npm/npm#5499, which talks about the question, whether it would be a great thing to install multiple versions of a dependency ...

That's great! 😊

Maybe the folks from there will contribute something to this idea here, so that we could "kill two birds with one stone" (wow, that sounds pretty strange in english).

@Qard
Copy link
Member

Qard commented Jul 29, 2015

A few things:

  • npm already has a cache, so changing where modules are located would have no impact on traffic to the registry.
  • Changing the code semantics of how require(...) works or how dependencies are structured in package.json is hugely ecosystem-breaking and thus unlikely to ever happen.
  • node/io.js itself has no concept of module versions. It simply does a recursive path resolve, searching for node_modules folders upward from the path of the current file. This means the path resolution for module loading would need to change substantially, in a way that is not at all backwards compatible. Again, probably not going to happen.
  • Storing the modules directly in the app directory can simplify deployment, making it easy to tar up the folder, scp it to the server and have it just work. (Other than native modules)
  • Having the modules easily accessible also makes issues much easier to debug, as you can dig through the dependencies and inspect the state easily and/or log it to the console.

@leo
Copy link
Author

leo commented Jul 29, 2015

Thanks for your feedback, @Qard. I really appreciate to hear more opinions of people who are familiar with the way, how node works.

Here are some of my thoughts on your words:


Changing the code semantics of how require(...) works or how dependencies are structured in package.json is hugely ecosystem-breaking and thus unlikely to ever happen.

That's true. And because we don't want that to happen, I'd suggest to firstly leave the whole node_modules structure as it is and just add our method as another feature (Since it doesn't require people to change their package.json or something like that. They're free to leave it as it is. Only if they want to use suffixes, they would need to add them there).

But all in all, I guess it wouldn't be that much of a problem to just add it on top of the current structure. I mean, there's already the method to install modules globally. We only need to find a way to make this the default for the future and add an option to require them.


node/io.js itself has no concept of module versions. It simply does a recursive path resolve, searching for node_modules folders upward from the path of the current file. This means the path resolution for module loading would need to change substantially, in a way that is not at all backwards compatible.

Huh? Why? My suggestion was keeping all modules in a single, global directory. And it should be the same on all machines (maybe different between OS', but npm could differ there). Therefore, the path won't change ever.


Storing the modules directly in the app directory can simplify deployment, making it easy to tar up the folder, scp it to the server and have it just work. (Other than native modules)

Yep, that's something we would need to sacrifice for this method. But I feel like it's more useful and intended by the creators to just move the main app up and then npm install on the project, since many modules use this process to prepare for the use on a different environment.


Having the modules easily accessible also makes issues much easier to debug, as you can dig through the dependencies and inspect the state easily and/or log it to the console.

I agree with you there. But I also think that keeping them all in a global directory will make debugging more easy, rather than harder (since the user only needs to open one folder to access all of them). Another great thing would be that he doesn't need to go through many nesting levels if he want's to access a module. Only through two.

@mikeal
Copy link
Contributor

mikeal commented Jul 30, 2015

A litte bit of history here:

In the early days of node we didn't have a node_modules directory. You either imported packages relatively require('./blah') or you had them somewhere in your NODE_PATH. Nor did we have the "local" import semantics that will reverse their way up the directory tree to find local packages by name.

Early versions of npm were built to this module system and, as a result, had many of the semantics you are suggesting. Much work was put in to allowing two packages to depend on two different versions of the same module while still having packages installed more or less globally. The result was something that worked but could be quite painful at times and the code was quite a mess with lots of pretty terrible edge cases. Also, because you had to use either shims or symlinks to route all these packages to the right module it was hard to understand as a developer how this stuff was really working and very difficult to intervene and forcibly change or update anything.

Eventually @isaacs started working directly on improving node's own module system to better support the use cases npm was introducing. Still, it's important to note that while you are talking about the installation of packages and the resolution scheme for those packages together there has always been and continues to be a separation of those concerns. Node has a clearly defined way that it resolves package names when you require them and the node_modules system is part of that, as is the preference for "local" rather than "global" resolution of a particular package by name.

The changes you note landing in npm are distinctly improvements to the way packages are installed and managed with no changes whatsoever to how node resolves packages during import. They are effectively optimizations to automatically "dedupe" and "flatten" installed packages without making any changes to require's resolution semantics.

@leo
Copy link
Author

leo commented Aug 1, 2015

Much work was put in to allowing two packages to depend on two different versions of the same module while still having packages installed more or less globally. The result was something that worked but could be quite painful at times and the code was quite a mess with lots of pretty terrible edge cases.

Yeah, I already thought something like that. — The reason why I'm proposing this method again is because I want to build on top of it from here (instead of just kicking this functionality out because it doesn't work fine with some other standards).

If you ask me, this method of managing modules is so important in terms of keeping code sanely and non-repeatedly structured, that we should say something like: "Okay, then let's introduce this method and see how we can work from here (in terms of adjusting other parts of the system to fit this functionality)". I would be happy to help you in improving other parts of the system until those sporadic occurring and terrible edge cases don't exist anymore. But just going the easy way and saying "no" because it's a huge effort to think about different solutions for those problems, shouldn't be an option. 😸

Because if we constantly use the excuse that we're not able to introduce such things because they don't fit into the existing structure, the whole project will always suffer from not being able to change in drastic ways, which is the basic essence of innovation and progress (at least that's my opinion).


Also, because you had to use either shims or symlinks to route all these packages to the right module it was hard to understand as a developer how this stuff was really working and very difficult to intervene and forcibly change or update anything.

I don't know if this is something related to my proposed change of the require() functionality, but I just want to make clear here that I never intended the new method to work like this. I don't want to link something from a local project to the global modules directory using symlinks or something similar (like npm link does it).

I just want apps to depend on global modules and to be able to require them from there, for now.

My proposal: Maybe we could just re-introduce this feature without removing the node_modules method and then go on from there and see what we can do in terms of fixing those edge cases? And while we're doing that, people who're experiencing those edge cases could simply switch to the node_modules way until everything fully works without problems.

Of course, it's a bit chancy. — But as I already said, I think that's exactly what constitutes open-source projects and what projects like node need to stay alive in the long run. And since we won't break the existing structure already, I don't think it's too risky.

@mikeal
Copy link
Contributor

mikeal commented Aug 3, 2015

@leo you have two options today:

  1. Set the NODE_PATH environment variable to a place that you are installing and managing global packages.
  2. Rather than making node_modules a directory create it as a symlink to a place that you are installing and managing global packages.

My own experience with global packages in many languages and with node in the early days leaves me with some skepticism about the success you might have with this but we did leave enough of the old package resolution semantics around for you to do this if you wish.

@isaacs
Copy link

isaacs commented Aug 3, 2015

@leo

I'd like to dig into some of the pushback you're getting here.

Consider a dependency conflict like so:

A depends on B 2.5.3 and C 1.2.3. C depends on B version 1.2.3. Oh noes, conflict, can't dedupe.

It IS possible to install packages into a structure like this:

{root}
+- a
| +- 1.2.3
|   +- node_modules
|     +- b@ -> {root}/b/2.5.3
|     +- c@ -> {root}/c/1.2.3
+- b
| +- 1.2.3
| +- 2.5.3
+- c
  +- 1.2.3
    +- node_modules
      +- b@ -> {root}/b/1.2.3

This isn't that hard to lay out on disk, and then you can link the modules into your project's node_modules folder, and thanks to the magic of realpath and caching, you'll get only the duplication that is absolutely necessary.

This isn't what npm does today. It's similar to what npm did in the 0.x days, but a bit different. Since we have pretty good "symlink" support in Windows using Junctions, I'm even willing to believe that such a model would be a little bit easier to work with, and perhaps marginally better than what we have today (or will have tomorrow, with npm@3.)

However, "install" is not the sum total of what a package manager does, and all human-interfacing software programs are eventually full of compromises. As they say, the first 90% is easy.

Inspecting packages, updating them, removing them (without breaking other deps) and so on, requires a lot more code to be rewritten. What does "bundleDependencies" mean in this model? How does it handle shrinkwrap and peerDependencies? ("It doesn't" might be an acceptable answer; but then again, might not.) What gets published? That's the "second 90%" of the problem.

The "third 90%" is the need to handle all the unexpected edge cases that come from users not knowing where things are, and helping them to understand. This is why npm@3 has been a lot more work than just progress bars and auto-deduping. You also need to handle stuff like logical dep trees (which now don't match the tree on disk) and so on.

I'm not saying it can't be done. Obviously it can be done. But it's a pretty big expense just to get feature parity with what we have now, so even if it ends up being better, if you want someone else to do it for you, you have to convince them that it'll be better enough to be worth that cost.

Or you could just go write it yourself. I don't mean this as a GFY answer. When there's something that requires some technical exploration to really show that you understand the trade-offs you're suggesting, and no one else is convinced that it's worthwhile, going off to write it yourself is a much better use of your time than dropping $0.02 on an issues list. The feature costs more than 2 cents, as it turns out, and either you can fund it or you can live with it not happening. It could be that there are incremental steps to getting from here to there, and some of those might even be independently valuable enough to convince someone else to pitch in with. But that's work, and since you're the one who wants it done, you may have to just do it. And it might not work out.

I have done this myself, many times; that's why npm exists, and why node's module system is what it is. I didn't manage to do this because I had political capital in the Node.js community; I got political capital in the Node.js community by doing this. Sometimes you just have to go do the crazy thing, no matter who tells you it isn't a good idea.

I am cheering for you, whatever you decide.

@isaacs
Copy link

isaacs commented Aug 4, 2015

Oh, also, I realize that the layout I proposed isn't exactly what you're suggesting. But I think it's more or less where you'll end up, if you keep the constraint of handling meta-dependency conflicts.

Any radically different-from-current layout will incur most of the same tradeoffs, so the gist of what I was getting at is still pretty much the same.

@leo
Copy link
Author

leo commented Aug 8, 2015

@isaacs

First of all: Thank you for those kind words. Looking at my past experience, many communities don't have that sentiment. 👌

Obviously it can be done. But it's a pretty big expense just to get feature parity with what we have now, so even if it ends up being better, if you want someone else to do it for you, you have to convince them that it'll be better enough to be worth that cost.

I'd like to try this first. Mostly because I'm not sure if I would be able to do this completely by myself already. I mean, I know how npm and node work but getting help from someone who knows it better will also lead to a better output.

As I already mentioned: Before I proposed this method, I was really bugged by the way how it currently works. In many projects, it worked fine. But on day, I started developing something which worked a bit different than a usual web app. I tried many different things which we're suggested by some cool people around the web. And yeah, they all worked fine. But to me, they simply didn't make any sense. And suddenly, the whole node_modules structure started feeling a bit grotesque to me and I knew that deep in there, there's a major problem which needed to be solved.

Because of this, I started playing with the thought of proposing this in here. Firstly, I thought "No, that's probably useless. Those node guys are already working hard and they'll definitely have a good reason for keeping it the current way!". But then my mind turned: "What are you waiting for? Thoughts from a guy who is relatively new to node and who came from completely different roots (in terms of languages, modules and project structures) is exactly what they want. Your opinion not only represents you, but all those people (rookies in node) for which the community wants to optimize node (to let them understand everything easier and make it all more accessible). — Regarding:

The "third 90%" is the need to handle all the unexpected edge cases that come from users not knowing where things are, and helping them to understand. This is why npm@3 has been a lot more work than just progress bars and auto-deduping.

All this is - exactly what describes me before I made my way through this thoughts-jungle and figured out how everything is working and where it lacks of a better solution - and at the same time also the reason why I'm proposing this method.

And if that reason of "Making everything more clear and accessible for rookies and people who aren't involved in the process of developing node's or npm's core (and those are the biggest part of the node community)" wasn't already enough, I also wrote about many other benefits this method will bring to all developers (even proffessionals in this area) in the above posts.

And YES, there will always be some of those edge cases. We can only juggle between a bad and a good core method. Both of those methods will create some nasty edge cases. We will always have to work our way through from a specific point, until all those cases are removed. And currently, we're doing exactly this with the node_modules method. When it was introduced, some edge cases popped up, which are now being solved one by one. Don't tell me, people aren't reporting issues which are somehow related to this. And it will be exactly the same with my method: It will be introduced, some nasty problems will pop up (or not) and we will work our way through them and solve them one by one.

So for me, it's the same with both methods. The only thing that's different is that the edge cases are different ones and what's even more important: The core solution is different.

And in my opinion, it will feel better for users.

I hope you understand which point I'm trying to make here and why I think this method will have a better impact on the project in the long run. I tried to avoid talking too much about the short run, because it's simply not important for me and I guess for you, it also isn't.

@leo
Copy link
Author

leo commented Aug 8, 2015

@mikeal

... we did leave enough of the old package resolution semantics around for you to do this if you wish.

That's good to know, that you! Maybe there's still hope. 😄

Regarding your suggestions: This issue isn't only about solving my problem. It's about solving a core problem for all users. But since you told me about those old package resolution semantics, I guess you already understood this.

@leo
Copy link
Author

leo commented Aug 8, 2015

@isaacs And if you aren't convinced already:

I don't know if you want to keep this hush-hush because it allows you to manage opinions in this issue more easily (since there are not many people saying something), but it might be great to get more people involved in this. Maybe I'm not the only one who thinks that this method is great.

And if so, then a huger amount of people wanting it might already be a good reason to implement it, right? Hmmm... But "simply wanting it" shouldn't be a reason, you're right. But maybe some of those people will also contribute some other great reasons, who knows?

@leo
Copy link
Author

leo commented Aug 29, 2015

I've tried out npm 3.3.0 recently. Yes, that comes very close to the solution I've proposed. The only different at mine would be, that the packages are all located in a single place and each module folder contains multiple versions of the same (sorted in sub-folders which are named "1.0.0", "1.0.1" and so on - just like the version tag itself is written).

How will this go on? Does anyone here have interest in developing this change? @isaacs maybe? If not, I will come back to this as soon as I have enough skills to realize it myself.

@isaacs
Copy link

isaacs commented Aug 29, 2015

@leo I'm not trying to keep anything "hush hush". If there was a grand conspiracy to prevent dissent, we probably would have locked this issue and deleted your comments or something equally nefarious ;)

You're seeing a lack of interest because people lack interest. The current solution works for millions of developers who use it every day.

You will have a very hard time trying to get anyone else to write this for you, because any breaking change is just simply too costly for too little value to even consider.

No one will build this for you. If it's not worth you spending your time building it, then why would I think that it's worth my time? It's your idea! If you can't even convince yourself, then it must not be very good! ;)

Go do it yourself. When you have something to show, show it.

@leo
Copy link
Author

leo commented Aug 31, 2015

Hehe, I never thought that I wouldn't believe in it. I definitely do. You must have misunderstood something up there. Anyway. Thanks for taking the time and answering me.

I will come back to this as soon I'm able to develop it. 😊

@dmail
Copy link

dmail commented Sep 3, 2015

I pray for this everyday of my life.

I mean the flat, self explanatory, directory structure already used in the folder at npm config get cache

@Marak
Copy link

Marak commented Sep 2, 2016

@leo

Been hitting this same issue for years.

Mikeal and Isaac are right though, the additional complexity this would introduce into the package model doesn't outweigh the potential benefits.

There are legitimate use cases for requiring and installing specific version of packages in same app, but it would increase complexity of both code and mental model for npm. Would cause too much trouble for the 99% of the users who don't need it.

I'm currently evaluating https://github.com/scott113341/npm-install-version , but it hasn't been tested in production yet.

It would be nice to at least have a reliable option to require('[email protected]') or require('foo').v('2.1.2') using a third party pool.

@Trott
Copy link
Member

Trott commented Sep 2, 2016

This had been dormant for almost a year so I'm going to close this. Feel free to comment anyway if anyone feels strongly about something, or re-open the issue, or request that it be re-opened if you think it should be and don't the permissions on the repo to do it.

@Trott Trott closed this as completed Sep 2, 2016
@leo
Copy link
Author

leo commented Sep 19, 2016

@Trott Many people spent a lot of effort on this. IMHO, we should just keep it only so that people who'll be thinking about sth like this in the future don't repeat their thoughts again. 😊

(=> please re-open it and let it sit there)

@Trott Trott reopened this Sep 19, 2016
@Marak
Copy link

Marak commented Oct 5, 2016

@leo

I've researched the APIs for this a bit now.

It seems to me the best way to expose the API is something like:

require('colors')('1.1.2')

With npm version 3, we now have available a flat folder of all installed packages and versions. Building in this additional require functionality with npm 3 should be relatively easy.

If someone was willing to make a new npm package that patches require, we can start experimenting with the functionality in user-land. If there is good adoption, we can broaden the discussion and see about potentially getting this into core.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants