-
Notifications
You must be signed in to change notification settings - Fork 842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running Haskell script with Stack uses multiple versions of one package #1957
Comments
interestingly following your steps i can't reproduce this error, and i have the same version of stack
is there something else going on? can you provide more information about your system? |
It's reasonably up-to-date (updated yesterday) Arch Linux.
PostgreSQL version 9.5.1-2.
flags: {}
packages: []
extra-deps: []
resolver: lts-3.7 |
I tried it on a fresh Ubuntu VM, couldn't reproduce it either. I was able to reproduce it in a separate directory on Arch. I'm going to work some more to do so. Maybe add a |
Updating the resolver in I don't want to close this until we figure out what went wrong, though. there is clearly a bug somewhere. I opened a merge request at Snowdrift to update the resolver: https://git.gnu.io/snowdrift/snowdrift/merge_requests/177. |
I don't think there's necessarily a bug here. The root cause is that AFAIK, stack only has duplicate versions of the package when it exists in the global DB. Have you used cabal with your package databases? The shadowing of package DBs should cause a global DB version of text to fail, though. What does Have you used cabal-install with your package DBs? That's one way this could happen. |
As the author of the sdb.hs in question, let me know if there's something I could do differently. Can I force stack to use the package-local sandbox when run as a shebang interpreter? This script is only designed to be used with the Snowdrift project anyway. (Though I suppose it might be useful to other Yesoders... it makes a ton of sense to me that people would use a local cluster for dev, rather than depending on a system-level instance of Postgres.)
|
I just ran into this myself.
Verbose output of
Weirdly,
yet
Now, I have no idea why I have Thanks. |
It really should. On the surface, it looks like ghc is preferring the global DB, for some bizarre reason. I asked some diagnostic questions in an early comment, and got no reply. I will reiterate them. Anyone affected by this, feel free to provide the output:
Furthermore, are you using a system install of ghc? I have a suspicion that this is much less likely to happen with stack managed ghc installs, because you're less likely to use cabal-install on the global DB |
Ah, I didn't catch that comment. What is |
Oh I meant |
Ah, here you go:
I do have a system |
Here's what happens when I run
I'm also on Arch Linux, FWIW. I have used this installation since before stack existed, so there are several packages installed with For me, updating the resolver fixed this for some reason. No idea why. |
@pharpend Doesn't seem to be a problem there; |
Hmm I believe your compiler changed from 7.10.2 to 7.10.3 when updating the resolver from lts-3.7 to lts-5ish and therefore different global pkgdb testing instead
indeed! curious to see if |
On Tue, Apr 19, 2016 at 10:16:03PM -0700, Luigy Leon wrote:
Oh, that's probably it! I managed to fix the issue, though. It seems like a |
I was just lurking this thread, and it doesn't seem like this got resolved. (Well, @pharpend seems to have fixed their problem but it seems like @mitchellwrosen did not.) Am I misreading something? Personally, I've run into this many times. I'm in a stack project, and I write a quick script to test some of my library functions, and then run it with |
On 07/18/2016 11:42 PM, Sid Kapur wrote:
It might be worth looking into Snowdrift's SDB: They use Stack for everything. |
This is pretty trivial to reproduce. I have global project using lts-6.7 as resolver with a different version of text as an extra dep. lts-6.7 snapshot has text-1.2.2.1 so I put text-1.2.2.0 as an extra dep.
Now run this script: import Turtle
import qualified Data.Text as T
main :: IO ()
main = sh $ echo $ T.pack "hello"
In fact we can just run ghc with appropriate GHC_PACKAGE_PATH (retrieved by running
This is what seems to be happening:
Another weird behavior that I see is that using Are these bugs in GHC or something that I am missing? There seems to be something more to picking packages from databases than just GHC_PACKAGE_PATH. |
We can fix this by passing specific versions of packages to runghc instead of relying on GHC_PACKAGE_PATH but the ghc behavior with respect to that needs to be explained. |
So, you've compiled Looking at the output of
So |
In my second invocation stack is not in picture at all, I am using ghc directly with an explicit GHC_PACKAGE_PATH. So there is no confusion whether stack is doing something wrong. One possibility is that ghc is trying to pick a newer version of a package from all databases without considering that another package is depending on an older version. So it uses text-1.2.2.1 from the second db in path even though the first db has text-1.2.2.0. About the
So you might be interpreting the |
Ah yes, I was interpreting the output of So, looking at the output of |
|
Current mechanism of using GHC_PACKAGE_PATH for runghc and ghc commands does not seem to work well when we have multiple versions of the same package. GHC does not always pick up the packages in the same order as GHC_PACKAGE_PATH. This fix determines of the package-ids using ghc-pkg and then passes package-ids on command line of ghc or runghc invocation. This works only when the user explicitly passes --package to runghc or ghc commands. When --package is not specified we have no easy way to determine what all packages will be used by the file being compiled. This will make sure that scripts which explicitly list all or multi-instance packages will always run reliably. fixes #1957 (Requires all packages to be listed explicitly)
@ezyang Would you know if
|
I sent an email to ghc-devs as well. |
Relevant code from ghc-pkg:
Relevant GHC code
Haven't read the rest of this ticket yet. |
OK this has nothing to do with I think there may have been behavior changes in 7.10 (and again in 8.0) because 7.10 introduced package keys and 8.0 got rid of them. |
Current mechanism of using GHC_PACKAGE_PATH for runghc and ghc commands does not seem to work well when we have multiple versions of the same package. GHC does not always pick up the packages in the same order as GHC_PACKAGE_PATH. This fix determines the package-ids using ghc-pkg and then passes package-ids on command line of ghc or runghc invocation. This works only when the user explicitly passes --package to runghc or ghc commands. When --package is not specified we have no easy way to determine what all packages will be used by the file being compiled. This will make sure that scripts which explicitly list all packages will always run reliably even in presence of packages which have multiple instances of the same version or multiple versions installed. fixes #1957 (Requires all packages to be listed explicitly)
@ezyang
The manual does not seem to have answers for these or is incorrect. I have not looked at the code yet. Will take a look soon. |
Very interesting, this explains a lot. It's too bad that ghc doesn't have what we need. I guess this is just one more way scripts can break if they lack --standalone (to-be-implemented, should be straightforward) |
|
Thanks a lot @ezyang for explaining the behavior in detail. I think what we need to do is to fix the documentation to make it explicit. I read the documentation carefully but did not get any clear answers from that. The documentation of GHC_PACKAGE_PATH gives an impression that the order is important which does not seem to be the case as per your explanation.
In fact it should not have been called PATH because that has a connotation of order. |
The order is important for shadowing (packages on the top of the stack shadow packages lower down). But shadowing only occurs two packages have the same installed package ID. In the common case, this never happens. So, I think there was a point in time (7.8 and earlier) when this explanation did make sense, because back then shadowing was computed by package ID rather than package key (7.10) or installed package ID (8.0). But this was quite miserable because you basically could never have multiple copies of the same version of a package in the same database; to manage it, you'd need separate databases for each package. (There was a technical reason behind this too: the symbol name only incorporated package name and version, so GHC could get REALLY confused if you had two of these in the database and visible; it'd think types were equal when they shouldn't be.) |
For what it's worth, I think one could plausibly argue that GHC 8.2 should implement behavior along the lines of, "If there are multiple packages with the same name and version, if a user says |
Yeah we can externally determine the set of exact package-ids to use. But it requires invoking other programs like ghc-pkg which is not usually a big deal though but it would be more convenient if ghc itself has a simple way of achieving the same thing since it anyway goes through the dbs. For
One of the problems in running scripts reliably is to specify the consistent versions of all packages used in the script. One way to achieve that would be to first determine all dependencies of the script and then pass correct versions of each package to GHC by first determining those externally. I do not know of a convenient way to determine the packages used (GHC API?, ghc -M?) in the script. On the other hand, if we have a way to tell ghc to always pick the packages in a certain order (e.g. GHC_PACKAGE_PATH) then this problem will be solved nicely for |
I edited my comment to s/LAST/TOP/. For command line The way they are combined is specified by this function:
So I believe
It could.
I don't think this question is well-formed. What do you have to work with? Do you have a Stackage revision and a list of package names? Just some package names? Some package names with dependency bounds? If you have a Stackage revision, shouldn't stack know what the
Where can I learn about Stack's package database organization? I don't know how it works so it is difficult for me to interpret this statement. |
Sorry about that, let me elaborate it a bit. This is in context of running a script using When we run a script we set Now since GHC does not pick packages in the db order what I have done is to figure out package-ids of the packages that we want to use and pass it on the command line to force GHC to use the right packages. But to be able to do that we need to know all the package names that the script needs so that we can pass all the package-ids on command line. We do not have that information in general, we have it only when the user explicitly specifies all packages using When the user does not explicitly specify the packages, how do we figure out the package names? Remember this is just a plain Haskell source file and there is no associated cabal file. So either we extract the package names from the script so that we can determine and pass correct package-ids or ghc does the work for us in choosing the packages in the order we want it to. With what you proposed above ghc itself should be able to pick the right packages for us the way we want. |
So, GHC got a new feature called "environment files" in January 5th this year; commit aa699b94e3a8ec92bcfa8ba3dbd6b0de15de8873 which I think is specifically targeted at your use-case. They were released with 8.0 it seems, and are documented here: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/packages.html#package-environments I think the model is, as Stack builds and installs packages into the snapshot / other package db, it updates the corresponding environment file. Then a user simply gets to see all the packages in the environment file. I guess this is not exactly what you are asking for. But is it close enough? |
This is pretty similar to what we are doing. The environment file allows us to put the command line stuff in a file, which can be useful if the command line gets too long for the shell to handle for example. The unsolved problem as of now is what I wrote in the last para of my previous update. Since we have no good way of extracting imported packages from a Haskell file, we would want the problem to be solved in a way so that we do not have to do that. That is, influence GHC to choose packages in the way we want. I think what you proposed here will solve this. |
Is having an environment file equivalent to passing in a bunch of Another option would be to just decide that it's rather unprincipled to run ghc without explicitly restricting the set of packages to what's needed for the script. In #1388 , a proposed |
That's like specifying the whole db in the environment file. It might work if the implementation designed for or is efficient enough for thousands of packages. Making |
I like it too! The main issue is backwards compatibility, as this will break people's scripts. I'm not sure if that should hold it back, though, as the current behavior has all of the following downsides:
|
I agree with all your points. I was thinking of a solver like command for scripts; given a script it will automatically dump a Also once #1944 is fixed then |
Hey @harendra-kumar , could you please open a ticket on GHC Trac (CC me) with a spec of the functionality requested (based on the discussion on this thread)? In particular, I am not sure if you wanted (1) GHC automatically prefers the package on top if the versions are the same, or (2) a new mode for handling |
I will raise a GHC trac ticket with specific details and cc you. |
The GHC issue is now tracked via a GHC trac ticket. |
Hello, there
In Snowdrift, we have a database setup script called
sdb.hs
. It usesstack
along with theturtle
package to run as a quasi-shell script.When I tried to run it today to start the database server, I got this
error message:
As you can see, stack is using multiple versions of
text
, and coming up with this ridiculous error.Bulleted information:
stack --version
:The text was updated successfully, but these errors were encountered: