Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precompile should check staleness using a hash instead of a timestamp #17845

Closed
tlnagy opened this issue Aug 5, 2016 · 10 comments
Closed

Precompile should check staleness using a hash instead of a timestamp #17845

tlnagy opened this issue Aug 5, 2016 · 10 comments
Labels
compiler:precompilation Precompilation of modules

Comments

@tlnagy
Copy link
Contributor

tlnagy commented Aug 5, 2016

As discussed in JuliaStats/StatsBase.jl#202, alternating Pkg.add(X) and using X statements can lead to errors. According to @tkelman, this could be fixed by making require check staleness by looking at the timestamp instead of a hash.

Example:

$ rm -rf ~/.julia/v0.4
$ rm -rf ~/.julia/lib/v0.4
$ julia -e "Pkg.update()"
INFO: Initializing package repository /Users/tamasnagy/.julia/v0.4
INFO: Cloning METADATA from git://github.com/JuliaLang/METADATA.jl
INFO: Updating METADATA...
INFO: Computing changes...
INFO: No packages to install, update or remove
$ julia
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.4.5 (2016-03-18 00:58 UTC)
 _/ |\__'_|_|_|\__'_|  |
|__/                   |  x86_64-apple-darwin15.4.0

julia> Pkg.add("StatsBase")
INFO: Installing BinDeps v0.4.1
INFO: Installing Compat v0.8.6
INFO: Installing Rmath v0.1.2
INFO: Installing SHA v0.2.0
INFO: Installing StatsBase v0.9.0
INFO: Installing StatsFuns v0.3.0
INFO: Installing URIParser v0.1.5
INFO: Building Rmath
INFO: Package database updated

julia> using StatsBase
INFO: Precompiling module StatsBase...

julia> Pkg.add("Distributions")
INFO: Installing Calculus v0.1.15
INFO: Installing Distributions v0.10.0
INFO: Installing PDMats v0.4.2
INFO: Building Rmath
INFO: Package database updated

julia> using Distributions
INFO: Precompiling module Distributions...
INFO: Recompiling stale cache file /Users/tamasnagy/.julia/lib/v0.4/Rmath.ji for module Rmath.
INFO: Recompiling stale cache file /Users/tamasnagy/.julia/lib/v0.4/StatsFuns.ji for module StatsFuns.
INFO: Recompiling stale cache file /Users/tamasnagy/.julia/lib/v0.4/StatsBase.ji for module StatsBase.
INFO: Recompiling stale cache file /Users/tamasnagy/.julia/lib/v0.4/Distributions.ji for module Distributions.
WARNING: Module Rmath uuid did not match cache file
  This is likely because module Rmath does not support  precompilation but is imported by a module that does.
ERROR: __precompile__(true) but require failed to create a precompiled cache file
 in require at /usr/local/Cellar/julia/0.4.5/lib/julia/sys.dylib
@tkelman tkelman added the compiler:precompilation Precompilation of modules label Aug 5, 2016
@tkelman
Copy link
Contributor

tkelman commented Aug 5, 2016

Pkg check staleness

staleness check happens in using (actually require, which is also used by import), not Pkg

@tlnagy
Copy link
Contributor Author

tlnagy commented Aug 5, 2016

Thanks! I updated the original post.

@stevengj
Copy link
Member

stevengj commented Aug 5, 2016

I see, the problem here is that the timestamp has changed because Pkg.build("Rmath") regenerates its deps.jl file, but the code hasn't actually changed so the rebuild (which triggers #12508) is not actually needed.

As a workaround, @BinDeps.install could overwrite the deps.jl file only if it has changed.

@tkelman
Copy link
Contributor

tkelman commented Aug 5, 2016

that was a workaround I suggested in the StatsBase issue, but I think this could happen for other reasons and might be good for making .ji files easier to redistribute, if the checksum is cheap enough. x-ref #12458 (comment)

@vchuravy
Copy link
Member

vchuravy commented Aug 5, 2016

Making staleness depend on the hash would be great for users who work on distributed system (supercomputers and such). @andreasnoack and I have been running into problems sometimes.

@stevengj
Copy link
Member

Here is a nice-looking hardware-accelerated BSD-licensed CRC32c implementation in C by Mark Adler: http://stackoverflow.com/questions/17645167/implementing-sse-4-2s-crc32c-in-software/1

@stevengj
Copy link
Member

And here is a benchmark of different CRC routines, concluding that Adler's routine is pretty much the fastest.

@vtjnash
Copy link
Member

vtjnash commented Aug 11, 2016

I've always been fairly against doing this with checksums. I agree there needs to be more development here, but I haven't been able to develop a complete proposal yet to address these and other similar issues.

@stevengj
Copy link
Member

stevengj commented Aug 11, 2016

@vtjnash, doing if !timestamps_match && !checksums_match then regenerate_cache seems like a strict improvement over if !timestamps_match (if the time for the checksum is negligible, which seems likely), by reducing the chance of false positives. What is the nature of your objection?

@stevengj
Copy link
Member

Should this be closed in light of the discussion at #18127?

@vtjnash vtjnash closed this as completed Aug 25, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:precompilation Precompilation of modules
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants