-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve build tools #58
Conversation
Codecov Report
@@ Coverage Diff @@
## master #58 +/- ##
==========================================
- Coverage 90.98% 88.65% -2.34%
==========================================
Files 19 25 +6
Lines 799 943 +144
==========================================
+ Hits 727 836 +109
- Misses 72 107 +35
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably run a build test of some kind in CI. CodeCov also seems to have choked on the tzdata folder rename.
src/tzdata/download.jl
Outdated
# Examples | ||
```julia | ||
julia> tzdata_url("2017a") | ||
"http://www.iana.org/time-zones/repository/releases/tzdata2017a.tar.gz" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example should be HTTPS. Doctest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I need to switch this package over to Documenter yet
src/tzdata/download.jl
Outdated
if version == "latest" | ||
v = latest_version(now_utc) | ||
if !isnull(v) | ||
return joinpath(dir, "tzdata$(unsafe_get(v)).tar.gz") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think unsafe_get
exists on 0.5. This code path should probably be tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't exist on 0.5 but it is in Compat.jl. I added an import at the top of the file to make this clear.
src/tzdata/download.jl
Outdated
archive = Base.download(url, joinpath(dir, basename(url))) # Overwrites the local file if any | ||
|
||
# HTTP 404 Not Found can result in a empty file being created | ||
if !isarchive(archive) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe differentiate between an empty file and an invalid archive file? This would also trigger on a corrupt download, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment needs to be changed. If download
encounters a 404 we will still generate a file which could be empty or contain the sites 404 page. Corrupt downloads will also raise an exception as the archive shouldn't pass tests.
@@ -0,0 +1,71 @@ | |||
import Compat: @static, is_windows | |||
|
|||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document keyword argument
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Also: is Windows CodeCov possible? |
deps/build.jl
Outdated
|
||
info("Successfully processed TimeZone data") | ||
# ENV variable allows us to only download a single version during CI jobs | ||
build(get(ENV, "JULIA_TZ_VERSION", "latest")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't it be the other way around, pinned by default for reproducibility of what gets installed in the usual case, but set latest (maybe as a separate matrix job) on CI to detect problems?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think people want the latest tz files always
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then make a new package release when there's a new tzdata file available. This has been the cause of repeated test failures over the last few years. Better to stick to something that's known to work correctly, and only release new versions after they've been tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been the cause of repeated test failures over the last few years
The test failures you're referring to were mainly to do with counting the number of time zone abbreviations. That problem was fixed in 2a7a307.
Better to stick to something that's known to work correctly, and only release new versions after they've been tested.
The tests are meant to ensure that the code is working correctly and not test the the tzdata. The compilation process of converting the tz source files into serialized Julia structs also ensures that the tzdata is parsed correctly. The reason for pinning the version of tzdata for testing was to make sure we were just testing for code changes and not data changes.
End-users will want accurate time zone information which would be the latest version. It is possible that the latest version of the tzdata could include something that causes the tz compilation process to break but this has never happened to date. Additionally, with this PR I've tested the tz compilation process against older versions including 1996n which demonstrates that the tz source format is stable and unlikely to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
End-users will want accurate time zone information which would be the latest version.
Then release new versions of the package when a new tzdata release comes out. We're talking at most a couple times a month? Users also want to be able to restore past versions of working code and reproduce results, which becomes unnecessarily difficult to do if you're downloading unversioned "latest" content from the internet at install time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tkelman Production code that gives correct results without having to update your package dependency tree (or possibly even your Julia version)
This is essentially a dichotomy between correct and reproducible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's also a question of library code vs application code. a library just needs to be correct at the time it was released, if it stops being correct due to an external resource changing, the solution is use a newer library version. applications or services that don't care about reproducibility or archival can configure the way they use libraries to depend on changing sources of external data, but a library shouldn't be forcing that choice on all users to give different results from running the same source code 6 months later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. One possible solution is have a separate package which only handles fetching TZ files. That way, people can pin/free that and this package independently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linux distributions do mostly have the tzdata packaged in a pretty minimal data-only package, from what I've seen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One possible solution is have a separate package which only handles fetching TZ files. That way, people can pin/free that and this package independently.
I think this is probably the ideal solution. Unfortunately, I won't be able to do this work right away so this is what I'll do for the moment. For now, I'll update this line from "latest" to "2017b" to deal with the reproducibility problem. Users who want to always use the most up to date version automatically can currently set an env variable JULIA_TZ_VERSION=latest
. I'll also be keeping a closer eye on tzdata releases so I can keep the default version up to date.
Possible but not currently supported. I opened an issue for this. |
1ec486f
to
c66fa47
Compare
Window code coverage is too much trouble for this PR. I almost got it working on Julia 0.4 but newer versions of Julia are problematic: JuliaLang/julia#21289 |
src/winzone/WindowsTimeZoneIDs.jl
Outdated
open(translation_file, "w") do fp | ||
serialize(fp, translation) | ||
serialize(fp, compile(xml_file)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the motivation for using the Julia serializer here? that isn't really meant for long-term storage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Serialization is used here to avoid having to re-parse the XML during the package loading. I don't need to strictly need to use serialization in this case.
I do use serialization for storing the TimeZone types. Can serialization change between Julia patch releases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't change between patch releases, but I'm not sure we have good tests ensuring that. It definitely changes between minor releases, and during the course of nightlies.
0d6e454
to
5770279
Compare
c5d66c0
to
73b9d7d
Compare
c3dcf65
to
abc0fc5
Compare
Code is finally in a state I am more or less happy with. The code coverage drop is mostly from the introduction of more Windows specific code. PR #60 should help with that. |
Numerous improvements around the TimeZones.jl build process including:
build.jl
code into the moduleAdditional work to complete:
Periodically allow downloads of(Support retrieval of older Windows translation XML #62)windowsZones.xml
(currently requires force)Support retrieval of older versions of(Support retrieval of older Windows translation XML #62)windowsZones.xml
. There may be incompatibilities between old tzdata versions using the latest versionAvoid usingRef
for keeping track of latest tzdata version and timestampTimeZones.build
translationXML at module loadFixes #55