Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve build tools #58

Merged
merged 34 commits into from
Apr 7, 2017
Merged

Improve build tools #58

merged 34 commits into from
Apr 7, 2017

Conversation

omus
Copy link
Member

@omus omus commented Apr 4, 2017

Numerous improvements around the TimeZones.jl build process including:

  • Move build.jl code into the module
  • Allowing users to optionally specify what version of tzdata to use
  • Reduce requests to IANA and Unicode servers

Additional work to complete:

Fixes #55

@omus omus self-assigned this Apr 4, 2017
@codecov-io
Copy link

codecov-io commented Apr 4, 2017

Codecov Report

Merging #58 into master will decrease coverage by 2.33%.
The diff coverage is 61.39%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #58      +/-   ##
==========================================
- Coverage   90.98%   88.65%   -2.34%     
==========================================
  Files          19       25       +6     
  Lines         799      943     +144     
==========================================
+ Hits          727      836     +109     
- Misses         72      107      +35
Impacted Files Coverage Δ
src/utils.jl 97.56% <ø> (+2.1%) ⬆️
src/accessors.jl 100% <ø> (ø) ⬆️
src/types.jl 97.64% <ø> (ø) ⬆️
src/tzdata/timeoffset.jl 100% <ø> (ø)
src/interpret.jl 100% <ø> (ø) ⬆️
src/local.jl 82.6% <0%> (+3.44%) ⬆️
src/winzone/WindowsTimeZoneIDs.jl 0% <0%> (ø)
src/tzdata/version.jl 100% <100%> (ø)
src/tzdata/compile.jl 93.77% <100%> (ø)
src/tzdata/TZData.jl 100% <100%> (ø)
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0c55948...0bc92df. Read the comment docs.

Copy link
Member

@iamed2 iamed2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably run a build test of some kind in CI. CodeCov also seems to have choked on the tzdata folder rename.

# Examples
```julia
julia> tzdata_url("2017a")
"http://www.iana.org/time-zones/repository/releases/tzdata2017a.tar.gz"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example should be HTTPS. Doctest?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I need to switch this package over to Documenter yet

if version == "latest"
v = latest_version(now_utc)
if !isnull(v)
return joinpath(dir, "tzdata$(unsafe_get(v)).tar.gz")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think unsafe_get exists on 0.5. This code path should probably be tested.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't exist on 0.5 but it is in Compat.jl. I added an import at the top of the file to make this clear.

archive = Base.download(url, joinpath(dir, basename(url))) # Overwrites the local file if any

# HTTP 404 Not Found can result in a empty file being created
if !isarchive(archive)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe differentiate between an empty file and an invalid archive file? This would also trigger on a corrupt download, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment needs to be changed. If download encounters a 404 we will still generate a file which could be empty or contain the sites 404 page. Corrupt downloads will also raise an exception as the archive shouldn't pass tests.

@@ -0,0 +1,71 @@
import Compat: @static, is_windows

"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document keyword argument

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@iamed2
Copy link
Member

iamed2 commented Apr 4, 2017

Also: is Windows CodeCov possible?

deps/build.jl Outdated

info("Successfully processed TimeZone data")
# ENV variable allows us to only download a single version during CI jobs
build(get(ENV, "JULIA_TZ_VERSION", "latest"))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't it be the other way around, pinned by default for reproducibility of what gets installed in the usual case, but set latest (maybe as a separate matrix job) on CI to detect problems?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think people want the latest tz files always

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then make a new package release when there's a new tzdata file available. This has been the cause of repeated test failures over the last few years. Better to stick to something that's known to work correctly, and only release new versions after they've been tested.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been the cause of repeated test failures over the last few years

The test failures you're referring to were mainly to do with counting the number of time zone abbreviations. That problem was fixed in 2a7a307.

Better to stick to something that's known to work correctly, and only release new versions after they've been tested.

The tests are meant to ensure that the code is working correctly and not test the the tzdata. The compilation process of converting the tz source files into serialized Julia structs also ensures that the tzdata is parsed correctly. The reason for pinning the version of tzdata for testing was to make sure we were just testing for code changes and not data changes.

End-users will want accurate time zone information which would be the latest version. It is possible that the latest version of the tzdata could include something that causes the tz compilation process to break but this has never happened to date. Additionally, with this PR I've tested the tz compilation process against older versions including 1996n which demonstrates that the tz source format is stable and unlikely to change.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

End-users will want accurate time zone information which would be the latest version.

Then release new versions of the package when a new tzdata release comes out. We're talking at most a couple times a month? Users also want to be able to restore past versions of working code and reproduce results, which becomes unnecessarily difficult to do if you're downloading unversioned "latest" content from the internet at install time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tkelman Production code that gives correct results without having to update your package dependency tree (or possibly even your Julia version)

This is essentially a dichotomy between correct and reproducible.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's also a question of library code vs application code. a library just needs to be correct at the time it was released, if it stops being correct due to an external resource changing, the solution is use a newer library version. applications or services that don't care about reproducibility or archival can configure the way they use libraries to depend on changing sources of external data, but a library shouldn't be forcing that choice on all users to give different results from running the same source code 6 months later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. One possible solution is have a separate package which only handles fetching TZ files. That way, people can pin/free that and this package independently.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linux distributions do mostly have the tzdata packaged in a pretty minimal data-only package, from what I've seen

Copy link
Member Author

@omus omus Apr 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One possible solution is have a separate package which only handles fetching TZ files. That way, people can pin/free that and this package independently.

I think this is probably the ideal solution. Unfortunately, I won't be able to do this work right away so this is what I'll do for the moment. For now, I'll update this line from "latest" to "2017b" to deal with the reproducibility problem. Users who want to always use the most up to date version automatically can currently set an env variable JULIA_TZ_VERSION=latest. I'll also be keeping a closer eye on tzdata releases so I can keep the default version up to date.

@omus
Copy link
Member Author

omus commented Apr 5, 2017

Also: is Windows CodeCov possible?

Possible but not currently supported. I opened an issue for this.

@omus omus force-pushed the tzdata branch 3 times, most recently from 1ec486f to c66fa47 Compare April 5, 2017 19:44
@omus
Copy link
Member Author

omus commented Apr 5, 2017

Window code coverage is too much trouble for this PR. I almost got it working on Julia 0.4 but newer versions of Julia are problematic: JuliaLang/julia#21289

open(translation_file, "w") do fp
serialize(fp, translation)
serialize(fp, compile(xml_file))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the motivation for using the Julia serializer here? that isn't really meant for long-term storage

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Serialization is used here to avoid having to re-parse the XML during the package loading. I don't need to strictly need to use serialization in this case.

I do use serialization for storing the TimeZone types. Can serialization change between Julia patch releases?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't change between patch releases, but I'm not sure we have good tests ensuring that. It definitely changes between minor releases, and during the course of nightlies.

@omus omus mentioned this pull request Apr 6, 2017
@omus omus force-pushed the tzdata branch 4 times, most recently from 0d6e454 to 5770279 Compare April 6, 2017 21:17
@omus omus force-pushed the tzdata branch 2 times, most recently from c5d66c0 to 73b9d7d Compare April 7, 2017 16:27
@omus omus force-pushed the tzdata branch 2 times, most recently from c3dcf65 to abc0fc5 Compare April 7, 2017 18:07
@omus
Copy link
Member Author

omus commented Apr 7, 2017

Code is finally in a state I am more or less happy with. The code coverage drop is mostly from the introduction of more Windows specific code. PR #60 should help with that.

@omus omus merged commit d183a73 into master Apr 7, 2017
@omus omus deleted the tzdata branch April 12, 2017 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants