-
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Download test data from an external source during build #1588
Conversation
The LGTM test fails with the following log messages. The build environment might not have access to the outside world to download the repo.
|
As documented here, LGTM doesn't support For a workaround, I modified |
Don't the git objects still exist in the repo? I understand that this would adjust the size of the working directory (the files currently checked out), but the repo-clone and on-disk storage would not be affected by this patch, right? |
@jaredgrubb That's correct. The largest directories are To reduce |
@nlohmann : Just to confirm that after this PR happens, we will rewrite the repo? I personally don't object, I just want to make sure that is the PoR, or else this patch really doesn't add any benefit. |
Is this PR useful on its own?Very good point! I agree that the PR would be less effective if we are not going to reduce the size of
But still, I agree that removing test/benchmark data from git history would be the way to go. Rewriting history is probably a bad ideaRegarding the git history rewrite, I think I was too optimistic initially. It could be controversial to say the least. Ignoring contributors, the most significant side-effect of rewriting history for the end-users is that all commit hashes will be changed. This invalidates all previous references that based on commit hashes which is be problematic. For instance, in Yocto project (a Linux distribution builder), a library can be pinned down to a version using its commit hash (they also support git tags): They have pinned down to version 3.3.0 by setting this hash:
A history rewrite would force them to change this commit. Also, a history rewrite will premananetly invalides all links like above. To put it simply, rewriting the history would not only affects the contributors, but also the users and scripts who rely on current commit hashes. Possible solutionA safer option would be:
This can be done other way around to keep the repository stars and followers: Rename the current repo to In this way, all of the previous history with original commit IDs will be kept as before and will be available as long as desired. The legacy repo can warns the users to use to new repo for the latest releases. Any newcomer or a current user who is willing to upgrade can use to the new repo. I believe this is the viable and least intrusive solution for reducing the size of git history significantly. The question of When this should be done?, and Does it worth it? is up to the @nlohmann and other contributors to discuss and answer. |
Just want to say that even if the repo remains as-is, I'd want to have release zips with source code which contain no testing data. This for me would be very useful and a good compromise. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
yeah, i dont really worry about repo size. as long as release remains small. |
The problem that remains for me is that release zip doesn't contain CMakeLists.txt, so it's hard to integrate the library into CMake build process (you have to make the target yourself) |
Fixes #96.
This is an attempt to reduce the repository size by moving the test data files
test/data
to an external repository. This can significantly cuts down the size by ~50% (from ~435 MB to ~245 MB).To make the build process seamless and minimize the effects on the users, the external repository will be downloaded by the build scripts automatically and only when the user desire to build the tests, either by
cmake
ormake
. Therefore, this separation should be transparent to regular users.A similar PR can be considered for the files in benchmarks/data later.
Note: As for now, the test data is being downloaded from my personal repository. They should be moved into a new repository owned by @nlohmann before merging this PR.
Pull request checklist
Read the Contribution Guidelines for detailed information.
include/nlohmann
directory, runmake amalgamate
to create the single-header filesingle_include/nlohmann/json.hpp
. The whole process is described here.Please don't
#ifdef
s or other means.