-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Populate release_date property #42
Conversation
I have 2 comments without looking too closely at the code
Now I'm not opposed to adding this to the meta data, but we really must flesh out the delineation of responsibility.
|
eee2254
to
661989a
Compare
48222bc
to
61b6b46
Compare
I think that's unavoidable. See the comments in KSP-CKAN/CKAN#2916; the Inflator doesn't know whether a file that it has generated already exists, so some other entity has to be in charge of copying forward already established release dates.
What's the reason behind that? The Indexer is built around a Git repo, so the historical data is all right there, and it is responsible for integrating new data into the existing database. Seems like a pretty natural fit. |
61b6b46
to
78b1550
Compare
Yes, I think we're going to have to step away from this slightly (as. But I feel very strongly the source of this truth needs to be the inflator. It gives us scope to implement the release dates from the APIs we consume where available and leaves the indexer with the sole job of deciding of handling/publishing changes. As noted in KSP-CKAN/CKAN#2916
To clarify, I'm not opposed to creating a stub like I did for the restore_status. But it doesn't belong in the main indexing logic as it will be a once off task. The method itself looks ok to me, which we could use in a 'populate_historical_release_dates' if we wanted. It also relies on a deep clone (which we could make optional), which we just don't need in regular operations. Part of my reasoning for using shallow cloning was to reduce our memory usage as during startup, as time goes on and as we add more services that'll become more important. |
Ahh right, shallow clones. FWIW, I think I checked this before, but just now I got 137.7 MiB for a deep clone and 100.1 MiB for shallow, a difference of 25% of the size of the deep clone. |
That's space on disk, which in reality is neither here nor there. Memory usage during clone is where the bigger difference is. |
Ahh, OK. Looks like that's on the order of a 45% difference: $ /usr/bin/time --format='%M KiB' git clone https://github.com/KSP-CKAN/CKAN-meta.git
48804 KiB
$ /usr/bin/time --format='%M KiB' git clone --depth 1 https://github.com/KSP-CKAN/CKAN-meta.git
26616 KiB |
Replaced by KSP-CKAN/CKAN#3059. |
Motivation
Users ask fairly frequently for the ability to sort modules by release date. Up to now we can't provide that info because the bot is stateless, and keeping track of the first time that we found a version requires state.
Changes
Now the Indexer will add a
release_date
property to each module that it processes:release_date
property forward unchangedrelease_date
property, it will set it to the file creation date in gitrelease_date
property to nowNote that the schema/spec currently do not have a
release_date
property. It will have to be added before this can be merged. In the meantime I want to see how many tests fail and why.Fixes #6.
Fixes KSP-CKAN/CKAN#2916.