-
Notifications
You must be signed in to change notification settings - Fork 980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create requires_python
column on release_files
to efficiently query requires_python
with files
#1448
Conversation
Pushed a change to try to make the linter happy. |
The test on 3.5.2 seem to have failed before starting can one of you restart this specific build of travis ? Thanks ! |
@Carreau I restarted it for you. |
Why not just add a column to the |
I'm not sure how to cascade on the same table, I can give it a try; @michaelpacer and I basically learn SQL for that, and I can see many way to get that wrong: I might just misunderstand the features available in SQL, though and there might be way to make sure there is consistency. |
Updated with @michaelpacer work, this now adds a column to |
Travis seem to be stuck on I improved the codestyle to pass the linter, and I assume after rebasing I had to update the database revision number (I assume that what's the following was trying to convey:
Is there a simpler way to update the db revision number than to re-run |
I think multiple heads means that other changes to the DB has landed since this change and you'll need to either rebase your DB revision so it follows linearly to that one, or you'll need to create a merge db revision. See http://alembic.zzzcomputing.com/en/latest/branches.html |
Thanks for restarting the build.
Thanks, it appears I still had the revision wrong but because of a missing key... re-attempting. |
requires_python
column on release_files
to efficiently query requires_python
with files
Yeah ! 🍰 🎊 Test are passing. Not sure if you prefer commits to be squashed a bit, and/or if extra test are needed for the DB as this will mostly be used by Legacy PyPI, nor exactly what need to be tested. |
That's now implemented and seem to work. [edit] fix typo spotted in next comment. |
I think what @Carreau meant was that it's now implemented. Since it is not not implemented. 😄 Is there anything more that needs to be done on this for it to be included? |
Oops. typed to fast. I edited my comment. |
I'll appreciate comment or hints on the next step to take on this. In order to tackle the things on PyPI sides as well. Thanks ! |
Hello there ! It's Monday 🎊 , usually the day of the week most people hate because it's time to start working again ! I hope you are all fine, and have spent a nice week-end. I personally went to the beach (in Moss landing) and saw otters. I did a wrong move and my broken for is hurting again, so I used that as an excuse to watch Mr Robot on Sunday. I've also read that Postgres 9.6 is going to have BDR, which is cool I guess. I'm now routinely looking at this piece of code and looking to get any feedback on things to improve (or to fix). Hopefully get things merged soon enough to be able to finish my patch on legacy PyPI. I'm now going to attempt inserting a Cat Gif here to cheer you up and give you the strength to review. Thanks for all your hard work ! |
Morning everyone ! Is is monday again ! I know, I know it's often monday. But according to rough calculation only about 14.28% of the time ! I definitively need more samples to be sure. Also good news it 💧 rained 💧 this morning in California ! Also my ankle did not hurt before it started raining. I'm sad cause I wanted to be like the old grandpa in movies that can predict rain just depending on whether* their rheumatism are hurting. [*or should I say weather, Haha] I'm also kind of anti-participating to hacktoberfest: instead of submitting PRs I'm trying to get mine merged or closed as I feel too many opened PRs can be scarry to new contributors, and depressing for maintainers. I know the feeling. Let me tell you a joke to cheer you up and give you the courage to merge this Pull request. You make it hard on me because I have to find one relevant for this Pull request...
Well ok, that was lame. But I hope you enjoyed it and got some courage to press the nice green button you can see below. Also if you are afraid to ask from some change or don't have the courage to write a long reply, do not worry I can take a blunt and short answer even negative one. All I want is this thing to move forward. And I know that writing a long nice message that is politically correct can be tough. Anyway thanks for your hard work, and hopping we can move this forward. Also I'm still unsure if you are a dog or a cat person, so I'm going to try a dog gif this time. If it does not work I'll likely try squirel, llama, rabbits, raptors or something else next week. See you before next monday hopefully. |
Sorry, this is on my list of things to review, I've just had a really crappy couple of weeks and haven't been able to focus much. |
BEGIN | ||
IF (TG_OP = 'INSERT' | ||
OR | ||
OLD.requires_python IS DISTINCT FROM NEW.requires_python) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can ditch the IF ... THEN
statement here and just rely on the trigger to only evaluate when this needs to be evaluated (see below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caught that the if statement ended after I tried to rebuild the db. Now it seems to successfully migrate!
# release_files with the appropriate requires_python values. | ||
op.execute( | ||
""" CREATE TRIGGER releases_requires_python | ||
AFTER INSERT OR UPDATE ON releases |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you change this to AFTER INSERT OR UPDATE OF requires_python ON releases
then this trigger will only fire on either an insert, or when requires_python
is updated. This would allow you to remove the logic from the procedure up above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, nice ! Thanks for the tip !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is much more elegant; thank you!
Changes should be addressed by 9061cbe.
# if someone changes the requires_python value, it is regenerated from | ||
# releases. | ||
op.execute( | ||
""" CREATE TRIGGER release_files_requires_python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably wouldn't bother with this. This should be read only from the Python code in Warehouse and so it shouldn't ever got modified anyways. If it does we can always repair the data by executing:
UPDATE release_files
SET requires_python = releases.requires_python
FROM releases
WHERE
release_files.name=releases.name
AND release_files.version=releases.version
again to fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that mean that those commands should be added to the downgrade()
function? Or just that there's a clean way to address the issue if it does happen to come up.
Thanks for the feedback either way!
No worries, we all go through these phase. Hope things will go better for you. I was secretly hoping you were not reviewing because you liked the weekly email and were hoping to get more. Thanks a lot for your time we really appreciate ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This overall looks good. One thing I missed is that you need to add this to the File
module, something like:
class File(db.Model):
...
requires_python = Column(Text)
@validates("requires_python")
def validate_requires_python(self, *args, **kwargs):
raise RuntimeError("Cannot set File.requires_python")
Is the validates decorator that you're suggesting the same as the sqlalchemy.orm.validates described here or is there a place in the code where you define your own validates decorator that I'm not finding? |
Yea, it's the SQLAlchemy one, and it will just make it so any attempt in the Warehouse ORM to change the value will bomb out (but the trigger will work fine). |
And it should also be included in the File module? |
It should only be on the |
Got it, thank you for the pointers! I wouldn't have known to construct it in this way without your help. |
I see now! The validation is accomplishing what we were trying to directly overwrite with triggers. I agree that raising an error is a better than silently overwriting what someone did. Also, I just realised I never imported it. |
@dstufft Does the codecov issue mean that a test needs to be added that hits the RuntimeError and expects a failure? |
Yes. |
Would that go in the |
Yup. |
@dstufft Thanks for helping out the Jupyter folks. Sorry about the last few weeks. Your Python friends ❤️ all that you do. Take care. Feel free to ping any of us if we can help. You know how much I love docs ;-) |
Thank you for merging this! When can we rely on this being available from test_pypi and pypi so we can make progress on integrating |
Now. |
The work in PR pypi#1448 was meant to replicate the information of the `requires_python` of the `release` table to the `release_files` table for efficiency when generating the list of available packages for pip. While the work on pypi#1448 seem sufficient for warehouse itself, it need to consider that legacy-pypi also access the same database, and legacy-pypi violate some constraint. In particular when using `setup.py register` followed by `twine upload`, the file upload insert files into `release_files` after inserting into `releases`. Thus the value in releases is not propagated, leading to an inconsistency and a listing in pip missing information about python-version compatibility. While I doubt there are any packages released between the merge of pypi#1448 and a fix, this update the tables, and bind an already existing trigger to update the information during insertion in `release_files`
The work in PR pypi#1448 was meant to replicate the information of the `requires_python` of the `release` table to the `release_files` table for efficiency when generating the list of available packages for pip. While the work on pypi#1448 seem sufficient for warehouse itself, it need to consider that legacy-pypi also access the same database, and legacy-pypi violate some constraint. In particular when using `setup.py register` followed by `twine upload`, the file upload insert files into `release_files` after inserting into `releases`. Thus the value in releases is not propagated, leading to an inconsistency and a listing in pip missing information about python-version compatibility. While I doubt there are any packages released between the merge of pypi#1448 and a fix, this update the tables, and bind an already existing trigger to update the information during insertion in `release_files`
The work in PR pypi#1448 was meant to replicate the information of the `requires_python` of the `release` table to the `release_files` table for efficiency when generating the list of available packages for pip. While the work on pypi#1448 seem sufficient for warehouse itself, it need to consider that legacy-pypi also access the same database, and legacy-pypi violate some constraint. In particular when using `setup.py register` followed by `twine upload`, the file upload insert files into `release_files` after inserting into `releases`. Thus the value in releases is not propagated, leading to an inconsistency and a listing in pip missing information about python-version compatibility. While I doubt there are any packages released between the merge of pypi#1448 and a fix, this update the tables, and bind an already existing trigger to update the information during insertion in `release_files`
The work in PR pypi#1448 was meant to replicate the information of the `requires_python` of the `release` table to the `release_files` table for efficiency when generating the list of available packages for pip. While the work on pypi#1448 seem sufficient for warehouse itself, it need to consider that legacy-pypi also access the same database, and legacy-pypi violate some constraint. In particular when using `setup.py register` followed by `twine upload`, the file upload insert files into `release_files` after inserting into `releases`. Thus the value in releases is not propagated, leading to an inconsistency and a listing in pip missing information about python-version compatibility. While I doubt there are any packages released between the merge of pypi#1448 and a fix, this update the tables, and bind an already existing trigger to update the information during insertion in `release_files`
The work in PR pypi#1448 was meant to replicate the information of the `requires_python` of the `release` table to the `release_files` table for efficiency when generating the list of available packages for pip. While the work on pypi#1448 seem sufficient for warehouse itself, it need to consider that legacy-pypi also access the same database, and legacy-pypi violate some constraint. In particular when using `setup.py register` followed by `twine upload`, the file upload insert files into `release_files` after inserting into `releases`. Thus the value in releases is not propagated, leading to an inconsistency and a listing in pip missing information about python-version compatibility. While I doubt there are any packages released between the merge of pypi#1448 and a fix, this update the tables, and bind an already existing trigger to update the information during insertion in `release_files`
The work in PR pypi#1448 was meant to replicate the information of the `requires_python` of the `release` table to the `release_files` table for efficiency when generating the list of available packages for pip. While the work on pypi#1448 seem sufficient for warehouse itself, it need to consider that legacy-pypi also access the same database, and legacy-pypi violate some constraint. In particular when using `setup.py register` followed by `twine upload`, the file upload insert files into `release_files` after inserting into `releases`. Thus the value in releases is not propagated, leading to an inconsistency and a listing in pip missing information about python-version compatibility. While I doubt there are any packages released between the merge of pypi#1448 and a fix, this update the tables, and bind an already existing trigger to update the information during insertion in `release_files`
The work in PR #1448 was meant to replicate the information of the `requires_python` of the `release` table to the `release_files` table for efficiency when generating the list of available packages for pip. While the work on #1448 seem sufficient for warehouse itself, it need to consider that legacy-pypi also access the same database, and legacy-pypi violate some constraint. In particular when using `setup.py register` followed by `twine upload`, the file upload insert files into `release_files` after inserting into `releases`. Thus the value in releases is not propagated, leading to an inconsistency and a listing in pip missing information about python-version compatibility. While I doubt there are any packages released between the merge of #1448 and a fix, this update the tables, and bind an already existing trigger to update the information during insertion in `release_files`
Assuming there are much less updates of the
release
andrelease_file
table than reads then we take the overhead of building this materialized
view only rarely.
Working with @michaelpacer on pypi/legacy#506 we came up with this solution for now, and woudl appreciate any feedback and timing statistics on replicate of production DB.