Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit headers load fix #751

Merged
merged 2 commits into from
Aug 19, 2022
Merged

Revisit headers load fix #751

merged 2 commits into from
Aug 19, 2022

Conversation

ikreymer
Copy link
Member

Description

Revisit records may contain HTTP headers, or they may contain no headers. If they do contain headers, the replay should use the HTTP headers from the revisit record, and they payload from the original record referenced by the revisit.
Unfortunately, this was not working (possibly ever?) as expected!
It appears that pywb was always using the HTTP headers from the original record.
Generally, this is fine, except in the case where the revisit record is of a redirect, and the HTTP headers have changed between the revisit and the original (while the payload is the same, and generally ignored).

Motivation and Context

This fixes an issue originally found in sul-dlss/was-pywb#64

Added test case which I believe replicates this issue.
cc: @edsu

Screenshots (if appropriate):

Types of changes

  • Replay fix (fixes a replay specific issue)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added or updated tests to cover my changes.
  • All new and existing tests passed.

- use http headers from headers record!
- parse records on initial lookup, as may need to use http headers from headers record
- possible fix for sul-dlss/was-pywb#64
tests: update test to use original record, revisit contains no content-length
@edsu
Copy link
Contributor

edsu commented Aug 16, 2022

Thanks for investigating this!

@ikreymer ikreymer merged commit f190190 into main Aug 19, 2022
ikreymer added a commit that referenced this pull request Aug 19, 2022
- if a revisit is of a redirect (3xx response) and revisit has http headers, return
the http headers with empty payload -- don't bother loading the original record
builds on changes in #751
@ikreymer ikreymer mentioned this pull request Aug 20, 2022
8 tasks
ikreymer added a commit that referenced this pull request Aug 20, 2022
- if a revisit is of a redirect (3xx response) and revisit has http headers, return
the http headers with empty payload -- don't bother loading the original record
builds on changes in #751
- cleanup redirect revisit tests from #751
@ikreymer ikreymer deleted the revisit-headers-load-fix branch August 31, 2022 23:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants