POST request handling and indexing improvements #636
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR updates the indexing of POST and other non-GET requests to include the request body in the URL as a query.
This allows for improved replay fidelity where request->response matching can not rely on the URL alone, and must also compare (parts) of the request body. The changes include:
{"a", "b", "c": {"a": "b"}}
will be converted to query stringa=b&a.2_=b
(the suffix is hopefully one that is not commonly used).__wb_method=<method>
is also added, so post requests will have a__wb_method=POST
in the queryThis indexing approach is compatible with cdxj-indexer and the replay used in wabac.js/ReplayWeb.page
This is a breaking change for existing collections that already have POST requests. These will need to be re-indexed to get accurate POST request replay.
Motivation and Context
Many sites require accurate replay of POST and PUT requests to be able to successfully match requests to responses and accurate replay pages.
Screenshots (if appropriate):
Types of changes
Checklist: