-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add POST data records, for PyWB playback #244
Comments
More information from Ilya: Hi, I am working on trying to standardize POST request indexing across all the different Webrecorder tools, and support additional improvements.. This probably calls for a write-up, but just wanted to share what the idea is so far:
The CDXJ entry would look like this: org,httpbin)/post?__wb_method=post&another=more^data&test=some+data 20200809195334 {"url": "https://httpbin.org/post", "mime": "application/json", "status": "200", "digest": "7AWVEIPQMCA4KTCNDXWSZ465FITB7LSK", "length": "688", "offset": "0", "filename": "post-test-more.warc", "requestBody": "?__wb_method=POST&test=some+data&another=more%5Edata", "method": "POST"}
the requestBody is for: |
Thanks @thomasegense - I'm afraid I'm probably going to switch to using the PyWB indexer for now, as modifying this codebase to pull together the request and reponse records is going to mean significant changes to the way it works. I don't current have time to make those changes. |
I completely agree with you. There is a hard timeconsuming task with only minor benefits to playback in solrwayback. |
Here is latest work going on between Ilya and Alex: |
I have a Java implementation of pywb compatible POST/PUT request body encoding here: https://github.com/iipc/jwarc/blob/master/src/org/netpreserve/jwarc/cdx/CdxRequestEncoder.java |
To get playback working, we need to make HEAD/OPTIONS/POST records like PyWB does. See webrecorder/pywb#585 and related tickets.
It's fairly involved! https://github.com/webrecorder/pywb/blob/54d8bccf4a4eebf305012d49cb7330eaddea9eba/pywb/warcserver/inputrequest.py#L183
Will replace/supercede
webarchive-discovery/warc-hadoop-recordreaders/src/main/java/uk/bl/wa/hadoop/mapreduce/cdx/TinyCDXServerReducer.java
Lines 86 to 95 in a166803
Note that to be useful, we need to upgrade to
nlagovau/outbackcdx:0.8.0
.The text was updated successfully, but these errors were encountered: