validating responses? #252

Dieterbe · 2016-04-17T16:41:30Z

Hi, been using gor for a bit and it proved really helpful in determining why some code changed crashed our apps with production traffic (while everything was great with fake stress test traffic).

but is there a way to also validate that when sending the duplicated traffic to test setup, that the test setup also responds the same responses as what prod is doing? (or maybe "near-identical", by allowing a regex or something to determine a field that could be different)

this is my biggest hurdle with the tool now. it's great for sending the traffic, but i really want to validate the responses as well. (or a sampled subset of responses)

am i missing something obvious? couldn't find any references to this.

thanks!

buger · 2016-04-18T12:27:11Z

Hello! I think this is smth that should be included in default installation. I made some steps to this direction and currently you can implement it by your own using middleware (as it has access to both request and response variables).

Dieterbe · 2016-04-18T13:27:44Z

I made some steps to this direction

Where can I see this?

buger · 2016-04-18T13:30:14Z

See this section https://github.com/buger/gor#middleware

Also, it will be really helpful if you help imagine how this built-in functionality may look like. Some fields probably should be ignored. Should it just generate a file with all differences, or somehow output it to the console. What would you like it to do? Thank you!

ecourreges-orange · 2016-04-18T14:22:20Z

Hello,
For implementing response validation/comparison, you could inspire yourself from diffy in java:
https://github.com/twitter/diffy

or mod_dup+mod_compare in C++:
https://github.com/Orange-OpenSource/mod_dup

We currently use mod_compare at Orange for making sure new code versions of our API are strictly identical to the one we are trying to replicate.
It includes filters to ignore certain differences, and we inject the differences real-time to ES (via logstash), and then we visualize/group differences with Kibana.
The diff engine is based on dtl-cpp: https://github.com/cubicdaiya/dtl
It has been very powerful for us, allowing real-time diff with low impact on the API servers (> 500 req/s per apache server running the API and mod_compare)

If someone implemented that kind of functionality in gor, we might be tempted to switch to it and perhaps contribute.

Regards,
[http://www.francetelecom.com/sirius/logos_mail/orange_logo.gif]
Emmanuel Courrèges
Responsable Technique APIs PnS
Orange/OF/DTSI/DSI/Digital Factory/Smart Data Factory/PNS
Tél. 04 97 46 26 51
[email protected]:[email protected]
[http://www.francetelecom.com/sirius/logos_mail/ampersand.gif]

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

Dieterbe · 2016-04-18T16:29:48Z

diffy says "For safety reasons POST, PUT, DELETE are ignored by default . Add -allowHttpSideEffects=true to your command line arguments to enable these verbs." <- this makes sense.
diffy show diffs in realtime in the browser. interesting but not a necessecity imho. I would much rather have the data in a file/database.
@ecourreges-orange on one of the images for mod_compare it shows it stores the diff data to a file. is this file format documented anywhere? How about the filters you mention? any docs on those?

@buger, re how i would want it to look like:

have you ever looked at vegeta? it's a http load testing tool, and it has a really nice mechanism where it can stream output data to a file, and that output file can then be used to generate all kinds reports from (to browser, to console, to png's, etc), after the fact but also in real time, streaming. We can take inspiration from that
for each response that has differences, i would log request, response-good and response-bad. we can then write simple tools to process that data how we like (some actual diff highlighting would be nice). ideally a binary format but something like an append-only yaml could also work to get us started.
furthermore, i would say responses that are equal don't have to be logged (this is probably obvious)
they can easily be sniffed
if there's a lot of diffs, i don't want to overload the system trying to store them all. so i want to sample. but gor already supports sampling down the volume of requests, which i think is fine. i don't think we need an extra sampling step to control the volume of diffs stored compared to the volume of requests sent. it's ok for them to be tied together I think because correctness and load testing are separate activities where you either care about finding diffs and focus on resolving them, or focus on load testing (after fixing the bugs that caused diffs)
to handle exceptions (some header or piece of the response body may be different), we'll need some kind of filters to express what kind of diffs we want to allow. there's so many ways to go about this, we can implement first without filters and then see what kind of filters people need.

@buger to be clear are you saying you want to build this as a middleware? you mention "built-in" but not sure if that means as middleware or not. do you have concrete plans to work on this? hopefully i can find some time to help out but can't promise anything.

ecourreges-orange · 2016-04-18T18:24:02Z

Hello Dieter,

This page explains briefly the filters :
https://github.com/Orange-OpenSource/mod_dup/wiki/mod_comp_exploit

And following your question, I have started this page to explain the output formats:
https://github.com/Orange-OpenSource/mod_dup/wiki/mod_compare_logs

Regards,
[http://www.francetelecom.com/sirius/logos_mail/orange_logo.gif]
Emmanuel Courrèges
PnS APIs Chief Engineer
Orange/OF/DTSI/DSI/Digital Factory/Smart Data Factory/PNS.com
[email protected]:[email protected]

De : Dieter Plaetinck [mailto:[email protected]]
Envoyé : lundi 18 avril 2016 18:30
À : buger/gor
Cc : COURREGES Emmanuel DTSI/DSI
Objet : Re: [buger/gor] validating responses? (#252)

diffy says "For safety reasons POST, PUT, DELETE are ignored by default . Add -allowHttpSideEffects=true to your command line arguments to enable these verbs." <- this makes sense.
diffy show diffs in realtime in the browser. interesting but not a necessecity imho. I would much rather have the data in a file/database.
@ecourreges-orangehttps://github.com/ecourreges-orange on one of the images for mod_compare it shows it stores the diff data to a file. is this file format documented anywhere? How about the filters you mention? any docs on those?

@bugerhttps://github.com/buger, re how i would want it to look like:

have you ever looked at vegeta? it's a http load testing tool, and it has a really nice mechanism where it can stream output data to a file, and that output file can then be used to generate all kinds reports from (to browser, to console, to png's, etc), after the fact but also in real time, streaming. We can take inspiration from that
for each response that has differences, i would log request, response-good and response-bad. we can then write simple tools to process that data how we like (some actual diff highlighting would be nice). ideally a binary format but something like an append-only yaml could also work to get us started.
furthermore, i would say responses that are equal don't have to be logged (this is probably obvious) they can easily be sniffed
if there's a lot of diffs, i don't want to overload the system trying to store them all. so i want to sample. but gor already supports sampling down the volume of requests, which i think is fine. i don't think we need an extra sampling step to control the volume of diffs stored compared to the volume of requests sent. it's ok for them to be tied together I think because correctness and load testing are separate activities where you either care about finding diffs and focus on resolving them, or focus on load testing (after fixing the bugs that caused diffs)
to handle exceptions (some header or piece of the response body may be different), we'll need some kind of filters to express what kind of diffs we want to allow. there's so many ways to go about this, we can implement first without filters and then see what kind of filters people need.

@bugerhttps://github.com/buger to be clear are you saying you want to build this as a middleware? you mention "built-in" but not sure if that means as middleware or not. do you have concrete plans to work on this? hopefully i can find some time to help out but can't promise anything.

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHubhttps://github.com//issues/252#issuecomment-211459091

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

buger · 2016-04-20T08:41:09Z

Thank you all for such detailed feedback! I need some time to read all this stuff :)

@ecourreges-orange not middleware, i want it to be built in.

@ecourreges-orange one more question, using mod_dup how do you handle dynamic variables like user session ids or auth tokens, which generated randomly on "production", and will not be same on staging?

ecourreges-orange · 2016-04-20T13:22:21Z

     For now we have only used mod_compare on stateless APIs without authentication.

     I don’t know how session Ids could be handled, especially if you duplicate only a percentage of production, you will probably not duplicate the session initiation request, so it’s a problem even if you don’t compare!

     As per authentication tokens, we won’t have any issue because we actually duplicate on “hidden production” which has the same authentication scope/keytab as production. So the token that works for prod also works for hidden prod.

o The question remains for duplicating to staging, I don’t know how to solve this without compromising security => staging probably needs to have auth disabled, in that case you’ll have differences only when there is an auth error on prod?

o We duplicate to hidden production so that the APIs hit the exact same backends. If you hit your staging server, your databases are probably not fully in sync, so you will have a lot of false positive differences.

o Of course, to make sure our new code doesn’t break our production database, we first test it in pre-production/bench/development.

     If we just need to ignore headers that are different (like Date, etc.) we use the Header Ignore functionality like this: HeaderList “IGNORE” “Date” “.”   - where “.” Is a regexp.

De : Leonid Bugaev [mailto:[email protected]]
Envoyé : mercredi 20 avril 2016 10:41
À : buger/gor
Cc : COURREGES Emmanuel DTSI/DSI; Mention
Objet : Re: [buger/gor] validating responses? (#252)

Thank you all for such detailed feedback! I need some time to read all this stuff :)

@ecourreges-orangehttps://github.com/ecourreges-orange not middleware, i want it to be built in.

@ecourreges-orangehttps://github.com/ecourreges-orange one more question, using mod_dup how do you handle dynamic variables like user session ids or auth tokens, which generated randomly on "production", and will not be same on staging?

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHubhttps://github.com//issues/252#issuecomment-212331174

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

Dieterbe · 2016-05-03T00:30:25Z

also relevant: https://github.com/dnaeon/go-vcr

joekiller · 2016-07-06T20:23:21Z

I generally used the elasticsearch results to validate responses as it includes the response code.

I also recently updated the tool to fix it with ES 2.x #333

jauco · 2016-08-30T15:42:26Z

Hey,

I wrote a gor middleware in java to do this. It's fairly untested. But I will be running it against our website in the coming weeks. Code is at https://github.com/HuygensING/gor-tester

cheers.

Jauco

buger · 2016-08-30T16:32:57Z

Terrific! Thank you so much for sharing it!

mayank-unbxd · 2018-07-18T10:41:34Z

@jauco were you able to validate response across your staging/test server

@buger - pretty exited to know if you have built something around this
Hello! I think this is smth that should be included in default installation. I made some steps to this direction and currently you can implement it by your own using middleware (as it has access to both request and response variables).

can you share something on this ...

buger · 2018-07-18T12:03:05Z

@mayank-unbxd this doc explains how to access response and validate it https://github.com/buger/goreplay/tree/master/middleware

buger added the question Question about GoReplay and how to use it label Aug 1, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

validating responses? #252

validating responses? #252

Dieterbe commented Apr 17, 2016

buger commented Apr 18, 2016

Dieterbe commented Apr 18, 2016

buger commented Apr 18, 2016

ecourreges-orange commented Apr 18, 2016

Dieterbe commented Apr 18, 2016

ecourreges-orange commented Apr 18, 2016

buger commented Apr 20, 2016

ecourreges-orange commented Apr 20, 2016

Dieterbe commented May 3, 2016

joekiller commented Jul 6, 2016

jauco commented Aug 30, 2016 •

edited

Loading

buger commented Aug 30, 2016

mayank-unbxd commented Jul 18, 2018

buger commented Jul 18, 2018

validating responses? #252

validating responses? #252

Comments

Dieterbe commented Apr 17, 2016

buger commented Apr 18, 2016

Dieterbe commented Apr 18, 2016

buger commented Apr 18, 2016

ecourreges-orange commented Apr 18, 2016

Dieterbe commented Apr 18, 2016

ecourreges-orange commented Apr 18, 2016

buger commented Apr 20, 2016

ecourreges-orange commented Apr 20, 2016

Dieterbe commented May 3, 2016

joekiller commented Jul 6, 2016

jauco commented Aug 30, 2016 • edited Loading

buger commented Aug 30, 2016

mayank-unbxd commented Jul 18, 2018

buger commented Jul 18, 2018

jauco commented Aug 30, 2016 •

edited

Loading