Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validating responses? #252

Open
Dieterbe opened this issue Apr 17, 2016 · 14 comments
Open

validating responses? #252

Dieterbe opened this issue Apr 17, 2016 · 14 comments
Labels
question Question about GoReplay and how to use it

Comments

@Dieterbe
Copy link

Hi, been using gor for a bit and it proved really helpful in determining why some code changed crashed our apps with production traffic (while everything was great with fake stress test traffic).

but is there a way to also validate that when sending the duplicated traffic to test setup, that the test setup also responds the same responses as what prod is doing? (or maybe "near-identical", by allowing a regex or something to determine a field that could be different)

this is my biggest hurdle with the tool now. it's great for sending the traffic, but i really want to validate the responses as well. (or a sampled subset of responses)

am i missing something obvious? couldn't find any references to this.

thanks!

@buger
Copy link
Owner

buger commented Apr 18, 2016

Hello! I think this is smth that should be included in default installation. I made some steps to this direction and currently you can implement it by your own using middleware (as it has access to both request and response variables).

@Dieterbe
Copy link
Author

I made some steps to this direction

Where can I see this?

@buger
Copy link
Owner

buger commented Apr 18, 2016

See this section https://github.com/buger/gor#middleware

Also, it will be really helpful if you help imagine how this built-in functionality may look like. Some fields probably should be ignored. Should it just generate a file with all differences, or somehow output it to the console. What would you like it to do? Thank you!

@ecourreges-orange
Copy link

Hello,
For implementing response validation/comparison, you could inspire yourself from diffy in java:
https://github.com/twitter/diffy

or mod_dup+mod_compare in C++:
https://github.com/Orange-OpenSource/mod_dup

We currently use mod_compare at Orange for making sure new code versions of our API are strictly identical to the one we are trying to replicate.
It includes filters to ignore certain differences, and we inject the differences real-time to ES (via logstash), and then we visualize/group differences with Kibana.
The diff engine is based on dtl-cpp: https://github.com/cubicdaiya/dtl
It has been very powerful for us, allowing real-time diff with low impact on the API servers (> 500 req/s per apache server running the API and mod_compare)

If someone implemented that kind of functionality in gor, we might be tempted to switch to it and perhaps contribute.

Regards,
[http://www.francetelecom.com/sirius/logos_mail/orange_logo.gif]
Emmanuel Courrèges
Responsable Technique APIs PnS
Orange/OF/DTSI/DSI/Digital Factory/Smart Data Factory/PNS
Tél. 04 97 46 26 51
[email protected]:[email protected]
[http://www.francetelecom.com/sirius/logos_mail/ampersand.gif]


Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

@Dieterbe
Copy link
Author

  • diffy says "For safety reasons POST, PUT, DELETE are ignored by default . Add -allowHttpSideEffects=true to your command line arguments to enable these verbs." <- this makes sense.
  • diffy show diffs in realtime in the browser. interesting but not a necessecity imho. I would much rather have the data in a file/database.
  • @ecourreges-orange on one of the images for mod_compare it shows it stores the diff data to a file. is this file format documented anywhere? How about the filters you mention? any docs on those?

@buger, re how i would want it to look like:

  • have you ever looked at vegeta? it's a http load testing tool, and it has a really nice mechanism where it can stream output data to a file, and that output file can then be used to generate all kinds reports from (to browser, to console, to png's, etc), after the fact but also in real time, streaming. We can take inspiration from that
  • for each response that has differences, i would log request, response-good and response-bad. we can then write simple tools to process that data how we like (some actual diff highlighting would be nice). ideally a binary format but something like an append-only yaml could also work to get us started.
  • furthermore, i would say responses that are equal don't have to be logged (this is probably obvious)
    they can easily be sniffed
  • if there's a lot of diffs, i don't want to overload the system trying to store them all. so i want to sample. but gor already supports sampling down the volume of requests, which i think is fine. i don't think we need an extra sampling step to control the volume of diffs stored compared to the volume of requests sent. it's ok for them to be tied together I think because correctness and load testing are separate activities where you either care about finding diffs and focus on resolving them, or focus on load testing (after fixing the bugs that caused diffs)
  • to handle exceptions (some header or piece of the response body may be different), we'll need some kind of filters to express what kind of diffs we want to allow. there's so many ways to go about this, we can implement first without filters and then see what kind of filters people need.

@buger to be clear are you saying you want to build this as a middleware? you mention "built-in" but not sure if that means as middleware or not. do you have concrete plans to work on this? hopefully i can find some time to help out but can't promise anything.

@ecourreges-orange
Copy link

Hello Dieter,

This page explains briefly the filters :
https://github.com/Orange-OpenSource/mod_dup/wiki/mod_comp_exploit

And following your question, I have started this page to explain the output formats:
https://github.com/Orange-OpenSource/mod_dup/wiki/mod_compare_logs

Regards,
[http://www.francetelecom.com/sirius/logos_mail/orange_logo.gif]
Emmanuel Courrèges
PnS APIs Chief Engineer
Orange/OF/DTSI/DSI/Digital Factory/Smart Data Factory/PNS.com
[email protected]:[email protected]

De : Dieter Plaetinck [mailto:[email protected]]
Envoyé : lundi 18 avril 2016 18:30
À : buger/gor
Cc : COURREGES Emmanuel DTSI/DSI
Objet : Re: [buger/gor] validating responses? (#252)

  • diffy says "For safety reasons POST, PUT, DELETE are ignored by default . Add -allowHttpSideEffects=true to your command line arguments to enable these verbs." <- this makes sense.
  • diffy show diffs in realtime in the browser. interesting but not a necessecity imho. I would much rather have the data in a file/database.
  • @ecourreges-orangehttps://github.com/ecourreges-orange on one of the images for mod_compare it shows it stores the diff data to a file. is this file format documented anywhere? How about the filters you mention? any docs on those?

@bugerhttps://github.com/buger, re how i would want it to look like:

  • have you ever looked at vegeta? it's a http load testing tool, and it has a really nice mechanism where it can stream output data to a file, and that output file can then be used to generate all kinds reports from (to browser, to console, to png's, etc), after the fact but also in real time, streaming. We can take inspiration from that
  • for each response that has differences, i would log request, response-good and response-bad. we can then write simple tools to process that data how we like (some actual diff highlighting would be nice). ideally a binary format but something like an append-only yaml could also work to get us started.
  • furthermore, i would say responses that are equal don't have to be logged (this is probably obvious) they can easily be sniffed
  • if there's a lot of diffs, i don't want to overload the system trying to store them all. so i want to sample. but gor already supports sampling down the volume of requests, which i think is fine. i don't think we need an extra sampling step to control the volume of diffs stored compared to the volume of requests sent. it's ok for them to be tied together I think because correctness and load testing are separate activities where you either care about finding diffs and focus on resolving them, or focus on load testing (after fixing the bugs that caused diffs)
  • to handle exceptions (some header or piece of the response body may be different), we'll need some kind of filters to express what kind of diffs we want to allow. there's so many ways to go about this, we can implement first without filters and then see what kind of filters people need.

@bugerhttps://github.com/buger to be clear are you saying you want to build this as a middleware? you mention "built-in" but not sure if that means as middleware or not. do you have concrete plans to work on this? hopefully i can find some time to help out but can't promise anything.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHubhttps://github.com//issues/252#issuecomment-211459091


Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

@buger
Copy link
Owner

buger commented Apr 20, 2016

Thank you all for such detailed feedback! I need some time to read all this stuff :)

@ecourreges-orange not middleware, i want it to be built in.

@ecourreges-orange one more question, using mod_dup how do you handle dynamic variables like user session ids or auth tokens, which generated randomly on "production", and will not be same on staging?

@ecourreges-orange
Copy link

  •      For now we have only used mod_compare on stateless APIs without authentication.
    
  •      I don’t know how session Ids could be handled, especially if you duplicate only a percentage of production, you will probably not duplicate the session initiation request, so it’s a problem even if you don’t compare!
    
  •      As per authentication tokens, we won’t have any issue because we actually duplicate on “hidden production” which has the same authentication scope/keytab as production. So the token that works for prod also works for hidden prod.
    

o The question remains for duplicating to staging, I don’t know how to solve this without compromising security => staging probably needs to have auth disabled, in that case you’ll have differences only when there is an auth error on prod?

o We duplicate to hidden production so that the APIs hit the exact same backends. If you hit your staging server, your databases are probably not fully in sync, so you will have a lot of false positive differences.

o Of course, to make sure our new code doesn’t break our production database, we first test it in pre-production/bench/development.

  •      If we just need to ignore headers that are different (like Date, etc.) we use the Header Ignore functionality like this: HeaderList “IGNORE” “Date” “.”   - where “.” Is a regexp.
    

De : Leonid Bugaev [mailto:[email protected]]
Envoyé : mercredi 20 avril 2016 10:41
À : buger/gor
Cc : COURREGES Emmanuel DTSI/DSI; Mention
Objet : Re: [buger/gor] validating responses? (#252)

Thank you all for such detailed feedback! I need some time to read all this stuff :)

@ecourreges-orangehttps://github.com/ecourreges-orange not middleware, i want it to be built in.

@ecourreges-orangehttps://github.com/ecourreges-orange one more question, using mod_dup how do you handle dynamic variables like user session ids or auth tokens, which generated randomly on "production", and will not be same on staging?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHubhttps://github.com//issues/252#issuecomment-212331174


Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

@Dieterbe
Copy link
Author

Dieterbe commented May 3, 2016

also relevant: https://github.com/dnaeon/go-vcr

@joekiller
Copy link
Contributor

I generally used the elasticsearch results to validate responses as it includes the response code.

I also recently updated the tool to fix it with ES 2.x #333

@buger buger added the question Question about GoReplay and how to use it label Aug 1, 2016
@jauco
Copy link

jauco commented Aug 30, 2016

Hey,

I wrote a gor middleware in java to do this. It's fairly untested. But I will be running it against our website in the coming weeks. Code is at https://github.com/HuygensING/gor-tester

cheers.

Jauco

@buger
Copy link
Owner

buger commented Aug 30, 2016

Terrific! Thank you so much for sharing it!

@mayank-unbxd
Copy link

@jauco were you able to validate response across your staging/test server

@buger - pretty exited to know if you have built something around this
Hello! I think this is smth that should be included in default installation. I made some steps to this direction and currently you can implement it by your own using middleware (as it has access to both request and response variables).

can you share something on this ...

@buger
Copy link
Owner

buger commented Jul 18, 2018

@mayank-unbxd this doc explains how to access response and validate it https://github.com/buger/goreplay/tree/master/middleware

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question about GoReplay and how to use it
Projects
None yet
Development

No branches or pull requests

6 participants