Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Karate-Gatling - large delay between the first active user(s) and the first request(s) #1395

Closed
theathlete3141 opened this issue Dec 10, 2020 · 13 comments
Assignees

Comments

@theathlete3141
Copy link

Using karate version 0.9.9.RC1

Using a basic feature file

Feature:

  Background:
    * url karate.properties['server.url']

  Scenario:
    Given path 'test'
    When method get

With a basic mock server

Feature:

  Scenario: pathMatches('test')

With a basic simulation:

class MySimulation extends Simulation {

  val server = MockServer.feature("classpath:example/mockServer.feature").build()
  val serverUrl = "http://localhost:" + server.getPort
  System.setProperty("server.url", serverUrl)

  val example = scenario("example")
    .exec(karateFeature("classpath:example/example.feature"))

  setUp(
    example.inject(rampUsers(100) during (10 seconds))
  )
}

Looking in the index.html test report, I found that there was a large delay between the first active user(s) and the first request(s).

Attached is an example project which demonstrates four test cases:

  1. Karate-Gatling simulation with Karate mock server
  2. Karate-Gatling simulation with non-Karate mock server
  3. Gatling simulation with Karate mock server
  4. Gatling simulation with non-Karate mock server
    example.zip

For a non-Karate mock server I used python http server. In a separate terminal window run (assuming using python3)

python -m http.server

Which will run the contents of that directory on localhost:8000. When running the tests, ignore the 404s seen here as the requests are not expected to succeed.

Then in a new terminal window run the four test cases:

mvn clean test -D gatling.simulationClass=example.KarateSimulation -D karate.env=mock
mvn test -D gatling.simulationClass=example.KarateSimulation
mvn test -D gatling.simulationClass=example.GatlingSimulation -D karate.env=mock
mvn test -D gatling.simulationClass=example.GatlingSimulation

In test cases 2 and 4 against the non-Karate mock server, ignore the errors where it fails to parse html as xml.

Attached are screenshots of the reports from test cases 1 and 3 which demonstrate the effect that making requests with Karate has on the results. As well as the initial pause seen in test case 1, the number of requests per second once it finally starts going is unexpectedly uneven.
karatesimulation-20201210144917763_index html
gatlingsimulation-20201210144959296_index html

This delay is only seen in test cases 1 and 2. Having ran these tests a number of times, often the delay is a little longer in test case 1 than 2 which could indicate that the issue is in part related to the Karate mock server. However no delay is seen in test case 3 so the issue must (mostly) be to do with the Karate aspect of Karate-Gatling rather than the Karate mock server.

Ignore the fact that the response times of these initial requests in each test case are very slow - this appears to be an unrelated issue, and seems to be to do with Gatling rather than Karate.

@ptrthomas
Copy link
Member

@theathlete3141 I'm inclined to mark this as won't fix since it is only for the first request, it could be the warm-up time of the JS engine. also I don't consider this a priority, there's plenty of backlog on the 1.0 release. any investigation from anyone will be greatly appreciated.

@theathlete3141
Copy link
Author

This issue has two parts:

  • the delay between users becoming active and making their first requests
  • the unevenness in the request per second

The warm-up time of the JS engine sounds like a good suggestion for the cause behind the first part, in which case the solution could be to delay the user activation until everything is initialised.

The second part seems a little more concerning. In this example the user spawn rate is low (10 per second) and the response time is short (< 1 second). Therefore the Gatling example behaves as expected in that the number of active users is always ~10:

  • 10 users become active in a second
  • Each makes their request, receive their response and then terminates in that same second
  • In the next second another 10 users become active

In the Karate-Gatling example however, even after the apparent warm up period has ended and the first request has finally been made, it looks like we sometimes have active users sitting around not doing anything.

@ptrthomas
Copy link
Member

@theathlete3141 yes this is a concern but I'm confident it could be an optimization we have missed. I don't remember if some changes I made to injecting an executor service into the feature-runtime is in 0.9.9.RC1 the other thing is maybe HTML reports and step-results collection may be happening in memory. a long pending optimization we haven't done is that the apache http client is still synchronous

one data point would be useful. do you see the same behavior in 0.9.6 ?

@adrian861
Copy link

@theathlete3141 I know this was just for demonstration, but I'm curious if you see the same behavior over a longer test run, say a minute? Also do you see the same delay with atOnceUsers? That command should produce immediate requests from every user.

@theathlete3141
Copy link
Author

TLDR: same issue are seen when using a longer test run, and also when using a different user inject method. Issues not present using version 0.9.6.

Running KarateSimulation with a Karate mock server using rampUsers(1000) during (100 seconds)
karatesimulation-20201210163606640_index html

Running KarateSimulation with a Karate mock server using atOnceUsers(100)
karatesimulation-20201210164725560_index html

Using Karate 0.9.6 and running KarateSimulation with a Karate mock server using rampUsers(1000) during (100 seconds)
example.zip
karatesimulation-20201210165230105_index html

@ptrthomas
Copy link
Member

Issues not present using version 0.9.6.

great. I'll try take a look over the weekend but any investigation in the meantime will help

@ptrthomas
Copy link
Member

ptrthomas commented Dec 11, 2020

@theathlete3141 can you do one more experiment. start the karate mock in a separate JVM (separate process like you do the python one) let me know if there is a difference.

    public static void main(String[] args) {
        MockServer server = MockServer
                .feature("classpath:com/intuit/karate/core/mock/_simple.feature")
                .http(8080).build();
        server.waitSync();
    }

@ptrthomas
Copy link
Member

ok, so with the karate mock running in a different process, here are my findings. 60 second test

mock:

Feature:

Scenario: pathMatches('/test')
* def response = { success: true }

gatling:

package example

import io.gatling.core.Predef._
import io.gatling.http.Predef._

import scala.concurrent.duration._

class GatlingSimple extends Simulation {

  val protocol = http.baseUrl("http://localhost:8080")

  val example = scenario("example")
    .exec(
      http("GET /test")
        .get("/test")
        .check(status.is(200))
    )
  setUp(
    example.inject(rampUsers(100) during (60 seconds)).protocols(protocol)
  )
}

results:
image

karate / simple.feature:

Feature:

  Scenario:
    * url 'http://localhost:8080'
    * path 'test'
    * method get
    * status 200
    * match response == { success: true }

karate-gatling:

package example

import com.intuit.karate.core.MockServer
import com.intuit.karate.gatling.PreDef._
import io.gatling.core.Predef._

import scala.concurrent.duration._
import scala.util.Properties

class KarateSimple extends Simulation {

  val example = scenario("example")
    .exec(karateFeature("classpath:example/simple.feature"))

  setUp(
    example.inject(rampUsers(100) during (60 seconds))
  )
}

result:
image

so I think we are good. closing. do see if you can find what weird threading or thread-local weirdness is happening when the mock is run by the scala JVM

@theathlete3141
Copy link
Author

@ptrthomas I don't think this issue is related to how the mock server is being run. It appears to be related to user arrival rate. In your simulation above your user count and duration are such that there are only a couple of users per second, whereas in my tests it was of the order of 10 per second.

Taking your above example but putting the mock server setup back in the simulation class

class KarateSimulation1 extends Simulation {

  val server = MockServer.feature("classpath:example/mockServer.feature").build()
  val serverUrl = "http://localhost:" + server.getPort
  System.setProperty("server.url", serverUrl)

  val example = scenario("example")
    .exec(karateFeature("classpath:example/example.feature"))

  setUp(
    example.inject(rampUsers(100) during (60 seconds))
  )
}

example.feature

Feature:

  Background:
    * url karate.properties['server.url']

  Scenario:
    Given path 'test'
    When method get
    Then status 200
    And match response == { success: true }

mockServer.feature

Feature:

  Scenario: pathMatches('test')
    * def response = { success: true }

Running with

mvn test -D gatling.simulationClass=example.KarateSimulation1

I see the same pattern of active users, requests and responses as you reported above.
It should be noted that even in this case a minor delay can be seen - in your Karate example the first requests are sent in the second after that which there are the first active users. I would have expected this to happen within the same second, as is seen in your Gatling example.
This can be seen more clearly when you consider the pattern of the number of requests per second with the number of active users. In your above Gatling example you can see that they align (i.e. peaks align with peaks and troughs align with troughs), whereas in your Karate example the number of requests lags behind the number of active users.

If I ramp up the same number of users over a shorter duration (rampUsers(100) during (10 seconds))

mvn test -D gatling.simulationClass=example.KarateSimulation2

Or ramp more users over the same duration (rampUsers(600) during (60 seconds))

mvn test -D gatling.simulationClass=example.KarateSimulation3

Then this delay becomes much more obvious and like that as shown in my original examples.

Example project attached. Just in case the 0.9.9.RC1 tagged version doesn't match the HEAD of develop, I have also tried the above against the source to include your recent commit.
example.zip

@ptrthomas
Copy link
Member

ptrthomas commented Dec 14, 2020

@theathlete3141 yes I tried with a higher no of users per second, things smooth out after the first 10 or so seconds. the lag you mentioned is also random and does not happen on every run.

considering this a won't fix unless convinced. meanwhile feel free to investigate and find solutions while we work on other areas. also the moment you increase users drastically, all bets are off but there are ways to mitigate - see:

and before you say so - yes pure gatling behaves differently: https://stackoverflow.com/a/63178818/143475

@ptrthomas
Copy link
Member

@theathlete3141 one more point - it does seem the new karate mock does not perform well for the first few seconds. if you request to re-open kindly focus on testing against a non-karate mock to demo that there is a significant lag or overhead

theathlete3141 pushed a commit to theathlete3141/karate that referenced this issue Dec 21, 2020
@theathlete3141
Copy link
Author

I found that this issue occurs even using a real url (http://date.jsontest.com). I've attached example project (using 0.9.9.RC2).
example.zip

Karate simulation against a live server:

mvn test -D gatling.simulationClass=example.KarateSimulation

Against a mock server:

mvn test -D gatling.simulationClass=example.KarateSimulation -D karate.env=mock

The equivalent using Gatling:

mvn test -D gatling.simulationClass=example.GatlingSimulation
mvn test -D gatling.simulationClass=example.GatlingSimulation -D karate.env=mock

However, I believe that I've found the issue and will submit a PR.

ptrthomas added a commit that referenced this issue Dec 21, 2020
#1395 - fix delay between the first active user(s) and the first request(s) in Karate-Gatling
@ptrthomas ptrthomas added this to the 1.0.0 milestone Dec 21, 2020
@ptrthomas ptrthomas added codequality and removed bug labels Dec 21, 2020
@ptrthomas
Copy link
Member

@theathlete3141 thanks for the PR, I'll keep this closed as this pertains to an in-dev version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants