Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ServiceBus] Align stress tests to cross-language min-bar before GA #14338

Closed
KieranBrantnerMagee opened this issue Oct 7, 2020 · 0 comments · Fixed by #14612 or #14437
Closed

[ServiceBus] Align stress tests to cross-language min-bar before GA #14338

KieranBrantnerMagee opened this issue Oct 7, 2020 · 0 comments · Fixed by #14612 or #14437
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. Service Bus

Comments

@KieranBrantnerMagee
Copy link
Member

KieranBrantnerMagee commented Oct 7, 2020

Towards GA fit-and-finish: Ensure our stress test coverage is on par with other SDK priorities.

Message lock renewal

Keep sending messages in a stream, keep receiving them and
○ manually keep renewing the lock for X duration
○ Use auto renew for X duration

Variation - load the queue with a set of messages initially

Snapshot
- Time stamp
- Number of operations performed
- Number of successes in X duration
- Number of failures in X duration
- Errors seen (Dump all the errors seen in a separate file at the end)
- Also include the snapshot from scenario 4
- Memory consumed

More thoughts
- Expectation is that it never fails in the X duration, observe if it fails?
- How many messages can we handle lock renewals for?
- Lock renewals over a long duration - reliability

Session lock renewal

Multiple sessions
- manually keep renewing the lock for X duration
- Use auto renew for X duration

Similar to above... but on session lock

Single sender

○ Loop over sendMessages for X duration with Y delay in between
○ Loop over Z parallel sendMessages for X duration with Y delay

Large messages
• Array of messages
• Batch message

Snapshot
○ Time stamp
○ Number of messages sent so far
○ Number of messages per sec
○ Number of sends per sec
○ Number of successes in X duration
○ Number of failures in X duration
○ Errors seen (Dump all the errors seen in a separate file at the end)

More Thoughts
○ Client should handle multiple sends in parallel? How many?
○ Client should work for sending for a long duration and see for any failures - reliability
○ Stretch goal - Send latency (requires internal instrumentation - Account only for the time the SDK takes ...to ignore service/network latencies)

Single Receiver
(Note the Sequence number and match with the received ones)

Keep sending messages in a stream, keep receiving them with a single receiver
○ ReceiveBatch in a loop with a single receiver for X duration(X=3hs)
§ Peeklock (random settlement method)
§ receiveAndDelete
§ maxMessageCount = 1 and Y
○ Streaming receiver left open for X duration to keep receiving the messages
§ peekLock (random settlement method)
§ receiveAndDelete
§ maxConcurrentCalls = 1 and Y
(As you increase the number, it should scale up)
Validation

Snapshot
○ Time elapsed
○ Number of messages sent so far
○ Number of messages received so far
○ Number of messages sent/received per sec
○ Number of successes in sending/receiving in X duration
○ Number of failures in sending/receiving in X duration
○ Number of messages per sec
○ Number of sends per sec
○ Number of receives per sec
○ Errors seen (Dump all the errors seen in a separate file at the end)

More Thoughts
○ Expectation is that we don't lose messages
○ Receiver is capable of receiving all the messages without breaking in between - reliability
○ Receive latency

Any of the managementLink operations
(Validate the sequence numbers)

Keep making peekMessage calls for X duration with Y delay in between

Snapshot
○ Time elapsed
○ Number of messages sent
○ Number of messages peeked so far
○ Number of successes in X duration
○ Number of failures in X duration
○ Errors seen (Dump all the errors seen in a separate file at the end)

More Thoughts
○ Stressing the managementLink
○ Difference b/w scenario-1 is that this deals with the data
○ Use fromSequenceNumber API
○ Client should work for a long duration and see for any failures - reliability

Relaxed tests - X is relatively longer (1hr/1day)

Do an operation, wait for X duration, do the operation again, repeat
Operation can be
○ Send
○ Receive
min_duration < X < max_duration

Snapshot
- Same as scenario 4

More Thoughts
- Implementation wise, scenario 3 and 4 would cover this
- Expectation is that the operation doesn't fail even if done after longer idle intervals

Closes and Opens

Create, open, close in sequence - repeat for X duration
○ Sender
○ Receiver
○ Session receiver

Variation - add closing the client too

Snapshot
- Include snapshot from scenario 4
- For senders/receivers/session-receivers on a single client
○ Number of close() calls made
○ Number of failures for close()
○ Number of successes for close()
○ Number of create() calls
○ Number of failures for create()
○ Number of successes for create()
- Errors seen (Dump all the errors seen in a separate file at the end)

More Thoughts
- Expectation is that closes and opens are graceful - observe if it fails

Same as above + make minor calls like send, receive in between

Pull receive reconnect

- Iterator timeout - python specific

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.