[ServiceBus] Align stress tests to cross-language min-bar before GA #14338

KieranBrantnerMagee · 2020-10-07T20:04:42Z

Towards GA fit-and-finish: Ensure our stress test coverage is on par with other SDK priorities.

Message lock renewal

Keep sending messages in a stream, keep receiving them and
○ manually keep renewing the lock for X duration
○ Use auto renew for X duration

Variation - load the queue with a set of messages initially

Snapshot
- Time stamp
- Number of operations performed
- Number of successes in X duration
- Number of failures in X duration
- Errors seen (Dump all the errors seen in a separate file at the end)
- Also include the snapshot from scenario 4
- Memory consumed

More thoughts
- Expectation is that it never fails in the X duration, observe if it fails?
- How many messages can we handle lock renewals for?
- Lock renewals over a long duration - reliability

Session lock renewal

Multiple sessions
- manually keep renewing the lock for X duration
- Use auto renew for X duration

Similar to above... but on session lock

Single sender

○ Loop over sendMessages for X duration with Y delay in between
○ Loop over Z parallel sendMessages for X duration with Y delay

Large messages
• Array of messages
• Batch message

Snapshot
○ Time stamp
○ Number of messages sent so far
○ Number of messages per sec
○ Number of sends per sec
○ Number of successes in X duration
○ Number of failures in X duration
○ Errors seen (Dump all the errors seen in a separate file at the end)

More Thoughts
○ Client should handle multiple sends in parallel? How many?
○ Client should work for sending for a long duration and see for any failures - reliability
○ Stretch goal - Send latency (requires internal instrumentation - Account only for the time the SDK takes ...to ignore service/network latencies)

Single Receiver
(Note the Sequence number and match with the received ones)

Keep sending messages in a stream, keep receiving them with a single receiver
○ ReceiveBatch in a loop with a single receiver for X duration(X=3hs)
§ Peeklock (random settlement method)
§ receiveAndDelete
§ maxMessageCount = 1 and Y
○ Streaming receiver left open for X duration to keep receiving the messages
§ peekLock (random settlement method)
§ receiveAndDelete
§ maxConcurrentCalls = 1 and Y
(As you increase the number, it should scale up)
Validation

Snapshot
○ Time elapsed
○ Number of messages sent so far
○ Number of messages received so far
○ Number of messages sent/received per sec
○ Number of successes in sending/receiving in X duration
○ Number of failures in sending/receiving in X duration
○ Number of messages per sec
○ Number of sends per sec
○ Number of receives per sec
○ Errors seen (Dump all the errors seen in a separate file at the end)

More Thoughts
○ Expectation is that we don't lose messages
○ Receiver is capable of receiving all the messages without breaking in between - reliability
○ Receive latency

Any of the managementLink operations
(Validate the sequence numbers)

Keep making peekMessage calls for X duration with Y delay in between

Snapshot
○ Time elapsed
○ Number of messages sent
○ Number of messages peeked so far
○ Number of successes in X duration
○ Number of failures in X duration
○ Errors seen (Dump all the errors seen in a separate file at the end)

More Thoughts
○ Stressing the managementLink
○ Difference b/w scenario-1 is that this deals with the data
○ Use fromSequenceNumber API
○ Client should work for a long duration and see for any failures - reliability

Relaxed tests - X is relatively longer (1hr/1day)

Do an operation, wait for X duration, do the operation again, repeat
Operation can be
○ Send
○ Receive
min_duration < X < max_duration

Snapshot
- Same as scenario 4

More Thoughts
- Implementation wise, scenario 3 and 4 would cover this
- Expectation is that the operation doesn't fail even if done after longer idle intervals

Closes and Opens

Create, open, close in sequence - repeat for X duration
○ Sender
○ Receiver
○ Session receiver

Variation - add closing the client too

Snapshot
- Include snapshot from scenario 4
- For senders/receivers/session-receivers on a single client
○ Number of close() calls made
○ Number of failures for close()
○ Number of successes for close()
○ Number of create() calls
○ Number of failures for create()
○ Number of successes for create()
- Errors seen (Dump all the errors seen in a separate file at the end)

More Thoughts
- Expectation is that closes and opens are graceful - observe if it fails

Same as above + make minor calls like send, receive in between

Pull receive reconnect

- Iterator timeout - python specific

KieranBrantnerMagee added Service Bus Client This issue points to a problem in the data-plane of the library. labels Oct 7, 2020

KieranBrantnerMagee added this to the [2020] November milestone Oct 7, 2020

KieranBrantnerMagee self-assigned this Oct 16, 2020

This was linked to pull requests Oct 26, 2020

[ServiceBus] Bring stress-test metrics up to cross-sdk parity. (primarily allow computing of trends) #14612

Merged

[ServiceBus] Add additional stress test coverage to ensure parity with cross-language priorities #14437

Merged

KieranBrantnerMagee closed this as completed in #14612 Oct 29, 2020

github-actions bot locked and limited conversation to collaborators Apr 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ServiceBus] Align stress tests to cross-language min-bar before GA #14338

[ServiceBus] Align stress tests to cross-language min-bar before GA #14338

KieranBrantnerMagee commented Oct 7, 2020 •

edited

Loading

[ServiceBus] Align stress tests to cross-language min-bar before GA #14338

[ServiceBus] Align stress tests to cross-language min-bar before GA #14338

Comments

KieranBrantnerMagee commented Oct 7, 2020 • edited Loading

KieranBrantnerMagee commented Oct 7, 2020 •

edited

Loading