Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eventhub Stress Test Onboarding #28320

Merged

Conversation

m-redding
Copy link
Member

Contributing to the Azure SDK

Please see our CONTRIBUTING.md if you are not familiar with contributing to this repository or have questions.

For specific information about pull request etiquette and best practices, see this section.

@check-enforcer
Copy link

This pull request is protected by Check Enforcer.
For more information about how to run a pipeline against this pull request, see this.

Comment on lines 5 to 7
COPY /sdk/eventhub/Azure.Messaging.EventHubs/ /app/sdk/eventhub/Azure.Messaging.EventHubs/
COPY /eng /app/eng
COPY /tools /app/tools
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these also need to be ./ instead of /?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're referencing the root, and aren't relative to the current directory, so I believe they're correct as written.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benbp: Any thoughts on templatizing some of this? The inclusions here will be needed for any .NET library to build due to the engsys dependencies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsquire I'm definitely in favor of common stress components, either global or language level. For this dockerfile, my initial thought would be to create an intermediate builder image, so you then you can do something like FROM azsdkengsys.azurecr.io/stress/dotnet/base AS build-env?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Development could be somewhat annoying if you're iterating on both, so we may want to provide a way to build multiple dockerfiles in sequence if it starts getting iterated on heavily.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the idea of the intermediate image.

# TODO: how to update registry path

Set-Location -Path ..\..\..\..\
Set-Location -Path .\artifacts\bin\Azure.Messaging.EventHubs.Stress\Debug\netcoreapp3.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script will fail the first time you run it if you have no eventhubs build.

```

### Running Tests
When tests are run locally, Azure resources need to be created prior to running the test. This can be done through the Azure CLI, an ARM template or bicep file, or the Azure Portal. The user is required to input the connection strings upon request on the command line when the test is being run. For more information about what resources are needed for each test, see the "Scenario Information" section below.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ckairen I think we should also support a "deploy resources only" type mode to make local testing easier with resources.

@@ -0,0 +1,165 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid checking this file in and also add a .gitignore for it (example) since it's an intermediate artifact.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Omitting it will help ensure we don't have confusion where someone who wants to make a change edits this template rather than the bicep file and gets confused when it doesn't work.

@description('The location of the resource group.')
param location string = resourceGroup().location

var eventHubsNamespace_var = 'eh-${resourceGroup().name}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you only use the resource group name here, it will be a little simpler to use the auto-generated namespace URL in the network blocking config, but it should still work fine with eh- we'll just have to remember to add it in other places.

@benbp
Copy link
Member

benbp commented Apr 21, 2022

A couple nits/suggestions but the stress configuration pieces LGTM.


Please see our [contributing guide](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/eventhub/Azure.Messaging.EventHubs/CONTRIBUTING.md) for more information.

![Impressions](https://azure-sdk-impressions.azurewebsites.net/api/impressions/azure-sdk-for-net%2Fsdk%2Feventhub%2FAzure.Messaging.EventHubs%2stress%2FREADME.png)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This png fails to load on the readme.

Copy link
Member

@jsquire jsquire Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loads for me; it's one of our tracking pixels, so it will look like nothing. @benbp: Are you seeing it 404 or just nothing show up?

Copy link
Member

@benbp benbp Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsquire it looks like the problem is a missing F character in the url encoding: EventHubs %2 stress. I added a commit suggestion below.

image

It looks like this on the readme, that's how I caught it:

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahhh.... I see it now. Yeah, that's on me. Thanks for the catch!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chrome/Edge do a nice job of fixing the malformed encoding. Checking in the browser worked.

@@ -5,7 +5,7 @@
"_generator": {
"name": "bicep",
"version": "0.4.1318.3566",
"templateHash": "9849786101028970025"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think because you previously checked this in, the .gitignore won't automatically remove it. So you have to manually remove it with git rm and commit the file deletion.

@jsquire
Copy link
Member

jsquire commented Apr 26, 2022

We should add the ConfigureAwait(false) to all the things at some point in the near future. It'll be helpful for ensuring that things keep flowing efficiently since we're in a resource-limited host environment.

sdk/eventhub/Azure.Messaging.EventHubs/stress/Dockerfile Outdated Show resolved Hide resolved

// Create the buffered producer client and register the success and failure handlers

var options = new EventHubBufferedProducerClientOptions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Buffered Producer has some adjustments to the standard defaults for retry policy if the options aren't explicitly passed. We should mirror them for the test, since that'll be our majority case.

Source: https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/eventhub/Azure.Messaging.EventHubs/src/Producer/EventHubBufferedProducerClient.cs#L68

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have ideas on how we can better detect and adjust retry configuration when the individual values aren't set, I'd be very interested in hearing them. The best that I can think of is creating a mirrored type inside the buffered options. Maybe we should explore that as a quick interlude between phase 1 and phase 2.

{
// If this catch is hit, it means the producer has restarted, collect metrics.

_metrics.Client.GetMetric(_metrics.ProducerRestarted).TrackValue(1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this increment the count or are we collecting these as individual events and then aggregating via Kusto?

Copy link
Member Author

@m-redding m-redding Apr 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsquire The GetMetric method is cool it aggregates (in this case sums) up all the values sent via "TrackValue" and sends them all at once:
"In summary GetMetric() is the recommended approach since it does pre-aggregation, it accumulates values from all the Track() calls and sends a summary/aggregate once every minute. This can significantly reduce the cost and performance overhead by sending fewer data points, while still collecting all relevant information." ref
The long-term aggregation like total sum is done in Kusto.

@m-redding m-redding merged commit adf95c7 into Azure:mredding/stress-test-onboarding Apr 29, 2022
zhihaoxue pushed a commit to zhihaoxue/azure-sdk-for-net that referenced this pull request Jul 27, 2022
@m-redding m-redding deleted the eventhub-stress-tests branch January 10, 2024 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Event Hubs pillar-reliability The issue is related to reliability, one of our core engineering pillars. (includes stress testing) Stress This issue is related to stress testing, part of our reliability pillar.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants