-
Notifications
You must be signed in to change notification settings - Fork 456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[azure logs] add routing integration to use only one azure-eventhub input #11984
Conversation
💔 Build Failed
Failed CI StepsHistory |
As the first step, this PR introduces a v2 integration that can handle all the Azure Logs using one input only.
|
It doesn't make sense in this data stream.
Used only when users do not provide a custom storage_account_container name.
🚀 Benchmarks reportTo see the full report comment with |
packages/azure/data_stream/events/elasticsearch/ingest_pipeline/default.yml
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed the packages/azure/docs/events.md file and left a few minor editing suggestions.
|
||
If you need to collect raw events from Azure Event Hub, we recommend using the [Custom Azure Logs integration](https://www.elastic.co/docs/current/integrations/azure_logs) which provides more flexibility. | ||
|
||
To learn more about the efficiency and routing enhancements introduced in version 1.20.0, please read the [Azure Logs (v2 preview)](https://www.elastic.co/docs/current/integrations/azure/events) documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the events.md file that will be rendered correct? Just the word efficiency made me read again the evetns.md and was trying to find a comparison between v1 vs v2.
Maybe just general, To learn more about the enhancements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the events.md file that will be rendered correct?
Yes, the events.md
file will eventually be available at https://www.elastic.co/guide/en/integrations/current/azure-events.html (now it's a 404).
Maybe just general, "To learn more about the enhancements"
You mean adding a link to the GitHub issues with more details, or add a sections that explains it? Should we extend the "What's new in v2 preview?" section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both can work. I would add some more details why v2 is better (if you have not done it already in last updates)
- target_dataset: azure.provisioning | ||
if: ctx.event?.dataset == 'azure.provisioning' | ||
namespace: | ||
- "{{data_stream.namespace}}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that you want to have also the additonal namespaces defined in the policy, along with default?
Should not then have a field namespace in manifest like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhm, good question. I don't know how it works under the hood, but the integration seems to manage data_stream.*
its own.
I run some tests and all work fine @zmoog ! Some samplesDid not manage to run the eh tool by the way with following error env | grep EVENTHUB
EVENTHUB_CONNECTION_STRING=Endpoint=sb://gizas-test.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey= ....
EVENTHUB_NAME=activitylogs
❯ cat data_stream/provisioning/_dev/test/pipeline/test-provisioninglogs-raw.log | eh -v eventdata send-batch
Sending 3 events to activitylogs
Traceback (most recent call last):
File "/opt/homebrew/bin/eh", line 8, in <module>
sys.exit(cli())
~~~^^
File "/opt/homebrew/lib/python3.13/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
~~~~~~~~~^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.13/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/opt/homebrew/lib/python3.13/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^
File "/opt/homebrew/lib/python3.13/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^
File "/opt/homebrew/lib/python3.13/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.13/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/homebrew/lib/python3.13/site-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
File "/opt/homebrew/lib/python3.13/site-packages/eventhubs/cli.py", line 171, in send_batch
batch = producer.create_batch()
File "/opt/homebrew/lib/python3.13/site-packages/azure/eventhub/_producer_client.py", line 740, in create_batch
self._get_max_message_size()
~~~~~~~~~~~~~~~~~~~~~~~~~~^^
File "/opt/homebrew/lib/python3.13/site-packages/azure/eventhub/_producer_client.py", line 336, in _get_max_message_size
)._open_with_retry()
~~~~~~~~~~~~~~~~^^
File "/opt/homebrew/lib/python3.13/site-packages/azure/eventhub/_producer.py", line 155, in _open_with_retry
return self._do_retryable_operation(self._open, operation_need_param=False)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.13/site-packages/azure/eventhub/_client_base.py", line 608, in _do_retryable_operation
raise last_exception from None
azure.eventhub.exceptions.AuthenticationError: CBS Token authentication failed.
Status code: None
Error: client-error
CBS Token authentication failed.
Status code: None
While buiding the package I see the following: 2024/12/05 16:38:32 INFO Skipped errors: found 5 validation errors:
1. references found in dashboard kibana/dashboard/azure-1e5c9b50-f24a-11ec-a5a8-bf965bcd5646.json: azure-671ff040-f24e-11ec-a5a8-bf965bcd5646 (search) (SVR00004)
2. references found in dashboard kibana/dashboard/azure-280493a0-f1a1-11ec-a5a8-bf965bcd5646.json: azure-fb61c4c0-f1a1-11ec-a5a8-bf965bcd5646 (search) (SVR00004)
3. references found in dashboard kibana/dashboard/azure-8731b980-f1aa-11ec-a5a8-bf965bcd5646.json: azure-252228a0-f1ab-11ec-a5a8-bf965bcd5646 (search) (SVR00004)
4. references found in dashboard kibana/dashboard/azure-91224490-f1a6-11ec-a5a8-bf965bcd5646.json: azure-70cbce40-f1a7-11ec-a5a8-bf965bcd5646 (search) (SVR00004)
5. references found in dashboard kibana/dashboard/azure-cad82b40-f251-11ec-a5a8-bf965bcd5646.json: azure-3d1466b0-f252-11ec-a5a8-bf965bcd5646 (search) (SVR00004)
(although I see the - SVR00004 # references found in dashboard in validation. Not sure if you need those) Probably you can delete such reference from dashboards if not used Approving as rest looks good and please have a look mainly in #11984 (comment) |
Some minors:
|
Co-authored-by: Arianna Laudazzi <[email protected]>
@zmoog - What if a user enables V2 and did not disable the V1? Do you think |
Yeah, this is something we need to address. I'll open an issue if I can't fix them quickly in this PR. |
They may get duplicate events and increased contention among consumers.
Yeah, I agree. While reading the text today, I noticed the text is not clear and we should be more explicit. Let me share the updated version... |
Co-authored-by: muthu-mps <[email protected]>
I added it accidentally. The dataset must be azure.events.
Co-authored-by: muthu-mps <[email protected]>
Oh, great point, I missed it! 🤦 — thank you for the heads up, I'm updating the package-spec version and I'll re-run the tests.
Stack version 8.13.0 is okay:
We finally found the time to update the integration with these long awaited changes. |
packages/azure/data_stream/events/elasticsearch/ingest_pipeline/default.yml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
We bump it to 1.20.0 because the v2 integration is in preview.
Quality Gate failedFailed conditions |
💚 Build Succeeded
History
cc @zmoog |
Package azure - 1.20.0 containing this change is available at https://epr.elastic.co/package/azure/1.20.0/ |
Proposed commit message
Switch the integration package from the one-input-per-data-stream model to the one-input model.
One input per data stream model:
One input model:
In the one-input model, there is only one azure-eventhub input running and sending events to the
events
data stream. In theevents
data stream, the ingest pipeline performs these tasks:event.dataset
field using thecategory
field in the event.event.dataset
field to reroute the event to the target data stream.The discover process uses the following logic:
event.dataset
toazure.eventhub
(the generic integration)event.dataset
toazure.platformlogs
(it's probably an Azure log)event.dataset
to specific one likeazure.activitylogs
orazure.signinlogs
.After the discovery step, the routing rules use the
event.dataset
value to forward the events to the best available target data stream.Checklist
changelog.yml
file.Author's Checklist
How to test this PR locally
Bump the integration version
I'll update the version and changelog later to avoid conflicts. In the meantime, a simple bump would do:
Build and start a local stack
elastic-package build && elastic-package stack up -d -v --version 8.16.1
Install the Azure Logs package
Send test documents and check they are redirected to the data stream
I use the eventhubs CLI tool to send JSON documents to an event hub.
Set up some environment variables:
We can then
cat
the test document and pipe them into an event hub usingeh -v eventdata send-batch
:cat data_stream/provisioning/_dev/test/pipeline/test-provisioninglogs-raw.log | eh -v eventdata send-batch
All supported documents should land into the expected data stream. Any unsupported document should land into the platform logs data stream.
Related issues
Screenshots
Integration setup
Sample from the integration docs