-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x-pack/metricbeat/module/openai: Add new module #41516
Changes from 11 commits
0e1175c
bf45bf6
8725cbb
03995f6
406c1a4
a3c46b7
72488f7
b5c750a
2d4027d
e3f8fe2
82cd979
be628fe
3259612
40b1131
2936540
850fc43
5061e46
34a7128
7f4cc75
7919c81
2a4bb56
4899f3a
217f7d5
b5dbfb0
492eca1
413010f
92f5062
fdb81c9
78bbd11
9c28e4d
f6067b8
ab98369
1201845
90c2680
63703d4
e1ea8d2
e907d98
942454d
be9b5f2
aff36b5
549f26e
e42472c
33daed7
33b7267
2b836c2
32a9593
9698cac
ae85fa8
0b970bd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
//// | ||
This file is generated! See scripts/mage/docs_collector.go | ||
//// | ||
|
||
:modulename: openai | ||
:edit_url: https://github.com/elastic/beats/edit/main/x-pack/metricbeat/module/openai/_meta/docs.asciidoc | ||
|
||
|
||
[[metricbeat-module-openai]] | ||
[role="xpack"] | ||
== openai module | ||
|
||
beta[] | ||
|
||
This is the openai module. | ||
|
||
|
||
|
||
:edit_url: | ||
|
||
[float] | ||
=== Example configuration | ||
|
||
The openai module supports the standard configuration options that are described | ||
in <<configuration-metricbeat>>. Here is an example configuration: | ||
|
||
[source,yaml] | ||
---- | ||
metricbeat.modules: | ||
- module: openai | ||
metricsets: ["usage"] | ||
enabled: false | ||
period: 1h | ||
|
||
# # Project API Keys - Multiple API keys can be specified for different projects | ||
# api_keys: | ||
# - key: "api_key1" | ||
# - key: "api_key2" | ||
|
||
# # API Configuration | ||
# ## Base URL for the OpenAI usage API endpoint | ||
# api_url: "https://api.openai.com/v1/usage" | ||
# ## Custom headers to be included in API requests | ||
# headers: | ||
# - "k1: v1" | ||
# - "k2: v2" | ||
## Rate Limiting Configuration | ||
# rate_limit: | ||
# limit: 60 # requests per second | ||
# burst: 5 # burst size | ||
# ## Request timeout duration | ||
# timeout: 30s | ||
|
||
# # Data Collection Configuration | ||
# collection: | ||
# ## Number of days to look back when collecting usage data | ||
# lookback_days: 30 | ||
# ## Whether to collect usage data in realtime. Defaults to false as how | ||
# # OpenAI usage data is collected will end up adding duplicate data to ES | ||
# # and also making it harder to do analytics. Best approach is to avoid | ||
# # realtime collection and collect only upto last day (in UTC). So, there's | ||
# # at most 24h delay. | ||
# realtime: false---- | ||
shmsr marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
[float] | ||
=== Metricsets | ||
|
||
The following metricsets are available: | ||
|
||
* <<metricbeat-metricset-openai-usage,usage>> | ||
|
||
include::openai/usage.asciidoc[] | ||
|
||
:edit_url!: |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
//// | ||
This file is generated! See scripts/mage/docs_collector.go | ||
//// | ||
:edit_url: https://github.com/elastic/beats/edit/main/x-pack/metricbeat/module/openai/usage/_meta/docs.asciidoc | ||
|
||
|
||
[[metricbeat-metricset-openai-usage]] | ||
[role="xpack"] | ||
=== openai usage metricset | ||
|
||
beta[] | ||
|
||
include::../../../../x-pack/metricbeat/module/openai/usage/_meta/docs.asciidoc[] | ||
|
||
|
||
:edit_url: | ||
|
||
==== Fields | ||
|
||
For a description of each field in the metricset, see the | ||
<<exported-fields-openai,exported fields>> section. | ||
|
||
Here is an example document generated by this metricset: | ||
|
||
[source,json] | ||
---- | ||
include::../../../../x-pack/metricbeat/module/openai/usage/_meta/data.json[] | ||
---- | ||
:edit_url!: |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
- module: openai | ||
metricsets: ["usage"] | ||
enabled: false | ||
period: 1h | ||
|
||
# # Project API Keys - Multiple API keys can be specified for different projects | ||
# api_keys: | ||
# - key: "api_key1" | ||
# - key: "api_key2" | ||
|
||
# # API Configuration | ||
# ## Base URL for the OpenAI usage API endpoint | ||
# api_url: "https://api.openai.com/v1/usage" | ||
# ## Custom headers to be included in API requests | ||
# headers: | ||
# - "k1: v1" | ||
# - "k2: v2" | ||
## Rate Limiting Configuration | ||
# rate_limit: | ||
# limit: 60 # requests per second | ||
# burst: 5 # burst size | ||
# ## Request timeout duration | ||
# timeout: 30s | ||
|
||
# # Data Collection Configuration | ||
# collection: | ||
# ## Number of days to look back when collecting usage data | ||
# lookback_days: 30 | ||
# ## Whether to collect usage data in realtime. Defaults to false as how | ||
# # OpenAI usage data is collected will end up adding duplicate data to ES | ||
# # and also making it harder to do analytics. Best approach is to avoid | ||
# # realtime collection and collect only upto last day (in UTC). So, there's | ||
# # at most 24h delay. | ||
# realtime: false |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
This is the openai module. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
- key: openai | ||
title: "openai" | ||
release: beta | ||
description: > | ||
openai module | ||
fields: | ||
- name: openai | ||
type: group | ||
description: > | ||
fields: |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one | ||
// or more contributor license agreements. Licensed under the Elastic License; | ||
// you may not use this file except in compliance with the Elastic License. | ||
|
||
// Package openai is a Metricbeat module that contains MetricSets. | ||
package openai |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
{ | ||
"@timestamp": "2017-10-12T08:05:34.853Z", | ||
"event": { | ||
"dataset": "openai.usage", | ||
"duration": 115000, | ||
"module": "openai" | ||
}, | ||
"metricset": { | ||
"name": "usage", | ||
"period": 10000 | ||
}, | ||
"openai": { | ||
"usage": { | ||
"data": { | ||
"aggregation_timestamp": "2024-11-04T05:01:00Z", | ||
"api_key_id": null, | ||
"api_key_name": null, | ||
"api_key_redacted": null, | ||
"api_key_type": null, | ||
"email": null, | ||
"n_cached_context_tokens_total": 0, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @shmsr , @ishleenk17 - Are we okay with the field names? IMO, Updating the field names something similar to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, we should keep fields more readable and inline with other LLM Integrations There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was thinking to keep it like this here and change the names in ingest pipelines in integrations. What do you think? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we want to have a parity between beats and Integrations, then ideally we should have similar field names There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But we don't want beats to be used right. So felt that ingest pipelines in integrations are also a good idea. But yes, no strong opinion. Doesn't matter much. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ideally we try to keep the fields in beats module and Integrtaions same. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Then if we are following that for other modules, let's do it here. I will rename the fields. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are we doing the name change of the fields ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have changed the names based on fields names from Azure OpenAI Elastic Integration docs. Although only a few fields were similar. But yes, I have changed the field names. |
||
"n_context_tokens_total": 118, | ||
"n_generated_tokens_total": 35, | ||
"n_requests": 1, | ||
"operation": "completion-realtime", | ||
"organization_id": "org-dummy", | ||
"organization_name": "Personal", | ||
"project_id": null, | ||
"project_name": null, | ||
"request_type": "", | ||
"snapshot_id": "gpt-4o-realtime-preview-2024-10-01" | ||
} | ||
} | ||
}, | ||
"service": { | ||
"type": "openai" | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
This is the usage metricset of the module openai. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this to be changes to 12 as well ?
Why have we changed the limit from 60 to 12 ?
I thought 60 was the agreed upon limit ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was testing everything from scratch today and that too thoroughly. Noticed a slower rate of firing of requests. So, understanding of
limit
andburst
was confusing and I did put incorrect values there which I have corrected now.This part of the doc needs to be updated with
make update
; I will run that. Rest all doc files are updated.So nothing changed. It's just that it wasn't configured properly by default. Rate limit is still 5 req/ min as per OpenAI.