Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(spans): Global config for AI model costs #3579

Merged
merged 4 commits into from
May 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@

**Bug fixes:**

- Properly handle AI metrics from the Python SDK's `@ai_track` decorator ([#3539](https://github.com/getsentry/relay/pull/3539))
- Mitigates occasional slowness and timeouts of the healthcheck endpoint. The endpoint will now respond promptly an unhealthy state. ([#3567](https://github.com/getsentry/relay/pull/3567))
- Properly handle AI metrics from the Python SDK's `@ai_track` decorator. ([#3539](https://github.com/getsentry/relay/pull/3539))
- Mitigate occasional slowness and timeouts of the healthcheck endpoint. The endpoint will now respond promptly an unhealthy state. ([#3567](https://github.com/getsentry/relay/pull/3567))

**Internal**:

Expand All @@ -22,6 +22,7 @@
- Emit negative outcomes for denied metrics. ([#3508](https://github.com/getsentry/relay/pull/3508))
- Increase size limits for internal batch endpoints. ([#3562](https://github.com/getsentry/relay/pull/3562))
- Emit negative outcomes when metrics are rejected because of a disabled namespace. ([#3544](https://github.com/getsentry/relay/pull/3544))
- Add AI model costs to global config. ([#3579](https://github.com/getsentry/relay/pull/3579))
- Add support for `event.` in the `Span` `Getter` implementation. ([#3577](https://github.com/getsentry/relay/pull/3577))

## 24.4.2
Expand Down
5 changes: 3 additions & 2 deletions py/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
# Changelog

### Unreleased
## Unreleased

- This release requires Python 3.11 or later. There are no intentionally breaking changes included in this release, but we stopped testing against Python 3.10.
- Add AI model costs to global config. ([#3579](https://github.com/getsentry/relay/pull/3579))

## 0.8.61

- Update data category metirc hours to metric seconds. [#3558](https://github.com/getsentry/relay/pull/3558)
- Update data category metric hours to metric seconds. [#3558](https://github.com/getsentry/relay/pull/3558)

## 0.8.60

Expand Down
73 changes: 73 additions & 0 deletions relay-dynamic-config/src/ai.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
//! Configuration for measurements generated from AI model instrumentation.

use relay_common::glob2::LazyGlob;
use serde::{Deserialize, Serialize};

const MAX_SUPPORTED_VERSION: u16 = 1;

#[derive(Clone, Default, Debug, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct ModelCosts {
pub version: u16,
#[serde(skip_serializing_if = "Vec::is_empty")]
pub costs: Vec<ModelCost>,
}

impl ModelCosts {
/// `false` if measurement and metrics extraction should be skipped.
pub fn is_enabled(&self) -> bool {
self.version > 0 && self.version <= MAX_SUPPORTED_VERSION
}

/// Gets the cost per 1000 tokens, if defined for the given model.
pub fn cost_per_1k_tokens(&self, model_id: &str, for_completion: bool) -> Option<f32> {
self.costs
.iter()
.find(|cost| cost.matches(model_id, for_completion))
.map(|c| c.cost_per_1k_tokens)
}
}

#[derive(Clone, Debug, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct ModelCost {
model_id: LazyGlob,
for_completion: bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this makes sense as an enum, but not familiar enough with AI terminology:

enum Usage {
    Completion,
    NotCompletionButTheDefaultUsecase,
    // MaybeSomeFutureUsecase,
}

@colin-sentry wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the SDKs only send the static prompt_tokens, completion_tokens, and total_tokens, so it's probably fine to keep as a boolean

cost_per_1k_tokens: f32,
}

impl ModelCost {
/// `true` if this cost definition matches the given model.
pub fn matches(&self, model_id: &str, for_completion: bool) -> bool {
self.for_completion == for_completion && self.model_id.compiled().is_match(model_id)
}
}

#[cfg(test)]
mod tests {
use insta::assert_debug_snapshot;

use super::*;

#[test]
fn roundtrip() {
let original = r#"{"version":1,"costs":[{"modelId":"babbage-002.ft-*","forCompletion":false,"costPer1kTokens":0.0016}]}"#;
let deserialized: ModelCosts = serde_json::from_str(original).unwrap();
assert_debug_snapshot!(deserialized, @r###"
ModelCosts {
version: 1,
costs: [
ModelCost {
model_id: LazyGlob("babbage-002.ft-*"),
for_completion: false,
cost_per_1k_tokens: 0.0016,
},
],
}
"###);

let serialized = serde_json::to_string(&deserialized).unwrap();
// Patch floating point
assert_eq!(&serialized, original);
}
}
14 changes: 13 additions & 1 deletion relay-dynamic-config/src/global.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ use relay_quotas::Quota;
use serde::{de, Deserialize, Serialize};
use serde_json::Value;

use crate::{defaults, ErrorBoundary, MetricExtractionGroup, MetricExtractionGroups};
use crate::ai::ModelCosts;
use crate::{ai, defaults, ErrorBoundary, MetricExtractionGroup, MetricExtractionGroups};

/// A dynamic configuration for all Relays passed down from Sentry.
///
Expand Down Expand Up @@ -46,6 +47,10 @@ pub struct GlobalConfig {
/// applying.
#[serde(skip_serializing_if = "is_ok_and_empty")]
pub metric_extraction: ErrorBoundary<MetricExtractionGroups>,

/// Configuration for AI span measurements.
#[serde(skip_serializing_if = "is_missing")]
pub ai_model_costs: ErrorBoundary<ai::ModelCosts>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add a skip_serializing_if here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

}

impl GlobalConfig {
Expand Down Expand Up @@ -401,6 +406,13 @@ fn is_ok_and_empty(value: &ErrorBoundary<MetricExtractionGroups>) -> bool {
)
}

fn is_missing(value: &ErrorBoundary<ai::ModelCosts>) -> bool {
matches!(
value,
&ErrorBoundary::Ok(ModelCosts{ version, ref costs }) if version == 0 && costs.is_empty()
)
}

#[cfg(test)]
mod tests {
use super::*;
Expand Down
1 change: 1 addition & 0 deletions relay-dynamic-config/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@
)]
#![allow(clippy::derive_partial_eq_without_eq)]

mod ai;
mod defaults;
mod error_boundary;
mod feature;
Expand Down
6 changes: 1 addition & 5 deletions relay-server/src/services/processor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3045,7 +3045,6 @@ mod tests {

#[cfg(feature = "processing")]
use {
relay_dynamic_config::Options,
relay_metrics::BucketValue,
relay_quotas::{QuotaScope, ReasonCode},
relay_test::mock_service,
Expand All @@ -3071,11 +3070,8 @@ mod tests {
#[test]
fn test_dynamic_quotas() {
let global_config = GlobalConfig {
measurements: None,
quotas: vec![mock_quota("foo"), mock_quota("bar")],
filters: Default::default(),
options: Options::default(),
metric_extraction: Default::default(),
..Default::default()
};

let project_quotas = vec![mock_quota("baz"), mock_quota("qux")];
Expand Down
Loading