Anomaly detection/combined scorer #1995

ram-senth · 2025-02-25T19:40:58Z

No description provided.

src/seer/anomaly_detection/accessors.py

src/seer/anomaly_detection/detectors/anomaly_detectors.py

src/seer/anomaly_detection/detectors/anomaly_scorer.py

aayush-se · 2025-03-03T19:27:55Z

src/seer/anomaly_detection/accessors.py

+        prophet_timestamps = np.array([None] * n_predictions)
+        prophet_ys = np.array([None] * n_predictions)
+        prophet_yhats = np.array([None] * n_predictions)
+        prophet_yhat_lowers = np.array([None] * n_predictions)
+        prophet_yhat_uppers = np.array([None] * n_predictions)


performance nit: would it be better to use np.full(n_predictions, None)? This would just create the array directly rather than performing a conversion

Think this was something you added? I just changed the values to None instead of 0.0 as 0.0 can affect scores.

Yes you're right I did add these to pass down the prophet related info. Was reviewing the _hydrate_alert function prior to this so was thinking of potential optimizations. I can add this as a followup PR if needed

It is fine, I can make the changes. It is a minor one.

aayush-se · 2025-03-03T19:39:31Z

src/seer/anomaly_detection/anomaly_detection.py

+        forecast_len = algo_config.prophet_forecast_len * (60 // config.time_period)
+        ts_internal = convert_external_ts_to_internal(timeseries)
+        prophet_df = prophet_detector.predict(
+            ts_internal.timestamps,
+            ts_internal.values,
+            forecast_len,
+            config.time_period,
+            config.sensitivity,
+        )


qq: Could we use the cached predictions in the db rather than re-running predict here?

and/or does forecast_len need to be > 0 in this case? Since specifically we just want the prophet predictions for what we currently have (history) and are not storing the future predictions in this function. I assume this would make this function slightly faster?

This method is called from both store_data and batch_predict calls. In the store_data path, the result from here is stored in DB. Agreed that the we do not need to predict for the batch_detection path but then that API is not being used actively in prod at this time. So I would not put it high on the list.

aayush-se · 2025-03-03T20:58:52Z

src/seer/anomaly_detection/detectors/anomaly_scorer.py

+            raise ServerError("No flags and scores from MP scorer")
+
+        if prophet_df is None or prophet_df.empty:
+            # logger.warning("The prophet_df is None or empty, skipping prophet scoring")


I think this log should still be included

This will cause a bunch of warnings in the case of combo_detect at present, where a batch detection is followed by a bunch of stream detection calls. I will be adding this back once I update that in a followup pr.

aayush-se · 2025-03-03T22:39:05Z

src/seer/anomaly_detection/detectors/anomaly_scorer.py

+        ):
+            return prophet_flag
+
+        if (direction == "up" and y >= yhat_upper) or (direction == "down" and y <= yhat_lower):


Should this just be strictly greater/less than? I noticed thats what it was in the original LocationDetectors too. Probably doesn't make a large difference otherwise though.

Doesn't make any noticeable difference.

aayush-se · 2025-03-03T22:43:15Z

src/seer/anomaly_detection/detectors/anomaly_scorer.py

+                if pd.to_datetime(timestamp) in prophet_map["flag"]:
+                    found += 1
+                    pd_dt = pd.to_datetime(timestamp)


I believe it would be faster to convert the entire timestamp array to pd datetimes first then iterate over the converted values rather than converting each element within the loop

For batch detection we are looking up at most 96 timestamps and for streaming we are looking up 1 timestamp. The prediction data can be up to 2688 long. So I do not think updating the entire DataFrame will be more efficient.

aayush-se · 2025-03-03T22:49:16Z

src/seer/anomaly_detection/detectors/anomaly_scorer.py

+                        and prophet_flag == "anomaly_higher_confidence"
+                    ):
+                        flags.append("anomaly_higher_confidence")
+                    elif prophet_score >= 2.0:


I noticed this threshold was >= 5.0 on the data-analysis notebook (I can't select the exact lines of code but if you cmd + f merge_prophet_mp_results its in that function). Is there a reason this was dropped to 2.0?

Wasn't that part of your recommendation from your experiments for that one timeseries with an extended flatline at 0? Or am I misremembering it? Regardless, all my experiments in these past few days that we evaluated together and with Tillman has been using 2.0.

No, I think my investigation for that was prior to us fixing the stream logic for the scoring. Since 2.0 seems to work then LGTM and we can adjust later if needed

aayush-se · 2025-03-03T22:50:23Z

src/seer/anomaly_detection/detectors/anomaly_scorer.py

+            # todo: publish metrics for found/total
+            # if debug:
+            #     print(f"found {found} out of {len(timestamps)}")


nits: Should this just be added now as a log.info()? Also extra debug comment?

Actually I'm actively working on this. I enabled this log and noticed that there is a difference in how timestamps are treated by the code that retrieves data from DB, resulting in stream computations not finding the datapoint. Am trouble shooting it.

FWIW it may be the discrepancy in the timestamps stored in DB (seconds) vs the pd timestamps which I believe defaults to 'ns': https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html

aayush-se · 2025-03-03T22:51:54Z

src/seer/anomaly_detection/detectors/anomaly_scorer.py

+        ad_config: AnomalyDetectionConfig,
+    ) -> FlagsAndScores:
+
+        # todo: return prophet thresholds


qq: I assume adding this would be a followup PR to support the UI changes for the confidence interval?

aayush-se · 2025-03-03T23:02:19Z

src/seer/app.py

@@ -376,7 +376,6 @@ def ready_check(app_config: AppConfig = injected):
    from seer.inference_models import models_loading_status

    status = models_loading_status()
-    logger.info(f"Model loading status: {status}")


Why is this removed?

This was causing a log entry to be published every second, increasing the noise in logs.

Just asked jenn/rohan about this. It isnt used for autofix, but may be helpful for grouping?

Would it make more sense to push this log to a few lines below within the if statements (FAILED/DONE checks on lines 384-387)?

I do not think there is much value in this log. Let me know if it is helpful for grouping.

Just reviewed the entire file this log can probably just be removed.

aayush-se

Looks great! No structural changes just had a couple of nits and questions.

aayush-se

Changes look good

…e LowVarianceScorer to MPLowVarianceScorer

ram-senth force-pushed the anomaly-detection/combined-scorer branch 8 times, most recently from 31be8fa to bbccf33 Compare February 28, 2025 07:20

aayush-se reviewed Mar 1, 2025

View reviewed changes

src/seer/anomaly_detection/accessors.py Outdated Show resolved Hide resolved

aayush-se reviewed Mar 1, 2025

View reviewed changes

src/seer/anomaly_detection/accessors.py Outdated Show resolved Hide resolved

aayush-se reviewed Mar 1, 2025

View reviewed changes

src/seer/anomaly_detection/detectors/anomaly_detectors.py Show resolved Hide resolved

aayush-se reviewed Mar 1, 2025

View reviewed changes

src/seer/anomaly_detection/detectors/anomaly_scorer.py Show resolved Hide resolved

ram-senth force-pushed the anomaly-detection/combined-scorer branch 2 times, most recently from 9a50bc9 to 62f6a83 Compare March 3, 2025 19:13

ram-senth marked this pull request as ready for review March 3, 2025 19:23

ram-senth requested a review from a team as a code owner March 3, 2025 19:23

aayush-se reviewed Mar 3, 2025

View reviewed changes

aayush-se approved these changes Mar 3, 2025

View reviewed changes

aayush-se approved these changes Mar 4, 2025

View reviewed changes

ram-senth force-pushed the anomaly-detection/combined-scorer branch from aa95433 to 540a511 Compare March 4, 2025 21:23

ram-senth added 2 commits March 5, 2025 09:57

feat(anomaly_detection): add prophet scorer and combined scorer

cee0e05

chore(anomaly_detection): Remove MPIQRScorer, LocationDetector; Renam…

da35803

…e LowVarianceScorer to MPLowVarianceScorer

ram-senth and others added 5 commits March 5, 2025 09:59

use prophet predictions from db and also store it when storing alert

cdf95a3

New location logic using the new prophet prediction

addc984

Increasing coverage and updates for review comments

6b8cb99

Fix for timestamp datatype mismatch

486425b

Anomaly detection/remove dynamic window (#2078)

8b85677

ram-senth force-pushed the anomaly-detection/combined-scorer branch from 23b9384 to 8b85677 Compare March 5, 2025 18:01

ram-senth merged commit d787f75 into main Mar 5, 2025
22 checks passed

ram-senth deleted the anomaly-detection/combined-scorer branch March 5, 2025 18:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Anomaly detection/combined scorer #1995

Anomaly detection/combined scorer #1995

ram-senth commented Feb 25, 2025

aayush-se Mar 3, 2025

ram-senth Mar 3, 2025

aayush-se Mar 3, 2025

ram-senth Mar 3, 2025

aayush-se Mar 3, 2025

aayush-se Mar 3, 2025

ram-senth Mar 3, 2025

aayush-se Mar 3, 2025

ram-senth Mar 3, 2025

aayush-se Mar 3, 2025

ram-senth Mar 3, 2025

aayush-se Mar 3, 2025

ram-senth Mar 3, 2025

aayush-se Mar 3, 2025

ram-senth Mar 3, 2025

aayush-se Mar 3, 2025

aayush-se Mar 3, 2025

ram-senth Mar 3, 2025

aayush-se Mar 3, 2025

aayush-se Mar 3, 2025

ram-senth Mar 3, 2025

aayush-se Mar 3, 2025

ram-senth Mar 3, 2025

aayush-se Mar 3, 2025

ram-senth Mar 3, 2025

aayush-se Mar 3, 2025 •

edited

Loading

aayush-se left a comment

aayush-se left a comment

Anomaly detection/combined scorer #1995

Anomaly detection/combined scorer #1995

Conversation

ram-senth commented Feb 25, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aayush-se Mar 3, 2025 • edited Loading

Choose a reason for hiding this comment

aayush-se left a comment

Choose a reason for hiding this comment

aayush-se left a comment

Choose a reason for hiding this comment

aayush-se Mar 3, 2025 •

edited

Loading