You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
river version: development Python version: Python 3.10.1 Operating system: macOS Sonoma 14.0 (23A344)
Describe the bug
Hello 👋
anomaly.LocalOutlierFactor (LOF) cannot be combined with AnomalyFilter to classify samples. The error is due to evaluation of score_one function prior to learn_one, raising IndexError. The error is, therefore, related to score_one function implemented in LOF.
I traced the error down to inability of calculating _initial_calculations due to lack of sufficient samples seen. I found a solution which, to the best of my knowledge, might be correct. Changing line 345-346 in anomaly.LocalOutlierFactor HERE
from
The originally returned None must be also replaced due to further TypeError raised if unchanged. It might be 0.0. Due to constraints on both tails of score in my projects, I set it to 0.5 in this proposed solution.
Let me know if that makes sense, I'd be happy to elaborate on any issues or comments.
IndexError: list index out of range
IndexError Traceback (most recent call last)
/river/bug_lof.ipynb Cell 3 line 7
5 X = [{\"a\": 0.5, \"b\": 1}, {\"a\": 1, \"b\": 1}]
6 for x in X:
----> 7 lof.learn_one(x)
File ~/river/anomaly/filter.py:179, in QuantileFilter.learn_one(self, *args, **learn_kwargs)
178 def learn_one(self, *args, **learn_kwargs):
--> 179 score = self.score_one(*args)
180 if not self.protect_anomaly_detector or not self.classify(score):
181 self.anomaly_detector.learn_one(*args, **learn_kwargs)
File ~/river/anomaly/base.py:146, in AnomalyFilter.score_one(self, *args, **kwargs)
130 def score_one(self, *args, **kwargs):
131 """Return an outlier score.
132
133 A high score is indicative of an anomaly. A low score corresponds to a normal observation.
(...)
144
145 """
--> 146 return self.anomaly_detector.score_one(*args, **kwargs)
File ~/river/anomaly/lof.py:457, in LocalOutlierFactor._initial_calculations(self, x_list, nm, neighborhoods, rev_neighborhoods, k_distances, dist_dict)
455 # Calculate new k-dist for each particle
456 for i, inner_dict in enumerate(dist_dict.values()):
--> 457 k_distances[i] = sorted(inner_dict.values())[min(k, len(inner_dict.values())) - 1]
459 # Only keep particles that are neighbors in distance dictionary
460 dist_dict = {
461 k: {k2: v2 for k2, v2 in v.items() if v2 <= k_distances[k]}
462 for k, v in dist_dict.items()
463 }
The text was updated successfully, but these errors were encountered:
Versions
river version: development
Python version: Python 3.10.1
Operating system: macOS Sonoma 14.0 (23A344)
Describe the bug
Hello 👋
anomaly.LocalOutlierFactor (LOF) cannot be combined with AnomalyFilter to classify samples. The error is due to evaluation of
score_one
function prior tolearn_one
, raising IndexError. The error is, therefore, related toscore_one
function implemented in LOF.I traced the error down to inability of calculating
_initial_calculations
due to lack of sufficient samples seen. I found a solution which, to the best of my knowledge, might be correct. Changing line 345-346 in anomaly.LocalOutlierFactor HEREfrom
to
The originally returned
None
must be also replaced due to furtherTypeError
raised if unchanged. It might be 0.0. Due to constraints on both tails of score in my projects, I set it to 0.5 in this proposed solution.Let me know if that makes sense, I'd be happy to elaborate on any issues or comments.
Thank you 🙏
Steps/code to reproduce
Full Backtrace of Exception
IndexError: list index out of range
IndexError Traceback (most recent call last) /river/bug_lof.ipynb Cell 3 line 7 5 X = [{\"a\": 0.5, \"b\": 1}, {\"a\": 1, \"b\": 1}] 6 for x in X: ----> 7 lof.learn_one(x)
File ~/river/anomaly/filter.py:179, in QuantileFilter.learn_one(self, *args, **learn_kwargs)
178 def learn_one(self, *args, **learn_kwargs):
--> 179 score = self.score_one(*args)
180 if not self.protect_anomaly_detector or not self.classify(score):
181 self.anomaly_detector.learn_one(*args, **learn_kwargs)
File ~/river/anomaly/base.py:146, in AnomalyFilter.score_one(self, *args, **kwargs)
130 def score_one(self, *args, **kwargs):
131 """Return an outlier score.
132
133 A high score is indicative of an anomaly. A low score corresponds to a normal observation.
(...)
144
145 """
--> 146 return self.anomaly_detector.score_one(*args, **kwargs)
File ~/river/anomaly/lof.py:371, in LocalOutlierFactor.score_one(self, x)
348 x_list_copy = self.x_list.copy()
349 (
350 nm,
351 x_list_copy,
(...)
368 self.lof,
369 )
--> 371 neighborhoods, rev_neighborhoods, k_dist, dist_dict = self._initial_calculations(
372 x_list_copy, nm, neighborhoods, rev_neighborhoods, k_dist, dist_dict
373 )
374 (
375 set_new_points,
376 set_neighbors,
(...)
379 set_upd_lof,
380 ) = define_sets(nm, neighborhoods, rev_neighborhoods)
381 reach_dist = calc_reach_dist_new_points(
382 set_new_points, neighborhoods, rev_neighborhoods, reach_dist, dist_dict, k_dist
383 )
File ~/river/anomaly/lof.py:457, in LocalOutlierFactor._initial_calculations(self, x_list, nm, neighborhoods, rev_neighborhoods, k_distances, dist_dict)
455 # Calculate new k-dist for each particle
456 for i, inner_dict in enumerate(dist_dict.values()):
--> 457 k_distances[i] = sorted(inner_dict.values())[min(k, len(inner_dict.values())) - 1]
459 # Only keep particles that are neighbors in distance dictionary
460 dist_dict = {
461 k: {k2: v2 for k2, v2 in v.items() if v2 <= k_distances[k]}
462 for k, v in dist_dict.items()
463 }
The text was updated successfully, but these errors were encountered: