Detection of skew/drift in distribution of numerical feature #101

wrapper228 · 2020-01-15T17:52:44Z

Does .validate_statistics() really detect anomalies in distribution only for categorical features? "For now drift detection is only supported for categorical features." and "For now skew detection is only supported for categorical features." - doesn't it seem weird?
For example, I have a N(0,1) distributed numerical feature in my train data. Now in serving data this numerical feature has N(10,1) distribution. Any solutions from TFDV for this case?

caveness · 2020-01-29T20:03:38Z

That's correct -- as of now, TFDV supports drift and skew detection only for categorical features. So, unfortunately, we don't currently have a solution for finding such a distribution shift in numeric features. However, we are planning to add support for skew and drift detection for numeric features in the future.

cah-aswini-jalla · 2021-03-11T15:09:27Z

Do we have any update on getting skew/drift anomalies for numerical features?

caveness · 2021-03-11T16:25:16Z

Yes -- support for detecting drift and skew for numeric features has been added to TFDV, as of Version 0.25.0.

To detect drift or distribution skew in numeric features, specify a
jensen_shannon_divergence threshold in the drift_comparator or skew_comparator in your schema.

See the TFDV Get Started Guide for more info.

wrapper228 changed the title sa Detection of skew/drift in distribution of numerical feature Jan 15, 2020

wrapper228 closed this as completed Jan 15, 2020

wrapper228 reopened this Jan 15, 2020

rmothukuru self-assigned this Jan 20, 2020

rmothukuru added type:feature stat:awaiting tensorflower labels Jan 20, 2020

rmothukuru assigned caveness and unassigned rmothukuru Jan 20, 2020

wrapper228 referenced this issue in wrapper228/data-validation Feb 28, 2020

TODO note looks better now

aee76bf

wrapper228 mentioned this issue Feb 28, 2020

Solution for skew/drift detection in distribution of numerical feature #113

Open

caveness closed this as completed Mar 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detection of skew/drift in distribution of numerical feature #101

Detection of skew/drift in distribution of numerical feature #101

wrapper228 commented Jan 15, 2020 •

edited

Loading

caveness commented Jan 29, 2020

cah-aswini-jalla commented Mar 11, 2021

caveness commented Mar 11, 2021

Detection of skew/drift in distribution of numerical feature #101

Detection of skew/drift in distribution of numerical feature #101

Comments

wrapper228 commented Jan 15, 2020 • edited Loading

caveness commented Jan 29, 2020

cah-aswini-jalla commented Mar 11, 2021

caveness commented Mar 11, 2021

wrapper228 commented Jan 15, 2020 •

edited

Loading