All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Added support for new metrics in the Confidence Based Performance Estimator (CBPE). It now estimates
roc_auc
,f1
,precision
,recall
andaccuracy
. - Added support for multiclass classification. This includes
- Specifying
multiclass classification metadata
+ support in automated metadata extraction (by introducing amodel_type
parameter). - Support for all
CBPE
metrics. - Support for realized performance calculation using the
PerformanceCalculator
. - Support for all types of drift detection (model inputs, model output, target distribution).
- A new synthetic toy dataset.
- Specifying
- Removed the
identifier
property from theModelMetadata
class. Joininganalysis
data andanalysis target
values should be done upfront or index-based. - Added an
exclude_columns
parameter to theextract_metadata
function. Use it to specify the columns that should not be considered as model metadata or features. - All
fit
methods now return the fitted object. This allows chainingCalculator
/Estimator
instantiation and fitting into a single line. - Custom metrics are no longer supported in the
PerformanceCalculator
. Only the predefined metrics remain supported. - Big documentation revamp: we've tweaked overall structure, page structure and incorporated lots of feedback.
- Improvements to consistency and readability for the 'hover' visualization in the step plots, including consistent color usage, conditional formatting, icon usage etc.
- Improved indication of "realized" and "estimated" performance in all
CBPE
step plots (changes to hover, axes and legends)
- Updated homepage in project metadata
- Added missing metadata modification to the quickstart
- Perform some additional check on reference data during preprocessing
- Various documentation suggestions (#58)
- Deal with out-of-time-order data when chunking
- Fix reversed Y-axis and plot labels in continuous distribution plots
- Publishing to PyPi did not like raw sections in ReST, replaced by Markdown version.
- Added support for both predicted labels and predicted probabilities in
ModelMetadata
. - Support for monitoring model performance metrics using the
PerformanceCalculator
. - Support for monitoring target distribution using the
TargetDistributionCalculator
- Plotting will default to using step plots.
- Restructured the
nannyml.drift
package and subpackages. Breaking changes! - Metadata completeness check will now fail when there are features of
FeatureType.UNKNOWN
. - Chunk date boundaries are now calculated differently for a
PeriodBasedChunker
, using the theoretical period for boundaries as opposed to the observed boundaries within the chunk observations. - Updated version of the
black
pre-commit hook due to breaking changes in itsclick
dependency. - The minimum chunk size will now be provided by each individual
calculator
/estimator
/metric
, allowing for each of them to warn the end user when chunk sizes are suboptimal.
- Restrict version of the
scipy
dependency to be>=1.7.3, <1.8.0
. Planned to be relaxed ASAP. - Deal with missing values in chunks causing
NaN
values when concatenating. - Crash when estimating CBPE without a target column present
- Incorrect label in
ModelMetadata
printout
- Allow calculators/estimators to provide appropriate
min_chunk_size
upon splitting intochunks
.
- Data reconstruction drift calculation failing when there are no categorical or continuous features (#36)
- Incorrect scaling on continuous feature distribution plot (#39)
- Missing
needs_calibration
checks before performing score calibration in CBPE - Fix crash on chunking when missing target values in reference data
- Result classes for Calculators and Estimators.
- Updated the documentation to reflect the changes introduced by result classes, specifically to plotting functionality.
- Add support for imputing of missing values in the
DataReconstructionDriftCalculator
.
nannyml.plots.plots
was removed. Plotting is now meant to be done usingDriftResult.plot()
orEstimatorResult.plot()
.
- Fixed an issue where data reconstruction drift calculation also used model predictions during decomposition.
- Chunking base classes and implementations
- Metadata definitions and utilities
- Drift calculator base classes and implementations
- Univariate statistical drift calculator
- Multivariate data reconstruction drift calculator
- Drifted feature ranking base classes and implementations
- Alert count based ranking
- Performance estimator base classes and implementations
- Certainty based performance estimator
- Plotting utilities with support for
- Stacked bar plots
- Line plots
- Joy plots
- Documentation
- Quick start guide
- User guides
- Deep dives
- Example notebooks
- Technical reference documentation