`cyclops.report.utils.py.create_metric_cards()` - Error when slice is empty #582

rjavadi · 2024-03-12T03:54:35Z

rjavadi
Mar 12, 2024
Maintainer

@a-kore
I'm trying to test export() in report.py. Is it possible to have empty slice for current_metric?
If yes, I get index out of range error for slice_val in this line:

slices = [
->    slice_val.split(":")[0]
        for slice_val in slices_values
        if slice_val.split(":")[0] != "overall"
    ]

And here's the code that calls this function in its call stack:

self.model_card_report.log_quantitative_analysis(
            analysis_type="performance",
            name="accuracy",
            value=0.85,
        )
        self.model_card_report.log_quantitative_analysis(
            analysis_type="performance",
            name="f1_score",
            value=0.65,
            metric_slice="test",
            decision_threshold=0.8,
            description="Accuracy of the model on the test set",
            pass_fail_thresholds=[0.9, 0.85, 0.8],
            pass_fail_threshold_fns=[lambda x, t: x >= t for _ in range(3)],
        )

        report_path = self.model_card_report.export(interactive=False, save_json=False)

Please let me know if I'm missing something.

Answered by a-kore

Mar 12, 2024

Each slice should have at least an overall split to specify that it hasn't been sliced.

Also, the method log_quantitative_analysis is used once per metric. So only one pass/fail threshold can be specified for it. The notebooks for the use-cases are a useful demo for the function. Here's a snippet from the heart_failure_prediction.ipynb notebook that shows how the metrics from the evaluator are logged:

 results_female_flat = flatten_results_dict(
    results=results_female,
    model_name=model_name,
)
# ruff: noqa: W505
for name, metric in results_female_flat.items():
    split, name = name.split("/")  # noqa: PLW2901
    descriptions = {
        "BinaryPrecision": "The proportion of pred…

View full answer

a-kore · 2024-03-12T14:27:37Z

a-kore
Mar 12, 2024
Maintainer

Each slice should have at least an overall split to specify that it hasn't been sliced.

Also, the method log_quantitative_analysis is used once per metric. So only one pass/fail threshold can be specified for it. The notebooks for the use-cases are a useful demo for the function. Here's a snippet from the heart_failure_prediction.ipynb notebook that shows how the metrics from the evaluator are logged:

 results_female_flat = flatten_results_dict(
    results=results_female,
    model_name=model_name,
)
# ruff: noqa: W505
for name, metric in results_female_flat.items():
    split, name = name.split("/")  # noqa: PLW2901
    descriptions = {
        "BinaryPrecision": "The proportion of predicted positive instances that are correctly predicted.",
        "BinaryRecall": "The proportion of actual positive instances that are correctly predicted. Also known as recall or true positive rate.",
        "BinaryAccuracy": "The proportion of all instances that are correctly predicted.",
        "BinaryAUROC": "The area under the receiver operating characteristic curve (AUROC) is a measure of the performance of a binary classification model.",
        "BinaryAveragePrecision": "The area under the precision-recall curve (AUPRC) is a measure of the performance of a binary classification model.",
        "BinaryF1Score": "The harmonic mean of precision and recall.",
    }
    report.log_quantitative_analysis(
        "performance",
        name=name,
        value=metric.tolist(),
        description=descriptions[name],
        metric_slice=split,
        pass_fail_thresholds=0.7,
        pass_fail_threshold_fns=lambda x, threshold: bool(x >= threshold),
    )

0 replies

rjavadi · 2024-03-15T02:03:14Z

rjavadi
Mar 15, 2024
Maintainer Author

Another thing that I encountered during testing is that when I call log_quantitative_analysis only once on my report, i.e. it has just one PerformanceMetric like below, it is later passed as a single object of PerformanceMetric to create_metric_cards, while it should be a list with one element:

self.model_card_report.log_quantitative_analysis(
            analysis_type="performance",
            name="BinaryAccuracy",
            description="Accuracy of the model on the test set",
            value=0.85,
            metric_slice="overall",
            decision_threshold=0.7,
            pass_fail_thresholds=[0.6, 0.65, 0.7],
            pass_fail_threshold_fns=[lambda x, t: x >= t for _ in range(3)]
        )

This is what create_metric_cards receives:

current_metrics=PerformanceMetric(type='BinaryAccuracy', value=0.85, slice='overall', description='Accuracy of the model on the test set', graphics=None, tests=[Test(name='BinaryAccuracy/overall', description=None, threshold=0.6, result=0.85, passed=True, graphics=None), Test(name='BinaryAccuracy/overall', description=None, threshold=0.65, result=0.85, passed=True, graphics=None), Test(name='BinaryAccuracy/overall', description=None, threshold=0.7, result=0.85, passed=True, graphics=None)], decision_threshold=0.7)

0 replies

a-kore · 2024-03-15T15:07:49Z

a-kore
Mar 15, 2024
Maintainer

Yeah, It looks like logging just one metric is an edge case that would fail. I think it has to do with the sweep_metrics function returning just the PerformanceMetric object instead of a list with just one object when called.

A simple fix might be to check the type and make sure it's a list before passing to create_metric_cards in report.py

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`cyclops.report.utils.py.create_metric_cards()` - Error when slice is empty #582

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

cyclops.report.utils.py.create_metric_cards() - Error when slice is empty #582

rjavadi Mar 12, 2024 Maintainer

Replies: 3 comments

a-kore Mar 12, 2024 Maintainer

rjavadi Mar 15, 2024 Maintainer Author

a-kore Mar 15, 2024 Maintainer

`cyclops.report.utils.py.create_metric_cards()` - Error when slice is empty #582

rjavadi
Mar 12, 2024
Maintainer

a-kore
Mar 12, 2024
Maintainer

rjavadi
Mar 15, 2024
Maintainer Author

a-kore
Mar 15, 2024
Maintainer