Skip to content

Releases: Lightning-AI/torchmetrics

Weekly patch release

13 Jul 11:16
Compare
Choose a tag to compare

[1.0.1] - 2022-07-13

Fixed

  • Fixes corner case when using MetricCollection together with aggregation metrics (#1896)
  • Fixed the use of max_fpr in AUROC metric when only one class is present (#1895)
  • Fixed bug related to empty predictions for IntersectionOverUnion metric (#1892)
  • Fixed bug related to MeanMetric and broadcasting of weights when Nans are present (#1898)
  • Fixed bug related to expected input format of pycoco in MeanAveragePrecision (#1913)

Contributors

@fansuregrin, @SkafteNicki

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Visualize metrics

05 Jul 08:54
Compare
Choose a tag to compare

We are happy to announce that the first major release of Torchmetrics, version v1.0, is publicly available. We have
worked hard on a couple of new features for this milestone release, but for v1.0.0, we have also managed to implement
over 100 metrics in torchmetrics.

Plotting

The big new feature of v1.0 is a built-in plotting feature. As the old saying goes: "A picture is worth a thousand words". Within machine learning, this is definitely also true for many things.
Metrics are one area that, in some cases, is definitely better showcased in a figure than as a list of floats. The only requirement for getting started with the plotting feature is installing matplotlib. Either install with pip install matplotlib or pip install torchmetrics[visual] (the latter option also installs Scienceplots and uses that as the default plotting style).

The basic interface is the same for any metric. Just call the new .plot method:

metric = AnyMetricYouLike()
for _ in range(num_updates):
    metric.update(preds[i], target[i])
fig, ax = metric.plot()

The plot method by default does not require any arguments and will automatically call metric.compute internally on
whatever metric states have been accumulated.

[1.0.0] - 2022-07-04

Added

  • Added prefix and postfix arguments to ClasswiseWrapper (#1866)
  • Added speech-to-reverberation modulation energy ratio (SRMR) metric (#1792, #1872)
  • Added new global arg compute_with_cache to control caching behaviour after compute method (#1754)
  • Added ComplexScaleInvariantSignalNoiseRatio for audio package (#1785)
  • Added Running wrapper for calculate running statistics (#1752)
  • AddedRelativeAverageSpectralError and RootMeanSquaredErrorUsingSlidingWindow to image package (#816)
  • Added support for SpecificityAtSensitivity Metric (#1432)
  • Added support for plotting of metrics through .plot() method (#1328, #1481, #1480, #1490, #1581, #1585, #1593, #1600, #1605, #1610, #1609, #1621, #1624, #1623, #1638, #1631, #1650, #1639, #1660, #1682, #1786)
  • Added support for plotting of audio metrics through .plot() method (#1434)
  • Added classes to output from MAP metric (#1419)
  • Added Binary group fairness metrics to classification package (#1404)
  • Added MinkowskiDistance to regression package (#1362)
  • Added pairwise_minkowski_distance to pairwise package (#1362)
  • Added new detection metric PanopticQuality (#929, #1527)
  • Added PSNRB metric (#1421)
  • Added ClassificationTask Enum and use in metrics (#1479)
  • Added ignore_index option to exact_match metric (#1540)
  • Add parameter top_k to RetrievalMAP (#1501)
  • Added support for deterministic evaluation on GPU for metrics that uses torch.cumsum operator (#1499)
  • Added support for plotting of aggregation metrics through .plot() method (#1485)
  • Added support for python 3.11 (#1612)
  • Added support for auto clamping of input for metrics that uses the data_range (#1606)
  • Added ModifiedPanopticQuality metric to detection package (#1627)
  • Added PrecisionAtFixedRecall metric to classification package (#1683)
  • Added multiple metrics to detection package (#1284)
    • IntersectionOverUnion
    • GeneralizedIntersectionOverUnion
    • CompleteIntersectionOverUnion
    • DistanceIntersectionOverUnion
  • Added MultitaskWrapper to wrapper package (#1762)
  • Added RelativeSquaredError metric to regression package (#1765)
  • Added MemorizationInformedFrechetInceptionDistance metric to image package (#1580)

Changed

  • Changed permutation_invariant_training to allow using a 'permutation-wise' metric function (#1794)
  • Changed update_count and update_called from private to public methods (#1370)
  • Raise exception for invalid kwargs in Metric base class (#1427)
  • Extend EnumStr raising ValueError for invalid value (#1479)
  • Improve speed and memory consumption of binned PrecisionRecallCurve with large number of samples (#1493)
  • Changed __iter__ method from raising NotImplementedError to TypeError by setting to None (#1538)
  • FID metric will now raise an error if too few samples are provided (#1655)
  • Allowed FID with torch.float64 (#1628)
  • Changed LPIPS implementation to no more rely on third-party package (#1575)
  • Changed FID matrix square root calculation from scipy to torch (#1708)
  • Changed calculation in PearsonCorrCoeff to be more robust in certain cases (#1729)
  • Changed MeanAveragePrecision to pycocotools backend (#1832)

Deprecated

Removed

  • Support for python 3.7 (#1640)

Fixed

  • Fixed support in MetricTracker for MultioutputWrapper and nested structures (#1608)
  • Fixed restrictive check in PearsonCorrCoef (#1649)
  • Fixed integration with jsonargparse and LightningCLI (#1651)
  • Fixed corner case in calibration error for zero confidence input (#1648)
  • Fix precision-recall curve based computations for float target (#1642)
  • Fixed missing kwarg squeeze in MultiOutputWrapper (#1675)
  • Fixed padding removal for 3d input in MSSSIM (#1674)
  • Fixed max_det_threshold in MAP detection (#1712)
  • Fixed states being saved in metrics that use register_buffer (#1728)
  • Fixed states not being correctly synced and device transfered in MeanAveragePrecision for iou_type="segm" (#1763)
  • Fixed use of prefix and postfix in nested MetricCollection (#1773)
  • Fixed ax plotting logging in `MetricCollection (#1783)
  • Fixed lookup for punkt sources being downloaded in RougeScore (#1789)
  • Fixed integration with lightning for CompositionalMetric (#1761)
  • Fixed several bugs in SpectralDistortionIndex metric (#1808)
  • Fixed bug for corner cases in MatthewsCorrCoef (#1812, #1863)
  • Fixed support for half precision in PearsonCorrCoef (#1819)
  • Fixed number of bugs related to average="macro" in classification metrics (#1821)
  • Fixed off-by-one issue when ignore_index = num_classes + 1 in Multiclass-jaccard (#1860)

New Contributors

Read more

Minor patch release

10 Mar 21:56
Compare
Choose a tag to compare

[0.11.4] - 2023-03-10

Fixed

  • Fixed evaluation of R2Score with the near constant target (#1576)
  • Fixed dtype conversion when the metric is submodule (#1583)
  • Fixed bug related to top_k>1 and ignore_index!=None in StatScores based metrics (#1589)
  • Fixed corner case for PearsonCorrCoef when running in DDP mode but only on a single device (#1587)
  • Fixed overflow error for specific cases in MAP when big areas are calculated (#1607)

Contributors

@Borda, @FarzanT, @SkafteNicki

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Full Changelog: v0.11.3...v0.11.4

Minor patch release

28 Feb 14:55
Compare
Choose a tag to compare

[0.11.3] - 2023-02-28

Fixed

  • Fixed classification metrics for byte input (#1521)
  • Fixed the use of ignore_index in MulticlassJaccardIndex (#1386)

Contributors

@SkafteNicki, @vincentvaroquauxads

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Full Changelog: v0.11.2...v0.11.3

Minor patch release

21 Feb 07:35
Compare
Choose a tag to compare

[0.11.2] - 2023-02-21

Fixed

  • Fixed compatibility between XLA in _bincount function (#1471)
  • Fixed type hints in methods belonging to MetricTracker wrapper (#1472)
  • Fixed multilabel in ExactMatch (#1474)

Contributors

@7shoe, @Borda, @SkafteNicki, @ValerianRey

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Full Changelog: v0.11.1...v0.11.2

Minor patch release

31 Jan 06:45
Compare
Choose a tag to compare

[0.11.1] - 2023-01-30

Fixed

  • Fixed type checking on the maximize parameter at the initialization of MetricTracker (#1428)
  • Fixed mixed precision auto-cast for SSIM metric (#1454)
  • Fixed checking for nltk.punkt in RougeScore if a machine is not online (#1456)
  • Fixed wrongly reset method in MultioutputWrapper (#1460)
  • Fixed dtype checking in PrecisionRecallCurve for target tensor (#1457)

Contributors

@Borda, @SkafteNicki, @stancld

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Full Changelog: v0.11.0...v0.11.1

Adding Multimodal and nominal domain

30 Nov 16:19
Compare
Choose a tag to compare

We are happy to announce that Torchmetrics v0.11 is now publicly available. In Torchmetrics v0.11 we have primarily focused on the cleanup of the large classification refactor from v0.10 and adding new metrics. With v0.11 are crossing 90+ metrics in Torchmetrics nearing the milestone of having 100+ metrics.

New domains

In Torchmetrics we are not only looking to expand with new metrics in already established metric domains such as classification or regression, but also new domains. We are therefore happy to report that v0.11 includes two new domains: Multimodal and nominal.

Multimodal

If there is one topic within machine learning that is hot right now then it is generative models and in particular image-to-text generative models. Just recently stable diffusion v2 was released, able to create even more photorealistic images from a single text prompt than ever

In Torchmetrics v0.11 we are adding a new domain called multimodal to support the evaluation of such models. For now, we are starting out with a single metric, the CLIPScore from this paper that can be used to evaluate such image-to-text models. CLIPScore currently achieves the highest correlation with human judgment, and thus a high CLIPScore for an image-text pair means that it is highly plausible that an image caption and an image are related to each other.

Nominal

If you have ever taken any course in statistics or introduction to machine learning you should hopefully have heard about data can be of different types of attributes: nominal, ordinal, interval, and ratio. This essentially refers to how data can be compared. For example, nominal data cannot be ordered and cannot be measured. An example, would it be data that describes the color of your car: blue, red, or green? It does not make sense to compare the different values. Ordinal data can be compared but does have not a relative meaning. An example, would it be the safety rating of a car: 1,2,3? We can say that 3 is better than 1 but the actual numerical value does not mean anything.

In v0.11 of TorchMetrics, we are adding support for classic metrics on nominal data. In fact, 4 new metrics have already been added to this domain:

  • CramersV
  • PearsonsContingencyCoefficient
  • TschuprowsT
  • TheilsU

All metrics are measures of association between two nominal variables, giving a value between 0 and 1, with 1 meaning that there is a perfect association between the variables.

Small improvements

In addition to metrics within the two new domains v0.11 of Torchmetrics contains other smaller changes and fixes:

  • TotalVariation metric has been added to the image package, which measures the complexity of an image with respect to its spatial variation.

  • MulticlassExactMatch metric has been added to the classification package, which for example can be used to measure sentence level accuracy where all tokens need to match for a sentence to be counted as correct

  • KendallRankCorrCoef have been added to the regression package for measuring the overall correlation between two variables

  • LogCoshError have been added to the regression package for measuring the residual error between two variables. It is similar to the mean squared error close to 0 but similar to the mean absolute error away from 0.


Finally, Torchmetrics now only supports v1.8 and higher of Pytorch. It was necessary to increase from v1.3 to secure because we were running into compatibility issues with an older version of Pytorch. We strive to support as many versions of Pytorch, but for the best experience, we always recommend keeping Pytorch and Torchmetrics up to date.


[0.11.0] - 2022-11-30

Added

  • Added MulticlassExactMatch to classification metrics (#1343)
  • Added TotalVariation to image package (#978)
  • Added CLIPScore to new multimodal package (#1314)
  • Added regression metrics:
    • KendallRankCorrCoef (#1271)
    • LogCoshError (#1316)
  • Added new nominal metrics:
  • Added option to pass distributed_available_fn to metrics to allow checks for custom communication backend for making dist_sync_fn actually useful (#1301)
  • Added normalize argument to Inception, FID, KID metrics (#1246)

Changed

  • Changed minimum Pytorch version to be 1.8 (#1263)
  • Changed interface for all functional and modular classification metrics after refactor (#1252)

Removed

  • Removed deprecated BinnedAveragePrecision, BinnedPrecisionRecallCurve, RecallAtFixedPrecision (#1251)
  • Removed deprecated LabelRankingAveragePrecision, LabelRankingLoss and CoverageError (#1251)
  • Removed deprecated KLDivergence and AUC (#1251)

Fixed

  • Fixed precision bug in pairwise_euclidean_distance (#1352)

Contributors

@Borda, @justusschock, @ragavvenkatesan, @shenoynikhil, @SkafteNicki, @stancld

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Minor patch release

16 Nov 18:09
Compare
Choose a tag to compare

[0.10.3] - 2022-11-16

Fixed

  • Fixed bug in Metrictracker.best_metric when return_step=False (#1306)
  • Fixed bug to prevent users from going into an infinite loop if trying to iterate of a single metric (#1320)
  • Fixed bug in Metrictracker.best_metric when return_step=False (#1306)

Contributors

@SkafteNicki

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Fixed Performance

31 Oct 21:29
Compare
Choose a tag to compare

[0.10.2] - 2022-10-31

Changed

  • Changed in-place operation to out-of-place operation in pairwise_cosine_similarity (#1288)

Fixed

  • Fixed high memory usage for certain classification metrics when average='micro' (#1286)
  • Fixed precision problems when structural_similarity_index_measure was used with autocast (#1291)
  • Fixed slow performance for confusion matrix-based metrics (#1302)
  • Fixed restrictive dtype checking in spearman_corrcoef when used with autocast (#1303)

Contributors

@SkafteNicki

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Minor patch release

21 Oct 14:21
Compare
Choose a tag to compare

[0.10.1] - 2022-10-21

Fixed

  • Fixed broken clone method for classification metrics (#1250)
  • Fixed unintentional downloading of nltk.punkt when lsum not in rouge_keys (#1258)
  • Fixed type casting in MAP metric between bool and float32 (#1150)

Contributors

@dreaquil, @SkafteNicki, @stancld

If we forgot someone due to not matching commit email with GitHub account, let us know :]