Releases: Lightning-AI/torchmetrics
Weekly patch release
[1.0.1] - 2022-07-13
Fixed
- Fixes corner case when using
MetricCollection
together with aggregation metrics (#1896) - Fixed the use of
max_fpr
inAUROC
metric when only one class is present (#1895) - Fixed bug related to empty predictions for
IntersectionOverUnion
metric (#1892) - Fixed bug related to
MeanMetric
and broadcasting of weights when Nans are present (#1898) - Fixed bug related to expected input format of pycoco in
MeanAveragePrecision
(#1913)
Contributors
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Visualize metrics
We are happy to announce that the first major release of Torchmetrics, version v1.0, is publicly available. We have
worked hard on a couple of new features for this milestone release, but for v1.0.0, we have also managed to implement
over 100 metrics in torchmetrics
.
Plotting
The big new feature of v1.0 is a built-in plotting feature. As the old saying goes: "A picture is worth a thousand words". Within machine learning, this is definitely also true for many things.
Metrics are one area that, in some cases, is definitely better showcased in a figure than as a list of floats. The only requirement for getting started with the plotting feature is installing matplotlib
. Either install with pip install matplotlib
or pip install torchmetrics[visual]
(the latter option also installs Scienceplots and uses that as the default plotting style).
The basic interface is the same for any metric. Just call the new .plot
method:
metric = AnyMetricYouLike()
for _ in range(num_updates):
metric.update(preds[i], target[i])
fig, ax = metric.plot()
The plot
method by default does not require any arguments and will automatically call metric.compute
internally on
whatever metric states have been accumulated.
[1.0.0] - 2022-07-04
Added
- Added
prefix
andpostfix
arguments toClasswiseWrapper
(#1866) - Added speech-to-reverberation modulation energy ratio (SRMR) metric (#1792, #1872)
- Added new global arg
compute_with_cache
to control caching behaviour aftercompute
method (#1754) - Added
ComplexScaleInvariantSignalNoiseRatio
for audio package (#1785) - Added
Running
wrapper for calculate running statistics (#1752) - Added
RelativeAverageSpectralError
andRootMeanSquaredErrorUsingSlidingWindow
to image package (#816) - Added support for
SpecificityAtSensitivity
Metric (#1432) - Added support for plotting of metrics through
.plot()
method (#1328, #1481, #1480, #1490, #1581, #1585, #1593, #1600, #1605, #1610, #1609, #1621, #1624, #1623, #1638, #1631, #1650, #1639, #1660, #1682, #1786) - Added support for plotting of audio metrics through
.plot()
method (#1434) - Added
classes
to output fromMAP
metric (#1419) - Added Binary group fairness metrics to classification package (#1404)
- Added
MinkowskiDistance
to regression package (#1362) - Added
pairwise_minkowski_distance
to pairwise package (#1362) - Added new detection metric
PanopticQuality
(#929, #1527) - Added
PSNRB
metric (#1421) - Added
ClassificationTask
Enum and use in metrics (#1479) - Added
ignore_index
option toexact_match
metric (#1540) - Add parameter
top_k
toRetrievalMAP
(#1501) - Added support for deterministic evaluation on GPU for metrics that uses
torch.cumsum
operator (#1499) - Added support for plotting of aggregation metrics through
.plot()
method (#1485) - Added support for python 3.11 (#1612)
- Added support for auto clamping of input for metrics that uses the
data_range
(#1606) - Added
ModifiedPanopticQuality
metric to detection package (#1627) - Added
PrecisionAtFixedRecall
metric to classification package (#1683) - Added multiple metrics to detection package (#1284)
IntersectionOverUnion
GeneralizedIntersectionOverUnion
CompleteIntersectionOverUnion
DistanceIntersectionOverUnion
- Added
MultitaskWrapper
to wrapper package (#1762) - Added
RelativeSquaredError
metric to regression package (#1765) - Added
MemorizationInformedFrechetInceptionDistance
metric to image package (#1580)
Changed
- Changed
permutation_invariant_training
to allow using a'permutation-wise'
metric function (#1794) - Changed
update_count
andupdate_called
from private to public methods (#1370) - Raise exception for invalid kwargs in Metric base class (#1427)
- Extend
EnumStr
raisingValueError
for invalid value (#1479) - Improve speed and memory consumption of binned
PrecisionRecallCurve
with large number of samples (#1493) - Changed
__iter__
method from raisingNotImplementedError
toTypeError
by setting toNone
(#1538) FID
metric will now raise an error if too few samples are provided (#1655)- Allowed FID with
torch.float64
(#1628) - Changed
LPIPS
implementation to no more rely on third-party package (#1575) - Changed FID matrix square root calculation from
scipy
totorch
(#1708) - Changed calculation in
PearsonCorrCoeff
to be more robust in certain cases (#1729) - Changed
MeanAveragePrecision
topycocotools
backend (#1832)
Deprecated
Removed
- Support for python 3.7 (#1640)
Fixed
- Fixed support in
MetricTracker
forMultioutputWrapper
and nested structures (#1608) - Fixed restrictive check in
PearsonCorrCoef
(#1649) - Fixed integration with
jsonargparse
andLightningCLI
(#1651) - Fixed corner case in calibration error for zero confidence input (#1648)
- Fix precision-recall curve based computations for float target (#1642)
- Fixed missing kwarg squeeze in
MultiOutputWrapper
(#1675) - Fixed padding removal for 3d input in
MSSSIM
(#1674) - Fixed
max_det_threshold
in MAP detection (#1712) - Fixed states being saved in metrics that use
register_buffer
(#1728) - Fixed states not being correctly synced and device transfered in
MeanAveragePrecision
foriou_type="segm"
(#1763) - Fixed use of
prefix
andpostfix
in nestedMetricCollection
(#1773) - Fixed
ax
plotting logging in `MetricCollection (#1783) - Fixed lookup for punkt sources being downloaded in
RougeScore
(#1789) - Fixed integration with lightning for
CompositionalMetric
(#1761) - Fixed several bugs in
SpectralDistortionIndex
metric (#1808) - Fixed bug for corner cases in
MatthewsCorrCoef
(#1812, #1863) - Fixed support for half precision in
PearsonCorrCoef
(#1819) - Fixed number of bugs related to
average="macro"
in classification metrics (#1821) - Fixed off-by-one issue when
ignore_index = num_classes + 1
in Multiclass-jaccard (#1860)
New Contributors
- @theja-vanka made their first contribution in #1372
- @wilderrodrigues made their first contribution in #1391
- @Freed-Wu made their first contribution in #1402
- @reaganjlee made their first contribution in #1405
- @davidgilbertson made their first contribution in #1412
- @ValerianRey made their first contribution in #1430
- @EPronovost made their first contribution in #1427
- @felixdivo made their first contribution in #1438
- @ivnvalex made their first contribution in #1447
- @PangLuo made their first contribution in #1452
- @JustinGoheen made their first contribution in #1463
- @DavidZhang73 made their first contribution in #1476
- @7shoe made their first contribution in #1474
- @srishti-git1110 made their first contribution in #1481
- @niberger made their first contribution in #929
- @shhs29 made their first contribution in #1434
- @ihowell made their first contribution in #1525
- @venomouscyanide made their first contribution in #1480
- @ItamarChinn made their first contribution in #1540
- @vincentvaroquauxads made their first contribution in #1521
- @Bomme made their first contribution in #1501
- @alexkrz made their first contribution in #1490
- @clay-curry made their first contribution in #1547
- @clueless-skywatcher made their first contribution in #1362
- @marcocaccin made their first contribution in #1527
- @Piyush-97 made their first contribution in #816
- @FarzanT made their first contribution in #1583
- @basveeling made their first contribution in #1651
- @YeaMerci made their first contribution in #1684
- @fkroeber made their first contribution in #1712
- @soma2000-lang made their first contribution in #1421
- @maxi-w made their first contribution in #1726
- @wbeardall made their first contribution in #1765
- @RistoAle97 made their first contribution in #1778
- @cdboer made their first contribution in #1820
- @bot66 made their first contribution in #1828
- ...
Minor patch release
[0.11.4] - 2023-03-10
Fixed
- Fixed evaluation of
R2Score
with the near constant target (#1576) - Fixed
dtype
conversion when the metric is submodule (#1583) - Fixed bug related to
top_k>1
andignore_index!=None
inStatScores
based metrics (#1589) - Fixed corner case for
PearsonCorrCoef
when running in DDP mode but only on a single device (#1587) - Fixed overflow error for specific cases in
MAP
when big areas are calculated (#1607)
Contributors
@Borda, @FarzanT, @SkafteNicki
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Full Changelog: v0.11.3...v0.11.4
Minor patch release
[0.11.3] - 2023-02-28
Fixed
- Fixed classification metrics for
byte
input (#1521) - Fixed the use of
ignore_index
inMulticlassJaccardIndex
(#1386)
Contributors
@SkafteNicki, @vincentvaroquauxads
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Full Changelog: v0.11.2...v0.11.3
Minor patch release
[0.11.2] - 2023-02-21
Fixed
- Fixed compatibility between XLA in
_bincount
function (#1471) - Fixed type hints in methods belonging to
MetricTracker
wrapper (#1472) - Fixed
multilabel
inExactMatch
(#1474)
Contributors
@7shoe, @Borda, @SkafteNicki, @ValerianRey
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Full Changelog: v0.11.1...v0.11.2
Minor patch release
[0.11.1] - 2023-01-30
Fixed
- Fixed type checking on the
maximize
parameter at the initialization ofMetricTracker
(#1428) - Fixed mixed precision auto-cast for
SSIM
metric (#1454) - Fixed checking for
nltk.punkt
inRougeScore
if a machine is not online (#1456) - Fixed wrongly reset method in
MultioutputWrapper
(#1460) - Fixed
dtype
checking inPrecisionRecallCurve
fortarget
tensor (#1457)
Contributors
@Borda, @SkafteNicki, @stancld
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Full Changelog: v0.11.0...v0.11.1
Adding Multimodal and nominal domain
We are happy to announce that Torchmetrics v0.11 is now publicly available. In Torchmetrics v0.11 we have primarily focused on the cleanup of the large classification refactor from v0.10 and adding new metrics. With v0.11 are crossing 90+ metrics in Torchmetrics nearing the milestone of having 100+ metrics.
New domains
In Torchmetrics we are not only looking to expand with new metrics in already established metric domains such as classification or regression, but also new domains. We are therefore happy to report that v0.11 includes two new domains: Multimodal and nominal.
Multimodal
If there is one topic within machine learning that is hot right now then it is generative models and in particular image-to-text generative models. Just recently stable diffusion v2 was released, able to create even more photorealistic images from a single text prompt than ever
In Torchmetrics v0.11 we are adding a new domain called multimodal to support the evaluation of such models. For now, we are starting out with a single metric, the CLIPScore from this paper that can be used to evaluate such image-to-text models. CLIPScore currently achieves the highest correlation with human judgment, and thus a high CLIPScore for an image-text pair means that it is highly plausible that an image caption and an image are related to each other.
Nominal
If you have ever taken any course in statistics or introduction to machine learning you should hopefully have heard about data can be of different types of attributes: nominal, ordinal, interval, and ratio. This essentially refers to how data can be compared. For example, nominal data cannot be ordered and cannot be measured. An example, would it be data that describes the color of your car: blue, red, or green? It does not make sense to compare the different values. Ordinal data can be compared but does have not a relative meaning. An example, would it be the safety rating of a car: 1,2,3? We can say that 3 is better than 1 but the actual numerical value does not mean anything.
In v0.11 of TorchMetrics, we are adding support for classic metrics on nominal data. In fact, 4 new metrics have already been added to this domain:
CramersV
PearsonsContingencyCoefficient
TschuprowsT
TheilsU
All metrics are measures of association between two nominal variables, giving a value between 0 and 1, with 1 meaning that there is a perfect association between the variables.
Small improvements
In addition to metrics within the two new domains v0.11 of Torchmetrics contains other smaller changes and fixes:
-
TotalVariation
metric has been added to the image package, which measures the complexity of an image with respect to its spatial variation. -
MulticlassExactMatch
metric has been added to the classification package, which for example can be used to measure sentence level accuracy where all tokens need to match for a sentence to be counted as correct -
KendallRankCorrCoef
have been added to the regression package for measuring the overall correlation between two variables -
LogCoshError
have been added to the regression package for measuring the residual error between two variables. It is similar to the mean squared error close to 0 but similar to the mean absolute error away from 0.
Finally, Torchmetrics now only supports v1.8 and higher of Pytorch. It was necessary to increase from v1.3 to secure because we were running into compatibility issues with an older version of Pytorch. We strive to support as many versions of Pytorch, but for the best experience, we always recommend keeping Pytorch and Torchmetrics up to date.
[0.11.0] - 2022-11-30
Added
- Added
MulticlassExactMatch
to classification metrics (#1343) - Added
TotalVariation
to image package (#978) - Added
CLIPScore
to new multimodal package (#1314) - Added regression metrics:
- Added new nominal metrics:
- Added option to pass
distributed_available_fn
to metrics to allow checks for custom communication backend for makingdist_sync_fn
actually useful (#1301) - Added
normalize
argument toInception
,FID
,KID
metrics (#1246)
Changed
- Changed minimum Pytorch version to be 1.8 (#1263)
- Changed interface for all functional and modular classification metrics after refactor (#1252)
Removed
- Removed deprecated
BinnedAveragePrecision
,BinnedPrecisionRecallCurve
,RecallAtFixedPrecision
(#1251) - Removed deprecated
LabelRankingAveragePrecision
,LabelRankingLoss
andCoverageError
(#1251) - Removed deprecated
KLDivergence
andAUC
(#1251)
Fixed
- Fixed precision bug in
pairwise_euclidean_distance
(#1352)
Contributors
@Borda, @justusschock, @ragavvenkatesan, @shenoynikhil, @SkafteNicki, @stancld
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Minor patch release
[0.10.3] - 2022-11-16
Fixed
- Fixed bug in
Metrictracker.best_metric
whenreturn_step=False
(#1306) - Fixed bug to prevent users from going into an infinite loop if trying to iterate of a single metric (#1320)
- Fixed bug in
Metrictracker.best_metric
whenreturn_step=False
(#1306)
Contributors
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Fixed Performance
[0.10.2] - 2022-10-31
Changed
- Changed in-place operation to out-of-place operation in
pairwise_cosine_similarity
(#1288)
Fixed
- Fixed high memory usage for certain classification metrics when
average='micro'
(#1286) - Fixed precision problems when
structural_similarity_index_measure
was used with autocast (#1291) - Fixed slow performance for confusion matrix-based metrics (#1302)
- Fixed restrictive dtype checking in
spearman_corrcoef
when used with autocast (#1303)
Contributors
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Minor patch release
[0.10.1] - 2022-10-21
Fixed
- Fixed broken clone method for classification metrics (#1250)
- Fixed unintentional downloading of
nltk.punkt
whenlsum
not inrouge_keys
(#1258) - Fixed type casting in
MAP
metric betweenbool
andfloat32
(#1150)
Contributors
@dreaquil, @SkafteNicki, @stancld
If we forgot someone due to not matching commit email with GitHub account, let us know :]