You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bug Description
Pielou's Evenness returns nan when a sample vector composed of a single feature is presented. At present, this gets represented as the empty string in the resulting output, and is re-represented as a nan in QIIME.
Downstream consumers of these data (e.g., alpha-group-significance) raise a divide by zero error when presented with these data (not show in the example below).
Steps to reproduce the behavior
In [13]: importqiime2In [14]: importpandasaspdIn [15]: importskbioIn [16]: adiv=pd.Series([skbio.diversity.alpha._base.pielou_e(v) forvin ([0,1,2,3], [0,1,0,0], [1,2,3,0])],
...: index=['foo', 'bar', 'baz'])
...:
In [17]: adivOut[17]:
foo0.92062barNaNbaz0.92062dtype: float64In [18]: ar=qiime2.Artifact.import_data('SampleData[AlphaDiversity]', adiv)
In [19]: reloaded=ar.view(pd.Series)
In [20]: reloadedOut[20]:
foo0.92062barNaNbaz0.92062Name: 0, dtype: float64
Expected behavior
It's not clear yet whether this is the correct behavior.
Comments
After internal discussion, we thought it made sense to first open an issue here with q2-diversity as it is an open question whether we should allow nan in the output, whether consumers should be expected to gracefully handle these data, or whether this represents a bug in scikit-bio's Pielou's Evenness implementation.
This is an unusual pathological case, as it is unusual for a sample to contain only a single feature. However, this scenario arose on two independent executions of the C. diff FMT QIIME2 tutorial so it does appear to possible to occur with real data.
The text was updated successfully, but these errors were encountered:
q2-diversity-lib offers a _drop_undefined_samples utility, and this behavior is already wired up in the pielou_evenness implementation there. Undefined samples are not dropped by default, preferring to let users make that choice actively.
IIRC, our original decision not to expose this parameter in alpha and core-metrics was just about API simplicity. I suspect wiring it up would be pretty easy if we decide there's value in doing so.
Good to know. IMOO it will be better to have that flag default to True (drop) as I'm not aware of any direct utility of keeping the Nones in the results but I might be missing something ... thoughts?
Bug Description
Pielou's Evenness returns
nan
when a sample vector composed of a single feature is presented. At present, this gets represented as the empty string in the resulting output, and is re-represented as anan
in QIIME.Downstream consumers of these data (e.g.,
alpha-group-significance
) raise a divide by zero error when presented with these data (not show in the example below).Steps to reproduce the behavior
Expected behavior
It's not clear yet whether this is the correct behavior.
Comments
After internal discussion, we thought it made sense to first open an issue here with
q2-diversity
as it is an open question whether we should allownan
in the output, whether consumers should be expected to gracefully handle these data, or whether this represents a bug in scikit-bio's Pielou's Evenness implementation.This is an unusual pathological case, as it is unusual for a sample to contain only a single feature. However, this scenario arose on two independent executions of the C. diff FMT QIIME2 tutorial so it does appear to possible to occur with real data.
The text was updated successfully, but these errors were encountered: