Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge release/v1.1.0 to develop #5190

Merged
merged 35 commits into from
Dec 2, 2024
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
0bc217e
treat delegated execution separately
brimoor Nov 23, 2024
5d835ef
leaky splits docs
Nov 25, 2024
46c53ba
fixed name of method
Nov 25, 2024
e345c05
added datasetview as an input option
Nov 25, 2024
40ff57c
fiftyone brain -> the fiftyone brain
Nov 25, 2024
5a823a4
of identifying -> to identify
Nov 25, 2024
d67015a
made language a little more professional
Nov 25, 2024
02d6e70
leakyness -> leakiness
Nov 25, 2024
259e14e
fix
Nov 25, 2024
cdd4a48
Fix deleted datasets on App server (#5183)
benjaminpkane Nov 25, 2024
5834afc
Merge branch 'release/v1.1.0' of https://github.com/voxel51/fiftyone …
voxel51-bot Nov 25, 2024
19ebdad
added more snippets, more thorough explainations
Nov 25, 2024
873287e
compressed image
Nov 25, 2024
cf76f7a
fix confusion matrix in model evaluation panel (#5186)
imanjra Nov 25, 2024
8a72af6
Merge branch 'release/v1.1.0' of https://github.com/voxel51/fiftyone …
voxel51-bot Nov 25, 2024
3a2e112
Updated doc for `panel.img` (#5192)
minhtuev Nov 26, 2024
87bf17c
Merge branch 'release/v1.1.0' of https://github.com/voxel51/fiftyone …
voxel51-bot Nov 26, 2024
241726a
use static execution options where possible
brimoor Nov 26, 2024
ac6799c
Merge pull request #5184 from voxel51/use-ctx-delegate
brimoor Nov 26, 2024
a992ac4
Merge branch 'release/v1.1.0' of https://github.com/voxel51/fiftyone …
voxel51-bot Nov 26, 2024
0900594
typo
Nov 26, 2024
e948e4d
cleaned up example code
Nov 26, 2024
d88e777
bugfix: y-axis autorange reversed on all evaluation chart (#5193)
lanzhenw Nov 26, 2024
a82749d
Merge branch 'release/v1.1.0' of https://github.com/voxel51/fiftyone …
voxel51-bot Nov 26, 2024
0c51930
fix pending evaluation also being listed as completed
imanjra Nov 26, 2024
5e32142
Merge branch 'release/v1.1.0' of https://github.com/voxel51/fiftyone …
voxel51-bot Nov 26, 2024
738b128
Merge pull request #5189 from voxel51/leaky-splits-docs
jacobsela Nov 26, 2024
835ccec
Merge branch 'release/v1.1.0' of https://github.com/voxel51/fiftyone …
voxel51-bot Nov 26, 2024
85e2265
add guards for missing dimensions and location in detection label
sashankaryal Nov 27, 2024
34961bb
Merge pull request #5195 from voxel51/fix/3d-bad-labels
sashankaryal Nov 27, 2024
3ae877c
Merge branch 'release/v1.1.0' of https://github.com/voxel51/fiftyone …
voxel51-bot Nov 27, 2024
9f690fe
Updated from Slack to Discord (#5196)
minhtuev Nov 27, 2024
4867bd5
Merge branch 'release/v1.1.0' of https://github.com/voxel51/fiftyone …
voxel51-bot Nov 27, 2024
7287351
make icon in panel table have circle background on hover (#5197)
CamronStaley Nov 27, 2024
a16fb12
Merge branch 'release/v1.1.0' of https://github.com/voxel51/fiftyone …
voxel51-bot Nov 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -1155,12 +1155,28 @@ export default function Evaluation(props: EvaluationProps) {
colorscale: confusionMatrixConfig.log
? confusionMatrix?.colorscale || "viridis"
: "viridis",
hovertemplate:
[
"<b>count: %{z:d}</b>",
`${
evaluation?.info?.config?.gt_field || "truth"
}: %{y}`,
`${
evaluation?.info?.config?.pred_field ||
"predicted"
}: %{x}`,
].join(" <br>") + "<extra></extra>",
},
]}
onClick={({ points }) => {
const firstPoint = points[0];
loadView("matrix", { x: firstPoint.x, y: firstPoint.y });
}}
layout={{
yaxis: {
autorange: "reversed",
},
}}
/>
</Stack>
{compareKey && (
Expand All @@ -1183,8 +1199,24 @@ export default function Evaluation(props: EvaluationProps) {
colorscale: confusionMatrixConfig.log
? compareConfusionMatrix?.colorscale || "viridis"
: "viridis",
hovertemplate:
[
"<b>count: %{z:d}</b>",
`${
evaluation?.info?.config?.gt_field || "truth"
}: %{y}`,
`${
evaluation?.info?.config?.pred_field ||
"predicted"
}: %{x}`,
].join(" <br>") + "<extra></extra>",
},
]}
layout={{
yaxis: {
autorange: "reversed",
},
}}
/>
</Stack>
)}
Expand Down Expand Up @@ -1532,8 +1564,8 @@ function getMatrix(matrices, config) {
if (!matrices) return;
const { sortBy = "az", limit } = config;
const parsedLimit = typeof limit === "number" ? limit : undefined;
const classes = matrices[`${sortBy}_classes`].slice(parsedLimit);
const matrix = matrices[`${sortBy}_matrix`].slice(parsedLimit);
const classes = matrices[`${sortBy}_classes`].slice(0, parsedLimit);
const matrix = matrices[`${sortBy}_matrix`].slice(0, parsedLimit);
const colorscale = matrices[`${sortBy}_colorscale`];
return { labels: classes, matrix, colorscale };
}
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import { useTheme } from "@fiftyone/components";
import { merge } from "lodash";
import React, { useMemo } from "react";
import Plot, { PlotParams } from "react-plotly.js";

Expand Down Expand Up @@ -31,7 +32,6 @@ export default function EvaluationPlot(props: EvaluationPlotProps) {
color: theme.text.secondary,
gridcolor: theme.primary.softBorder,
automargin: true, // Enable automatic margin adjustment
scaleanchor: "x",
},
autosize: true,
margin: { t: 20, l: 50, b: 50, r: 20, pad: 0 },
Expand All @@ -45,6 +45,11 @@ export default function EvaluationPlot(props: EvaluationPlotProps) {
},
};
}, [theme]);

const mergedLayout = useMemo(() => {
return merge({}, layoutDefaults, layout);
}, [layoutDefaults, layout]);

const configDefaults: PlotConfig = useMemo(() => {
return {
displaylogo: false,
Expand All @@ -66,7 +71,7 @@ export default function EvaluationPlot(props: EvaluationPlotProps) {
return (
<Plot
config={configDefaults}
layout={{ ...layoutDefaults, ...layout }}
layout={mergedLayout}
style={{ height: "100%", width: "100%", zIndex: 1, ...style }}
data={data}
{...otherProps}
Expand Down
6 changes: 5 additions & 1 deletion app/packages/looker-3d/src/labels/index.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,11 @@ export const ThreeDLabels = ({ sampleMap }: ThreeDLabelsProps) => {
const newPolylineOverlays = [];

for (const overlay of rawOverlays) {
if (overlay._cls === "Detection") {
if (
overlay._cls === "Detection" &&
overlay.dimensions &&
overlay.location
) {
newCuboidOverlays.push(
<Cuboid
key={`cuboid-${overlay.id ?? overlay._id}-${overlay.sampleId}`}
Expand Down
2 changes: 1 addition & 1 deletion app/packages/looker/src/overlays/detection.ts
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ export default class DetectionOverlay<
this.label.mask && this.drawMask(ctx, state);
!state.config.thumbnail && this.drawLabelText(ctx, state);

if (this.is3D) {
if (this.is3D && this.label.dimensions && this.label.location) {
this.fillRectFor3d(ctx, state, this.getColor(state));
} else {
this.strokeRect(ctx, state, this.getColor(state));
Expand Down
6 changes: 5 additions & 1 deletion app/packages/looker/src/worker/threed-label-processor.ts
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,10 @@ const getInferredParamsForUndefinedProjection = (

if (cls === DETECTIONS) {
for (const detection of label.detections as DetectionLabel[]) {
if (!detection.location || !detection.dimensions) {
continue;
}

const [x, y] = detection.location;
const [lx, ly] = detection.dimensions;

Expand All @@ -55,7 +59,7 @@ const getInferredParamsForUndefinedProjection = (
minY = Math.min(minY, y - ly / 2);
maxY = Math.max(maxY, y + ly / 2);
}
} else if (cls === "Detection") {
} else if (cls === "Detection" && label.location && label.dimensions) {
const [x, y] = label.location as DetectionLabel["location"];
const [lx, ly] = label.dimensions as DetectionLabel["dimensions"];

Expand Down
148 changes: 148 additions & 0 deletions docs/source/brain.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,13 @@ workflow:
examples to train on in your data and for visualizing common modes of the
data.

* :ref:`Leaky Splits <brain-image-leaky-splits>`:
Often when sourcing data en masse, duplicates and near duplicates can slip
through the cracks. The FiftyOne Brain offers a *leaky-splits analysis* that
can be used to find potential leaks between dataset splits. These splits can
be misleading when evaluating a model, giving an overly optimistic measure
for the quality of training.

.. note::

Check out the :ref:`tutorials page <tutorials>` for detailed examples
Expand Down Expand Up @@ -1759,6 +1766,147 @@ samples being less representative and closer samples being more representative.
:alt: representativeness
:align: center


.. _brain-image-leaky-splits:

Leaky Splits
____________

Despite our best efforts, duplicates and other forms of non-IID samples
show up in our data. When these samples end up in different splits, this
can have consequences when evaluating a model. It can often be easy to
overestimate model capability due to this issue. The FiftyOne Brain offers a way
to identify such cases in dataset splits.

The leaks of a |Dataset| or |DatasetView| can be computed directly without the need
for the predictions of a pre-trained model via the
:meth:`compute_leaky_splits() <fiftyone.brain.compute_leaky_splits>`
method:. The splits of a dataset can be defined in three ways. Through tags, by
tagging samples with their corresponding split. Through a field, by giving each
split a unique value in that field. And finally through views, by having views
corresponding to each split.

.. code-block:: python
:linenos:

import fiftyone as fo
import fiftyone.brain as fob

dataset = fo.load_dataset(...)

# splits via tags
split_tags = ['train', 'test']
index, leaks = fob.compute_leaky_splits(dataset, split_tags=split_tags)

# splits via field
split_field = ['split'] # holds split values e.g. 'train' or 'test'
index, leaks = fob.compute_leaky_splits(dataset, split_field=split_field)

# splits via views
split_views = {
'train' : some_view
'test' : some_other_view
}
index, leaks = fob.compute_leaky_splits(dataset, split_views=split_views)

Here is a sample snippet to run this on the `COCO <https://cocodataset.org/#home>`_ dataset.
Try it for yourself and see what you may find.

.. code-block:: python
:linenos:

import fiftyone as fo
import fiftyone.zoo as foz
import fiftyone.utils.random as four
from fiftyone.brain import compute_leaky_splits

# load coco
dataset = foz.load_zoo_dataset("coco-2017", split="test")

# set up splits via tags
dataset.untag_samples(dataset.distinct("tags"))
four.random_split(dataset, {"train": 0.7, "test": 0.3})

# compute leaks
index, leaks = compute_leaky_splits(dataset, split_tags=['train', 'test'])

Once you have these leaks, it is wise to look through them. You may gain some insight
into the source of the leaks.

.. code-block:: python
:linenos:

session = fo.launch_app(leaks)

Before evaluating your model on your test set, consider getting a version of it
with the leaks removed. This can be easily done with the built in method
:meth:`no_leaks_view() <fiftyone.brain.internal.core.leaky_splits.LeakySplitsIndex.no_leaks_view>`.

.. code-block:: python
:linenos:

# if you already have it
test_set = some_view

# can also be found with the variable `split_views` from the index
# make sure to put in the right string based on the field/tag/key in view dict
# passed when building the index
test_set = index.split_views['test']

test_set_no_leaks = index.no_leaks_view(test_set) # return a view with leaks removed
session.view = test_set_no_leaks

# do evaluations on test_set_no_leaks rather than test_set

Performance on the clean test set will can be closer to the performance of the
model in the wild. If you found some leaks in your dataset, consider comparing
performance on the base test set against the clean test set.

**Input**: A |Dataset| or |DatasetView|, and a definition of splits through one
of tags, a field, or views.

**Output**: An index that will allow you to look through your leaks and
provides some useful actions once they are discovered such as automatically
cleaning the dataset with
:meth:`no_leaks_view() <fiftyone.brain.internal.core.leaky_splits.LeakySplitsIndex.no_leaks_view>`
or tagging them for the future with
:meth:`tag_leaks() <fiftyone.brain.internal.core.leaky_splits.LeakySplitsIndex.tag_leaks>`.
Besides this, a view with all leaks is returned. Visualization of this view
can give you an insight into the source of the leaks in your dataset.

**What to expect**: Leakiness find leaks by embedding samples with a powerful
model and finding very close samples in different splits in this space. Large,
powerful models that were *not* trained on a dataset can provide insight into
visual and semantic similarity between images, without creating further leaks
in the process.

**Similarity**: At its core, the leaky-splits module is a wrapper for the brain's
:class:`SimilarityIndex <fiftyone.brain.similarity.SimilarityIndex>`. Any similarity
backend, (see :ref:`similarity backends <brain-similarity-backends>`) that implements
the :class:`DuplicatesMixin <fiftyone.brain.similarity.DuplicatesMixin>` can be used
to compute leaky splits. You can either pass an existing similarity index by passing
its brain key to the argument `similarity_brain_key`, or have the method create one on
the fly for you. If there is a specific configuration for `Similarity` you would like
to use, pass it in the argument `similarity_config_dict`.

**Models and Embeddings**: If you opt for the method to create a `SimilarityIndex`
for you, you can still bring you own model by passing it in the `model` argument.
Alternatively, compute embeddings and pass the field that they reside on. We will
handle the rest.

**Thresholds**: The leaky-splits module uses a threshold to decide what samples
are 'too close' and mark them as potential leaks. This threshold can be changed
either by passing a value to the `threshold` argument of the `compute_leaky_splits()`
method, or by using the
:meth:`set_threshold() <fiftyone.brain.internal.core.leaky_splits.SimilarityIndex.set_threshold>`
method. The best value for your use-case may vary depending on your dataset, as well
as the embeddings used. A threshold that's too big will have a lot of false positives,
a threshold that's too small will have a lot of false negatives.

.. image:: /images/brain/brain-leaky-splits.png
:alt: leaky-splits
:align: center

.. _brain-managing-runs:

Managing brain runs
Expand Down
Binary file added docs/source/images/brain/brain-leaky-splits.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
25 changes: 10 additions & 15 deletions docs/source/plugins/developing_plugins.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1142,10 +1142,10 @@ executed in the background while you continue to use the App.
There are a variety of options available for configuring whether a given
operation should be delegated or executed immediately.

.. _operator-delegation-configuration:
.. _operator-execution-options:

Delegation configuration
~~~~~~~~~~~~~~~~~~~~~~~~
Execution options
~~~~~~~~~~~~~~~~~

You can provide the optional properties described below in the
:ref:`operator's config <operator-config>` to specify the available execution
Expand Down Expand Up @@ -1183,12 +1183,12 @@ user to choose between the supported options if there are multiple:
.. image:: /images/plugins/operators/operator-execute-button.png
:align: center

.. _operator-execution-options:
.. _dynamic-execution-options:

Execution options
~~~~~~~~~~~~~~~~~
Dynamic execution options
~~~~~~~~~~~~~~~~~~~~~~~~~

Operators can implement
Operators may also implement
:meth:`resolve_execution_options() <fiftyone.operators.operator.Operator.resolve_execution_options>`
to dynamically configure the available execution options based on the current
execution context:
Expand Down Expand Up @@ -1238,14 +1238,9 @@ of the current view:
# Force delegation for large views and immediate execution for small views
return len(ctx.view) > 1000

.. note::

If :meth:`resolve_delegation() <fiftyone.operators.operator.Operator.resolve_delegation>`
is not implemented or returns `None`, then the choice of execution mode is
deferred to
:meth:`resolve_execution_options() <fiftyone.operators.operator.Operator.resolve_execution_options>`
to specify the available execution options as described in the previous
section.
If :meth:`resolve_delegation() <fiftyone.operators.operator.Operator.resolve_delegation>`
is not implemented or returns `None`, then the choice of execution mode is
deferred to the prior mechanisms described above.

.. _operator-reporting-progress:

Expand Down
Loading
Loading