Skip to content

Commit

Permalink
Removing explicit n_jobs=1 for torch datasets
Browse files Browse the repository at this point in the history
  • Loading branch information
jwmueller authored Apr 13, 2023
2 parents a2cb6d0 + e77b0a0 commit f2100ea
Show file tree
Hide file tree
Showing 3 changed files with 40 additions and 13 deletions.
9 changes: 8 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,11 @@ ENV/
env.bak/
venv.bak/

.idea/
# Editors
.idea/

# Datasets
image_files*
cifar*

results/
2 changes: 1 addition & 1 deletion huggingface_dataset.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
"outputs": [],
"source": [
"!pip install -U pip\n",
"!pip install cleanvision[huggingface]"
"!pip install \"cleanvision[huggingface]\""
]
},
{
Expand Down
42 changes: 31 additions & 11 deletions torchvision_dataset.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,15 @@
"cells": [
{
"cell_type": "markdown",
"id": "7e00bf6d",
"metadata": {},
"source": [
"# Run CleanVision on Torchvision dataset"
]
},
{
"cell_type": "markdown",
"id": "e7e88122",
"metadata": {},
"source": [
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/cleanlab/cleanvision-examples/blob/main/torchvision_dataset.ipynb) "
Expand All @@ -17,15 +19,17 @@
{
"cell_type": "code",
"execution_count": null,
"id": "4f0bd4fb",
"metadata": {},
"outputs": [],
"source": [
"!pip install -U pip\n",
"!pip install cleanvision[pytorch]"
"!pip install \"cleanvision[pytorch]\""
]
},
{
"cell_type": "markdown",
"id": "903e8838",
"metadata": {
"tags": []
},
Expand All @@ -36,6 +40,7 @@
{
"cell_type": "code",
"execution_count": 1,
"id": "69fb0960",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -46,6 +51,7 @@
},
{
"cell_type": "markdown",
"id": "19a97524",
"metadata": {},
"source": [
"### Download dataset and concatenate all splits\n",
Expand All @@ -60,6 +66,7 @@
{
"cell_type": "code",
"execution_count": 2,
"id": "54153464",
"metadata": {},
"outputs": [
{
Expand All @@ -78,6 +85,7 @@
},
{
"cell_type": "markdown",
"id": "8ba2126f",
"metadata": {
"tags": []
},
Expand All @@ -88,6 +96,7 @@
{
"cell_type": "code",
"execution_count": 3,
"id": "5f6c1fe5",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -96,6 +105,7 @@
},
{
"cell_type": "markdown",
"id": "23c65375",
"metadata": {
"tags": []
},
Expand All @@ -106,6 +116,7 @@
{
"cell_type": "code",
"execution_count": 4,
"id": "1167dbb3",
"metadata": {},
"outputs": [
{
Expand All @@ -125,6 +136,7 @@
},
{
"cell_type": "markdown",
"id": "62765c2b",
"metadata": {},
"source": [
"Let's look at the first image in this dataset"
Expand All @@ -133,6 +145,7 @@
{
"cell_type": "code",
"execution_count": 5,
"id": "54ce892d",
"metadata": {},
"outputs": [
{
Expand All @@ -153,6 +166,7 @@
},
{
"cell_type": "markdown",
"id": "a20fecd0",
"metadata": {},
"source": [
"### Run CleanVision"
Expand All @@ -161,22 +175,17 @@
{
"cell_type": "code",
"execution_count": 6,
"id": "67bf829d",
"metadata": {},
"outputs": [],
"source": [
"imagelab = Imagelab(torchvision_dataset=dataset)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We set `n_jobs = 1` as CleanVision parallelization may interact with torch dataloaders in unexpected ways."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "abb83355",
"metadata": {},
"outputs": [
{
Expand All @@ -190,8 +199,8 @@
"name": "stderr",
"output_type": "stream",
"text": [
"100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 60000/60000 [00:33<00:00, 1813.45it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 60000/60000 [00:15<00:00, 3862.80it/s]\n"
"100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 60000/60000 [00:34<00:00, 1740.02it/s]\n",
"100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 60000/60000 [00:15<00:00, 3835.02it/s]\n"
]
},
{
Expand All @@ -203,11 +212,12 @@
}
],
"source": [
"imagelab.find_issues(n_jobs=1)"
"imagelab.find_issues()"
]
},
{
"cell_type": "markdown",
"id": "836fa63a",
"metadata": {
"tags": []
},
Expand All @@ -218,6 +228,7 @@
{
"cell_type": "code",
"execution_count": 8,
"id": "42ca1712",
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -358,6 +369,7 @@
},
{
"cell_type": "markdown",
"id": "09aa8e41",
"metadata": {},
"source": [
"View more information about each image, such as what types of issues it exhibits and its quality score with respect to each type of issue."
Expand All @@ -366,6 +378,7 @@
{
"cell_type": "code",
"execution_count": 9,
"id": "ea24d412",
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -678,6 +691,7 @@
},
{
"cell_type": "markdown",
"id": "92e394a2",
"metadata": {},
"source": [
"Get indices of all **dark** images in the dataset sorted by their dark score."
Expand All @@ -686,6 +700,7 @@
{
"cell_type": "code",
"execution_count": 10,
"id": "55b42b69",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -694,6 +709,7 @@
},
{
"cell_type": "markdown",
"id": "3d6d717a",
"metadata": {},
"source": [
"View the 5th darkest image in the dataset"
Expand All @@ -702,6 +718,7 @@
{
"cell_type": "code",
"execution_count": 11,
"id": "9f7028a9",
"metadata": {},
"outputs": [
{
Expand All @@ -722,6 +739,7 @@
},
{
"cell_type": "markdown",
"id": "9a24028d",
"metadata": {},
"source": [
"View global information about each issue, such as how many images in the dataset suffer from this issue."
Expand All @@ -730,6 +748,7 @@
{
"cell_type": "code",
"execution_count": 12,
"id": "9e84f5f8",
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -825,6 +844,7 @@
},
{
"cell_type": "markdown",
"id": "0baaf964",
"metadata": {},
"source": [
"**For more detailed guide on how to use CleanVision, check the [tutorial notebook](https://github.com/cleanlab/cleanvision/blob/main/examples/tutorial.ipynb).**"
Expand Down

0 comments on commit f2100ea

Please sign in to comment.