-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
15 changed files
with
548 additions
and
1 deletion.
There are no files selected for viewing
114 changes: 114 additions & 0 deletions
114
configs/vision/pathology/offline/classification/breakhis.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
--- | ||
trainer: | ||
class_path: eva.Trainer | ||
init_args: | ||
n_runs: &N_RUNS ${oc.env:N_RUNS, 5} | ||
default_root_dir: &OUTPUT_ROOT ${oc.env:OUTPUT_ROOT, logs/${oc.env:MODEL_NAME, dino_vits16}/offline/breakhis} | ||
max_steps: &MAX_STEPS ${oc.env:MAX_STEPS, 12500} | ||
checkpoint_type: ${oc.env:CHECKPOINT_TYPE, best} | ||
callbacks: | ||
- class_path: eva.callbacks.ConfigurationLogger | ||
- class_path: lightning.pytorch.callbacks.TQDMProgressBar | ||
init_args: | ||
refresh_rate: ${oc.env:TQDM_REFRESH_RATE, 1} | ||
- class_path: lightning.pytorch.callbacks.LearningRateMonitor | ||
init_args: | ||
logging_interval: epoch | ||
- class_path: lightning.pytorch.callbacks.ModelCheckpoint | ||
init_args: | ||
filename: best | ||
save_last: true | ||
save_top_k: 1 | ||
monitor: &MONITOR_METRIC ${oc.env:MONITOR_METRIC, val/MulticlassAccuracy} | ||
mode: &MONITOR_METRIC_MODE ${oc.env:MONITOR_METRIC_MODE, max} | ||
- class_path: lightning.pytorch.callbacks.EarlyStopping | ||
init_args: | ||
min_delta: 0 | ||
patience: ${oc.env:PATIENCE, 105} | ||
monitor: *MONITOR_METRIC | ||
mode: *MONITOR_METRIC_MODE | ||
- class_path: eva.callbacks.ClassificationEmbeddingsWriter | ||
init_args: | ||
output_dir: &DATASET_EMBEDDINGS_ROOT ${oc.env:EMBEDDINGS_ROOT, ./data/embeddings}/${oc.env:MODEL_NAME, dino_vits16}/breakhis | ||
dataloader_idx_map: | ||
0: train | ||
1: val | ||
backbone: | ||
class_path: eva.vision.models.ModelFromRegistry | ||
init_args: | ||
model_name: ${oc.env:MODEL_NAME, universal/vit_small_patch16_224_dino} | ||
model_extra_kwargs: ${oc.env:MODEL_EXTRA_KWARGS, null} | ||
overwrite: false | ||
logger: | ||
- class_path: lightning.pytorch.loggers.TensorBoardLogger | ||
init_args: | ||
save_dir: *OUTPUT_ROOT | ||
name: "" | ||
model: | ||
class_path: eva.HeadModule | ||
init_args: | ||
head: | ||
class_path: torch.nn.Linear | ||
init_args: | ||
in_features: ${oc.env:IN_FEATURES, 384} | ||
out_features: &NUM_CLASSES 8 | ||
criterion: torch.nn.CrossEntropyLoss | ||
optimizer: | ||
class_path: torch.optim.AdamW | ||
init_args: | ||
lr: ${oc.env:LR_VALUE, 0.0003} | ||
lr_scheduler: | ||
class_path: torch.optim.lr_scheduler.CosineAnnealingLR | ||
init_args: | ||
T_max: *MAX_STEPS | ||
eta_min: 0.0 | ||
metrics: | ||
common: | ||
- class_path: eva.metrics.AverageLoss | ||
- class_path: eva.metrics.MulticlassClassificationMetrics | ||
init_args: | ||
num_classes: *NUM_CLASSES | ||
data: | ||
class_path: eva.DataModule | ||
init_args: | ||
datasets: | ||
train: | ||
class_path: eva.datasets.EmbeddingsClassificationDataset | ||
init_args: &DATASET_ARGS | ||
root: *DATASET_EMBEDDINGS_ROOT | ||
manifest_file: manifest.csv | ||
split: train | ||
val: | ||
class_path: eva.datasets.EmbeddingsClassificationDataset | ||
init_args: | ||
<<: *DATASET_ARGS | ||
split: val | ||
predict: | ||
- class_path: eva.vision.datasets.BreaKHis | ||
init_args: &PREDICT_DATASET_ARGS | ||
root: ${oc.env:DATA_ROOT, ./data/breakhis} | ||
split: train | ||
download: ${oc.env:DOWNLOAD_DATA, false} | ||
# Set `download: true` to download the dataset from https://zenodo.org/records/1214456 | ||
# The BreaKHis dataset is distributed under the following license: "CC BY 4.0" | ||
# (see: https://creativecommons.org/licenses/by/4.0/) | ||
transforms: | ||
class_path: eva.vision.data.transforms.common.ResizeAndCrop | ||
init_args: | ||
mean: ${oc.env:NORMALIZE_MEAN, [0.485, 0.456, 0.406]} | ||
std: ${oc.env:NORMALIZE_STD, [0.229, 0.224, 0.225]} | ||
- class_path: eva.vision.datasets.BreaKHis | ||
init_args: | ||
<<: *PREDICT_DATASET_ARGS | ||
split: val | ||
dataloaders: | ||
train: | ||
batch_size: &BATCH_SIZE ${oc.env:BATCH_SIZE, 256} | ||
num_workers: &N_DATA_WORKERS ${oc.env:N_DATA_WORKERS, 4} | ||
shuffle: true | ||
val: | ||
batch_size: *BATCH_SIZE | ||
num_workers: *N_DATA_WORKERS | ||
predict: | ||
batch_size: &PREDICT_BATCH_SIZE ${oc.env:PREDICT_BATCH_SIZE, 64} | ||
num_workers: *N_DATA_WORKERS |
94 changes: 94 additions & 0 deletions
94
configs/vision/pathology/online/classification/breakhis.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
--- | ||
trainer: | ||
class_path: eva.Trainer | ||
init_args: | ||
n_runs: &N_RUNS ${oc.env:N_RUNS, 5} | ||
default_root_dir: &OUTPUT_ROOT ${oc.env:OUTPUT_ROOT, logs/${oc.env:MODEL_NAME, dino_vits16}/online/breakhis} | ||
max_steps: &MAX_STEPS ${oc.env:MAX_STEPS, 12500} | ||
checkpoint_type: ${oc.env:CHECKPOINT_TYPE, best} | ||
callbacks: | ||
- class_path: eva.callbacks.ConfigurationLogger | ||
- class_path: lightning.pytorch.callbacks.TQDMProgressBar | ||
init_args: | ||
refresh_rate: ${oc.env:TQDM_REFRESH_RATE, 1} | ||
- class_path: lightning.pytorch.callbacks.LearningRateMonitor | ||
init_args: | ||
logging_interval: epoch | ||
- class_path: lightning.pytorch.callbacks.ModelCheckpoint | ||
init_args: | ||
filename: best | ||
save_last: true | ||
save_top_k: 1 | ||
monitor: &MONITOR_METRIC ${oc.env:MONITOR_METRIC, val/MulticlassAccuracy} | ||
mode: &MONITOR_METRIC_MODE ${oc.env:MONITOR_METRIC_MODE, max} | ||
- class_path: lightning.pytorch.callbacks.EarlyStopping | ||
init_args: | ||
min_delta: 0 | ||
patience: ${oc.env:PATIENCE, 105} | ||
monitor: *MONITOR_METRIC | ||
mode: *MONITOR_METRIC_MODE | ||
logger: | ||
- class_path: lightning.pytorch.loggers.TensorBoardLogger | ||
init_args: | ||
save_dir: *OUTPUT_ROOT | ||
name: "" | ||
model: | ||
class_path: eva.HeadModule | ||
init_args: | ||
backbone: | ||
class_path: eva.vision.models.ModelFromRegistry | ||
init_args: | ||
model_name: ${oc.env:MODEL_NAME, universal/vit_small_patch16_224_dino} | ||
model_extra_kwargs: ${oc.env:MODEL_EXTRA_KWARGS, null} | ||
head: | ||
class_path: torch.nn.Linear | ||
init_args: | ||
in_features: ${oc.env:IN_FEATURES, 384} | ||
out_features: &NUM_CLASSES 8 | ||
criterion: torch.nn.CrossEntropyLoss | ||
optimizer: | ||
class_path: torch.optim.AdamW | ||
init_args: | ||
lr: ${oc.env:LR_VALUE, 0.0003} | ||
lr_scheduler: | ||
class_path: torch.optim.lr_scheduler.CosineAnnealingLR | ||
init_args: | ||
T_max: *MAX_STEPS | ||
eta_min: 0.0 | ||
metrics: | ||
common: | ||
- class_path: eva.metrics.AverageLoss | ||
- class_path: eva.metrics.MulticlassClassificationMetrics | ||
init_args: | ||
num_classes: *NUM_CLASSES | ||
data: | ||
class_path: eva.DataModule | ||
init_args: | ||
datasets: | ||
train: | ||
class_path: eva.vision.datasets.BreaKHis | ||
init_args: &DATASET_ARGS | ||
root: ${oc.env:DATA_ROOT, ./data/breakhis} | ||
split: train | ||
download: ${oc.env:DOWNLOAD_DATA, false} | ||
# Set `download: true` to download the dataset from https://zenodo.org/records/1214456 | ||
# The BreaKHis dataset is distributed under the following license: "CC BY 4.0" | ||
# (see: https://creativecommons.org/licenses/by/4.0/) | ||
transforms: | ||
class_path: eva.vision.data.transforms.common.ResizeAndCrop | ||
init_args: | ||
mean: ${oc.env:NORMALIZE_MEAN, [0.485, 0.456, 0.406]} | ||
std: ${oc.env:NORMALIZE_STD, [0.229, 0.224, 0.225]} | ||
val: | ||
class_path: eva.vision.datasets.BreaKHis | ||
init_args: | ||
<<: *DATASET_ARGS | ||
split: val | ||
dataloaders: | ||
train: | ||
batch_size: &BATCH_SIZE ${oc.env:BATCH_SIZE, 256} | ||
num_workers: &N_DATA_WORKERS ${oc.env:N_DATA_WORKERS, 4} | ||
shuffle: true | ||
val: | ||
batch_size: *BATCH_SIZE | ||
num_workers: *N_DATA_WORKERS |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
# BreakHis | ||
|
||
The Breast Cancer Histopathological Image Classification (BreakHis) is composed of 9,109 microscopic images of breast tumor tissue collected from 82 patients using different magnifying factors (40X, 100X, 200X, and 400X). For this benchmark we only use the 40X samples which results in a subset of 1,995 images. This database has been built in collaboration with the P&D Laboratory, Pathological Anatomy and Cytopathology, Parana, Brazil. | ||
|
||
The dataset is divided into two main groups: benign tumors and malignant tumors. The dataset currently contains four histological distinct types of benign breast tumors: adenosis (A), fibroadenoma (F), phyllodes tumor (PT), and tubular adenona (TA); and four malignant tumors (breast cancer): carcinoma (DC), lobular carcinoma (LC), mucinous carcinoma (MC) and papillary carcinoma (PC). | ||
|
||
## Raw data | ||
|
||
### Key stats | ||
|
||
| | | | ||
|--------------------------------|-----------------------------| | ||
| **Modality** | Vision (WSI patches) | | ||
| **Task** | Multiclass classification (8 classes) | | ||
| **Cancer type** | Breast | | ||
| **Data size** | 4 GB | | ||
| **Image dimension** | 700 x 460 | | ||
| **Magnification (μm/px)** | 40x (0.25) | | ||
| **Files format** | `png` | | ||
| **Number of images** | 1995 | | ||
|
||
|
||
### Splits | ||
|
||
The data source provides train/validation splits | ||
|
||
| Splits | Train | Validation | | ||
|----------|---------------|--------------| | ||
| #Samples | 1393 (70%) | 602 (30%) | | ||
|
||
A test split is not provided, as by further dividing the dataset the number of samples per class becomes too low for robust evaluations. __eva__ therefore reports evaluation results for BreakHis on the validation split. | ||
|
||
|
||
### Organization | ||
|
||
The BreakHis data is organized as follows: | ||
|
||
``` | ||
BreaKHis_v1 | ||
├── histology_slides | ||
│ ├── breast | ||
| │ ├── benign | ||
| │ | ├── SOB | ||
| │ | | ├── adenosis | ||
| │ | | ├── fibroadenoma | ||
| │ | | └── ... | ||
``` | ||
|
||
|
||
## Download and preprocessing | ||
The `BreakHis` dataset class supports downloading the data during runtime through setting the environment variable `DOWNLOAD_DATA=true`. | ||
|
||
## Relevant links | ||
|
||
* [Official Source](https://web.inf.ufpr.br/vri/databases/breast-cancer-histopathological-database-breakhis/) | ||
|
||
## License | ||
|
||
[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.