-
Notifications
You must be signed in to change notification settings - Fork 691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't run on Mac M1 - Cannot convert a MPS Tensor to float64 #1533
Comments
You could try changing the |
Hallo blaz-r, And first of all thanks for the tip! That was exactly the hint I needed ;) While it made the model run for a bit, I got a follow up crash with a reshape problem in Regarding the MPS, it still might be interesting to further investigate as the processing times are suboptimal on CPU.. |
I had the same visualization error when I tried to log image results out. I have tried to run the same code in Linux environment, and everything works just fine, if you change the config file as follows, you should be able to log some image results out. visualization:
show_images: False # show images on the screen
save_images: True # save images to the file system
log_images: False # log images to the available loggers (if any)
image_save_path: null # path to which images will be saved
mode: simple I am using macOS 12.3.1 with miniconda env |
by using the MPS accelerator, we might have to modify the defaults datamodule to typecast all the data from default torch.int64 to torch.int32. No sure if it worth it |
just got my new macbook and can confirm that this is indeed a problem :D |
Hi @jahad9819jjj, just to clarify, did you get the visualisation issue or tensor to float64 issue? We have fixed the tensor to float64 issue, but the visualisation issue is still there. |
@samet-akcay float64 errordyld[29303]: Assertion failed: (this->magic == kMagic), function matchesPath, file Loader.cpp, line 154.
Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x12cb30ee0>
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1479, in __del__
self._shutdown_workers()
File "/opt/homebrew/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1443, in _shutdown_workers
w.join(timeout=_utils.MP_STATUS_CHECK_INTERVAL)
File "/opt/homebrew/Cellar/[email protected]/3.10.13_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/process.py", line 149, in join
res = self._popen.wait(timeout)
File "/opt/homebrew/Cellar/[email protected]/3.10.13_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/popen_fork.py", line 40, in wait
if not wait([self.sentinel], timeout):
File "/opt/homebrew/Cellar/[email protected]/3.10.13_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/connection.py", line 931, in wait
ready = selector.select(timeout)
File "/opt/homebrew/Cellar/[email protected]/3.10.13_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/selectors.py", line 416, in select
fd_event_list = self._selector.poll(timeout)
File "/opt/homebrew/lib/python3.10/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 29303) is killed by signal: Abort trap: 6.
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /opt/homebrew/bin/anomalib:8 in <module> │
│ │
│ 5 from anomalib.cli.cli import main │
│ 6 if __name__ == '__main__': │
│ 7 │ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0]) │
│ ❱ 8 │ sys.exit(main()) │
│ 9 │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/cli/cli.py:376 in main │
│ │
│ 373 def main() -> None: │
│ 374 │ """Trainer via Anomalib CLI.""" │
│ 375 │ configure_logger() │
│ ❱ 376 │ AnomalibCLI() │
│ 377 │
│ 378 │
│ 379 if __name__ == "__main__": │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/cli/cli.py:64 in __init__ │
│ │
│ 61 │ │ run: bool = True, │
│ 62 │ │ auto_configure_optimizers: bool = True, │
│ 63 │ ) -> None: │
│ ❱ 64 │ │ super().__init__( │
│ 65 │ │ │ AnomalyModule, │
│ 66 │ │ │ AnomalibDataModule, │
│ 67 │ │ │ save_config_callback, │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/cli.py:386 in __init__ │
│ │
│ 383 │ │ self.instantiate_classes() │
│ 384 │ │ │
│ 385 │ │ if self.subcommand is not None: │
│ ❱ 386 │ │ │ self._run_subcommand(self.subcommand) │
│ 387 │ │
│ 388 │ def _setup_parser_kwargs(self, parser_kwargs: Dict[str, Any]) -> Tuple[Dict[str, Any │
│ 389 │ │ subcommand_names = self.subcommands().keys() │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/cli/cli.py:294 in _run_subcommand │
│ │
│ 291 │ │ if self.config["subcommand"] in (*self.subcommands(), "train", "export", "predic │
│ 292 │ │ │ fn = getattr(self.engine, subcommand) │
│ 293 │ │ │ fn_kwargs = self._prepare_subcommand_kwargs(subcommand) │
│ ❱ 294 │ │ │ fn(**fn_kwargs) │
│ 295 │ │ else: │
│ 296 │ │ │ self.config_init = self.parser.instantiate_classes(self.config) │
│ 297 │ │ │ getattr(self, f"{subcommand}")() │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/engine/engine.py:478 in train │
│ │
│ 475 │ │ """ │
│ 476 │ │ self._setup_trainer(model) │
│ 477 │ │ self._setup_dataset_task(train_dataloaders, val_dataloaders, test_dataloaders, d │
│ ❱ 478 │ │ self.trainer.fit(model, train_dataloaders, val_dataloaders, datamodule, ckpt_pat │
│ 479 │ │ self.trainer.test(model, test_dataloaders, ckpt_path=ckpt_path, datamodule=datam │
│ 480 │ │
│ 481 │ def export( │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py:544 in fit │
│ │
│ 541 │ │ self.state.fn = TrainerFn.FITTING │
│ 542 │ │ self.state.status = TrainerStatus.RUNNING │
│ 543 │ │ self.training = True │
│ ❱ 544 │ │ call._call_and_handle_interrupt( │
│ 545 │ │ │ self, self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, │
│ 546 │ │ ) │
│ 547 │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py:44 in │
│ _call_and_handle_interrupt │
│ │
│ 41 │ try: │
│ 42 │ │ if trainer.strategy.launcher is not None: │
│ 43 │ │ │ return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, │
│ ❱ 44 │ │ return trainer_fn(*args, **kwargs) │
│ 45 │ │
│ 46 │ except _TunerExitException: │
│ 47 │ │ _call_teardown_hook(trainer) │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py:580 in _fit_impl │
│ │
│ 577 │ │ │ model_provided=True, │
│ 578 │ │ │ model_connected=self.lightning_module is not None, │
│ 579 │ │ ) │
│ ❱ 580 │ │ self._run(model, ckpt_path=ckpt_path) │
│ 581 │ │ │
│ 582 │ │ assert self.state.stopped │
│ 583 │ │ self.training = False │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py:989 in _run │
│ │
│ 986 │ │ # ---------------------------- │
│ 987 │ │ # RUN THE TRAINER │
│ 988 │ │ # ---------------------------- │
│ ❱ 989 │ │ results = self._run_stage() │
│ 990 │ │ │
│ 991 │ │ # ---------------------------- │
│ 992 │ │ # POST-Training CLEAN UP │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py:1035 in │
│ _run_stage │
│ │
│ 1032 │ │ │ with isolate_rng(): │
│ 1033 │ │ │ │ self._run_sanity_check() │
│ 1034 │ │ │ with torch.autograd.set_detect_anomaly(self._detect_anomaly): │
│ ❱ 1035 │ │ │ │ self.fit_loop.run() │
│ 1036 │ │ │ return None │
│ 1037 │ │ raise RuntimeError(f"Unexpected state {self.state}") │
│ 1038 │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py:202 in run │
│ │
│ 199 │ │ while not self.done: │
│ 200 │ │ │ try: │
│ 201 │ │ │ │ self.on_advance_start() │
│ ❱ 202 │ │ │ │ self.advance() │
│ 203 │ │ │ │ self.on_advance_end() │
│ 204 │ │ │ │ self._restarting = False │
│ 205 │ │ │ except StopIteration: │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py:359 in advance │
│ │
│ 356 │ │ │ ) │
│ 357 │ │ with self.trainer.profiler.profile("run_training_epoch"): │
│ 358 │ │ │ assert self._data_fetcher is not None │
│ ❱ 359 │ │ │ self.epoch_loop.run(self._data_fetcher) │
│ 360 │ │
│ 361 │ def on_advance_end(self) -> None: │
│ 362 │ │ trainer = self.trainer │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/loops/training_epoch_loop.py:137 in │
│ run │
│ │
│ 134 │ │ while not self.done: │
│ 135 │ │ │ try: │
│ 136 │ │ │ │ self.advance(data_fetcher) │
│ ❱ 137 │ │ │ │ self.on_advance_end(data_fetcher) │
│ 138 │ │ │ │ self._restarting = False │
│ 139 │ │ │ except StopIteration: │
│ 140 │ │ │ │ break │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/loops/training_epoch_loop.py:285 in │
│ on_advance_end │
│ │
│ 282 │ │ │ │ # clear gradients to not leave any unused memory during validation │
│ 283 │ │ │ │ call._call_lightning_module_hook(self.trainer, "on_validation_model_zero │
│ 284 │ │ │ │
│ ❱ 285 │ │ │ self.val_loop.run() │
│ 286 │ │ │ self.trainer.training = True │
│ 287 │ │ │ self.trainer._logger_connector._first_loop_iter = first_loop_iter │
│ 288 │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/loops/utilities.py:182 in │
│ _decorator │
│ │
│ 179 │ │ else: │
│ 180 │ │ │ context_manager = torch.no_grad │
│ 181 │ │ with context_manager(): │
│ ❱ 182 │ │ │ return loop_run(self, *args, **kwargs) │
│ 183 │ │
│ 184 │ return _decorator │
│ 185 │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/loops/evaluation_loop.py:113 in run │
│ │
│ 110 │ │ if self.skip: │
│ 111 │ │ │ return [] │
│ 112 │ │ self.reset() │
│ ❱ 113 │ │ self.on_run_start() │
│ 114 │ │ data_fetcher = self._data_fetcher │
│ 115 │ │ assert data_fetcher is not None │
│ 116 │ │ previous_dataloader_idx = 0 │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/loops/evaluation_loop.py:243 in │
│ on_run_start │
│ │
│ 240 │ │ hooks.""" │
│ 241 │ │ self._verify_dataloader_idx_requirement() │
│ 242 │ │ self._on_evaluation_model_eval() │
│ ❱ 243 │ │ self._on_evaluation_start() │
│ 244 │ │ self._on_evaluation_epoch_start() │
│ 245 │ │
│ 246 │ def on_run_end(self) -> List[_OUT_DICT]: │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/loops/evaluation_loop.py:289 in │
│ _on_evaluation_start │
│ │
│ 286 │ │ │
│ 287 │ │ hook_name = "on_test_start" if trainer.testing else "on_validation_start" │
│ 288 │ │ call._call_callback_hooks(trainer, hook_name, *args, **kwargs) │
│ ❱ 289 │ │ call._call_lightning_module_hook(trainer, hook_name, *args, **kwargs) │
│ 290 │ │ call._call_strategy_hook(trainer, hook_name, *args, **kwargs) │
│ 291 │ │
│ 292 │ def _on_evaluation_model_eval(self) -> None: │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py:157 in │
│ _call_lightning_module_hook │
│ │
│ 154 │ pl_module._current_fx_name = hook_name │
│ 155 │ │
│ 156 │ with trainer.profiler.profile(f"[LightningModule]{pl_module.__class__.__name__}.{hoo │
│ ❱ 157 │ │ output = fn(*args, **kwargs) │
│ 158 │ │
│ 159 │ # restore current_fx when nested context │
│ 160 │ pl_module._current_fx_name = prev_fx_name │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/models/components/base/memory_bank_module.py:37 in │
│ on_validation_start │
│ │
│ 34 │ def on_validation_start(self) -> None: │
│ 35 │ │ """Ensure that the model is fitted before validation starts.""" │
│ 36 │ │ if not self._is_fitted: │
│ ❱ 37 │ │ │ self.fit() │
│ 38 │ │ │ self._is_fitted = torch.tensor([True]) │
│ 39 │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/models/image/patchcore/lightning_model.py:95 in fit │
│ │
│ 92 │ │ embeddings = torch.vstack(self.embeddings) │
│ 93 │ │ │
│ 94 │ │ logger.info("Applying core-set subsampling to get the embedding.") │
│ ❱ 95 │ │ self.model.subsample_embedding(embeddings, self.coreset_sampling_ratio) │
│ 96 │ │
│ 97 │ def validation_step(self, batch: dict[str, str | torch.Tensor], *args, **kwargs) -> │
│ 98 │ │ """Get batch of anomaly maps from input image batch. │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/models/image/patchcore/torch_model.py:153 in │
│ subsample_embedding │
│ │
│ 150 │ │ """ │
│ 151 │ │ # Coreset Subsampling │
│ 152 │ │ sampler = KCenterGreedy(embedding=embedding, sampling_ratio=sampling_ratio) │
│ ❱ 153 │ │ coreset = sampler.sample_coreset() │
│ 154 │ │ self.memory_bank = coreset │
│ 155 │ │
│ 156 │ @staticmethod │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/models/components/sampling/k_center_greedy.py:127 in │
│ sample_coreset │
│ │
│ 124 │ │ │ >>> coreset.shape │
│ 125 │ │ │ torch.Size([219, 1536]) │
│ 126 │ │ """ │
│ ❱ 127 │ │ idxs = self.select_coreset_idxs(selected_idxs) │
│ 128 │ │ return self.embedding[idxs] │
│ 129 │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/models/components/sampling/k_center_greedy.py:90 in │
│ select_coreset_idxs │
│ │
│ 87 │ │ │ selected_idxs = [] │
│ 88 │ │ │
│ 89 │ │ if self.embedding.ndim == 2: │
│ ❱ 90 │ │ │ self.model.fit(self.embedding) │
│ 91 │ │ │ self.features = self.model.transform(self.embedding) │
│ 92 │ │ │ self.reset_distances() │
│ 93 │ │ else: │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/models/components/dimensionality_reduction/random_pr │
│ ojection.py:136 in fit │
│ │
│ 133 │ │ # (Could not run 'aten::empty_strided' with arguments from the 'SparseCsrCUDA' b │
│ 134 │ │ # hence sparse matrix is stored as a dense matrix on the device │
│ 135 │ │ # self.sparse_random_matrix = self._sparse_random_matrix(n_features=n_features). │
│ ❱ 136 │ │ self.sparse_random_matrix = self._sparse_random_matrix(n_features=n_features).to │
│ 137 │ │ │
│ 138 │ │ return self │
│ 139 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.
Epoch 0: 100%|██████████| 7/7 [00:13<00:00, 0.53it/s, v_num=5] I modified this like the below but I don't get it whether this is appropriate or not :( if self.device.type == "mps":
self.sparse_random_matrix = self._sparse_random_matrix(n_features=n_features).to(device, dtype=torch.float32) then I executed and got errors as the below:
visualization errorApplePersistenceIgnoreState: Existing state will not be touched. New state will be written to /var/folders/m1/qnfmlldx0ql5wt_hc_5054fw0000gn/T/org.python.python.savedState
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /opt/homebrew/bin/anomalib:8 in <module> │
│ │
│ 5 from anomalib.cli.cli import main │
│ 6 if __name__ == '__main__': │
│ 7 │ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0]) │
│ ❱ 8 │ sys.exit(main()) │
│ 9 │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/cli/cli.py:376 in main │
│ │
│ 373 def main() -> None: │
│ 374 │ """Trainer via Anomalib CLI.""" │
│ 375 │ configure_logger() │
│ ❱ 376 │ AnomalibCLI() │
│ 377 │
│ 378 │
│ 379 if __name__ == "__main__": │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/cli/cli.py:64 in __init__ │
│ │
│ 61 │ │ run: bool = True, │
│ 62 │ │ auto_configure_optimizers: bool = True, │
│ 63 │ ) -> None: │
│ ❱ 64 │ │ super().__init__( │
│ 65 │ │ │ AnomalyModule, │
│ 66 │ │ │ AnomalibDataModule, │
│ 67 │ │ │ save_config_callback, │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/cli.py:386 in __init__ │
│ │
│ 383 │ │ self.instantiate_classes() │
│ 384 │ │ │
│ 385 │ │ if self.subcommand is not None: │
│ ❱ 386 │ │ │ self._run_subcommand(self.subcommand) │
│ 387 │ │
│ 388 │ def _setup_parser_kwargs(self, parser_kwargs: Dict[str, Any]) -> Tuple[Dict[str, Any │
│ 389 │ │ subcommand_names = self.subcommands().keys() │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/cli/cli.py:294 in _run_subcommand │
│ │
│ 291 │ │ if self.config["subcommand"] in (*self.subcommands(), "train", "export", "predic │
│ 292 │ │ │ fn = getattr(self.engine, subcommand) │
│ 293 │ │ │ fn_kwargs = self._prepare_subcommand_kwargs(subcommand) │
│ ❱ 294 │ │ │ fn(**fn_kwargs) │
│ 295 │ │ else: │
│ 296 │ │ │ self.config_init = self.parser.instantiate_classes(self.config) │
│ 297 │ │ │ getattr(self, f"{subcommand}")() │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/engine/engine.py:435 in predict │
│ │
│ 432 │ │ │
│ 433 │ │ self._setup_dataset_task(dataloaders, datamodule) │
│ 434 │ │ │
│ ❱ 435 │ │ return self.trainer.predict(model, dataloaders, datamodule, return_predictions, │
│ 436 │ │
│ 437 │ def train( │
│ 438 │ │ self, │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py:864 in predict │
│ │
│ 861 │ │ self.state.fn = TrainerFn.PREDICTING │
│ 862 │ │ self.state.status = TrainerStatus.RUNNING │
│ 863 │ │ self.predicting = True │
│ ❱ 864 │ │ return call._call_and_handle_interrupt( │
│ 865 │ │ │ self, self._predict_impl, model, dataloaders, datamodule, return_predictions │
│ 866 │ │ ) │
│ 867 │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py:44 in │
│ _call_and_handle_interrupt │
│ │
│ 41 │ try: │
│ 42 │ │ if trainer.strategy.launcher is not None: │
│ 43 │ │ │ return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, │
│ ❱ 44 │ │ return trainer_fn(*args, **kwargs) │
│ 45 │ │
│ 46 │ except _TunerExitException: │
│ 47 │ │ _call_teardown_hook(trainer) │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py:903 in │
│ _predict_impl │
│ │
│ 900 │ │ ckpt_path = self._checkpoint_connector._select_ckpt_path( │
│ 901 │ │ │ self.state.fn, ckpt_path, model_provided=model_provided, model_connected=sel │
│ 902 │ │ ) │
│ ❱ 903 │ │ results = self._run(model, ckpt_path=ckpt_path) │
│ 904 │ │ │
│ 905 │ │ assert self.state.stopped │
│ 906 │ │ self.predicting = False │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py:989 in _run │
│ │
│ 986 │ │ # ---------------------------- │
│ 987 │ │ # RUN THE TRAINER │
│ 988 │ │ # ---------------------------- │
│ ❱ 989 │ │ results = self._run_stage() │
│ 990 │ │ │
│ 991 │ │ # ---------------------------- │
│ 992 │ │ # POST-Training CLEAN UP │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/trainer/trainer.py:1030 in │
│ _run_stage │
│ │
│ 1027 │ │ if self.evaluating: │
│ 1028 │ │ │ return self._evaluation_loop.run() │
│ 1029 │ │ if self.predicting: │
│ ❱ 1030 │ │ │ return self.predict_loop.run() │
│ 1031 │ │ if self.training: │
│ 1032 │ │ │ with isolate_rng(): │
│ 1033 │ │ │ │ self._run_sanity_check() │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/loops/utilities.py:182 in │
│ _decorator │
│ │
│ 179 │ │ else: │
│ 180 │ │ │ context_manager = torch.no_grad │
│ 181 │ │ with context_manager(): │
│ ❱ 182 │ │ │ return loop_run(self, *args, **kwargs) │
│ 183 │ │
│ 184 │ return _decorator │
│ 185 │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/loops/prediction_loop.py:122 in run │
│ │
│ 119 │ │ │ │ │ batch, batch_idx, dataloader_idx = next(data_fetcher) │
│ 120 │ │ │ │ self.batch_progress.is_last_batch = data_fetcher.done │
│ 121 │ │ │ │ # run step hooks │
│ ❱ 122 │ │ │ │ self._predict_step(batch, batch_idx, dataloader_idx, dataloader_iter) │
│ 123 │ │ │ except StopIteration: │
│ 124 │ │ │ │ # this needs to wrap the `*_step` call too (not just `next`) for `datalo │
│ 125 │ │ │ │ break │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/loops/prediction_loop.py:263 in │
│ _predict_step │
│ │
│ 260 │ │ │ dataloader_idx = data_fetcher._dataloader_idx │
│ 261 │ │ │ hook_kwargs = self._build_kwargs(batch, batch_idx, dataloader_idx if self.nu │
│ 262 │ │ │
│ ❱ 263 │ │ call._call_callback_hooks(trainer, "on_predict_batch_end", predictions, *hook_kw │
│ 264 │ │ call._call_lightning_module_hook(trainer, "on_predict_batch_end", predictions, * │
│ 265 │ │ │
│ 266 │ │ self.batch_progress.increment_completed() │
│ │
│ /opt/homebrew/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py:208 in │
│ _call_callback_hooks │
│ │
│ 205 │ │ fn = getattr(callback, hook_name) │
│ 206 │ │ if callable(fn): │
│ 207 │ │ │ with trainer.profiler.profile(f"[Callback]{callback.state_key}.{hook_name}") │
│ ❱ 208 │ │ │ │ fn(trainer, trainer.lightning_module, *args, **kwargs) │
│ 209 │ │
│ 210 │ if pl_module: │
│ 211 │ │ # restore current_fx when nested context │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/callbacks/visualizer/visualizer_image.py:60 in │
│ on_predict_batch_end │
│ │
│ 57 │ │ del trainer, pl_module, batch, batch_idx, dataloader_idx # These variables are │
│ 58 │ │ assert outputs is not None │
│ 59 │ │ │
│ ❱ 60 │ │ for i, image in enumerate(self.visualizer.visualize_batch(outputs)): │
│ 61 │ │ │ if "image_path" in outputs: │
│ 62 │ │ │ │ filename = Path(outputs["image_path"][i]) │
│ 63 │ │ │ elif "video_path" in outputs: │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/utils/visualization.py:124 in visualize_batch │
│ │
│ 121 │ │ │ │ pred_boxes=batch["pred_boxes"][i].cpu().numpy() if "pred_boxes" in batch │
│ 122 │ │ │ │ box_labels=batch["box_labels"][i].cpu().numpy() if "box_labels" in batch │
│ 123 │ │ │ ) │
│ ❱ 124 │ │ │ yield self.visualize_image(image_result) │
│ 125 │ │
│ 126 │ def visualize_image(self, image_result: ImageResult) -> np.ndarray: │
│ 127 │ │ """Generate the visualization for an image. │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/utils/visualization.py:136 in visualize_image │
│ │
│ 133 │ │ │ The full or simple visualization for the image, depending on the specified m │
│ 134 │ │ """ │
│ 135 │ │ if self.mode == VisualizationMode.FULL: │
│ ❱ 136 │ │ │ return self._visualize_full(image_result) │
│ 137 │ │ if self.mode == VisualizationMode.SIMPLE: │
│ 138 │ │ │ return self._visualize_simple(image_result) │
│ 139 │ │ msg = f"Unknown visualization mode: {self.mode}" │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/utils/visualization.py:185 in _visualize_full │
│ │
│ 182 │ │ │ │ image_classified = add_normal_label(image_result.image, 1 - image_result │
│ 183 │ │ │ visualization.add_image(image=image_classified, title="Prediction") │
│ 184 │ │ │
│ ❱ 185 │ │ return visualization.generate() │
│ 186 │ │
│ 187 │ def _visualize_simple(self, image_result: ImageResult) -> np.ndarray: │
│ 188 │ │ """Generate a simple visualization for an image. │
│ │
│ /Volumes/SSD_USB/副業/anomalib/src/anomalib/utils/visualization.py:296 in generate │
│ │
│ 293 │ │ self.figure.canvas.draw() │
│ 294 │ │ # convert canvas to numpy array to prepare for visualization with opencv │
│ 295 │ │ img = np.frombuffer(self.figure.canvas.tostring_rgb(), dtype=np.uint8) │
│ ❱ 296 │ │ img = img.reshape(self.figure.canvas.get_width_height()[::-1] + (3,)) │
│ 297 │ │ plt.close(self.figure) │
│ 298 │ │ return img │
│ 299 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: cannot reshape array of size 15000000 into shape (500,2500,3) |
@jahad9819jjj, yeah you are right, the problem was not fully resolved. I've also spotted this, and created #1644 . Once it is merged, it should hopefully be ok :) |
i still got the float64 error, unless changing accelerator to cpu. |
I also need to turn save_images off, or there will be similar dimension issues |
@Yiiipu, which branch are you using? Note that all these fixes are made to the v1 branch |
I was working on the main branch. It makes sense now. Thank you! |
If you want to solve the visualisation issue you should add matplotlib.use('Agg') to the top of the visualizer.py file. There is some issue with Mac Os and that's why you need to specify this matplotlib backend. |
Thanks for sharing, @Mr-Corentin |
Describe the bug
I get a "Cannot convert a MPS Tensor to float64" when running the train.py script on a Mac M1
It seems that the Mac GPU interface can't handle 64bit tensors.. I am unsure where to cast the tensor or how to properly do it but from what I can tell data is loaded in ligthining_fabric/apply_func.py.
I tried changing stuff to "data_output= data.type(torch.float32).to(device, **kwargs)" (~line 95) but this does not work.
Looking forward to any help :)
Regards JI
Dataset
MVTec
Model
PADiM
Steps to reproduce the behavior
On a Mac M1 / Apple Silicon:
OS information
OS information:
Expected behavior
A working training to get going with this cool lib
Screenshots
No response
Pip/GitHub
GitHub
What version/branch did you use?
1.0dev
Configuration YAML
Logs
Code of Conduct
The text was updated successfully, but these errors were encountered: