-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PyTorch object detection - evaluate function call - object of type <class 'numpy.float64'> cannot be safely interpreted as an integer #1700
Comments
After quite a bit of troubleshooting I found there is a change is numpy 1.18.0 which causes this. The fix is:
A few things I can also mention: -numpy 1.17.4 was released on November 10th, 2019 and therefore should still be good for quite some time -There is now a pip package for pycocotools, so instead of the above procedure (cloning and building) you can now simply do:
|
Thanks for the heads up and through investigation! This has been fixed in pycocotools already cocodataset/cocoapi#354 |
Not sure if this is still relevant but I ran into the same issue with numpy version 1.18.1 and the most recent pycocotools. |
@nkalavak this should have been fixed with the latest installation of pycocotools. If you still face those issues, please open a new issue in the pycocotools repo |
For those curious, I believe they have the issue at cocodataset/cocoapi#356 |
@EMCP that issue has been fixed in cocodataset/cocoapi#354 |
I recently made this post on SO:
https://stackoverflow.com/questions/59493606/pytorch-object-detection-evaluate-function-call-object-of-type-class-numpy
I'm not sure if I should have posted on SO or here first, so I figured I'd ask here as well.
I'm attempting to get this PyTorch person detection example running:
https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html
I'm using Ubuntu 18.04. Here is a summary of the steps I've performed:
Stock Ubuntu 18.04 install on a Lenovo ThinkPad X1 Extreme Gen 2 with a GTX 1650 GPU.
Perform a standard CUDA 10.0 / cuDNN 7.4 install. I'd rather not restate all the steps as this post is going to be more than long enough already. This is a standard procedure, pretty much any link found via googling is what I followed.
Install
torch
andtorchvision
From this link on the PyTorch site:
https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html
I saved the source available from the link at the bottom:
https://pytorch.org/tutorials/_static/tv-training-code.py
To a directory I made,
PennFudanExample
Install the CoCo API into Python:
open Makefile in gedit, change the two instances of "python" to "python3", then:
Get the necessary files the above linked files need to run:
from
~/vision/references/detection
, copycoco_eval.py
,coco_utils.py
,engine.py
,transforms.py
, andutils.py
to directoryPennFudanExample
.https://www.cis.upenn.edu/~jshi/ped_html/PennFudanPed.zip
then unzip and put in directory
PennFudanExample
tv-training-code.py
was to change the training batch size from 2 to 1 to prevent a GPU out of memory crash, see this other post I made here:PyTorch Object Detection with GPU on Ubuntu 18.04 - RuntimeError: CUDA out of memory. Tried to allocate xx.xx MiB
Here is
tv-training-code.py
as I'm running it with the slight batch size edit I mentioned:Here is the full text output including error I'm getting currently:
The really strange thing is after I resolved the above-mentioned GPU error this was working for about 1/2 a day and now I'm getting this error, and I could swear I didn't change anything.
I've tried uninstalling and reinstalling
torch
,torchvision
,pycocotools
, and for copying the filescoco_eval.py
,coco_utils.py
,engine.py
,transforms.py
, andutils.py
, I've tried checking out torchvision v0.5.0, v0.4.2, and using the latest commit, all produce the same error.Also, I was working from home yesterday (Christmas) and this error does not happen on my home computer, which is also Ubuntu 18.04 with an NVIDIA GPU.
In Googling for this error one suggestion that is relatively common is to backdate
numpy
to 1.11.0, but that version is really old now and therefore this would likely cause problems with other packages.Also in Googleing for this error it seems the general fix is to add a cast to int somewhere or to change a divide by
/
to//
but I'm really hesitant to make changes internal topycocotools
or worse yet insidenumpy
. Also since error was not occurring previously and is not occurring on another computer I don't suspect this is a good idea anyway.Fortunately I can comment out the line
evaluate(model, data_loader_test, device=device)
For now and the training will complete, although I don't get the evaluation data (Mean Average Precision, etc.)
About the only thing left I can think of at this point is to format the HD and reinstall Ubuntu 18.04 and everything else, but this will take at least a day, and if this ever happens again I'd really like to know what may be causing it.
Ideas? Suggestions? Additional stuff I should check?
The text was updated successfully, but these errors were encountered: