Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YoloV8Translator加载模型执行后结果错误有偏移,通过YoloV8TranslatorFactory加载模型则没有问题,这个是什么原因,怎么解决? #3562

Open
zhu-weixin opened this issue Dec 18, 2024 · 1 comment

Comments

@zhu-weixin
Copy link

zhu-weixin commented Dec 18, 2024

我想请问是我下面的通过YoloV8Translator加载模型的处理有问题吗,还是后续对图片的处理错误,导致出现这个问题?

下面是我分别通过两种方式加载模型,getPredictor1是通过YoloV8TranslatorFactory加载模型,getPredictor2是通过YoloV8Translator加载模型的代码

@Slf4j
public class Main4 {

    public static void main(String[] args) throws IOException, ModelException, TranslateException {
        Path imgPath = Paths.get("D:/Project/aggreg/image/src/main/resources/img/bus.jpg");

        try {
            Predictor<Image, DetectedObjects> predictor1 = getPredictor1();
            Predictor<Image, DetectedObjects> predictor2 = getPredictor2();
            Path outputPath = Paths.get("D:/Project/aggreg/image/src/main/resources/img/");
            Files.createDirectories(outputPath);
            //第一种构建方式
            Image img1 = ImageFactory.getInstance().fromFile(imgPath);
            DetectedObjects detection1 = predictor1.predict(img1);
            if (detection1.getNumberOfObjects() > 0) {
                img1.drawBoundingBoxes(detection1);
                Path output = outputPath.resolve("yolo_detected_1.png");
                try (OutputStream os = Files.newOutputStream(output)) {
                    img1.save(os, "png");
                }
                log.info(">>>>>>>>>>>>>>>>>>>>>>>>>>>");
                log.info("Detected object1: {}", detection1);
                log.info("Detected object saved in: {}", output);
                log.info(">>>>>>>>>>>>>>>>>>>>>>>>>>>");
            }
            //第二种构建方式
            Image img2 = ImageFactory.getInstance().fromFile(imgPath);
            DetectedObjects detection2 = predictor2.predict(img2);
            if (detection2.getNumberOfObjects() > 0) {
                img2.drawBoundingBoxes(detection2);
                Path output = outputPath.resolve("yolo_detected_2.png");
                try (OutputStream os = Files.newOutputStream(output)) {
                    img2.save(os, "png");
                }
                log.info(">>>>>>>>>>>>>>>>>>>>>>>>>>>");
                log.info("Detected object2: {}", detection2);
                log.info("Detected object2 saved in: {}", output);
                log.info(">>>>>>>>>>>>>>>>>>>>>>>>>>>");
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    /**
     * 通过YoloV8TranslatorFactory加载模型
     *
     * @return
     * @throws ModelNotFoundException
     * @throws MalformedModelException
     * @throws IOException
     */
    public static Predictor<Image, DetectedObjects> getPredictor1() throws ModelNotFoundException, MalformedModelException, IOException {
        Criteria<Image, DetectedObjects> criteria =
                Criteria.builder()
                        .setTypes(Image.class, DetectedObjects.class)
                        .optDevice(Device.gpu())
                        .optModelUrls(Main4.class.getResource("/yolo").getPath())
                        .optModelName("yolo11n.onnx")
                        .optEngine("OnnxRuntime")
                        .optArgument("width", 640)
                        .optArgument("height", 640)
                        .optArgument("resize", true)
                        .optArgument("toTensor", true)
                        .optArgument("applyRatio", true)
                        .optArgument("threshold", 0.6f)
                        .optTranslatorFactory(new YoloV8TranslatorFactory())
                        .optProgress(new ProgressBar())
                        .build();
        ZooModel<Image, DetectedObjects> model = criteria.loadModel();
        Predictor<Image, DetectedObjects> predictor = model.newPredictor();
        return predictor;
    }

    /**
     * 通过YoloV8Translator加载模型
     *
     * @return
     * @throws ModelNotFoundException
     * @throws MalformedModelException
     * @throws IOException
     */
    public static Predictor<Image, DetectedObjects> getPredictor2() throws ModelNotFoundException, MalformedModelException, IOException {
        Translator<Image, DetectedObjects> translator = YoloV8Translator.builder()
                .addTransform(new Resize(640, 640))
                .addTransform(new ToTensor())
                .optApplyRatio(true)
                .optThreshold(0.6f)
                .optSynsetArtifactName("synset.txt")
                .build();
        Criteria<Image, DetectedObjects> criteria =
                Criteria.builder()
                        .setTypes(Image.class, DetectedObjects.class)
                        .optDevice(Device.gpu())
                        .optModelUrls(Main4.class.getResource("/yolo").getPath())
                        .optModelName("yolo11n.onnx")
                        .optEngine("OnnxRuntime")
                        .optArgument("maxBox", 1000)
                        .optTranslator(translator)
                        .optProgress(new ProgressBar())
                        .build();
        ZooModel<Image, DetectedObjects> model = criteria.loadModel();
        Predictor<Image, DetectedObjects> predictor = model.newPredictor();
        return predictor;
    }
}

下面是我的控制台打印的结果(当然我知道有内存溢出的问题):

18:28:56.194 [main] INFO xyz.zwxin.image.service.yolo.Main4 - >>>>>>>>>>>>>>>>>>>>>>>>>>>
18:28:56.194 [main] INFO xyz.zwxin.image.service.yolo.Main4 - Detected object1: [
	{"className": "bus", "probability": 0.90587, "boundingBox": {"rect":[0.01778717041015625,0.2105330228805542,0.9935930252075195,0.685427975654602]}},
	{"className": "person", "probability": 0.89490, "boundingBox": {"rect":[0.06084575653076172,0.36660306453704833,0.29915738105773926,0.8391817331314086]}},
	{"className": "person", "probability": 0.89393, "boundingBox": {"rect":[0.8278806686401368,0.3668684959411621,0.9998627662658692,0.814603328704834]}},
	{"className": "person", "probability": 0.88770, "boundingBox": {"rect":[0.2755547523498535,0.37552340030670167,0.42212188243865967,0.7971038579940797]}},
	{"className": "person", "probability": 0.72946, "boundingBox": {"rect":[1.4168620109558105E-4,0.5128156661987304,0.09215204119682313,0.8079570293426513]}}
]

18:28:56.226 [main] INFO xyz.zwxin.image.service.yolo.Main4 - Detected object saved in: D:\Project\aggreg\image\src\main\resources\img\yolo_detected_1.png
18:28:56.226 [main] INFO xyz.zwxin.image.service.yolo.Main4 - >>>>>>>>>>>>>>>>>>>>>>>>>>>
18:29:01.987 [main] INFO xyz.zwxin.image.service.yolo.Main4 - >>>>>>>>>>>>>>>>>>>>>>>>>>>
18:29:01.987 [main] INFO xyz.zwxin.image.service.yolo.Main4 - Detected object2: [
	{"className": "bus", "probability": 0.90590, "boundingBox": {"rect":[0.050815309797014506,0.6015426090785435,2.8388173239571706,1.9583563123430525]}},
	{"className": "person", "probability": 0.89488, "boundingBox": {"rect":[0.17384161267961776,1.047433308192662,0.8547286987304688,2.3976612091064453]}},
	{"className": "person", "probability": 0.89395, "boundingBox": {"rect":[2.3653640747070312,1.0481888907296317,2.856751033238002,2.3274388994489397]}},
	{"className": "person", "probability": 0.88761, "boundingBox": {"rect":[0.7872960226876395,1.0729188919067383,1.2060600008283342,2.277428899492536]}},
	{"className": "person", "probability": 0.72934, "boundingBox": {"rect":[4.0481771741594585E-4,1.465179443359375,0.2632885490145002,2.308438846043178]}}
]

18:29:01.997 [main] INFO xyz.zwxin.image.service.yolo.Main4 - Detected object2 saved in: D:\Project\aggreg\image\src\main\resources\img\yolo_detected_2.png
18:29:02.008 [main] INFO xyz.zwxin.image.service.yolo.Main4 - >>>>>>>>>>>>>>>>>>>>>>>>>>>

yolo_detected_1.png的图片是这样的:
yolo_detected_1

yolo_detected_2.png的图片是这样的:
yolo_detected_2

@frankfliu
Copy link
Contributor

@zhu-weixin
Thanks for reporting this issue. There is a bug in BaseImageTranslator.builder() class. This was introduced when refactor Yolov8TranslatorFacotry. We will raise a PR a short after.

For the time being, you can use the following workaround:

        Map<String, Object> args = new ConcurrentHashMap<>();
        args.put("width", 640);
        args.put("height", 640);
        args.put("resize", true);
        args.put("toTensor", true);
        args.put("applyRatio", true);
        args.put("threshold", 0.6f);
        args.put("maxBox", 1000);

        Translator<Image, DetectedObjects> translator = YoloV8Translator.builder(args).build();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants