How do i use pretrained model for prediction? #13

monuminu · 2019-12-15T08:15:19Z

Hi All,

Thanks a lot of this awesome Dataset and pretrained weights . I wanted to know how can i use this for prediction of bounding box given a page image ?

zhxgj · 2019-12-15T22:13:01Z

Hi @monuminu The model was trained with the Detectron frame work. So you can load it in Detectron and use its inference examples. I will also provide a Jupyter notebook for inference soon.

Lambert-Shirzad · 2019-12-18T00:40:02Z

Hi,
Once again, thanks for releasing your great work.
I am not sure whether we should get Detectron or Detectron2? Detectron is in Caffe2 and installing Caffe at this point is just a pain. Detectron2 is a rewrite of Detecton in PyTorch and seems to enjoy a better support.
Looking forward to the tutorial!

zhxgj · 2019-12-18T03:14:41Z

Hi @Lambert-Shirzad Thank you. I trained the models on Detectron, which is powered by Caffe2. But I found the installation was quite easy. I just followed the instructions in this link. We also plan to retrain the models in the Detectron2 framework.

monuminu · 2019-12-18T17:22:33Z

Agree with @Lambert-Shirzad . It will be awesome if you can train on Detectron2 . Also waiting for your jupyter notebook. That will greatly help me .

hpanwar08 · 2019-12-29T12:09:53Z

If anyone is interested, I have trained it on Detectron2. You can find training config and trained models (resnet101, resnext101) in this repo https://github.com/hpanwar08/detectron2

Note: Models are trained on ~60% of the original dataset for ~1.5 epochs. But these models works good for fine-tuning domain specific dataset.

nodechef · 2019-12-31T20:30:50Z

@hpanwar08 I am trying your trained models for prediction but I get this error:
Config '\configs\DLA_mask_rcnn_X_101_32x8d_FPN_3x.yaml' has no VERSION. Assuming it to be compatible with latest v2.

hpanwar08 · 2020-01-01T04:20:43Z

@nodechef This is just a warning. You should still be able to get the predictions.
I have updated the models. Please download the smaller trimmed model "model_final_trimmed.pth" from dropbox. It is smaller in size.

Use the below command for prediction

python demo/demo.py --config-file configs/DLA_mask_rcnn_X_101_32x8d_FPN_3x.yaml --input "<path to image.jpg>" --output <path to save the predicted image> --confidence-threshold 0.5 --opts MODEL.WEIGHTS <path to model_final_trimmed.pth> MODEL.DEVICE cpu

nodechef · 2020-01-01T18:24:34Z

@hpanwar08 Yeah it worked, later realized that it was just the warning. However, I have a question.
How do we get predicted classes ? like title, paragraph, figure ?

nodechef · 2020-01-02T03:03:24Z

I guess we need to add this to demo.py in order to get the predicted class, Right ? Correct me if I am wrong.

from detectron2.data import MetadataCatalog
MetadataCatalog.get("dataset_name").thing_classes = ["title", "text","figure", "table","list"]

The name of the dataset would be the one you used for training ?

hpanwar08 · 2020-01-02T03:17:56Z

@nodechef If you want just the predictions and not visualization then use detectron2.engine.defaults.DefaultPredictor() to get the classes. You will get class indexes, then you need to convert class indexes to actual class names.
Refer line 35 and 48 in predictor.py

Or you can try this

classes = ['text', 'title', 'list', 'table', 'figure']
default_predictor = detectron2.engine.defaults.DefaultPredictor(cfg)
img = detectron2.data.detection_utils.read_image(path_to_image, format="BGR")
predictions = default_predictor(img)
instances = predictions["instances"].to('cpu')
pred_classes = instances.pred_classes
labels = [classes[i] for i in pred_classes]

"instances" contains the bbox, class probabilities and class indexes.

nodechef · 2020-01-02T03:53:51Z

@hpanwar08 I am looking for class labels instead of percentage for each bbox (Along with the visualization.)

hpanwar08 · 2020-01-02T04:19:45Z

@nodechef If that is the case, then what you said should work. dataset_name will be "dla_val"

nodechef · 2020-01-02T04:20:58Z

MetadataCatalog.get(cfg.DATASETS.TRAIN[0]).thing_classes = ['text', 'title', 'list', 'table', 'figure']

v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]) , scale=1.2)
v = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2_imshow(v.get_image()[:, :, ::-1])

This worked.

hpanwar08 · 2020-01-07T00:19:55Z

@deepseek Done!

elnazsn1988 · 2020-02-09T23:47:04Z

Hi, is there an internal feature which lets each classed be saved as a seperate segment, or image? I am trying to identify tables, seperate and then run through a tabular data analyzer and ocr - so far am able to get the image predictions with your code, but not the actual annotations/segmented fields for further analysis/ocr.

elnazsn1988 · 2020-02-09T23:51:40Z

@nodechef If you want just the predictions and not visualization then use detectron2.engine.defaults.DefaultPredictor() to get the classes. You will get class indexes, then you need to convert class indexes to actual class names.
Refer line 35 and 48 in predictor.py

Or you can try this
classes = ['text', 'title', 'list', 'table', 'figure']
default_predictor = detectron2.engine.defaults.DefaultPredictor(cfg)
img = detectron2.data.detection_utils.read_image(path_to_image, format="BGR")
predictions = default_predictor(img)
instances = predictions["instances"].to('cpu')
pred_classes = instances.pred_classes
labels = [classes[i] for i in pred_classes]
"instances" contains the bbox, class probabilities and class indexes.

Sorry, can you expand on where I woud add/change these lines you added - do I run as an external .py code and reference input/outputs, or do I change the predict code as mentioned above?

elnazsn1988 · 2020-02-09T23:54:07Z

MetadataCatalog.get(cfg.DATASETS.TRAIN[0]).thing_classes = ['text', 'title', 'list', 'table', 'figure']

v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]) , scale=1.2)
v = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2_imshow(v.get_image()[:, :, ::-1])

This worked.

@nodechef does this give you both the classes, and the visualization? could you elaborate on whether you changed these lines in the predict.py file or ran an external .py file?

elnazsn1988 · 2020-02-10T00:03:31Z

MetadataCatalog.get(cfg.DATASETS.TRAIN[0]).thing_classes = ['text', 'title', 'list', 'table', 'figure']

v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]) , scale=1.2)
v = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2_imshow(v.get_image()[:, :, ::-1])

@nodechef when adding the script you added to the bottom of demo.py, I get the following error:

Traceback (most recent call last):
  File "C:\projects\pytorch\detectron2\demo\demo.py", line 155, in <module>
    v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]) , scale=1.2)
NameError: name 'Visualizer' is not defined

when adding in to the top of the demo.py file, I get the following:

Traceback (most recent call last):
  File "C:\projects\pytorch\detectron2\demo\demo.py", line 22, in <module>
    MetadataCatalog.get(cfg.DATASETS.TRAIN[0]).thing_classes = ['text', 'title', 'list', 'table', 'figure']
NameError: name 'cfg' is not defined

is there somewhere specific I should be adding your edit?

elnazsn1988 · 2020-02-10T00:22:17Z

@nodechef If you want just the predictions and not visualization then use detectron2.engine.defaults.DefaultPredictor() to get the classes. You will get class indexes, then you need to convert class indexes to actual class names.
Refer line 35 and 48 in predictor.py

Or you can try this
classes = ['text', 'title', 'list', 'table', 'figure']
default_predictor = detectron2.engine.defaults.DefaultPredictor(cfg)
img = detectron2.data.detection_utils.read_image(path_to_image, format="BGR")
predictions = default_predictor(img)
instances = predictions["instances"].to('cpu')
pred_classes = instances.pred_classes
labels = [classes[i] for i in pred_classes]
"instances" contains the bbox, class probabilities and class indexes.

@hpanwar08 thanks the above throws up no errors, how can I save the instances to a location on my system? am running through shell.

hpanwar08 · 2020-02-10T04:19:51Z

@nodechef If you want just the predictions and not visualization then use detectron2.engine.defaults.DefaultPredictor() to get the classes. You will get class indexes, then you need to convert class indexes to actual class names.
Refer line 35 and 48 in predictor.py
Or you can try this
classes = ['text', 'title', 'list', 'table', 'figure']
default_predictor = detectron2.engine.defaults.DefaultPredictor(cfg)
img = detectron2.data.detection_utils.read_image(path_to_image, format="BGR")
predictions = default_predictor(img)
instances = predictions["instances"].to('cpu')
pred_classes = instances.pred_classes
labels = [classes[i] for i in pred_classes]
"instances" contains the bbox, class probabilities and class indexes.
@hpanwar08 thanks the above throws up no errors, how can I save the instances to a location on my system? am running through shell.

@elnazsn1988 In addition to the above code you can extract the bounding boxes and then crop the image based on the bounding box and save.

classes = ['text', 'title', 'list', 'table', 'figure']
default_predictor = detectron2.engine.defaults.DefaultPredictor(cfg)
img = detectron2.data.detection_utils.read_image(path_to_image, format="BGR")
predictions = default_predictor(img)
instances = predictions["instances"].to('cpu')

pred_classes = instances.pred_classes
labels = [classes[i] for i in pred_classes]
boxes = instances.pred_boxes
if isinstance(boxes, detectron2.structures.boxes.Boxes):
    boxes = boxes.tensor.numpy()
else:
    boxes = np.asarray(boxes)

for label, bbox in zip(labels, boxes):
    if label == "table":
        cropped_img = img.crop(bbox)
        croppped_img.save(f"{label}_{bbox}.png")

Hope this helps.

pollyMath · 2020-04-24T16:35:37Z

@nodechef If you want just the predictions and not visualization then use detectron2.engine.defaults.DefaultPredictor() to get the classes. You will get class indexes, then you need to convert class indexes to actual class names.
Refer line 35 and 48 in predictor.py
Or you can try this
classes = ['text', 'title', 'list', 'table', 'figure']
default_predictor = detectron2.engine.defaults.DefaultPredictor(cfg)
img = detectron2.data.detection_utils.read_image(path_to_image, format="BGR")
predictions = default_predictor(img)
instances = predictions["instances"].to('cpu')
pred_classes = instances.pred_classes
labels = [classes[i] for i in pred_classes]
"instances" contains the bbox, class probabilities and class indexes.
Sorry, can you expand on where I woud add/change these lines you added - do I run as an external .py code and reference input/outputs, or do I change the predict code as mentioned above?

this may already be resolved but would leave it for future reference, based on node chef's code, what I did is adding the following lines in demo.py after demo.run_on_image

from detectron2.data import MetadataCatalog

            MetadataCatalog.get(cfg.DATASETS.TRAIN[0]).thing_classes = ["text",
                                                                 "title",
                                                                 "list",
                                                                 "table",
                                                                 "figure"]
            v = Visualizer(img[:, :, ::-1],
                           MetadataCatalog.get(cfg.DATASETS.TRAIN[0]),
                           scale=1.2)
            v = v.draw_instance_predictions(predictions["instances"].to("cpu"))

and then
imwrite(out_filename, v.get_image()[:, :, ::-1]) where u have output filename
Note you may have to reorder the labels in the classes array

ChungNPH · 2020-05-11T11:40:27Z

Did anyone try to find tune with others categories. For example, i have others categories as authors, introduction. So my categories are [title, text, figure, table, list, authors, introduction]. I know that IBM don't have plan with additional categories, but how can i try to do this?

zhxgj · 2020-05-11T22:56:26Z

Did anyone try to find tune with others categories. For example, i have others categories as authors, introduction. So my categories are [title, text, figure, table, list, authors, introduction]. I know that IBM don't have plan with additional categories, but how can i try to do this?

Hi @ChungNPH do you have annotations of the additional categories?

ChungNPH · 2020-05-13T02:42:12Z

Did anyone try to find tune with others categories. For example, i have others categories as authors, introduction. So my categories are [title, text, figure, table, list, authors, introduction]. I know that IBM don't have plan with additional categories, but how can i try to do this?

Hi @ChungNPH do you have annotations of the additional categories?

I create a custom dataset my self, it have some difference with publaynet format. As:
{ image:
{ file name: 'filename',
height: 'height',
width: 'width',
id: 'id',
annotations:
{ obj 1: [],
obj 2: [],
}
}
}

And i tried to train only publaynet with detectron2 but may detectron2 dont understand publaynet format as { 'image': [...], 'annotations' : [...], 'categories' : [...] }. Should I change json file format to train? Can you so me an overview to train with publaynet? Thank you so much!

ChungNPH · 2020-05-13T02:52:17Z

https://github.com/facebookresearch/detectron2/blob/master/docs/tutorials/datasets.md

I see that detectron2 need to re-format dataset to their format before train. Is it right?

jmandivarapu1 · 2020-07-02T04:48:23Z

Hi @monuminu The model was trained with the Detectron frame work. So you can load it in Detectron and use its inference examples. I will also provide a Jupyter notebook for inference soon.

Hey can anybody help me in correctly loading pretrained model and test it on documents,

I am trying to use the pretrained weights on the testset Images.

Case 1 : Using config configs/DLA_mask_rcnn_X_101_32x8d_FPN_3x.yaml

code: python3.7 demo/demo.py --config-file configs/DLA_mask_rcnn_X_101_32x8d_FPN_3x.yaml --input examples/PMC3654277_00006.jpg --output out/ --confidence-threshold 0.25 --opts MODEL.DEVICE cpu

But it's giving me pretty bad results. if I increase confidence-threshold> 0.25 then there is no detection

Case 2 : Using config configs/DLA_mask_rcnn_X_101_32x8d_FPN_3x.yaml

code python3.7 demo/demo.py --config-file pre-trained-models/Mask-RCNN/e2e_mask_rcnn_X-101-64x4d-FPN_1x.yaml --input examples/PMC3576793_00004.jpg --output out/ --confidence-threshold 0.2 --opts TYPE Mask-RCNN MODEL.DEVICE cpu

throwing the below error

'pre-trained-models/Mask-RCNN/e2e_mask_rcnn_X-101-64x4d-FPN_1x.yaml' has no VERSION. Assuming it to be compatible with latest v2.
Traceback (most recent call last):
  File "demo/demo.py", line 76, in <module>
    cfg = setup_cfg(args)
  File "demo/demo.py", line 28, in setup_cfg
    cfg.merge_from_file(args.config_file)
  File "/opt/anaconda3/lib/python3.7/site-packages/detectron2/config/config.py", line 49, in merge_from_file
    self.merge_from_other_cfg(loaded_cfg)
  File "/opt/anaconda3/lib/python3.7/site-packages/fvcore/common/config.py", line 120, in merge_from_other_cfg
    return super().merge_from_other_cfg(cfg_other)
  File "/opt/anaconda3/lib/python3.7/site-packages/yacs/config.py", line 217, in merge_from_other_cfg
    _merge_a_into_b(cfg_other, self, self, [])
  File "/opt/anaconda3/lib/python3.7/site-packages/yacs/config.py", line 464, in _merge_a_into_b
    _merge_a_into_b(v, b[k], root, key_list + [k])
  File "/opt/anaconda3/lib/python3.7/site-packages/yacs/config.py", line 477, in _merge_a_into_b
    raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: MODEL.TYPE'

hpanwar08 · 2020-07-02T08:48:57Z

Where did you download the trained model from?

hpanwar08 · 2020-07-02T08:51:54Z

https://github.com/facebookresearch/detectron2/blob/master/docs/tutorials/datasets.md

I see that detectron2 need to re-format dataset to their format before train. Is it right?

You probably have solved it by now, but just for info, if your data is in coco format then it's easy to train detectron2. And yes detectron2 convert the data into it's own data structure.

https://github.com/hpanwar08/detectron2

ShabariGadewar · 2020-09-22T06:59:44Z

https://github.com/hpanwar08/detectron2

Hi, I need a little help, I'm new to this. I just want to test the pre-trained models of PubLayNet dataset on the test data given on PubLayNet. I have downloaded the pkl file and un-pickled it. But I don't know how to use it on test data. Can you help me out? Thank you

hpanwar08 · 2020-09-25T10:32:21Z

You could try this https://github.com/hpanwar08/document-layout-analysis-app

ankur-sentieo · 2021-04-28T07:25:57Z

Hey guys @ShabariGadewar @zhxgj , do you have any jupyter notebook now for the publaynet pre-trained model loading and testing? I am also looking to test the pre-trained models of PubLayNet dataset on the test data given on PubLayNet but i also get the same error which @jmandivarapu1 was getting in his comment above. Also @hpanwar08, I was able to run your trimmed models but can you please guide on how can we further train your pre-trained models (trimmed ones) that you have included in your repo?

'pre-trained-models/Mask-RCNN/e2e_mask_rcnn_X-101-64x4d-FPN_1x.yaml' has no VERSION. Assuming it to be compatible with latest v2.
Traceback (most recent call last):
File "demo/demo.py", line 76, in
cfg = setup_cfg(args)
File "demo/demo.py", line 28, in setup_cfg
cfg.merge_from_file(args.config_file)
File "/opt/anaconda3/lib/python3.7/site-packages/detectron2/config/config.py", line 49, in merge_from_file
self.merge_from_other_cfg(loaded_cfg)
File "/opt/anaconda3/lib/python3.7/site-packages/fvcore/common/config.py", line 120, in merge_from_other_cfg
return super().merge_from_other_cfg(cfg_other)
File "/opt/anaconda3/lib/python3.7/site-packages/yacs/config.py", line 217, in merge_from_other_cfg
_merge_a_into_b(cfg_other, self, self, [])
File "/opt/anaconda3/lib/python3.7/site-packages/yacs/config.py", line 464, in _merge_a_into_b
_merge_a_into_b(v, b[k], root, key_list + [k])
File "/opt/anaconda3/lib/python3.7/site-packages/yacs/config.py", line 477, in _merge_a_into_b
raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: MODEL.TYPE'

under-score · 2021-11-07T06:35:37Z

in support of @ankur-sentieo for a juypter/colab/binder notebook or at least a working docker container

opyate · 2023-03-30T09:21:44Z

If anyone is interested, I have trained it on Detectron2. You can find training config and trained models (resnet101, resnext101) in this repo https://github.com/hpanwar08/detectron2

Note: Models are trained on ~60% of the original dataset for ~1.5 epochs. But these models works good for fine-tuning domain specific dataset.

Thanks for that!

Note that the layout-parser team also retrained with detectron2. See:
https://github.com/Layout-Parser/layout-parser/blob/main/src/layoutparser/models/detectron2/catalog.py#L25

And
https://layout-parser.readthedocs.io/en/latest/notes/modelzoo.html#model-catalog

s-jays · 2024-04-28T14:26:08Z

Hi, just wondering how I can save the annotation output from the model into the same format as Publaynet's dataset?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do i use pretrained model for prediction? #13

How do i use pretrained model for prediction? #13

monuminu commented Dec 15, 2019

zhxgj commented Dec 15, 2019

Lambert-Shirzad commented Dec 18, 2019

zhxgj commented Dec 18, 2019

monuminu commented Dec 18, 2019 •

edited

Loading

hpanwar08 commented Dec 29, 2019

nodechef commented Dec 31, 2019

hpanwar08 commented Jan 1, 2020 •

edited

Loading

nodechef commented Jan 1, 2020 •

edited

Loading

nodechef commented Jan 2, 2020

hpanwar08 commented Jan 2, 2020

nodechef commented Jan 2, 2020

hpanwar08 commented Jan 2, 2020

nodechef commented Jan 2, 2020

hpanwar08 commented Jan 7, 2020

elnazsn1988 commented Feb 9, 2020

elnazsn1988 commented Feb 9, 2020

elnazsn1988 commented Feb 9, 2020

elnazsn1988 commented Feb 10, 2020

elnazsn1988 commented Feb 10, 2020

hpanwar08 commented Feb 10, 2020 •

edited

Loading

pollyMath commented Apr 24, 2020

ChungNPH commented May 11, 2020

zhxgj commented May 11, 2020

ChungNPH commented May 13, 2020

ChungNPH commented May 13, 2020

jmandivarapu1 commented Jul 2, 2020 •

edited

Loading

hpanwar08 commented Jul 2, 2020

hpanwar08 commented Jul 2, 2020

ShabariGadewar commented Sep 22, 2020

hpanwar08 commented Sep 25, 2020

ankur-sentieo commented Apr 28, 2021 •

edited

Loading

under-score commented Nov 7, 2021

opyate commented Mar 30, 2023 •

edited

Loading

s-jays commented Apr 28, 2024

How do i use pretrained model for prediction? #13

How do i use pretrained model for prediction? #13

Comments

monuminu commented Dec 15, 2019

zhxgj commented Dec 15, 2019

Lambert-Shirzad commented Dec 18, 2019

zhxgj commented Dec 18, 2019

monuminu commented Dec 18, 2019 • edited Loading

hpanwar08 commented Dec 29, 2019

nodechef commented Dec 31, 2019

hpanwar08 commented Jan 1, 2020 • edited Loading

nodechef commented Jan 1, 2020 • edited Loading

nodechef commented Jan 2, 2020

hpanwar08 commented Jan 2, 2020

nodechef commented Jan 2, 2020

hpanwar08 commented Jan 2, 2020

nodechef commented Jan 2, 2020

hpanwar08 commented Jan 7, 2020

elnazsn1988 commented Feb 9, 2020

elnazsn1988 commented Feb 9, 2020

elnazsn1988 commented Feb 9, 2020

elnazsn1988 commented Feb 10, 2020

elnazsn1988 commented Feb 10, 2020

hpanwar08 commented Feb 10, 2020 • edited Loading

pollyMath commented Apr 24, 2020

ChungNPH commented May 11, 2020

zhxgj commented May 11, 2020

ChungNPH commented May 13, 2020

ChungNPH commented May 13, 2020

jmandivarapu1 commented Jul 2, 2020 • edited Loading

hpanwar08 commented Jul 2, 2020

hpanwar08 commented Jul 2, 2020

ShabariGadewar commented Sep 22, 2020

hpanwar08 commented Sep 25, 2020

ankur-sentieo commented Apr 28, 2021 • edited Loading

under-score commented Nov 7, 2021

opyate commented Mar 30, 2023 • edited Loading

s-jays commented Apr 28, 2024

monuminu commented Dec 18, 2019 •

edited

Loading

hpanwar08 commented Jan 1, 2020 •

edited

Loading

nodechef commented Jan 1, 2020 •

edited

Loading

hpanwar08 commented Feb 10, 2020 •

edited

Loading

jmandivarapu1 commented Jul 2, 2020 •

edited

Loading

ankur-sentieo commented Apr 28, 2021 •

edited

Loading

opyate commented Mar 30, 2023 •

edited

Loading