Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do i use pretrained model for prediction? #13

Open
monuminu opened this issue Dec 15, 2019 · 34 comments
Open

How do i use pretrained model for prediction? #13

monuminu opened this issue Dec 15, 2019 · 34 comments

Comments

@monuminu
Copy link

Hi All,

Thanks a lot of this awesome Dataset and pretrained weights . I wanted to know how can i use this for prediction of bounding box given a page image ?

@zhxgj
Copy link
Contributor

zhxgj commented Dec 15, 2019

Hi @monuminu The model was trained with the Detectron frame work. So you can load it in Detectron and use its inference examples. I will also provide a Jupyter notebook for inference soon.

@Lambert-Shirzad
Copy link

Hi,
Once again, thanks for releasing your great work.
I am not sure whether we should get Detectron or Detectron2? Detectron is in Caffe2 and installing Caffe at this point is just a pain. Detectron2 is a rewrite of Detecton in PyTorch and seems to enjoy a better support.
Looking forward to the tutorial!

@zhxgj
Copy link
Contributor

zhxgj commented Dec 18, 2019

Hi @Lambert-Shirzad Thank you. I trained the models on Detectron, which is powered by Caffe2. But I found the installation was quite easy. I just followed the instructions in this link. We also plan to retrain the models in the Detectron2 framework.

@monuminu
Copy link
Author

monuminu commented Dec 18, 2019

Agree with @Lambert-Shirzad . It will be awesome if you can train on Detectron2 . Also waiting for your jupyter notebook. That will greatly help me .

@hpanwar08
Copy link

If anyone is interested, I have trained it on Detectron2. You can find training config and trained models (resnet101, resnext101) in this repo https://github.com/hpanwar08/detectron2

Note: Models are trained on ~60% of the original dataset for ~1.5 epochs. But these models works good for fine-tuning domain specific dataset.

@nodechef
Copy link

@hpanwar08 I am trying your trained models for prediction but I get this error:
Config '\configs\DLA_mask_rcnn_X_101_32x8d_FPN_3x.yaml' has no VERSION. Assuming it to be compatible with latest v2.

@hpanwar08
Copy link

hpanwar08 commented Jan 1, 2020

@nodechef This is just a warning. You should still be able to get the predictions.
I have updated the models. Please download the smaller trimmed model "model_final_trimmed.pth" from dropbox. It is smaller in size.

Use the below command for prediction

python demo/demo.py --config-file configs/DLA_mask_rcnn_X_101_32x8d_FPN_3x.yaml --input "<path to image.jpg>" --output <path to save the predicted image> --confidence-threshold 0.5 --opts MODEL.WEIGHTS <path to model_final_trimmed.pth> MODEL.DEVICE cpu

@nodechef
Copy link

nodechef commented Jan 1, 2020

@hpanwar08 Yeah it worked, later realized that it was just the warning. However, I have a question.
How do we get predicted classes ? like title, paragraph, figure ?

@nodechef
Copy link

nodechef commented Jan 2, 2020

I guess we need to add this to demo.py in order to get the predicted class, Right ? Correct me if I am wrong.

from detectron2.data import MetadataCatalog
MetadataCatalog.get("dataset_name").thing_classes = ["title", "text","figure", "table","list"]

The name of the dataset would be the one you used for training ?

@hpanwar08
Copy link

@nodechef If you want just the predictions and not visualization then use detectron2.engine.defaults.DefaultPredictor() to get the classes. You will get class indexes, then you need to convert class indexes to actual class names.
Refer line 35 and 48 in predictor.py

Or you can try this

classes = ['text', 'title', 'list', 'table', 'figure']
default_predictor = detectron2.engine.defaults.DefaultPredictor(cfg)
img = detectron2.data.detection_utils.read_image(path_to_image, format="BGR")
predictions = default_predictor(img)
instances = predictions["instances"].to('cpu')
pred_classes = instances.pred_classes
labels = [classes[i] for i in pred_classes]

"instances" contains the bbox, class probabilities and class indexes.

@nodechef
Copy link

nodechef commented Jan 2, 2020

@hpanwar08 I am looking for class labels instead of percentage for each bbox (Along with the visualization.)

@hpanwar08
Copy link

@nodechef If that is the case, then what you said should work. dataset_name will be "dla_val"

@nodechef
Copy link

nodechef commented Jan 2, 2020

MetadataCatalog.get(cfg.DATASETS.TRAIN[0]).thing_classes = ['text', 'title', 'list', 'table', 'figure']

v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]) , scale=1.2)
v = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2_imshow(v.get_image()[:, :, ::-1])

This worked.

@hpanwar08
Copy link

@deepseek Done!

@elnazsn1988
Copy link

Hi, is there an internal feature which lets each classed be saved as a seperate segment, or image? I am trying to identify tables, seperate and then run through a tabular data analyzer and ocr - so far am able to get the image predictions with your code, but not the actual annotations/segmented fields for further analysis/ocr.

@elnazsn1988
Copy link

@nodechef If you want just the predictions and not visualization then use detectron2.engine.defaults.DefaultPredictor() to get the classes. You will get class indexes, then you need to convert class indexes to actual class names.
Refer line 35 and 48 in predictor.py

Or you can try this

classes = ['text', 'title', 'list', 'table', 'figure']
default_predictor = detectron2.engine.defaults.DefaultPredictor(cfg)
img = detectron2.data.detection_utils.read_image(path_to_image, format="BGR")
predictions = default_predictor(img)
instances = predictions["instances"].to('cpu')
pred_classes = instances.pred_classes
labels = [classes[i] for i in pred_classes]

"instances" contains the bbox, class probabilities and class indexes.

Sorry, can you expand on where I woud add/change these lines you added - do I run as an external .py code and reference input/outputs, or do I change the predict code as mentioned above?

@elnazsn1988
Copy link

MetadataCatalog.get(cfg.DATASETS.TRAIN[0]).thing_classes = ['text', 'title', 'list', 'table', 'figure']

v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]) , scale=1.2)
v = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2_imshow(v.get_image()[:, :, ::-1])

This worked.

@nodechef does this give you both the classes, and the visualization? could you elaborate on whether you changed these lines in the predict.py file or ran an external .py file?

@elnazsn1988
Copy link

MetadataCatalog.get(cfg.DATASETS.TRAIN[0]).thing_classes = ['text', 'title', 'list', 'table', 'figure']

v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]) , scale=1.2)
v = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2_imshow(v.get_image()[:, :, ::-1])

@nodechef when adding the script you added to the bottom of demo.py, I get the following error:

Traceback (most recent call last):
  File "C:\projects\pytorch\detectron2\demo\demo.py", line 155, in <module>
    v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]) , scale=1.2)
NameError: name 'Visualizer' is not defined

when adding in to the top of the demo.py file, I get the following:

Traceback (most recent call last):
  File "C:\projects\pytorch\detectron2\demo\demo.py", line 22, in <module>
    MetadataCatalog.get(cfg.DATASETS.TRAIN[0]).thing_classes = ['text', 'title', 'list', 'table', 'figure']
NameError: name 'cfg' is not defined

is there somewhere specific I should be adding your edit?

@elnazsn1988
Copy link

@nodechef If you want just the predictions and not visualization then use detectron2.engine.defaults.DefaultPredictor() to get the classes. You will get class indexes, then you need to convert class indexes to actual class names.
Refer line 35 and 48 in predictor.py

Or you can try this

classes = ['text', 'title', 'list', 'table', 'figure']
default_predictor = detectron2.engine.defaults.DefaultPredictor(cfg)
img = detectron2.data.detection_utils.read_image(path_to_image, format="BGR")
predictions = default_predictor(img)
instances = predictions["instances"].to('cpu')
pred_classes = instances.pred_classes
labels = [classes[i] for i in pred_classes]

"instances" contains the bbox, class probabilities and class indexes.

@hpanwar08 thanks the above throws up no errors, how can I save the instances to a location on my system? am running through shell.

@hpanwar08
Copy link

hpanwar08 commented Feb 10, 2020

@nodechef If you want just the predictions and not visualization then use detectron2.engine.defaults.DefaultPredictor() to get the classes. You will get class indexes, then you need to convert class indexes to actual class names.
Refer line 35 and 48 in predictor.py
Or you can try this

classes = ['text', 'title', 'list', 'table', 'figure']
default_predictor = detectron2.engine.defaults.DefaultPredictor(cfg)
img = detectron2.data.detection_utils.read_image(path_to_image, format="BGR")
predictions = default_predictor(img)
instances = predictions["instances"].to('cpu')
pred_classes = instances.pred_classes
labels = [classes[i] for i in pred_classes]

"instances" contains the bbox, class probabilities and class indexes.

@hpanwar08 thanks the above throws up no errors, how can I save the instances to a location on my system? am running through shell.

@elnazsn1988 In addition to the above code you can extract the bounding boxes and then crop the image based on the bounding box and save.

classes = ['text', 'title', 'list', 'table', 'figure']
default_predictor = detectron2.engine.defaults.DefaultPredictor(cfg)
img = detectron2.data.detection_utils.read_image(path_to_image, format="BGR")
predictions = default_predictor(img)
instances = predictions["instances"].to('cpu')

pred_classes = instances.pred_classes
labels = [classes[i] for i in pred_classes]
boxes = instances.pred_boxes
if isinstance(boxes, detectron2.structures.boxes.Boxes):
    boxes = boxes.tensor.numpy()
else:
    boxes = np.asarray(boxes)

for label, bbox in zip(labels, boxes):
    if label == "table":
        cropped_img = img.crop(bbox)
        croppped_img.save(f"{label}_{bbox}.png")

Hope this helps.

@pollyMath
Copy link

@nodechef If you want just the predictions and not visualization then use detectron2.engine.defaults.DefaultPredictor() to get the classes. You will get class indexes, then you need to convert class indexes to actual class names.
Refer line 35 and 48 in predictor.py
Or you can try this

classes = ['text', 'title', 'list', 'table', 'figure']
default_predictor = detectron2.engine.defaults.DefaultPredictor(cfg)
img = detectron2.data.detection_utils.read_image(path_to_image, format="BGR")
predictions = default_predictor(img)
instances = predictions["instances"].to('cpu')
pred_classes = instances.pred_classes
labels = [classes[i] for i in pred_classes]

"instances" contains the bbox, class probabilities and class indexes.

Sorry, can you expand on where I woud add/change these lines you added - do I run as an external .py code and reference input/outputs, or do I change the predict code as mentioned above?

this may already be resolved but would leave it for future reference, based on node chef's code, what I did is adding the following lines in demo.py after demo.run_on_image

from detectron2.data import MetadataCatalog

            MetadataCatalog.get(cfg.DATASETS.TRAIN[0]).thing_classes = ["text",
                                                                 "title",
                                                                 "list",
                                                                 "table",
                                                                 "figure"]
            v = Visualizer(img[:, :, ::-1],
                           MetadataCatalog.get(cfg.DATASETS.TRAIN[0]),
                           scale=1.2)
            v = v.draw_instance_predictions(predictions["instances"].to("cpu"))

and then
imwrite(out_filename, v.get_image()[:, :, ::-1]) where u have output filename
Note you may have to reorder the labels in the classes array

@ChungNPH
Copy link

Did anyone try to find tune with others categories. For example, i have others categories as authors, introduction. So my categories are [title, text, figure, table, list, authors, introduction]. I know that IBM don't have plan with additional categories, but how can i try to do this?

@zhxgj
Copy link
Contributor

zhxgj commented May 11, 2020

Did anyone try to find tune with others categories. For example, i have others categories as authors, introduction. So my categories are [title, text, figure, table, list, authors, introduction]. I know that IBM don't have plan with additional categories, but how can i try to do this?

Hi @ChungNPH do you have annotations of the additional categories?

@ChungNPH
Copy link

Did anyone try to find tune with others categories. For example, i have others categories as authors, introduction. So my categories are [title, text, figure, table, list, authors, introduction]. I know that IBM don't have plan with additional categories, but how can i try to do this?

Hi @ChungNPH do you have annotations of the additional categories?

I create a custom dataset my self, it have some difference with publaynet format. As:
{ image:
{ file name: 'filename',
height: 'height',
width: 'width',
id: 'id',
annotations:
{ obj 1: [],
obj 2: [],
}
}
}

And i tried to train only publaynet with detectron2 but may detectron2 dont understand publaynet format as { 'image': [...], 'annotations' : [...], 'categories' : [...] }. Should I change json file format to train? Can you so me an overview to train with publaynet? Thank you so much!

@ChungNPH
Copy link

https://github.com/facebookresearch/detectron2/blob/master/docs/tutorials/datasets.md

I see that detectron2 need to re-format dataset to their format before train. Is it right?

@jmandivarapu1
Copy link

jmandivarapu1 commented Jul 2, 2020

Hi @monuminu The model was trained with the Detectron frame work. So you can load it in Detectron and use its inference examples. I will also provide a Jupyter notebook for inference soon.

Hey can anybody help me in correctly loading pretrained model and test it on documents,

I am trying to use the pretrained weights on the testset Images.
PMC3654277_00006

Case 1 : Using config configs/DLA_mask_rcnn_X_101_32x8d_FPN_3x.yaml

code: python3.7 demo/demo.py --config-file configs/DLA_mask_rcnn_X_101_32x8d_FPN_3x.yaml --input examples/PMC3654277_00006.jpg --output out/ --confidence-threshold 0.25 --opts MODEL.DEVICE cpu

But it's giving me pretty bad results. if I increase confidence-threshold> 0.25 then there is no detection

Case 2 : Using config configs/DLA_mask_rcnn_X_101_32x8d_FPN_3x.yaml

code python3.7 demo/demo.py --config-file pre-trained-models/Mask-RCNN/e2e_mask_rcnn_X-101-64x4d-FPN_1x.yaml --input examples/PMC3576793_00004.jpg --output out/ --confidence-threshold 0.2 --opts TYPE Mask-RCNN MODEL.DEVICE cpu

throwing the below error

'pre-trained-models/Mask-RCNN/e2e_mask_rcnn_X-101-64x4d-FPN_1x.yaml' has no VERSION. Assuming it to be compatible with latest v2.
Traceback (most recent call last):
  File "demo/demo.py", line 76, in <module>
    cfg = setup_cfg(args)
  File "demo/demo.py", line 28, in setup_cfg
    cfg.merge_from_file(args.config_file)
  File "/opt/anaconda3/lib/python3.7/site-packages/detectron2/config/config.py", line 49, in merge_from_file
    self.merge_from_other_cfg(loaded_cfg)
  File "/opt/anaconda3/lib/python3.7/site-packages/fvcore/common/config.py", line 120, in merge_from_other_cfg
    return super().merge_from_other_cfg(cfg_other)
  File "/opt/anaconda3/lib/python3.7/site-packages/yacs/config.py", line 217, in merge_from_other_cfg
    _merge_a_into_b(cfg_other, self, self, [])
  File "/opt/anaconda3/lib/python3.7/site-packages/yacs/config.py", line 464, in _merge_a_into_b
    _merge_a_into_b(v, b[k], root, key_list + [k])
  File "/opt/anaconda3/lib/python3.7/site-packages/yacs/config.py", line 477, in _merge_a_into_b
    raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: MODEL.TYPE'

@hpanwar08
Copy link

Where did you download the trained model from?

@hpanwar08
Copy link

https://github.com/facebookresearch/detectron2/blob/master/docs/tutorials/datasets.md

I see that detectron2 need to re-format dataset to their format before train. Is it right?

You probably have solved it by now, but just for info, if your data is in coco format then it's easy to train detectron2. And yes detectron2 convert the data into it's own data structure.

https://github.com/hpanwar08/detectron2

@ShabariGadewar
Copy link

https://github.com/hpanwar08/detectron2

Hi, I need a little help, I'm new to this. I just want to test the pre-trained models of PubLayNet dataset on the test data given on PubLayNet. I have downloaded the pkl file and un-pickled it. But I don't know how to use it on test data. Can you help me out? Thank you

@hpanwar08
Copy link

@ankur-sentieo
Copy link

ankur-sentieo commented Apr 28, 2021

Hey guys @ShabariGadewar @zhxgj , do you have any jupyter notebook now for the publaynet pre-trained model loading and testing? I am also looking to test the pre-trained models of PubLayNet dataset on the test data given on PubLayNet but i also get the same error which @jmandivarapu1 was getting in his comment above. Also @hpanwar08, I was able to run your trimmed models but can you please guide on how can we further train your pre-trained models (trimmed ones) that you have included in your repo?

'pre-trained-models/Mask-RCNN/e2e_mask_rcnn_X-101-64x4d-FPN_1x.yaml' has no VERSION. Assuming it to be compatible with latest v2.
Traceback (most recent call last):
File "demo/demo.py", line 76, in
cfg = setup_cfg(args)
File "demo/demo.py", line 28, in setup_cfg
cfg.merge_from_file(args.config_file)
File "/opt/anaconda3/lib/python3.7/site-packages/detectron2/config/config.py", line 49, in merge_from_file
self.merge_from_other_cfg(loaded_cfg)
File "/opt/anaconda3/lib/python3.7/site-packages/fvcore/common/config.py", line 120, in merge_from_other_cfg
return super().merge_from_other_cfg(cfg_other)
File "/opt/anaconda3/lib/python3.7/site-packages/yacs/config.py", line 217, in merge_from_other_cfg
_merge_a_into_b(cfg_other, self, self, [])
File "/opt/anaconda3/lib/python3.7/site-packages/yacs/config.py", line 464, in _merge_a_into_b
_merge_a_into_b(v, b[k], root, key_list + [k])
File "/opt/anaconda3/lib/python3.7/site-packages/yacs/config.py", line 477, in _merge_a_into_b
raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: MODEL.TYPE'

@under-score
Copy link

in support of @ankur-sentieo for a juypter/colab/binder notebook or at least a working docker container

@opyate
Copy link

opyate commented Mar 30, 2023

If anyone is interested, I have trained it on Detectron2. You can find training config and trained models (resnet101, resnext101) in this repo https://github.com/hpanwar08/detectron2

Note: Models are trained on ~60% of the original dataset for ~1.5 epochs. But these models works good for fine-tuning domain specific dataset.

Thanks for that!

Note that the layout-parser team also retrained with detectron2. See:
https://github.com/Layout-Parser/layout-parser/blob/main/src/layoutparser/models/detectron2/catalog.py#L25

And
https://layout-parser.readthedocs.io/en/latest/notes/modelzoo.html#model-catalog

@s-jays
Copy link

s-jays commented Apr 28, 2024

Hi, just wondering how I can save the annotation output from the model into the same format as Publaynet's dataset?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests