-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add batch img inference support for ocr det with readtext_batched #458
Conversation
@rkcosmos, let me know if you need to have some tests or pipeline for verification as well. If you think this PR is good, we can later discuss a PR for improving the general code formatting with PEP guidelines as well. thanks |
@SamSamhuns thank you |
Tesla V100 DGX |
why this method load all tensor into gpu memory? I got memory leaks about 26 gb while batch_size didn't work |
When benchmarking on GPU |
Add batch img inference support for ocr det with readtext_batched
@SamSamhuns It would be great if you could suggest a solution that supports running parallel processing, as I need to process lot of images and extract text in a short period of time. I would appreciate recommendations on leveraging multiprocessing or alternative parallelism techniques for better performance. |
@SamSamhuns
|
Batched image inference for text detection
Caveats:
n_width
andn_height
parameters can be used inreadtext_batched
.readtext_batched
can also take a single image as input but returns a result list with one element, i.e. a further result[0] access will be required.cudnn.benchmark
mode set to True is better for batched inference hence I passcudnn_benchmark=True
ineasyocr.Reader
dummy = np.zeros([batch_size, 600, 800, 3], dtype=np.uint8); reader.readtext_batched(dummy)
before timing the inferences.Edited files
These changes although major should have no backward compatibility issues, but I would greatly appreciate extensive testing @rkcosmos . I am open to any suggestions or changes
utils.py
reformat_input_batched
to take a list of file paths, numpy ndarrays, or byte stream objectsdetection.py
get_textbox
function to process a list of lists of bboxes and polystest_net
functions to accumulate the input image and send all the inputs in a single tensor to the CRAFT torch modeleasyocr.py
readtext_batched
to take a list of file paths, numpy ndarrays, or byte stream objects now to process them in batch.detect
function to process a list of imagesI have a test script here to verify the functions are working as intended and added results for both CPU and GPU
As expected GPU batched inference is almost twice as fast as sequential GPU inference.
GPU results
CPU results
test_batch_easyocr.py
program to generate the outputs above