Skip to content
This repository has been archived by the owner on Mar 17, 2022. It is now read-only.

Add some useful functions #40

Merged
merged 1 commit into from
Feb 17, 2014
Merged

Conversation

ductranit
Copy link

There are 5 extend functions in base api: GetBoxText, GetHOCRText, SetInputName, SetOutputName, ReadConfigFile

Here is the example for new apis:

     TessBaseAPI baseApi = new TessBaseAPI();
     // use the most accuracy mode.
     baseApi.init(appPath, "eng", TessBaseAPI.OEM_TESSERACT_CUBE_COMBINED);

     // set config output hocr
     baseApi.setVariable("tessedit_create_hocr", "1");
     baseApi.setImage(new File(grayScaleFileName));

     // set the filename for hocr output
     baseApi.setInputName("hocr.html");

     String hocr = baseApi.getHOCRText(0);
     String recognizedText = baseApi.getUTF8Text();
     String boxText = baseApi.getBoxText(0);

This is the input ocr image:
20140213_145057_675_grayscale

And the results:

For the Page Iterator, I created test case base on tesseract 's document

     ResultIterator iterator = baseApi.getResultIterator();
        int level = PageIteratorLevel.RIL_WORD;
        if(iterator != null){       
        do{
            String dataText = iterator.getUTF8Text(level);
            float confident = iterator.confidence(level);
            if(!TextUtils.isEmpty(dataText)){
                int[] box = iterator.getBoundingBox(level);
                System.out.println(box);
            }       
        }while(iterator.next(level));
            iterator = null;
        } 

@rmtheis rmtheis merged commit 85ac493 into rmtheis:master Feb 17, 2014
@rmtheis
Copy link
Owner

rmtheis commented Feb 17, 2014

Outstanding--this is a great contribution. Thank you.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants