You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using the tesseract to detect the chi_sim, for example "2017年11月06日”.
when I using the following command:
tesseract XX.jpg stdout -l chi_sim --psm 7 --oem 0,
it outputs:2017年10月 12 臼
but when I use this one:
tesseract XX.jpg stdout -l chi_sim --psm 7 --oem 0 -c tessedit_char_whitelist="0123456789年月日"
it outputs:
2017 10 12
it looks that after I specify the whitelist, the letter "年” and “月” go missing while not specify, it is there, why?
Environment
Current Behavior:
I am using the tesseract to detect the chi_sim, for example "2017年11月06日”.
when I using the following command:
tesseract XX.jpg stdout -l chi_sim --psm 7 --oem 0,
it outputs:2017年10月 12 臼
but when I use this one:
tesseract XX.jpg stdout -l chi_sim --psm 7 --oem 0 -c tessedit_char_whitelist="0123456789年月日"
it outputs:
2017 10 12
it looks that after I specify the whitelist, the letter "年” and “月” go missing while not specify, it is there, why?
Expected Behavior:
input: tesseract XX.jpg stdout -l chi_sim --psm 7 --oem 0 -c tessedit_char_whitelist="0123456789年月日"
output: 2017年10月 12
Suggested Fix:
in my opinion, whitelist means the result filter, for example:
input "6758490210", the whitelist is as: "02468"
the output should be:"684020"
sincerely thanks for your answer.
The text was updated successfully, but these errors were encountered: