-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve tesstrain.sh script #92
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ided The --fontlist argument to tesstrain.sh was always ignored, even if the language had no specific fonts specified in language-specific.sh. Change this behaviour so the --fontlist argument is used if no specifc fonts are selected by language-specific.sh.
Previously the fonts specified in language-selection.sh would override any specified on the command line. This changes language-specific.sh from overriding a user request to just setting the default fonts if none are specified with --fontlist.
The fontconfig initialisation hardcodes using Arial. However it may not be available, whereas the fonts being used later will be, so use one of them for initialisation instead.
The --bin_dir option to tesstrain.sh is not useful, as $PATH does the same job much better, so switch to relying on that instead. This also makes the code a bit more readable, as it removes the need to refer to binaries as COMMAND_NAME_EXE rather than just command_name.
This flag can be used to specify multiple different exposure levels for a training. There was some code already in tesstrain_utils.sh to deal with multiple exposure levels, so it looks like this functionality was always intended. The default usage does not change, with exposure level 0 being the only one used if --exposures is not used.
mktemp is a better idea for security, as well as enabling users to specify a different directory using the TMPDIR environment variable, which is useful if /tmp is a small tmpfs. Also fix a bug where the first few log messages were failing as the workspace directory wasn't been created early enough.
zvezdochiot
pushed a commit
to ImageProcessing-ElectronicPublications/tesseract
that referenced
this pull request
Mar 28, 2021
Improve tesstrain.sh script
zvezdochiot
pushed a commit
to ImageProcessing-ElectronicPublications/tesseract
that referenced
this pull request
Mar 28, 2021
Improve tesstrain.sh script
zvezdochiot
pushed a commit
to ImageProcessing-ElectronicPublications/tesseract
that referenced
this pull request
Mar 28, 2021
Improve tesstrain.sh script
zvezdochiot
pushed a commit
to ImageProcessing-ElectronicPublications/tesseract
that referenced
this pull request
Mar 28, 2021
Improve tesstrain.sh script
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Improvements to the tesstrain.sh script.
The only difference from default usage is that the --bin_dir option has been removed, in favour of $PATH.
See the commit log for details.