Skip to content

BigIskander/Handwriting-keyboard-for-Linux-tesseract

Repository files navigation

Handwriting-keyboard-for-Linux.

This is programm written for Linux X11 desktop environment.

To recognize handwritten pattern program uses tesseract-ocr.

You can find compiled .deb and .AppImage packages in releases page.

How to use

  1. Install dependencies:

sudo apt install xdotool

sudo apt install tesseract-ocr

Note: I would recommend to install tesseract 4 (instead of tesseract 5). Because the results is the most accurate when using with tesseract 4 (at least for recognition of text (writing) in Chinese language).

  1. Install programm (you can find compiled .deb package in releases page)

  2. Download training data for tesseract-ocr and copy training data files to data folder of tesseract-ocr (for example for tesseract-ocr 4.0 it would be this folder /usr/share/tesseract-ocr/4.00/tessdata/).

Or alternatively you can put these files in watever folder you like and run program with --tessdata-dir cli parameter and point to the folder where training data files are located.

By default program uses language chi_all, which you can download from this source https://github.com/gumblex/tessdata_chi, or you can select desired language by running program with cli parameter --lang and set language.

  1. Launch the program with or without cli parameters handwriting-keyboard-t and just use it.

CLI (command line interface) parameters

--lang or -l - language used to recognize handwritten pattern.

--tessdata-dir - custom folder where is located the training data (for tesseract-ocr) used to recognize handwriting pattern.

--automode or -a - programm will send request to tesseract-ocr automatically after every stroke.

Example:

handwriting-keyboard-t --tessdata-dir=/home/user/ --lang=chi_sim -a

In this case (above), to recognize hand written pattern programm will use training data from folder "/home/user/" and language "chi_sim" (Chinese simplified), particularly the file "/home/user/chi_sim.traineddata". Also in this case the programm will automatically send request to tesseract-ocr after every stroke, because it was launched with "-a" parameter.

Some technical details

Programm written by using tauri framework https://tauri.app/

The script from https://github.com/ChenYuHo/handwriting.js is used to make a writing canvas.

To recognize handwritten pattern program uses tesseract-ocr.

In order to run from code or compile the programm: You need to install Node.js 18 or newer version and Rust as well.

Install Node.js dependencies: npm install

Run program in development environment: npm run tauri dev

Run program in development environment with cli (command line) parameters: npm run tauri dev -- -- -- cli_parameters

Compile the programm: npm run tauri build

Older version of this program using Google API instead of tesseract-ocr is available by this link: https://github.com/BigIskander/Handwriting-keyboard-for-Linux.

Recommended IDE Setup