✍️ A Handwriting Synthesizer by abusing Harfbuzz WASM Shaper.
🔗 Check more stupid stuff at Harfbuzz-WASM-Fantasy.
During the hype of llama.ttf months ago, I was speculating the potential of WASM shaper for even crazier purpose, one that fitter to a font shaper's duty -- to synthesize font at runtime. This project as proof-of-concept implements a synthesizer that generates and rasterizes handwriting-style font, backed by a super-lightweight RNN model (~14MiB).
The project must be run in an application linked against libharfbuzz
with the experimental WASM shaper enabled, which does not hold for any products currently. Considering that it's not easy to build such a library from scratch, I prebuilt a Docker image hsfzxjy/harfbuzz-wasm-handwriting-synthesis
which contains both the TTF file and a modified version of gedit
.
Usage You may try out this project with the following steps:
- On a Linux system with X11 (WSL is fine), run
GIT_LFS_SKIP_SMUDGE=1 git clone https://github.com/hsfzxjy/handwriter.ttf
; - In directory
handwriter.ttf
, runmake run
, which fetches Docker imagehsfzxjy/harfbuzz-wasm-handwriting-synthesis
and starts thegedit
application inside; - Start typing in the pop-up gedit window. Each line should prefixed by
#
to trigger the shaper, e.g., typing#hello world
.
Some strokes might look cursed due to the limitation of the model, appending a space
should make it better.
2024-08-21.13-31-43.mp4
The project follows Alex Graves's paper Generating Sequences With Recurrent Neural Networks and adopts an RNN model for handwriting synthesis. Shortly, the generation process undergoes multiple steps to produce a series of strokes given the input text. At each step the model predicts the next pen position given the current one. Afterwards, Bresenham's line algorithm rasterizes the strokes into pixel locations, which are set as the offsets for an array of "black-box" glyphs.
I've tried some more recent models, but their runtime latency is unaffordable.
The final TTF file is highly optimized, reaching the speed of 0.08 sec/character on Intel Ultra 125H. Each text run's generation time is proportional to the text length.
The journey to perfect optimization is interesting, which I shall introduce in blog posts later. Some important notes:
- Use rten as inference backend to make sure neural ops are executed with SIMD instructions.
- Pre-transpose the RHS of MatMul to make them col-major, improving the performance by ~15%.
- To run modules containing SIMD instructions, wasm-micro-runtime should be compiled with
-DWAMR_BUILD_SIMD=1
and WASM file must be AOT-compiled by wamrc. - Enable specific optimization in
wamrc
(--opt-level=3
,--enable-segue=i32.load,f32.load,i32.store,f32.store
and--enable-tail-call
), improving the performance by ~55%.
This project is licensed under the Apache 2.0 LICENSE. Copyright (c) 2024 hsfzxjy.