Releases: LanguageMachines/ucto
Releases · LanguageMachines/ucto
v0.27
[Ko van der Sloot]
- removed dependency on libtar
- fixed build when HAVE_TEXTCAT was not set. Improved guards agains missing textcat support
[Maarten van Gompel]
- guard against uninitialized/missing textcat (https://github.com/proycon/python-frog#22)
- require latest libfolia, ticcutils and a more recent libxml2
v0.26
v0.25
[Ko van der Sloot]
- Added a test for #87
- Adapted to latest update in tokconfig-fra (uctodata 0.9)
- Deal with unknown languages (as detected by ucto), using iso-639-3 'und' (#86)
- don't tokenize unknown languages
- configurable sentence splitter for "und" text
- added tests
- added code to set the separator (--seperators), so ucto can split on more than just spaces
- migrated test wrapper to Python 3 (was still on 2.7)
[Maarten van Gompel]
- Set up a Dockerfile
- Added build-deps.sh to automatically download, build and install dependencies
- Updated software metadata (codemeta.json) to latest requirements as proposed in CLARIAH
- deprecated options -f and -x, still works but no longer advertised and gives a deprecation notice (#88)
- textcat.cfg is now searched for in user config dir as well as global config; also allow running without textcat if the config is missing entirely (same as if not compiled in)
- added support for user-based configuration dirs ($XDG_CONFIG_HOME/ucto), takes precedence over global data dirs
v0.24.1
- added UTF8 members to the API, to replace the variants that were converted to UnicodeString
This should help fixing proycon/python-ucto#11
v0.24
v0.23
- added support for the new 'tag' feature in FoLiA, only for tag="token"
- fixed a problem with '-T full' option not always adding text
- use the new TextPolicy class from libfolia
- fix for #81
- fix for #82
- added code to handle several Unicode joiners
- replaced TravisCI by GutHub action
- %include files may have an extension now
- added tests for new features
v0.22
v0.21.1
v0.21
v0.20
Bug fix release. solving: