Skip to content

Releases: LanguageMachines/ucto

v0.27

23 Jan 13:49
Compare
Choose a tag to compare

[Ko van der Sloot]

  • removed dependency on libtar
  • fixed build when HAVE_TEXTCAT was not set. Improved guards agains missing textcat support

[Maarten van Gompel]

v0.26

02 Jan 15:29
Compare
Choose a tag to compare

[Ko van der Sloot]

  • some code quality improvements
  • fix for #89
  • updated configure.ac
  • updated GitHub action

[Maarten van Gompel]

v0.25

22 Jul 09:25
Compare
Choose a tag to compare

[Ko van der Sloot]

  • Added a test for #87
  • Adapted to latest update in tokconfig-fra (uctodata 0.9)
  • Deal with unknown languages (as detected by ucto), using iso-639-3 'und' (#86)
    • don't tokenize unknown languages
    • configurable sentence splitter for "und" text
    • added tests
  • added code to set the separator (--seperators), so ucto can split on more than just spaces
  • migrated test wrapper to Python 3 (was still on 2.7)

[Maarten van Gompel]

  • Set up a Dockerfile
  • Added build-deps.sh to automatically download, build and install dependencies
  • Updated software metadata (codemeta.json) to latest requirements as proposed in CLARIAH
  • deprecated options -f and -x, still works but no longer advertised and gives a deprecation notice (#88)
  • textcat.cfg is now searched for in user config dir as well as global config; also allow running without textcat if the config is missing entirely (same as if not compiled in)
  • added support for user-based configuration dirs ($XDG_CONFIG_HOME/ucto), takes precedence over global data dirs

v0.24.1

17 Dec 15:36
Compare
Choose a tag to compare
  • added UTF8 members to the API, to replace the variants that were converted to UnicodeString
    This should help fixing proycon/python-ucto#11

v0.24

15 Dec 13:59
Compare
Choose a tag to compare
  • fix for #84
  • added a solution for #53
    (only partly)
  • added some UnicodeString members to the API
  • bumped library version to 6.0, because of API changes
  • code cleanup and refactoring

v0.23

12 Jul 09:42
Compare
Choose a tag to compare
  • added support for the new 'tag' feature in FoLiA, only for tag="token"
  • fixed a problem with '-T full' option not always adding text
  • use the new TextPolicy class from libfolia
  • fix for #81
  • fix for #82
  • added code to handle several Unicode joiners
  • replaced TravisCI by GutHub action
  • %include files may have an extension now
  • added tests for new features

v0.22

08 Oct 16:26
Compare
Choose a tag to compare

[Ko vd Sloot]

  • Fix for Byte-order Marker problem #79

v0.21.1

15 Apr 10:20
Compare
Choose a tag to compare

v0.21

15 Apr 09:14
Compare
Choose a tag to compare
  • Adapted to newest libfolia 2.4
  • adapted some tests
  • added an --allow-word-corrections option
  • improved handling of odd FoLiA

v0.20

27 Nov 12:00
Compare
Choose a tag to compare