Skip to content

💫 A spaCy package for Yohei Tamura's Rust tokenizations library

License

Notifications You must be signed in to change notification settings

explosion/spacy-alignments

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

spacy-alignments: Align tokenizations for spaCy + transformers

A spaCy package for Yohei Tamura's Rust tokenizations library with Python bindings.

Installation

pip install -U pip setuptools wheel
pip install spacy-alignments

If no binary wheel is available for your platform, you will need to install Rust in order to build spacy-alignments from source.

spacy-alignments vs. pytokenizations

The spacy_alignments module is a drop-in replacement for tokenizations:

import spacy_alignments as tokenizations
a2b, b2a = tokenizations.get_alignments(["Ã¥", "BC"], ["abc"])
assert a2b == [[0], [0]]
assert b2a == [[0, 1]]

The only difference between this package and the original pytokenizations is that it switches the build system to setuptools-rust to make it easier for us at Explosion to build source and binary packages for a wider range of platforms.

Bug reports and other issues

Please use spaCy's issue tracker to report a bug, or open a new thread on the discussion board for any other issue.

About

💫 A spaCy package for Yohei Tamura's Rust tokenizations library

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •