Skip to content

Faster lookup of OPQ-quantized embeddings

Compare
Choose a tag to compare
@danieldk danieldk released this 09 Jun 08:06
  • Make lookups of unknown words in OPQ-quantized embedding matrices 2.6x faster (resulting in ~1.6x faster allround lookups).
  • Add the Reconstruct trait is a counterpart to Quantize. This trait can be used to reconstruct quantized embedding matrices. Using this trait is also much faster than reconstructing individual embeddings.
  • Add more I/O checks to ensure that the embedding matrix can actually be represented in the native usize.