07 Jun 2024
O200K
encoding
14 Oct 2023
Focused on internal refactoring and API stabilization to enhance usability and facilitate future development.
- Rename
Enconding
toEncodingConfig
. - Rework encodings to enhance extensibility:
- Add a new
Encoding
interface. - Introduce default implementations of
Encoding
:CL100KBase
,P50KBase
,R50KBase
, andP50KEdit
.
- Add a new
- Rename
Tokenizer.encoding
andTokenizer.encodingForModel
toTokenizer.of
with overrides.
12 Oct 2023
- JVM: use local PBE loader by default
- Encoding: enable custom encoding
08 Oct 2023
Initial release.
- Encodings:
CL100K_BASE
,R50K_BASE
,P50K_BASE
andP50K_EDIT
- Custom encoding support