You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using tiktoken in a dataset preprocessing step for a pytorch DataLoader. They support multiprocessing in creating batches which spawns workers. This fails with exception:
I am using tiktoken in a dataset preprocessing step for a pytorch DataLoader. They support multiprocessing in creating batches which spawns workers. This fails with exception:
TypeError: cannot pickle 'builtins.CoreBPE' object
I am not familiar with Rust, but this thread seems to suggest that a few methods in the Rust implementation would enable pickling the tokenizer.
The text was updated successfully, but these errors were encountered: