You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have observed a native memory leak when using ai.djl.huggingface.tokenizers.HuggingFaceTokenizer. Using the default (false) option of optWithOverflowingTokens we see a significant increase in memory usage over time when we have long strings that are truncated to shorter token sequences. When we set optWithOverFlowingTokens to true, we do not see this memory increase. This is particularly evident when very long strings are truncated to short token sequences.
Testing back, this behaviour started in version 0.27.0, and tracing that to the release notes, it seems that this PR might be the culprit: #2957.
It seems that by calling TokenizersLibrary.LIB.getOverflowing(encoding), a clone will be created that is only cleaned up when withOverflowTokens is true, as toEncoding is then called recursively on the overflowing handles which eventually calls TokenizersLibrary.LIB.deleteEncoding(encoding); on this copy.
So when withOverflowTokens is false, this cleanup does not occur.
The text was updated successfully, but these errors were encountered:
var input = "this will become a long string".repeat(256);
var tokenizer = ai.djl.huggingface.tokenizers.HuggingFaceTokenizer.builder()
.optTokenizerPath(Path.of("src/test/models/huggingface/bert-base-uncased.json"))
.optMaxLength(5)
.optTruncation(true)
.optWithOverflowingTokens(false)
.build();
while (true) {
tokenizer.encode(input);
}
We have observed a native memory leak when using
ai.djl.huggingface.tokenizers.HuggingFaceTokenizer
. Using the default (false
) option ofoptWithOverflowingTokens
we see a significant increase in memory usage over time when we have long strings that are truncated to shorter token sequences. When we setoptWithOverFlowingTokens
totrue
, we do not see this memory increase. This is particularly evident when very long strings are truncated to short token sequences.Testing back, this behaviour started in version 0.27.0, and tracing that to the release notes, it seems that this PR might be the culprit: #2957.
Particularly: https://github.com/deepjavalibrary/djl/pull/2957/files#diff-62d10f278a5a7644ce30deff638cf6ead21457bca60b9cc7430d115dd2fa2b38R533-R537
It seems that by calling
TokenizersLibrary.LIB.getOverflowing(encoding)
, a clone will be created that is only cleaned up whenwithOverflowTokens
istrue
, astoEncoding
is then called recursively on the overflowing handles which eventually callsTokenizersLibrary.LIB.deleteEncoding(encoding);
on this copy.So when
withOverflowTokens
isfalse
, this cleanup does not occur.The text was updated successfully, but these errors were encountered: