-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CodeUnits input broken with CSV.jl 0.9 #894
Comments
Yeah, sorry for the change/hassle here. For context, this happened to work previously, but wasn't officially supported, and actually what was happening, I believe, is that a copy of the input CodeUnits was being made to convert it to a We can maybe still support this; one idea is to call |
Fixes #894. I believe this used to work because we made a copy of the input `CodeUnits` object, which is fine, because we got a `Vector{UInt8}` out of it, but not ideal since we made a copy. After some research, I found that when you call `IOBuffer(str)`, it uses this nifty little trick of calling `unsafe_wrap(Vector{UInt8}, str)` which is then passed to `IOBuffer` as a way to convert a string to a `Vector{UInt8}` without copying. We can utilize the same trick to efficiently treat a string as a `Vector{UInt8}` while the Julia internals takes care of tracking the true owner of the data as the original string for us (thus avoiding any GC issues if we were to naively call `unsafe_wrap(Array, pointer(str))` ourselves).
Ok, I think I figured out a clean way for us to support this: #905 |
Thanks for addressing this! Note that for me it's fine to use If the intention is really to work on a byte stream, maybe it's best to leave |
Fixes #894. I believe this used to work because we made a copy of the input `CodeUnits` object, which is fine, because we got a `Vector{UInt8}` out of it, but not ideal since we made a copy. After some research, I found that when you call `IOBuffer(str)`, it uses this nifty little trick of calling `unsafe_wrap(Vector{UInt8}, str)` which is then passed to `IOBuffer` as a way to convert a string to a `Vector{UInt8}` without copying. We can utilize the same trick to efficiently treat a string as a `Vector{UInt8}` while the Julia internals takes care of tracking the true owner of the data as the original string for us (thus avoiding any GC issues if we were to naively call `unsafe_wrap(Array, pointer(str))` ourselves).
The support that was added was specifically |
The following used to work in CSV.jl 0.8:
with 0.9.1 I get
(It works when replacing
codeunits
withIOBuffer
.)Maybe
CodeUnits
as input is not really supported? The documentation forinput
mentions support forVector{UInt8}
orSubArray{UInt8, 1, Vector{UInt8}}
but notAbstractVector{UInt8}
. On the other hand, the documentation forCSV.Rows
mentions support forThe text was updated successfully, but these errors were encountered: