Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADO.NET API for reading string data directly from the provider's internal buffer #28135

Open
roji opened this issue Dec 11, 2018 · 1 comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Data enhancement Product code improvement that does NOT require public API changes/additions
Milestone

Comments

@roji
Copy link
Member

roji commented Dec 11, 2018

dotnet/corefxlab#2368 discusses directions and goals for future UTF-8 string support. Apart from a new UTF8String type (https://github.com/dotnet/corefx/issues/30503) which would obviate decoding when binary data is already in UTF8, there is also a first discussion on a perf-oriented UTF8 slice directly over binary data:

We find ourselves with two conflicting goals. The first goal is performance above all else: fill a buffer with inbound network data, reinterpret_cast it as UTF-8 data, and operate on it. Network protocol stacks are the big consumer here. This can be achieved by providing UTF-8 manipulation methods which operate directly on spans, which has the added benefit of allowing the consumer to remain in full control of all memory allocations.
[...]
While Utf8String is useful for representing incoming UTF-8 data without the need for transcoding, it does still incur the cost of an allocation per instance. As part of this work we may want to consider making StringSlice or Utf8StringSlice first-class types in the framework. One could imagine these types as being thin wrappers (perhaps aliases?) for ReadOnlyMemory and ReadOnlyMemory along with most (but not all) of the instance methods on String and Utf8String.

Something like this could be exposed on DbDataReader, returning a string-like object that directly references the ADO.NET provider's internal buffer; this would allow users to do zero-allocation access on incoming strings from the database, with a potentially big perf impact.

The big issue is of course the lifespan of the returned slice object. Unless CommandBehavior.Sequential is specified, providers are already expected to buffer entire rows into memory. The lifetime of a string slice would therefore be until the next time DbDataReader.Read() is called, at which point the data in the buffer is potentially changed. This is somewhat dangerous and requires understanding from users, so this would definitely be an advanced, high-perf API only.

Note the similarity with DbDataReader.GetStream() and DbDataReader.GetTextReader(), which are used to stream (large) binary and text data from the database. Although not formally specified, it is expected for the returned Stream/TextReader to be disposed as soon as the next row is read. It could be interesting to think of some sort of "invalidatable slice", where calling DbDataReader.Read() would invalidate any string slice returned on the previous row.

Such automatically-invalidated slices could be of interest as a safety measure anywhere where we're considering exposing slices to end users, where underlying data could change.

/cc @GrabYourPitchforks @divega @ajcvickers

@msftgits msftgits transferred this issue from dotnet/corefx Feb 1, 2020
@msftgits msftgits added this to the 5.0 milestone Feb 1, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@ajcvickers ajcvickers modified the milestones: 5.0.0, Future Jun 23, 2020
@ajcvickers ajcvickers removed the untriaged New issue has not been triaged by the area owner label Jun 23, 2020
@roji
Copy link
Member Author

roji commented Aug 12, 2021

See additional comments on lifecycle/async: #57262 (comment)

@roji roji closed this as completed Aug 12, 2021
@roji roji reopened this Aug 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Data enhancement Product code improvement that does NOT require public API changes/additions
Projects
None yet
Development

No branches or pull requests

4 participants