Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

csharp: Redesign the C# API to allow more asynchronous operations #1843

Open
CurtHagenlocher opened this issue May 8, 2024 · 1 comment
Open
Labels
Type: enhancement New feature or request

Comments

@CurtHagenlocher
Copy link
Contributor

CurtHagenlocher commented May 8, 2024

What feature or improvement would you like to see?

The C# API should allow more asynchronous operations. Currently, it only supports ExecuteQueryAsync and ExecuteUpdateAsync.

My mental model for the four ADBC "object" types is as follows:

The driver is analogous to the ODBC or JDBC driver, or ADO.NET provider. In JDBC, this concept is represented by the java.sql.Driver interface. In ADO.NET, it's represented by the DbProviderFactory class. Because this object is strictly about code, it's not expected to do any IO other than that potentially required by any code e.g. to bring in pages of a binary image from disk.

The database is analogous to the JDBC DataSource, the ODBC "DSN" or the ADO.NET connection string. It represents the information and capability required to create a database connection but does not itself do IO until it tries to create a connection. (This would imply that parameter validation which requires network access -- e.g. to validate a host name -- is deferred until the connection is opened. Perhaps that's too limiting?)

Because neither the driver nor the database is doing IO, neither of them need to have async methods other than Connect, including async cleanup.

The connection represents an actual session with a database. This matches an ODBC connection, a JDBC java.sql.Connection or an ADO.NET DbConnection. Opening a connection, closing it or using it to fetch information about the data source are all operations likely to require IO so these all require async implementations.

The statement is a unit of bookkeeping related to certain types of database operations. In some cases, a connection can only have a single active statement running against it, but it can be useful to have multiple statements even then if, for instance, each one is a prepared statement that represents both client-side and server-side resources. The statement is analogous to an ODBC statement, a JDBC java.sql.Statement or an ADO.NET DbCommand. Due to the need to clean up an in-progress operation or to release server-side resources, the cleanup of a statement might do IO and should therefore support asynchrony. But when the statement is first created, it only represents the potential for future work and so creation is always synchronous.

I'd be curious to hear how well this aligns with others' points of view.

@CurtHagenlocher CurtHagenlocher added the Type: enhancement New feature or request label May 8, 2024
@CurtHagenlocher CurtHagenlocher changed the title feat(csharp): The C# API should allow more asynchronous operations csharp: The C# API should allow more asynchronous operations May 8, 2024
@CurtHagenlocher
Copy link
Contributor Author

More broadly, there should also be a clear theory of sync/async including preferences for which to implement in the "pure C#" case and how. (The import/export case is going to be governed by the limitations of the C API.)

@CurtHagenlocher CurtHagenlocher changed the title csharp: The C# API should allow more asynchronous operations csharp: Redesign the C# API to allow more asynchronous operations May 10, 2024
CurtHagenlocher added a commit that referenced this issue May 19, 2024
For #1843, redefines the C# APIs to prioritize full async support. As
this is making a number of breaking changes already, it also takes the
opportunity to do some general cleanup.

Async methods that are generally expected to run locally (e.g.
GetOption/SetOption) are defined to return ValueTask and have their
default implementation be synchronous. Async methods that are generally
expected to run remotely are defined to return Task and have their
default implementations be asynchronous.

My mental model for the four ADBC "object" types is as follows:

The driver is analogous to the ODBC or JDBC driver, or ADO.NET provider.
In JDBC, this type is represented by the java.sql.Driver interface. In
ADO.NET, it's represented by the DbProviderFactory class. Because this
object is strictly about code, it's not expected to do any IO other than
that potentially required by any code e.g. to bring in pages of a binary
image from disk.

The database is analogous to the JDBC DataSource, the ODBC "DSN" or the
ADO.NET connection string. It represents the information and capability
required to create a database connection but does not itself do IO until
it tries to create a connection. (This would imply that parameter
validation which requires network access -- e.g. to validate a host name
-- is deferred until the connection is created. Perhaps that's too
limiting?)

Because neither the driver nor the database is doing IO, neither of them
need to have async methods other than Connect, including async cleanup.

The connection represents an actual session with a database. This
matches an ODBC connection, a JDBC java.sql.Connection or an ADO.NET
DbConnection. Opening a connection, closing it or using it to fetch
information about the data source are all operations likely to require
IO so these all require async implementations.

The statement is a unit of bookkeeping related to certain types of
database operations. In some cases, a connection can only have a single
active statement running against it, but it can be useful to have
multiple statements even then if, for instance, each one is a prepared
statement that represents both client-side and server-side resources.
The statement is analogous to an ODBC statement, a JDBC
java.sql.Statement or an ADO.NET DbCommand. Due to the need to clean up
an in progress operation or to release server-side resources, the
cleanup of a statement might do IO and should therefore support
asynchrony. But when the statement is first created, it only represents
the potential for future work and so creation is always synchronous.

I'd be curious to hear how well this aligns with others' points of view.
lidavidm? davidhcoe? ("ping" removed)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant