Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

database/sql: add Context support #15123

Closed
bradfitz opened this issue Apr 5, 2016 · 49 comments
Closed

database/sql: add Context support #15123

bradfitz opened this issue Apr 5, 2016 · 49 comments
Milestone

Comments

@bradfitz
Copy link
Contributor

bradfitz commented Apr 5, 2016

This bug is about adding Context support the database/sql package.

Parent bug is #14660

@bradfitz
Copy link
Contributor Author

bradfitz commented Apr 5, 2016

@pbnjay wrote on the parent bug:

I'm a bit late to the discussion, but would it make more sense to add a Context to sql.Tx instead of the Query args? Automatic rollback on cancellation would be desirable, no? Alternative, maybe less desirable version would be a method like BeginWithContext(ctx)

"instead of"? Remember you can also query outside of transactions.

@cihangir
Copy link

cihangir commented Apr 5, 2016

I started implementing this a while back ago with having context as the first parameter to all the api functions. Apart from SetMaxIdleConns and SetMaxOpenConns all functionality implemented. Given the fact that my implementation breaks the whole api, implementation is purely done in a third party manner (without touching to std lib code)

For the ref, you can find the docs here https://godoc.org/github.com/cihangir/ctxdb

I would like to help on the implementation if need.

Automatic rollback on cancellation would be desirable, no?

I tried to achieve this by closing the connection to the database Some of the database vendors handle it properly, but i am not sure if it is a general solution.

@pbnjay
Copy link
Contributor

pbnjay commented Apr 5, 2016

You can query outside of transactions, but I think combining a user-space context with the database transaction (even for a single-query transaction) is troublesome. If you run an UPDATE but your context times out, user space indicates a cancellation/error, but the database may complete the request successfully. Sure this would be a user error, but ensuring that all Context usage occurs within a transaction allows for consistent results.

Surely I'm missing something?

@extemporalgenome
Copy link
Contributor

I like @pbnjay's idea of having tying contexts into Tx -- even if we have to add a way to "begin" a fake transaction, or create an interface that *sql.Tx will be made to implement, and which is additionally consumed for non-transactional database contexts.

With the "magic first argument" behavior, there are too many open questions. What happens if a user passes in a first argument that implements both driver.Value and context.Context? There is already also the open question about whether it should be the first or last value, and users will have the same question (and make mistakes which may pass silently, depending on how much validation the driver does). The only way to find out if a context with a 5 minute deadline is actually being used for query aborts is to construct a query that takes 5 minutes.

Also, users may infer that multiple contexts may be passed in (whichever is cancelled first), or have subtle bugs that didn't occur before but can occur now, such as sorting the parameter []interface{} for a generated "IN" query, say for debugging clarity, in a wrapper in front of Query/Exec.

I don't think we need a parallel API, but we can shift the proposal to a couple new methods that flow into the existing API (and don't sacrifice type-safety).

@bradfitz bradfitz added this to the Go1.7 milestone Apr 7, 2016
@ymmt2005
Copy link

ymmt2005 commented Apr 7, 2016

FWIW, I have just implemented a context-oriented query canceler for MySQL.

Not only in Go, but MySQL server should also be able to cancel running queries
to avoid wasting CPU and memory resources. In my implementation, I used two
techniques:

  1. (for MySQL < 5.7.8) KILL by another connection.
  2. (for MySQL >= 5.7.8) SET max_execution_time.

This was possible because I knew my program issues only SELECT statements.

From this experience, I'm not sure whether it is good or not that database drivers
implicitly executes this kind of queries with side effects for native support of contexts.

The code is here for your reference:
https://github.com/cybozu-go/goma/blob/ac585a21752104d5ce69601c310eed753bf8f135/probes/mysql/probe.go#L27

@pbnjay
Copy link
Contributor

pbnjay commented Apr 11, 2016

This one may need sql/driver support to function, I did some testing here with Postgres and the query keeps running even if you kill the code entirely with ctrl-C :

https://gist.github.com/pbnjay/96f829f8aa77d7ae1d88bca094e5d6ec

Since @ymmt2005 also mentioned mysql-specific commands, it leads me to think it can't be done from the current generic interface.

@j7b
Copy link

j7b commented Apr 25, 2016

Context support isn't a good fit for database/sql, there's no portable way to add deadlines or cancellation signals, I'd prefer to see a separate package like database/context that wraps *database.DB methods with checks to see if Done() is closed for a passed context and adds appropriate errors.

@bradfitz bradfitz modified the milestones: Go1.8, Go1.7 May 10, 2016
@johto
Copy link
Contributor

johto commented Jun 28, 2016

If you run an UPDATE but your context times out, user space indicates a cancellation/error, but the database may complete the request successfully. Sure this would be a user error, but ensuring that all Context usage occurs within a transaction allows for consistent results.

With explicit transactions you're merely moving this problem from the UPDATE to the COMMIT command, not eliminating it.

@derekperkins
Copy link

This one may need sql/driver support to function

Seems very likely.

@kardianos
Copy link
Contributor

@bradfitz

Context is a great fit for sql. Almost all rdbms have a query cancel method and all databases need to ensure their connections are returned to the pool when the request is finished.

One option I could see is in sql.DB add a Ctx(context.Context) *sql.DB method who'd methods use a context under the covers.

In the driver you might need new interfaces that could be tested for, such as

type ConnCtx interface {
        PrepareCtx(ctx context.Context, query string) (sql.Stmt, error)
        CloseCtx(ctx context.Context) error
        BeginCtx(ctx context.Context) (sql.Tx, error)
}
type DriverCtx interface {
        OpenCtx(ctx context.Context, name string) (Conn, error)
}
type ExecerCtx interface {
        ExecCtx(ctx context.Context, query string, args []Value) (Result, error)
}
type QueryerCtx interface {
        QueryCtx(ctx context.Context, query string, args []Value) (Rows, error)
}
type StmtCtx interface {
        CloseCtx(ctx context.Context) error
        ExecCtx(ctx context.Context, args []Value) (Result, error)
        QueryCtx(ctx context.Context, args []Value) (Rows, error)
}
type TxCtx interface {
        CommitCtx(ctx context.Context) error
        RollbackCtx(ctx context.Context) error
}

Does that sound about right?

@kardianos
Copy link
Contributor

@bradfitz

One option I could see is in sql.DB add a Ctx(context.Context) *sql.DB method who'd methods use a context under the covers.

I hope to start designing some CLs for testing. What do you think of adding a Ctx method to *sql.DB which returns a *sql.DB, but allows any queries from the returned DB to use the context. This would be done per request obviously. I want to make sure I'm not barking up the wrong tree on this.

@quentinmit
Copy link
Contributor

I think we have a convention of calling such a method WithContext. If you do this for queries, how do you propose applying a context to the process of connecting to the database? Would you just not use a context for that?

@kardianos
Copy link
Contributor

@quentinmit I think with a connection pool, you may specify a connection timeout in the DSN. You then pass a context to the context pool with a timeout for acquiring the connection.

You are correct about the method name. Agree it should be spelled *sql.DB.WithContext(context.Context) *sql.DB..

@kardianos
Copy link
Contributor

I haven't used context everywhere it needs to be used, but the API and basic plumbing is present.
https://go-review.googlesource.com/28532

Feedback on the approach welcome.

@bradfitz
Copy link
Contributor Author

bradfitz commented Sep 5, 2016

@kardianos, remove the "DO NOT REVIEW" label from your CL if you want a review.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/28532 mentions this issue.

@bradfitz
Copy link
Contributor Author

bradfitz commented Sep 8, 2016

@kardianos and I discussed this today. He's going to write up a summary of our discussion here.

@kardianos
Copy link
Contributor

kardianos commented Sep 8, 2016

Current design adds the following methods to the database/sql package.

func (*DB) BeginLevel(ctx context.Context, level IsoLevel) (*Tx, error)
func (*DB) ExecContext(ctx context.Context, query string, args ...interface{}) (Result, error)
func (*DB) QueryContext(ctx context.Context, query string, args ...interface{}) (*Rows, error)
func (*DB) QueryRowContext(ctx context.Context, query string, args ...inerface{}) *Row

func (*Stmt) ExecContext(ctx context.Context, args ...interface{}) (Result, error)
func (*Stmt) QueryContext(ctx context.Context, args ...interface{}) (*Rows, error)
func (*Stmt) QueryRow(ctx context.Context, args ...interface{}) (*Row)

func (*Tx) ExecContext(ctx context.Context, query string, args ...interface{}) (Result, error)
func (*Tx) QueryContext(ctx context.Context, query string, args ...interface{}) (*Rows, error)
func (*Tx) QueryRowContext(ctx context.Context, query string, args ...interface{}) *Row
func (*Tx) StmtContext(ctx context.Context, stmt *Stmt) *Stmt

The goal is to drive a middle ground between having a context on every Rows method, but not on what is designed to be globals such as *DB or *Stmt nor store context in what looks like a *DB. The context parameter is designed to signal the driver to close the connection (cancelling the query if the driver supports it) and return the connection to the database pool at the end of the request.

The database/sql/driver package will also get modified to add optional interfaces drivers can implement. Most database systems support cancelling in-flight queries, use context to signal cancellation. If a driver doesn't support the new interfaces the new sql Context methods can still be used, however in-flight queries may continue to process until it sees the connection is closed.

type ExecerContext interface {
    ExecContext(ctx context.Context, query string, args []Value) (Result, error)
}
type QueryerContext interface {
    QueryContext(ctx context.Context, query, args []Value) (Rows, error)
}
type RowsContext interface {
    ColumnInfo() []ColumnInfo // Probably not this exactly, but may have something like this, see #16652
    Close() error
    NextContext(ctx context.Context, dest []Value) error
}
type StmtContext interface {
    Close() error
    NumInput() int
    ExecContext(ctx context.Context, args []Value) (Result, error)
    QueryContext(ctx context.Context, args []Value) (Rows, error)
}
type TxContext interface {
    CommitContext(context.Context) error
    RollbackContext(context.Context) error
}

@ymmt2005
Copy link

ymmt2005 commented Sep 9, 2016

Off topic, but this new API catches my attention:

func (_DB) BeginLevel(ctx context.Context, level IsoLevel) (_Tx, error)

It would be better if we can also pass a flag to turn the transaction read-only.
Some databases can prohibit (and optimize) a transaction from modifying data.
For instance, MySQL can do it by START TRANSACTION READ ONLY statement.

Any chance can BeginLevel take the third parameter as readonly bool?

@kardianos
Copy link
Contributor

@ymmt2005 Thank you for pointing that out. I've done a rough survey of systems that support this. I'll need to do a more in depth survey before having an opinion, but so far I'm in favor of doing something for that.

@kardianos
Copy link
Contributor

@noblehng I'm aware of what ISO defines is only 4 levels. But you didn't give an example of a different iso level, just that a particular existing isolation level may have different meaning. It think that is where we differ in opinion. Is it okay to have different database system take the same const when they might be implemented differently have have different properties?

@derekperkins
Copy link

What about defining an interface for drivers to define supported transaction levels?

@kardianos
Copy link
Contributor

@derekparker I don't understand what you mean by that. Could you give an example?

@noblehng
Copy link

@kardianos You can provide some predefined flags in the stdlib, I have no objection to that, but you will also need to add a big warning in its documentation. Database driver specific flags can also coexist with stdlib predefined flags. Use a flags uint32 to BeginContext supports both.

@kardianos
Copy link
Contributor

@noblehng Can you give me an example of a name of a transaction level that is not represented in the list above? Your previous examples have listed different behavior between different servers on the same iso level name, but you have not given a different iso level that is on the list.

@noblehng
Copy link

noblehng commented Sep 13, 2016

@kardianos Your list for transaction isolation level is enough, except I will put IsoSnapshot before IsoSerializable. But you can have other vendor specific flags that can be passed to when starting a transaction, For example, Oracle has some flags to specify rollback segment for a transaction, and I don't think that is universal across vendor, you can see that here. And there other flags for the wire protocol here.

Oh, and ReadOnly should be a access mode, not a isolation level.

Another concern is that if you define those contants for user code in database/sql, you will also need to syncing it in database/sql/driver for driver implementation, or the opposite way.

@kardianos
Copy link
Contributor

kardianos commented Sep 13, 2016

@noblehng Thank you for the pointers to the docs. While not exactly the same, another, separate issue is to support Tx.Savepoint(name string) and Tx.RolebackTo(name string). The API at large doesn't support resuming transactions at present. I don't know what a Loosely coupled branch or tightly coupled branch is, nor do I know what a separable transaction is.

I'm open to having an API like:

type TxFlag interface{
    TxFlagValue() interface{}
    String() string
}

type Isolation byte
func (i Isolation) TxFlagValue() interface{} {
    return i
}

const (
    IsoDefault Isolation = iota * 2
    IsoReadUncommited
    IsoReadCommited
    IsoWriteCommited
    IsoRepeatableRead
    IsoSerializable
    IsoSnapshot
    IsoLinearizable
)

type TxAccessMode byte

const ReadOnly TxAccessMode

func (m TxAccessMode) TxFlagValue() interface{} {
    return m
}

func (db *DB) BeginLevel(ctx context.Context, flags ...TxFlag) (*Tx, error)

That way the driver has the option to define additional flags, and the flags could contain data for the something like the Oracle USE ROLLBACK SEGMENT <name>, where <name> would be the value in the interface.

package oracle

type TxRollback string
func (r TxRollback) TxFlagValue() interface{}
    return string(r)
}

Used like:

db.BeginLevel(ctx, sql.IsoSerializable, oracle.TxRollback("DANCING_CATS"))

I would be interested to hear what you and @bradfitz think of that.

@derekperkins
Copy link

@kardianos Your example is basically what I was suggesting.

@noblehng
Copy link

noblehng commented Sep 14, 2016

@kardianos I think using a interface is overkill. For vendor specific settings, you will need to import the driver expicitly (not just register with a '_') anyway. So for more complex transaction settings, driver could provide helper functions like: func SetTxXXX(tx *sql.Tx) error, which just call tx.Exec("SET TRANSACTION ...") under the hood. It is just that if you are adding a flag like argument to BeginContext() to support isolation level and access mode etc, we could make it a little more generic by making it a flags uint32, which could both use with stblib defined or driver specific flags without much overhead.

And I still want to ask the reason to put IsoSnapshot behind IsoSerializable. MSSQL's wire protocol MS-TDS seems to use the same ordering. But I think that is just because Snapshot for them is added after, so they can't change the ordering for backward compatibility.

Also the encoding of the Isolation constants could be more efficiently, as different isolation levels can't be used together.

@kardianos
Copy link
Contributor

kardianos commented Sep 14, 2016

@noblehng I heard you express the user requirement "as a user, I want to enable usage such as Oracle's USE ROLLBACK SEGMENT (name)". I then see you are pushing the technical solution to have transactions be a uint32 flag set. Because the use rollback needs a (name), I don't see how your proposed technical solution satisfies your previous user requirement.

@noblehng
Copy link

noblehng commented Sep 14, 2016

@kardianos I said that above, it can be done with a func SetTxXXX(tx *sql.Tx) error in driver. Because this is too driver specific, no need to specially support that in the stdlib. I want the constants to be extendable because there is already a uint32 or byte like argument in your proposal, better make it more extendable. It is more about api design than just user requirement.

For predefined constants, it could be like:

const (
    DefaultTx iota

    // isolation level, 3bits
    // Warning: The actual meanings of these levels are drive defined, use with cautious
    ReadUncommitedTx
    ReadCommitedTx
    WriteCommitedTx
    RepeatableReadTx
    SnapshotTx
    SerializableTx
    LinearizableTx

    // access mode, 2bits
    ReadOnlyTx    0x1000
    WriteOnlyTx   0x10000
    ReadWriteTx   0x11000

    // remaining bits are for custom driver defined flags
)

Edit:
WriteCommited probably shouldn't exsist, so remove it from the above list.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/29381 mentions this issue.

@andlabs
Copy link
Contributor

andlabs commented Sep 28, 2016

If you don't mind me asking: when I first read the context package docs, I assumed I would do something like this to have per-request database transactions associated with per-request contexts:

func (h *MyHTTPHandler) ServeHTTP(w http.RequestWriter, r *http.Request) {
    ctx := context.Background()
    tx, err := globalDBConn.Begin()
    if err != nil { /* report error and return */ }
    ctx = context.WithValue(ctx, "tx", tx)
    err = h.realHandle(ctx, w, r)
    if err != nil { tx.Rollback(); /* report error */ }
    err = tx.Commit(); // and so on
}

Am I wrong about this, given all these new methods? I admittedly don't quite get what a context should store, apart from the general "request-scoped values".

@derekperkins
Copy link

This doesn't change anything about where you keep your database or your transaction variables (which should not be in the context).

Where this adds value is to allow for cancelation of queries and transactions for whatever purpose, plus adding some ability for extra tooling like traces, logs, etc.

@kardianos
Copy link
Contributor

context does store values and that can be useful at times. But context is mostly about limiting the life span of requests in a consistent manner. Cancellation and timeout. This might be worthy of a blog post, but here is how you might use context here:

var db *sql.DB
func InitialProgramSetup() error {
    var err error
    db, err = sql.Open("pg", "connect-to-server")
    return err
}
func (h *MyHTTPHandler) ServeHTTP(w http.RequestWriter, r *http.Request) {
    // r.Context() will automatically cancel when this request is done.
    // TODO: use trace package to inject UUID to context used for tracing execution.
    ctx, cancel := context.WithTimeout(r.Context(), time.Second * 1)
    defer cancel() // Not needed, the r.Context() will cancel when the request is done already.
    r = r.WithContext(ctx)

    switch r.URL.Path {
    default:
        http.Error(w, "not found", 404)
    case "/d/user":
        h.dataUser(w, r)
    }
}

func (h *MyHTTPHandler) dataUser(w http.RequestWriter, r *http.Request) {
    tx, err := db.BeginContext(r.Context())
    if err != nil {...}
    // The following would timeout and and the Tx would rollback any changes.
    // The err would be "context.DeadlineExceeded".
    // Drivers can also tie into this to cancel the query on the server to prevent the
    // server from consuming additional resources.
    row, err :=  tx.QueryRow(r.Context(), "sleep 99999; select * from user where ID = ?;", 42)
    if err != nil {...}
    tx.Commit()
}

@andlabs
Copy link
Contributor

andlabs commented Sep 28, 2016

Ah, that makes more sense. Good to know; thanks!

@bradfitz
Copy link
Contributor Author

bradfitz commented Nov 3, 2016

@kardianos wrote a doc summarizing what's new for database/sql in Go 1.8:

https://docs.google.com/document/d/1F778e7ZSNiSmbju3jsEWzShcb8lIO4kDyfKDNm4PNd8/edit

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests