Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

database/sql: add support for named input parameters #12381

Closed
kardianos opened this issue Aug 28, 2015 · 41 comments
Closed

database/sql: add support for named input parameters #12381

kardianos opened this issue Aug 28, 2015 · 41 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@kardianos
Copy link
Contributor

Named input parameters are supported by a number of larger database servers and simulated in some database clients for protocol's that only support positional arguments.
TDS is MS SQL Server's wire protocol and directly supports named parameters.
Postgres' v3 wire protocol doesn't support named parameters but some clients simulate it through replacement.

In SQL this may look like:

select d.Name, d.Age
from app.Dancer as d
where d.ID = :DancerID;

or

select d.Name, d.Age
from app.Dancer as d
where d.ID = @DancerID;

In small to medium cases positional placeholders work fine. In larger tables positional indexes become hard to use, bug prone, and possibly impossible to use.

One of my clients uses JDE, their table structure can be viewed here: http://www.jdetables.com/
Yes, it is bad, but I've gotta deal with it. I have another client that uses a similarly "enterprise" solution that has huge wide tables. Now try to imagine updating or inserting into those tables, or executing proc with 30-50 parameters. Named parameters gives me a chance to not get an index off by one.

The other area where I keenly feel the need for named parameters is when I dynamically assemble SQL, but still want to pass in input parameters so I don't need to manually escape everything. Two examples:

  1. When creating a specific RPC for a search or filter I often need to filter in particular cases. There are two ways to do this:
    a.
select * from data.Account as a
where
  (len(:InputValue) <> 0 and :InputValue like a.Name)
  or (len(:inputValue = 0 and a.Name = 'BOB')

I use this method often. However sometimes in complex cases performance can drop off due when it chooses the wrong query plan. When that starts to happen I do the next option:
b.

sql := &bytes.Buffer{}
sql.WriteString(`select * from data.Account as a where 1=1 `)
if len(inputValue) != 0 {
  sql.WriteString(` and :InputValue like a.Name `)
} else {
  sql.WriteString(` and a.Name = 'BOB' `)
}

When you start to construct the SQL, it becomes harder to determine the positions all the parameters should be in and makes the code more complex, though not impossible if you push input parameter values into an array as you push sql fragments in.

  1. While truly it is a variant of 1b, when creating a more generic application framework the work-around presented in there becomes more difficult. I sometimes define both the name of the parameter I want to pass in, tie it to the client, and if present construct SQL to the desired effect. I realize that is vague, but I don't have a ready open source solution I can show.
@ianlancetaylor ianlancetaylor added this to the Unplanned milestone Aug 29, 2015
@kostya-sh
Copy link
Contributor

It might be useful to know what other packages are doing.

  • JDBC (Java API for working with databases) doesn't support named parameters. Usually this functionality is provided by libraries or frameworks (e.g. Spring, Hibernate, etc).
  • Python DB API 2.0 supports a few param styles (including named parameters) but the style that is used selected by a driver.

@kardianos
Copy link
Contributor Author

I'm not a fan or ODBC or JDBC. I'm aware they don't support named parameters even when the underlying database protocols do, then when the higher abstractions add back in parameters through substitution we get even more inefficiencies.

A few years back I had to access a database from go through ODBC as part of a warranty registration process:

exec warr.pr_api_WarrantyRegister ?, ?, ?, ?, ?, ?, ?,      ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,   ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,    ?, ?, ?, ?, ?, ?,   ?

Over the years I have both added and removed parameters from this call, each time it is a counting game.

All that to say, I would prefer to survey database protocols rather then other abstractions.

@gwenn
Copy link

gwenn commented Aug 30, 2015

For your information,
JDBC supports named parameters (for callable statement):
http://docs.oracle.com/javase/7/docs/api/java/sql/CallableStatement.html#setObject(java.lang.String,%20java.lang.Object)

@kostya-sh
Copy link
Contributor

@kardianos, I agree that surveying database protocols is necessary. If named parameters are not supported by a database protocol than either database/sql or driver package would have to parse supplied SQL to implement placeholder replacement correctly.

For example postgresql protocol supports positional parameters only (e.g. $1, $2, etc). pq driver for go also uses this syntax. ? placeholder syntax is not supported because fair amount of work is required to implement this functionality properly. See discussion of lib/pq#233 for more details.

@gwenn, callable statements are for stored procedures only. It is much easier to add support for named parameters in this case as most databases provide a way to get information about a stored procedure including parameter names and types.

@kardianos
Copy link
Contributor Author

@kostya-sh Regarding adding named parameters, you hit the nail on the head. Named parameters for drivers needs to be optional, if it was supported. However sometimes you don't need a full SQL parser for things like this correctly. For example MS SQL Server has a "go" keyword that isn't part of T-SQL, but is rather a common client extension that separates out SQL batches. Supporting that required a simpler parser: https://bitbucket.org/kardianos/rdb/src/3240fb22ca61727b4604fae0c541737c1e1e26eb/ms/batch/batch_test.go?at=default

But you are correct, it would be non-trivial to do correctly even in the best of cases. I don't deny that.

@kardianos
Copy link
Contributor Author

I could see a few ways to make this work.

  1. We could have an `DB.QueryCtx(ctx, query, ...Param)
  2. We could just accept a new special Param type into the existing methods
  3. We could alternate DB.Query(query, "ID", 5, "Name", "Dreaming"), but I don't think that could work as that would be API breaking (different meaning, no way to differentiate).

Driver interfaces would need to be updated as well. Having a Param type would solve specifying an input type and specifying output parameters as well.

@bradfitz If methods were added to support the context param, I would support (1). Any opinion on this?

@freeformz
Copy link

freeformz commented Aug 12, 2016

How about?

db.Query(quesry,sql.NamedParam{"ID",5},sql.NamedParam{"Name","Dreaming"}).

Query's args are ...interface{}, so we can pass a type that encapsulates the named param.

@kardianos
Copy link
Contributor Author

So (2) in the above options. I'd also like to include type information as well as Out bool for output parameter.

I think having a Param or NamedParam type is the best option. My concern with re-using the existing method is how to support existing drivers, or telling them apart. I guess if the driver doesn't support the new interface and the query args contained a Param type, then it could just error out, backend not supported.

@freeformz
Copy link

@kardianos Doh! sorry. Mentally glossed over #2 somehow. :-(

Another option is to only support this on prepared statements, which the ValueConvertor interface may be able to already do for us with some extentions / docs / guidance ? Not sure as I'm not too familiar with it or how its used.

@kardianos
Copy link
Contributor Author

@freeformz That is how postgresql proto works, but other systems like TDS (ms sql server) can specify named param types for batch queries as well as prepared statements and trans. Most postgresql drivers do value substitution for non prepared statements when used with named parameters.

Hmm, I'm not sure how the ValueConverter would help in this case.

@freeformz
Copy link

freeformz commented Aug 12, 2016

@kardianos K.

Re: ValueConverter, I was reading through the Stmt code and maybe I just misunderstood what it's doing and how it works. Never mind.

Query could check args to see if any of them are sql.NamesParams and if so then assert for a NamedQueryer interface that looks like this:

type NamedQueryer interface {
        NamedQuery(query string, args []Value, namedArgs []NamedValue) (Rows, error)
}

Stmt would also need to handle named params as well right? And it handles args in a very different way (the code I was reading).

@kardianos
Copy link
Contributor Author

How would package references work out?
sql -> sql/driver
impl -> sql/driver
impl -> sql

sql/driver !-> sql

As such NamedValue or Param could be a concrete type, but it would need to satisfy an interface that was defined in sql/driver.


I'll need to look at those particulars in a bit. I'll see if I can get a complete conceptual model of how the driver code operates.

@freeformz
Copy link

I was using sql.NamedParam and driver.NamedValue to differentiate.

@kardianos
Copy link
Contributor Author

You're one step ahead of me :).

@bradfitz
Copy link
Contributor

How about?
db.Query(quesry,sql.NamedParam{"ID",5},sql.NamedParam{"Name","Dreaming"}).

The only problem with that is it wouldn't pass go vet checks, due to use of unkeyed struct literals.

What about a variation of that, where we have:

type NamedParam struct {
   Name  string
   Value interface{}
}

func Param(name string, value interface{}) NamedParam {
   return NamedParam{name, value}
}

So then callers are shorter and look like:

   db.Query(query, sql.Param("ID", 5), sql.Param("Name", "Dreaming"))

@bradfitz bradfitz modified the milestones: Go1.8Maybe, Unplanned Aug 12, 2016
@kardianos
Copy link
Contributor Author

@bradfitz
That would be alright by me. I would prefer a shorter name then NamedParam for the struct. Here is an example from one of my procs:

   ... query text ...
        rdb.Param{Name: "@Approve", Type: rdb.Bool, Value: approve},
        rdb.Param{Name: "@Comment", Type: rdb.Text, Value: comment},
        rdb.Param{Name: "@QueueType", Type: rdb.Integer, Value: NotifyReport},
        rdb.Param{Name: "@Account", Type: rdb.Integer, Value: userInfo.ID},
        rdb.Param{Name: "@Now", Type: rdb.Time, Value: time.Now().UTC()},
        rdb.Param{Name: "@Today", Type: rdb.TypeDate, Value: time.Now()},

Specifying the placeholder in the name may also be important as different systems accept different ones. (In the v1 of rdb, I just had the name, no placeholder and let the driver append it. I think that was a mistake.)

@kardianos
Copy link
Contributor Author

Type can be important, I've run into text vs varchar issues, or Date vs Timestamp issues, Decimal precision param issues.

@bradfitz
Copy link
Contributor

@kardianos, note that the type in my example had a long name, but the function to create them was short. Is that okay?

@kardianos
Copy link
Contributor Author

@bradfitz Noted.

In ADO.NET there is both an overloaded function to generate a param and you can use a struct literal (in go parlance) to specify it. Often none of the overloads do what I want so I use a combination of function + struct literal (it is an interesting C# syntax that allows that). From that experience I tend to shy away from function helpers in this context.

What I would recommend is to not include a function helper. If an application wants to it can specify a helper function in the data access that meets the particular need. I would prefer just sql.Param as a name, rather than sql.NamedParam, but the preference is small.

I've haven't referenced this directly in this issue thus far, but I also think it important to have an optional type in the NamedParam struct as well, default value TypeUnknown.

How do you feel about including type information in the NamedParam type? Possibly Length, Precision, and Scale as well.

@bradfitz
Copy link
Contributor

I would prefer just sql.Param as a name, rather than sql.NamedParam, but the preference is small.

See my comment above about go vet. It would be very wordy at use sites.

How do you feel about including type information in the NamedParam type?

I don't think we should include any type information. A Go interface value already conveys the type information. If a user wants to signify a certain type, they can use a Go type with that value.

@kardianos
Copy link
Contributor Author

@bradfitz
I understand. This could be an issue. In many instances, though not all, it makes a difference if a driver sends a parameter as a text, varchar, or nvarchar. Sometimes for it maters for compatibility and sometimes for performance.

If this isn't an option, while I can still work to put in multiple result sets in database/sql, I will need to work on a different interface for my own uses. I'm not trying to be difficult, it is just the constraints I face; it isn't worth trying to fight the rdbms. Maybe there is a middle ground with a driver defined type like:
Query("select * from dbo.Locus where Name = @Locus", sql.Param("Locus", mydriver.Text("HLA A"))) I'm not sure. I'm not sure yet how that would play with the rest of the pipeline.

@kardianos
Copy link
Contributor Author

It would be very wordy at use sites.

I'm okay with wordy. I'm not okay with "can't do". SQL just is wordy.

@bradfitz
Copy link
Contributor

I'm fine with driver-defined types for the weird cases.

@kostya-sh
Copy link
Contributor

I think it would be very useful to collect use cases where specifying parameter types is necessary.

@kardianos you mentioned about a few issues with types. Can you provide more details?

This issue about named parameters though, I think the discussion about types can be moved to #12383.

@freeformz
Copy link

FWIW: lib/pq has some driver defined types for array handling and a helper function https://godoc.org/github.com/lib/pq#Array

@noblehng
Copy link

I don't think named input parameter binding is problem, you can use named input parameter in you statement string as long as your database support it. The underlying driver doesn't need to know the name, it can just treat named binding as positional binding and let the database server handle it. For example, go-oci8 support named bind with database/sql, and it just use a positional binding under the hood. Note that placeholder can be numbered, named or not.

The only problem I can see is portability across database. For that you will also need a uniformed named placeholder format, which will inevitably need to do string substitution in the driver if the database doesn't support named binding or use a different format. This can already be done in a third-party helper library like sqlx now.

As for output parameter binding, there are three problems now as I can see:

  • We can use a pointer to value as a argument supplied to Exec() and Query() for value output parameter, and database drivers can infer the database type from value type then doing the binding and store back the result in the pointer. The problem here is that, in database/sql, all args are converted to database/sql/driver.Value then pass to the underlying database driver, and Value doesn't allow a pointer type, and will even do a auto-dereference with the default converter or use database/sql/driver.IsValue() to check that for custom Valuer. This can be work aroud in the driver by implement a custom ColumnConverter on *Stmt, but that doesn't work for all situation. A better solution will be create a OutputValue type as a pointer, and implement the Valuer interface, then modify database/sql/driver.IsValue() to accept that.
  • For output ref cursor, we can't just use a pointer for drivers to infer its type, we will need a special type here. This special type could be merge with the above OutputValue type. Note that ref cursor's result will be return as a result set and doesn't need to store in the argument passed to Query().
  • Finally, some output parameter could have no placeholder in the statement. For example, when calling a function (not stored procedure) with return values, the return values will not have placeholder in our call statement. Another example will be something like SELECT id, CURSOR(SELECT * FROM table_2) FROM table_1 where id=100;, the cursor here is a output parameter without placeholder. For this problem, we could put extra arguments after those with placeholder in Exec() and Query(). The problem is that the database may not numbered the parameters in this order, but caller can always arrange the arguments' order base on which database they use.

For database specific types, I think is better leave to database drivers and modify database/sql/driver.IsValue() to accept them. I will add some comment in #12383.

@kardianos
Copy link
Contributor Author

Wire protocols that support named parameters should send those names to the database. If for nothing else error messages from the DB will be much easier to understand. I personally don't want to use a database that forces named parameters into indexed parameters if the wire protocol supports it.

RE driver types: yes, defining DB types on the driver seems to be the way people are leaning, though I think it hurts the possibility of more generic uses. Did you see #12381 (comment) ? It looks like this is possible today. (I still need to investigate all the way myself.)

Output values and cursors are probably not something I personally would want to tackle with the database/sql API. Feel free to open a specific proposal / send a change request on thoses.

@noblehng
Copy link

@kardianos Note that the named placeholders are still in the statement text that we send to database, we just bind it by position, database will have to parse placeholders anyway, so it will have both name and index infomation. So for error messages, database could still use placeholder names if it want to support that.

Yes, I saw the pq support for PostgreSQL array type. But it just convert the array to a Go string, because as a input paramater, PostgreSQL's array could just be represented by {elem1, elem2, ...}. So that is a little hack now. For underlying problem for cunstom types now and my thoughts about common database types, you can see #12383 (comment).

@kardianos
Copy link
Contributor Author

@noblehng I'm not understanding what you are saying then. You are not suggesting modifying the SQL string sent to the server that is provided by the user.

How would you bind the following parameters by index?

PARAMS:
    IN bigint :ID = 3;
    IN decimal :Amount = 34.56;

SQL:
IF :ID is null begin
    insert into app.Invoice (Amount) values (:Amount);
    set :ID = LASTVAL();
end

update app.Invoice set
    Amount = :Amount,
    TimeUpdated = GETDATE()
where
    ID = :ID;

@noblehng
Copy link

noblehng commented Sep 13, 2016

You just provide the string as is, and provide the args, because the args are indexed, so the driver can infer its position then do a positional binding. No string substitution on the driver side, database server can handle it. You can see how go-oci8 do it. Your sample will be something like Exec(query, id, amount), database server have enough infomation to handle this, it can do the name bind and give it a index when parsing the query for placeholders.

@kardianos
Copy link
Contributor Author

@noblehng I don't understand what you are saying. Are you saying you would send :ID up three times to positionally bind it? I'm afraid we might be talking about different things. This issue is about being able to do:

db.Exec(`
IF :ID is null begin
    insert into app.Invoice (Amount) values (:Amount);
    set :ID = LASTVAL();
end

update app.Invoice set
    Amount = :Amount,
    TimeUpdated = GETDATE()
where
    ID = :ID;`, sql.Param(":ID", id), sql.Param(":Amount", amount))

Is that what you were addressing?

@noblehng
Copy link

@kardianos :ID will be sent only once as the first position, your database's sql syntax parser will understand :ID and :Amount are the only placeholders, and following :ID and :Amount are just reference with the same value.

Note that, '?',':name', ':number', '@name', $number are both just text input from a sql syntax parser's standpoint. The only different is what is allow for the first character of a placeholder to differ them with other identifiers in the same statement. While parsing placeholders, a parser will also indexing them, so it can easily map a number to a text fragment in the statement.

If your database doesn't support mapping index to name yet, you can still easily workaround it by using custom type (which not supported yet in the stdlib), and no need to add a special api in database/sql.

Your example will just be:

id, amount := 100, 100
db.Exec(`
IF :ID is null begin
    insert into app.Invoice (Amount) values (:Amount);
    set :ID = LASTVAL();
end

update app.Invoice set
    Amount = :Amount,
    TimeUpdated = GETDATE()
where
    ID = :ID;`, id, amount)

and database driver will bind the value of id to position 1 and the value of amount to position 2.

@kardianos
Copy link
Contributor Author

This issue is about mapping an explicit Name to Value in the parameter list.

So when you have 50 parameters (not an exaggeration), they stay sane. Thus key is the sql.Param(name, value) construct (or similar pairing) is key.

@noblehng
Copy link

I don't see my example is harder to parse for human eyes though. Variable id and amount maps to :ID and :Amount, and database server will use :ID and :Amount for error messages.

And as I said, you can always use custom type for more syntax surgar if you want, which will not further improve error messages from database servers, maybe it could improve driver error messages if driver also does some sql parsing.

@kardianos
Copy link
Contributor Author

The driver should not do SQL parsing.

Yes, two parameters are easy. I hate counting indexes past 10, and I have to interact with some wide tables and procs with long input params. It gets hairy past 25 input parameters and unmanageable past 40.

@noblehng
Copy link

noblehng commented Sep 13, 2016

You shold read my example above, you don't need to counting indexes this way. What needs to counting the index is the database driver, and that doesn't need to do sql parsing or string substitution. What really need to parse the sql is your database server, and it is nature for it to do the name<->index mapping.

Edit:
Technically, current driver can already support named parameter using the current stdlib api. It can use positional binding internally, but give a named binding interface to user, with all the benefit you mention of named binding and without any downside like string substitution. The go-oci8 driver shows that, and I just list the reasonings behind it.

@kardianos
Copy link
Contributor Author

@noblehng I don't understand what you are saying. I'm sorry. Your example doesn't have a Name to Value association like sql.Param("ID", 5), so it doesn't seem to relate to this issue.

@noblehng
Copy link

noblehng commented Sep 13, 2016

@kardianos In my example, :ID<->var id<->100. It even surpport syntax highlighting well, and if under a IDE, hover over id should display its value as 100.

I just look at MSSQL's wire protocol, so now I understand why one whould want special support for that. For infomation I write it down here, note that I could be wrong.

In short, MSSQL's MS-TDS doesn't really support binded by position, actually it doesn't have a bind command. Parameters are just send alongside the statement text with a RPC call, its structure is something like:

type Param struct {
    Name         string
    StatusFlags  uint8
    TypeInfos    []TypeInfo
    ValueBytes   []byte
}

So it can't bind at database server that have all the infomation from parsing the statement, it must doing the binding from the client side. Even for positional binding, it will need to parse the statement text to get the placeholder and doing the binding in driver, by assign parsed results to Param.Name, under current stdlib interface. You can see that in a mssql driver for Go here.

In my opinion, this is a corner case or a quirk of MSSQL, and it could well be solved by driver custom types. I will stop here and welcome others' opinion.

Edit:
Add a example for custom type solution for this, it is even fancier than sql.Param(name, value):

type InputValues map[string]driver.Value
...
db.Exec(`
IF :ID is null begin
    insert into app.Invoice (Amount) values (:Amount);
    set :ID = LASTVAL();
end

update app.Invoice set
    Amount = :Amount,
    TimeUpdated = GETDATE()
where
    ID = :ID;`, InputValues{
    ":ID":     100,
    ":Amount": 100,
})

Edit2:
I think leave out the Name part as empty string for all parameters in MS-TDS's RPC call could make MSSQL does positional binding by default. If this is true, drivers for MSSQL could just do what go-oci8 does to support named placeholders.

@kardianos
Copy link
Contributor Author

kardianos commented Sep 14, 2016

@noblehng This bug is not about how the driver works, whether it uses bind or directly send the named params. It is about adding user facing API to the sql package.

var (
    PersonID int64 = 42
    AccountID int64 = 42
)
// Make query more complex, add about 10 to 20 additional params, many just bigints.
// First was:
db.ExecContext(ctx, 'select Person  = :Person, Account = :Account;`, PersonID, AccountID)
// Then changed to:
db.ExecContext(ctx, 'select Account  = :Account, Person = :Person;`, PersonID, AccountID)

// But would still work if:
db.ExecContext(ctx,
    `select Account  = :Account, Person = :Person;`,
    sql.Param("Person", PersonID), 
    sql.Param("Account", AccountID)
)

I don't care how the driver works, if it uses something similar to TDS wire proto or something like postgresql that uses bind. (The TDS wire protocol will use index of the param if the name is omitted fyi.)

@noblehng
Copy link

noblehng commented Sep 14, 2016

@kardianos Thanks, now I can see your reason is to want to just change the query string without changing the the parameter order. Because from your other examples, I didn't notice this before.

Edit:
To be clarify for others, the intention for this issue, is not just support named placeholders in query string, which can already be done with the current api, but also support passing in parameters by name, so that their ordering become irrelevant.

The later part could be useful for queries that have large number of placeholders or their placeholders' position will chaning dynamically. This might need a new api in stdlib.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/30166 mentions this issue.

@quentinmit quentinmit added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Oct 10, 2016
@golang golang locked and limited conversation to collaborators Oct 17, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

9 participants