Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add escaping for SQL characters #3

Closed
twheys opened this issue Jul 20, 2016 · 13 comments
Closed

Add escaping for SQL characters #3

twheys opened this issue Jul 20, 2016 · 13 comments
Assignees
Milestone

Comments

@twheys
Copy link
Contributor

twheys commented Jul 20, 2016

Any input into pypika should escape invalid characters. This includes table names, field names, and values.

@twheys twheys added this to the 0.1.0 milestone Jul 20, 2016
@twheys twheys self-assigned this Jul 20, 2016
@spinningarrow
Copy link

Is there any workaround for this in the current version?

@twheys
Copy link
Contributor Author

twheys commented Aug 17, 2017

Not currently, except to escape the characters on the result query string itself. I've been dragging my feet on doing this but if it's a blocker for you let me know and I can take care of it.

@spinningarrow
Copy link

@twheys thanks for the suggestion! I think it would definitely help to have this supported. If it's fairly straightforward, I can also take a stab at it and make a PR if you'd like.

@twheys
Copy link
Contributor Author

twheys commented Nov 15, 2017

Yeah if you have time to do it, your contribution would be greatly appreciated 👍 There are some comments in different places in the code with TODOs where it the escaping should be applied

@spinningarrow
Copy link

I can't make any promises at the moment; I'm no longer working on the project that requires this 🙈

@twheys
Copy link
Contributor Author

twheys commented Nov 16, 2017

No worries

@reutsharabani
Copy link

@twheys How would you implement this? One way is to make sure all methods that receive inputs are validating it (escaping) - very prone to error on extension (someone may forget...)

The other method is to escape everything on get_sql (or the string creation).

Do you have a preferred way (possibly something else)?

@twheys
Copy link
Contributor Author

twheys commented Feb 14, 2018

There's a couple of places where it needs to be done. Basically, anywhere a parameter is taken that ends up rendered in the query. For example, table names and schemas, ValueWrapper values, etc. The escape function probably only needs to escape the quote characters, as long as what is being escaped is inside a string or something. Might prove a bit more complicated than that, though.

@reutsharabani
Copy link

reutsharabani commented Feb 16, 2018

@twheys I may be able to spend time on it this weekend or the next one. Do you have any guidelines on how to do it?

Honestly, I think this is crucial for this library to be usable for any interaction with input from the real world.

For now I'm thinking the same as you - any place that takes an argument from the user has to apply an escape function. This is the "naive" way but it really messes up the code's extensibility because now everyone has to do it.

I would prefer to escape when doing the get_sql so that everyone get the escape applied transparently "for free". Can this be a good implementation? Are all values stored in defined locations so that the function can escapoe everything? (values, columns, tables etc).

@twheys
Copy link
Contributor Author

twheys commented Feb 20, 2018

@reutsharabani sorry I didn't reply before the weekend. I don't have a particular solution in mind. I think the problem might be more complicated that it first appears, though, perhaps not. My thoughts were to make an escape function that replaced any special characters such as ; in the arguments that need it. I think the escaping should only need to be done on string arguments, not the whole get_sql output, otherwise you'll be escaping characters that the pypika internals are writing.

@reutsharabani
Copy link

@twheys no problem.

I think a custom escape function is meaningless. The functions are usually exposed by the drivers so just take it as an argument. I strongly suggest it to be at a single point that everything passes through, Otherwise it is hard to maintain (scattered around the code).

About escaping pypika internals - my idea was applying it at a single point before doing something with the values. If the values are mutated before that point it obviously won't work.

I will probably take a look at the weekend to get familiar with the code and see if I come up with something.

@grigi
Copy link
Contributor

grigi commented Jan 7, 2019

I think that there is two different ways to harden the query:

  1. Escaping the identifiers (e.g. table/column/alias names)
  2. Escaping the data

I feel we should definitely do 1, and it would be different for each dialect. e.g. MySQL uses ` (backtick), etc... We can get away with this as identifiers have a limited character set. We also must do it as there is no other way.

I feel we should not attempt 2, and find a way to treat data as binary instead (as in parametrized queries) as that is the only recommended way in most systems to access a database, as the data goes in a separate data structure.

@twheys
Copy link
Contributor Author

twheys commented Jan 15, 2019

Gonna close this since parametrized queries perform this role.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants