-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using string array with magic block results in whitespace removal from first entry of the array #166
Comments
@amardeep The problem is related to the @plamut Cloud you help in this , i don't have that much idea about the |
@HemangChothani @amardeep This appears to be a yet another case where IPython's argument parsing turns out to be fragile at times. It's the same reason why negative numbers in Perhaps the docs on the Edit: Just tried it and it doesn't work, unfortunately, the spaces confuse the parser even if the value is passed as a JSON string. Maybe the we can hack around this in in the _cell_magic() function, but even if we succeed, it would still be dealing with the symptom and it might not work in all cases. @HemangChothani Do you want me to have a closer look when I find some time? |
It doesn't work with JSON string. params = {
'cats': ['apple orange', 'pear plum']
}
jparams = json.dumps(params)
jparams
%%bigquery df --params $jparams
SELECT * FROM UNNEST(@cats)
|
@plamut Yes, Actually i am not available for few days, so please take a look into it when you get some time, thanks. |
The following argument string:
is tokenized by IPython as follows:
The parser is not aware of dicts, lists, and similar structures, thus it does not treat them as a single entity and splits them into multiple tokens. On the other hand it does appear that the parser at least recognizes multi-word strings if they are enclosed in quotes, which is why it produces It made me wondering if we could trick the parser to treat the first array item, i.e. >>> params = {
... 'cats': ['apple orange', 'pear plum', 'foo bar']
... }
>>> params_json = json.dumps(params)
>>> with_space = params_json.replace('["', '[ "') # <-- Hocus-pocus, abracadabra! Magic!
>>> with_space
'{"cats": [ "apple orange", "pear plum", "foo bar"]}' Such modified JSON string gets tokenized as follows:
The space between "apple" and "orange" is not lost anymore and correct result is produced: %%bigquery df --params $with_space
SELECT * FROM UNNEST(@cats)
f0_
0 apple orange
1 pear plum
2 foo bar Such manipulation of JSON strings is, of course, far from ideal, but at least it's straightforward and should work in most cases. We are kind of limited with IPython's argument parser here. |
Other option to consider is not using IPython argument parser, and custom parse line argument. |
Considering the different edge cases reported where the built-in parser falls flat, we might actually have to do this at some point, yes. The replacement parser would of course have to be compatible with the existing ones in order to not break existing notebooks. Edit: Ha! IPython's parser is based on Edit 2: Unfortunately, any |
On passing a dict
{ 'cats': ['apple orange', 'pear plum']}
as params in bigquery magic cell, the first value is changed toappleorange
- the space character is filtered out.Environment details
google-cloud-bigquery
version: 1.21.0Steps to reproduce
A colab notebook illustrating the error:
https://colab.research.google.com/gist/amardeep/63ec303ba8bac3db9849f4044cd19ff1/test-bigquery-array-parameter-bug.ipynb
Code example
This results in the output:
The text was updated successfully, but these errors were encountered: