Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to treat empty input as null input #1628

Open
TylerRick opened this issue Mar 14, 2018 · 28 comments
Open

Add an option to treat empty input as null input #1628

TylerRick opened this issue Mar 14, 2018 · 28 comments

Comments

@TylerRick
Copy link

Currently, if you provide an empty input, jq appears to not run your filter program at all, which can be very puzzling (see #1497 and #1142, for example).

There are times when it would be more useful to treat the empty input as a null value, so that your filter program actually gets a chance to run and you can therefore actually do something with the input (or lack thereof), react to the lack of input, and actually produce an output anyway — which currently isn't possible at all when the input is "empty".

Being able to produce an output value even when the input is empty means that you could even influence the exit code when using --exit-status (see #1497 (comment)) — also something that you otherwise couldn't do (currently always exits with 0 but is supposed to exit with 4 (#1497)).

I propose adding a new --empty-input-as-null option to complement the familiar existing input options like --slurp and --null-input/-n. In fact, it would work identically to --null-input when the input was empty.

jq usually treats its input a stream of 0 or more JSON values (separated by whitespace) ... which is great when you do in fact have multiple values coming in. But oftentimes you'll have an input script that generates a single, well-defined JSON value, and it would be more useful and intuitive to treat jq's input as a single value as well in order to process that script's single output value.

This option basically gives you that: It allows users to treat the input as if it were a single JSON value, falling back to null if needed instead of skipping running your filter entirely and not giving you a chance to generate an output value.

Examples:

> command_that_may_produce_empty_input_or_json | jq --empty-input-as-null '.'
null

# same as: 
> jq -n '.'
null

This would be especially useful in combination with --exit-status.

> echo | jq --empty-input-as-null --exit-status '.'; echo $?
null
1

# same as: 
> jq -n --exit-status '.'; echo $?
null
1
jq <settings -e --empty-input-as-null ".some_setting == true" || do_something_else

Could use with // operator to provide your own fallback behavior/value if you want something other than null when there's empty input:

> echo | jq --empty-input-as-null --exit-status '. // true'; echo $?
true
0

# same as:
> jq -n --exit-status '. // true'; echo $?
true
0
@e1senh0rn
Copy link

Just encountered this with an empty file. If file is empty, filter is not processed at all:

touch file.json
jq '(. // {}) * {"new_key": "value"}' file.json

Is there any workaround?

@nocive
Copy link

nocive commented Apr 1, 2020

bump

@pkoppstein
Copy link
Contributor

pkoppstein commented Apr 1, 2020

@e1senh0rn asked:

is there any workaround?

What is the alternative behavior you have in mind? Would the -s option be useful for you? Or are you asking for new file-handing functions? Please be specific.

@nocive
Copy link

nocive commented Apr 11, 2020

@pkoppstein maybe the report wasn't the most accurate but it seems to me that it still refers to the original (and quite detailed) report here.
I've come across this counter intuitive behaviour myself while using jq to parse all sorts of output from very common tools like gcloud or aws cli and the fact you can't use -e reliability because of this issue is a real pain and requires additional code and workarounds to cope with.

@stewartadam
Copy link

stewartadam commented Apr 27, 2020

you can't use -e reliability because of this issue is a real pain and requires additional code and workarounds to cope with.

Exactly this, jq is often used to parse output from APIs and if the API returns no output (e.g. curl returned a 403 with no body due to missing token, or 503 due to service error) than jq proceeds happily along and we're none the wiser.

I think by far and wide, the user's expected behavior for -e is "exit on any false|null output" not "exit on any false|null output only if non-empty input". In fact, the docs don't even mention the non-empty quirk which makes it all the more confusing.

@nicpottier
Copy link

Just here to say this bit me as well, specifically when using jq to parse the output of an aws command that started failing. I get the purity argument that an empty string is valid JSON, but from a practicality perspective this sure would be a useful option.

@kolyshkin
Copy link

Let me rephrase the original issue; maybe it will make it more clear.

I have the following json input:

{
  "foo": true
}

and I want to use jq from a shell script, checking if foo == true and printing "fail" otherwise.

I read docs and come up with the following code (for those unfamiliar with shell, a || b checks the exit code of a and runs b if it is non-zero):

# valid input, foo is true
$ echo '{ "foo": true }' | jq -e '.foo == true' >/dev/null || echo "fail"

# valid input, foo is false
$ echo '{ "foo": false }' | jq -e '.foo == true' >/dev/null || echo "fail"
fail

# invalid input
echo '{ "foo": hoot }' | jq -e '.foo == true' >/dev/null || echo "fail"
parse error: Invalid numeric literal at line 1, column 14
fail

But it does not work for the case of no input:

$ echo  | jq -e '.foo == true' >/dev/null || echo "fail"

Obviously, foo is not true, it's not even there, but my code do not print "fail". 👎

It seems that if -e is used, empty input should be treated like {}, otherwise we can't rely on exit code in case of empty input.

kolyshkin added a commit to kolyshkin/cri-o that referenced this issue Oct 9, 2020
Commits a2ec1d4 and 247d465 added a few checks
using jq in the following test cases:

 * ctr lifecycle
 * ctr execsync should not overwrite initial spec args
 * privileged ctr -- check for rw mounts

Alas, those checks do not work (and never worked); jq always succeeds.

This happened because

1. in `run cmd1 ... | cmd2` the part starting with the pipe
   character is not part of `run` statement;

2. `run` eats `cmd1 ...` output (into `$output` variable);

so `cmd2` is provided with empty input.

Now,

3. `jq` with empty input does not run any filters and thus succeeds
   (even with `-e`, see [1]).

The fix is to add a separate check that the output is not empty.

While at it, remove `run` where it's not needed from the other places
in those three tests we fix.

[1] jqlang/jq#1628

Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/cri-o that referenced this issue Oct 9, 2020
Commits a2ec1d4 and 247d465 added a few checks
using jq in the following test cases:

 * ctr lifecycle
 * ctr execsync should not overwrite initial spec args
 * privileged ctr -- check for rw mounts

Alas, those checks do not work (and never worked); jq always succeeds.

This happened because

1. in `run cmd1 ... | cmd2` the part starting with the pipe
   character is not part of `run` statement;

2. `run` eats `cmd1 ...` output (into `$output` variable);

so `cmd2` is provided with empty input.

Now,

3. `jq` with empty input does not run any filters and thus succeeds
   (even with `-e`, see [1]).

The fix is to add a separate check that the output is not empty.

While at it, remove `run` where it's not needed from the other places
in those three tests we fix.

[1] jqlang/jq#1628

Signed-off-by: Kir Kolyshkin <[email protected]>
kolyshkin added a commit to kolyshkin/cri-o that referenced this issue Oct 9, 2020
Commits a2ec1d4 and 247d465 added a few checks
using jq in the following test cases:

 * ctr lifecycle
 * ctr execsync should not overwrite initial spec args
 * privileged ctr -- check for rw mounts

Alas, those checks do not work (and never worked); jq always succeeds.

This happened because

1. in `run cmd1 ... | cmd2` the part starting with the pipe
   character is not part of `run` statement;

2. `run` eats `cmd1 ...` output (into `$output` variable);

so `cmd2` is provided with empty input.

Now,

3. `jq` with empty input does not run any filters and thus succeeds
   (even with `-e`, see [1]).

The fix is to add a separate check that the output is not empty.

While at it, remove `run` where it's not needed from the other places
in those three tests we fix.

[1] jqlang/jq#1628

Signed-off-by: Kir Kolyshkin <[email protected]>
@chrisregnier
Copy link

I just came across the same problem. Trying to parse the results a curl command, but when the server goes down, my curl command returns an empty string.

> curl -sS --fail localhost:3000/health | jq -re 'has("status") and .status != "RED" // false'
curl: (7) Failed to connect to localhost port 3000: Connection refused
> echo $?
0

In my jq instructions I'm explicitly trying to handle the bad state by always returning false, but when an empty string is passed through my error case isn't being run at all.

I'd love to see an option that forces the rules to be run with an empty string instead of bypassing them altogether.

@wader
Copy link
Member

wader commented Mar 30, 2022

Maybe you can do something like this:

$ echo -n '{"status": "GREEN"}' | jq -esRr 'if . == "" then null else fromjson end | has("status") and .status != "RED" // false' ; echo $?
true
0
$ echo -n '{"status": "RED"}' | jq -esRr 'if . == "" then null else fromjson end | has("status") and .status != "RED" // false' ; echo $?
false
1
$ echo -n '' | jq -esRr 'if . == "" then null else fromjson end | has("status") and .status != "RED" // false' ; echo $?
false
1

@chrisregnier
Copy link

That works perfectly, thank you! I didn't realize using the -sR options will cause the empty string to be passed through and then run the rules, since the rules aren't run in the other cases. So I think those two options along with the first rule you have solves this perfectly.

@wader
Copy link
Member

wader commented Mar 30, 2022

👍 Yeap without -sR jq will run the filter on each input JSON it reads, ex:

$ echo '' | jq .
$ echo '123' | jq .
123
$ echo '123 123' | jq .
123
123

Which i think make sense but might be a bit surprising

@emanuele6
Copy link
Member

emanuele6 commented Mar 30, 2022

you can also just use:

jq -n 'try input catch null, inputs | . # your code'

or, if you want to be better about dealing with parse errors for the first input:

jq -n 'try input catch if . != "break" then error else null end, inputs | . # your code'

@wader
Copy link
Member

wader commented Mar 30, 2022

Yeap maybe a bit clearer than use raw slurp :)

Can also do:

jq -n 'inputs // null | ...' 

@emanuele6
Copy link
Member

emanuele6 commented Mar 30, 2022

Can also do:

 jq -n 'inputs // null | ...' 

RE: @wader

Nope, you can't do that. jq's // checks non-truthyness not emptiness; it's more similar to lua/python/javascript's or/or/|| than to perl's //. Annoying, but that is just how it is sadly; it would simplify many things if it actually only worked for empty.

inputs | . // null will replace all your false inputs with null (the only non-thruty values in jq are: false, null and empty); (also, inputs//null seems a little buggy and removes anything non-thruty as if you were doing inputs | select(.))

$ jq -n 'true, {}, null, 0, false, 10' | jq -n 'try input catch if . != "break" then error else null end, inputs'
true
{}
null
0
false
10
$ jq -n 'true, {}, null, 0, false, 10' | jq -n 'inputs | . // null'
true
{}
null
0
null
10
$ jq -n 'true, {}, null, 0, false, 10' | jq -n 'inputs // null'
true
{}
0
10

@wader
Copy link
Member

wader commented Mar 30, 2022

Ah yeah good catch, i've been bitten by that before. Yes also wish jq had a something like the // operator but only check for emptyness. Thers is isempty but i think it will be hard to use with inputs as it will consume the inputs.

What about:

jq -n '[inputs][0] | ...'

😄

@wader
Copy link
Member

wader commented Mar 30, 2022

Two more:

jq -n 'reduce inputs as $i (null; $i) | ...' # null or last input
jq -n 'first(inputs, null) | ...'

Ok time to do something more useful maybe :)

@emanuele6
Copy link
Member

emanuele6 commented Mar 30, 2022

RE: @wader

jq -n '[inputs][0] | ...'

I am guessing you forgot a .[1:][]:

jq -n '[ inputs ] | .[0], .[1:][]'

If the first thing you do is [ inputs ], you can just use -s:

jq -s '.[0], .[1:][]'

Yep, that's nice and short :D

I am not a big fan of it since I don't like unnecessary slurps which make jq read and load all inputs into memory before it can do anything, but that is a nice looking solution.

I think the ideal solution is:

jq -n 'try input catch if . != "break" then error else null end, inputs | . # your code'

run input/0 which reads one input and, unlike inputs/0, errors with "break" if there are no inputs to read; if you catch an error from input/0 and that error is "break", output null (otherwise reproduce the error); after reading the first input, just call inputs/0 that will read the rest of the inputs.

@wader
Copy link
Member

wader commented Mar 30, 2022

I am guessing you forgot a .[1:][]:

jq -n '[ inputs ] | .[0], .[1:][]'

Ah yes i was thinking about the case when you only want the first input or null.

Yep, that's nice and short :D

Oh forgot about slurp, so i guess if you only care about first value it can be:

jq -s '.[0]'
# or
jq -s first

:)

I am not a big fan of it since I don't like unnecessary slurps which make jq read and load all inputs into memory before it can do anything, but that is a nice looking solution.

I think the ideal solution is:

jq -n 'try input catch if . != "break" then error else null end, inputs | . # your code'

input (unlike inputs) reads one input and errors with "break" if there are no inputs to read. if you catch an error and it's "break", return null (otherwise forward the error), after reading the first input, just call inputs and read the rest of the inputs until there is are no more or an error occurs without checking anything.

Yes also like when jq uses generators over arrays to make things like that possible.

@emanuele6
Copy link
Member

Yes also like when jq uses generators over arrays to make things like that possible.

input/0 is great! I wish it were possible to use it and inputs/0 to iterate any jq expression, not just inputs from the input files. Sometimes I think of splitting my jq scripts into two command (jq '... | .[]' | jq -n '... | input | ...') just so that I can use it. (it's especially convenient when it's something like: jq '... | tostream' | jq -n '... | input | ...')


Here is a neat tool that I made that makes heavy usage of input/0: https://gist.github.com/emanuele6/b2f6055a5ac2cca4618f467d84f739fd

It "partitions" arrays read from stdin into multiple arrays of n elements (n is the argument you pass to the script):

$ ./partitioner.jq 2 <<< '[1,2,3] [4,5,6,7,8]'
[1,2]
[3,4]
[5,6]
[7,8]
$ ./partitioner.jq 5 <<< '[1,2,3] [4,5,6,7,8]'
[1,2,3,4,5]
[6,7,8]

@wader
Copy link
Member

wader commented Mar 30, 2022

input/0 is great! I wish it were possible to use it and inputs/0 to iterate any jq expression, not just inputs from the input files. Sometimes I think of splitting my jq scripts into two command (jq '... | .[]' | jq -n '... | input | ...') just so that I can use it. (it's especially convenient when it's something like: jq '... | tostream' | jq -n '... | input | ...')

Idea is to use the output of something later on in a filter pipeline? would a binding work instead? or maybe this would be more similar to coroutines #1342?

Here is a neat tool that I made that makes heavy usage of input/0: https://gist.github.com/emanuele6/b2f6055a5ac2cca4618f467d84f739fd

It "partitions" arrays read from stdin into multiple arrays of n elements (n is the argument you pass to the script):

$ ./partitioner.jq 2 <<< '[1,2,3] [4,5,6,7,8]'
[1,2]
[3,4]
[5,6]
[7,8]
$ ./partitioner.jq 5 <<< '[1,2,3] [4,5,6,7,8]'
[1,2,3,4,5]
[6,7,8]

Nice! will have a look. For fq i added a chunk($size) function to do something similar as i need it quite a lot, but it only works on one input array.

@pkoppstein
Copy link
Contributor

pkoppstein commented Mar 30, 2022

In this post I would like to emphasize that to "branch" on whether
the input stream is empty or not, without losing the first item if
any, it is not necessary to use the -s command-line option or
[inputs], both of which may be undesirable if the input stream might
be very large.

The simplest, efficient, general-purpose way to distinguish between
an empty and a non-empty input stream is to use the template:

jq -n '[try input catch infinite] | .[0] | if isinfinite then "empty" else ., inputs end'

That is, you would replace "empty" by the program that you want to handle the case of an empty input
stream, and place the main program (say P), right after inputs, like so:

jq -n '
 def P: .;
 [try input catch infinite]
 | .[0]
 | if isinfinite then "empty" else ., inputs | P end'

Enjoy!

@pkoppstein
Copy link
Contributor

Re: partitioner

Note that jq has an (undocumented but internally used) builtin, _nwise/1, that partitions an array input into arrays of up to the specified length. Its can evidently be used with add to concatenate and then partition arrays. To process a stream, s, of arrays in this manner, one could write:

   # input and output are both streams of arrays
  def repartition(s; $n): [s] | add | _nwise($n); 

@emanuele6
Copy link
Member

emanuele6 commented Mar 31, 2022

RE: @pkoppstein

The simplest, efficient, general-purpose way to distinguish between
an empty and a non-empty input stream is to use the template:

jq -n '[try input catch infinite] | .[0] | if isinfinite then "empty" else ., inputs end'

that doesn't handle parse errors for the first input correctly, but i don't think I understand how that is more simple/efficent/general-purpuse than the solution I mentioned:

jq -n 'try input catch if . != "break" then error else "empty" end, inputs'

Note that jq has an (undocumented but internally used) builtin, _nwise/1, that partitions an array input into arrays of up to the specified length.

I know _nwise/1 exists, but that does not do the same thing as my script; that has to slurp the arrays and can only output when it has finished reading all the inputs; mine doesn't have to slurp, it can process the array inputs as they come.

$ # i am typing [1,2,3,4,5] and [2] and then pressing ^D
$ jq -n 'def repartition(s; $n): [s] | add | _nwise($n); repartition(inputs; 2)'
[1,2,3,4,5]
[2]
^D
[1,2]
[3,4]
[5,2]
$ ./partitioner.jq 2
[1,2,3,4,5]
[1,2]
[3,4]
[2]
[5,2]
^D

RE: @wader

Idea is to use the output of something later on in a filter pipeline?

This is probably getting a little OT, but the idea is to be able to easily read an arbitrary number of values at each iteration without having to use reduce and only being able to output at the end, or having to use foreach and [[state],actual_value_to_output]:

example: split an array into multiple arrays at every null.

$ # with input/0
$ jq -n '[1,2,3,null,4,6,false,5,null,12,3,4,5,7,7,6][]' | jq -cn 'try repeat(1 | [ while(. != null; ([ input ]? // error)[]) ]) catch . | arrays[1:]'
[1,2,3]
[4,6,false,5]
[12,3,4,5,7,7,6]
$ # with foreach
$ # you lose values at the end if the array is not null terminated:
$ jq -cn '[1,2,3,null,4,6,false,5,null,12,3,4,5,7,7,6] | foreach .[] as $v ([ false, [] ]; if .[0] then [false,[]] else . end | if $v != null then .[1] += [ $v ] else .[0] = true end; select(.[0])[1] | arrays)'
[1,2,3]
[4,6,false,5]
$ jq -cn '[1,2,3,null,4,6,false,5,null,12,3,4,5,7,7,6,null] | foreach .[] as $v ([ false, [] ]; if .[0] then [false,[]] else . end | if $v != null then .[1] += [ $v ] else .[0] = true end; select(.[0])[1] | arrays)'
[1,2,3]
[4,6,false,5]
[12,3,4,5,7,7,6]
$ # with reduce; can only output at the end
$ jq -cn '[1,2,3,null,4,6,false,5,null,12,3,4,5,7,7,6] | reduce .[] as $v ([[]]; if $v == null then . + [ [] ] else .[-1] += [ $v ] end) | .[]'
[1,2,3]
[4,6,false,5]
[12,3,4,5,7,7,6]

This is a simple example, but, when you need to take an arbitrary number of values at each iteration from an array or stream (something that you often want to do when reconstructing a value from tostream/0 output or --stream) and you want to output whenever possible instead of having to wait until you have iterated the whole array or stream, an input/0 loop is usually more convenient to use compared to foreach .[] as $v/foreach inputs as $v or reduce .[] as $v/reduce inputs as $v.

CONS:

  • reduce:
    • slurps
    • must wait until the end of the stream before it can output
  • foreach:
    • can't output final incomplete value
  • input/0 loop:
    • may make the code look too procedural

@pkoppstein
Copy link
Contributor

pkoppstein commented Apr 1, 2022

@emanuele6 - Please note that my two most recent posts above
were not addressed to you because they were not intended
as a comment on or critique of your contributions.

The first of these two posts (try input catch infinite) was motivated in part
by the fact that your response, while
addressing the OP's question, did not address the related question
about branching on whether the input stream is empty or not.
A case of apples and oranges if you like.

It's also a case of apples and oranges with respect to your
partitioner.jq script and my repartition function.
I did not mean to suggest that you did not know about _nwise or that
my very simple program was equivalent
to your much more elaborate one. I was just pointing out how the
functionality illustrated in your original "partitions" post could be achieved using _nwise

By the way, for anyone interested in a simple jq program to
repartition in an incremental manner (in particular, without concatenating the
arrays), here's one solution [with correction since first posting]:

# s is assumed to be a stream of arrays
def repartition(s; $n):
  foreach (s,null) as $a (null;  # {emit, buffer}
     if $a == null then {emit: .buffer}
     elif $a == [] then .
     else .buffer += $a
     | (.buffer|length) as $len
     | if $len >= $n
       then ($len % $n) as $x
       | {emit: .buffer[: $len - $x], buffer: (if $x>0 then .buffer[-$x:] else null end)}
       else .emit = null
       end
     end;
     select(.emit).emit | _nwise($n) );

Example: repartition([1,2],[3,4,5],[6,7]; 2)

@emanuele6
Copy link
Member

emanuele6 commented Apr 1, 2022

RE: @pkoppstein

Please note that my two most recent posts above
were not addressed to you because they were not intended
as a comment on or critique of your contributions

I was just trying to figure out how it was different from mine; it looked like just mine, but with the extra step of using infinite and with incorrect handling of parse errors for the first input.

The first of these two posts (try input catch infinite) was motivated in part
by the fact that your response, while
addressing the OP's question, did not address the related question
about branching on whether the input stream is empty or not.
A case of apples and oranges if you like.

jq -n 'try input catch if . != "break" then error else "empty" end, inputs'

If you want to branch on the case in which there are no inputs with mine, you just have to write code where I wrote "empty" and the parse errors for the first input were handled properly in mine, so I couldn't figure out why you did the extra steps. (I thought it about it for quite a bit, that is why I asked.)

$ printf '\n' | jq -n 'try input catch if . != "break" then error else "empty" end, inputs'
"empty"
$ printf '%s\n' '"hello"' '"hi"' | jq -n 'try input catch if . != "break" then error else "empty" end, inputs'
"hello"
"hi"
$ printf '%s\n' '"hello"' 'hi' | jq -n 'try input catch if . != "break" then error else "empty" end, inputs'
"hello"
jq: error (at <stdin>:2): Invalid numeric literal at line 3, column 0
$ printf '%s\n' 'hello' '"hi"' | jq -n 'try input catch if . != "break" then error else "empty" end, inputs'
jq: error (at <stdin>:1): Invalid numeric literal at line 2, column 0
$
$ printf '\n' | jq -n '[try input catch infinite] | .[0] | if isinfinite then "empty" else ., inputs end'
"empty"
$ printf '%s\n' '"hello"' '"hi"' | jq -n '[try input catch infinite] | .[0] | if isinfinite then "empty" else ., inputs end'
"hello"
"hi"
$ printf '%s\n' '"hello"' 'hi' | jq -n '[try input catch infinite] | .[0] | if isinfinite then "empty" else ., inputs end'
"hello"
jq: error (at <stdin>:2): Invalid numeric literal at line 3, column 0
$ printf '%s\n' 'hello' '"hi"' | jq -n '[try input catch infinite] | .[0] | if isinfinite then "empty" else ., inputs end'
"empty"

By the way, for anyone interested in a simple jq program to
repartition in an incremental manner (in particular, without concatenating the
arrays), here's one solution:

# s is assumed to be a stream of arrays
def repartition(s; $n):
  foreach (s,null) as $a (null;  # {emit, buffer}
     if $a == null then {emit: .buffer}
     else .buffer += $a
     | if (.buffer|length) >= $n
       then {emit: .buffer[:$n], buffer: .buffer[$n:]}
       else .emit = null
       end
     end;
     select(.emit|length>0).emit );

That could be a possible solution, but it should be allowed to emit more than one value per iteration otherwise it will output arrays a lot after they were input if it gets arrays larger than $n (that should be the usual case), and, more importantly, it can build up a huge buffer (if there are not enough arrays with length < $n to balance the larger ones) that will be output entirely at the end.

for example:

$ # i am entering [1,2,3,4,5,6,7], ["a","b","c","d","e"] and ^D
$ jq -cn '# s is assumed to be a stream of arrays
def repartition(s; $n):
  foreach (s,null) as $a (null;  # {emit, buffer}
     if $a == null then {emit: .buffer}
     else .buffer += $a
     | if (.buffer|length) >= $n
       then {emit: .buffer[:$n], buffer: .buffer[$n:]}
       else .emit = null
       end
     end;
     select(.emit|length>0).emit );
repartition(inputs; 3)'
 [1,2,3,4,5,6,7]
[1,2,3]
 ["a","b","c","d","e"]
[4,5,6]
^D
[7,"a","b","c","d","e"]

possible fix:

# s is assumed to be a stream of arrays
def repartition(s; $n):
  foreach (s,null) as $a ({emit: []};
    if $a == null then {emit: [ .buffer // empty ]}
    else .buffer += $a
    | if (.buffer|length) < $n then .emit = []
      else [ .buffer | _nwise($n) ]
      | if (.[-1]|length) == $n
        then {emit: .}
        else {emit: .[:-1], buffer: .[-1]}
        end
      end
    end;
    .emit[]);

@pkoppstein
Copy link
Contributor

pkoppstein commented Apr 1, 2022

@emanuele6 - To see the difference, suppose we want to branch on whether the input stream is empty. If jq had a side-effect-free version of isempty/1, we would write something like:

   if sideffect_free_isempty(inputs) then X else E end. # not currently possible 

Using the 'infinite' template, we have only to write:

jq -n '[try input catch infinite] | .[0] | if isinfinite then X else ., inputs | E end'

or, taking into account your concern:

jq -n '[try input catch if . == "break" then infinite else error end] | .[0] | if isinfinite then X else ., inputs | E end'

Using your approach, we would have:

jq -n 'try input catch if . != "break" then error else X end, inputs | P'

So the difference is now obvious: with your program, an empty input stream results in X|P rather than just P.


I've fixed the incremental version of repartition. Thanks.

@emanuele6
Copy link
Member

Oh, right. I didn't think of that for some reason! Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

12 participants