Skip to content

Internals: jq Assigment Operators

Nico Williams edited this page Jul 11, 2023 · 10 revisions

The jq assigment operators =, //=, <op>= (e.g., +=, -=, etc.), and |= are very special. They're not like assignments in most languages -- they are just another kind of jq expression that produces zero, one, or more values, but the values produced are the input with the changes denoted by the right-hand side (RHS) to the left-hand side (LHS) of the input to the assignment.

The LHS is very special: it is a path expression (TODO: add wiki page about path expressions), which is an expression consisting only of sub-expressions like .a, if/then/else with path expressions as the actions, and/or calls to functions whose bodies are path expressions.

The RHS is some expression which, in the case of |= receives the current value at the LHS in ., while in the other cases the RHS receives . (the input to the whole assignment expression). The latter can be confusing.

Inspecting src/parser.y is instructive.

First we have //= and <op>=

Exp "//=" Exp {
  $$ = gen_definedor_assign($1, $3);
} |
static block gen_definedor_assign(block object, block val) {
  block tmp = gen_op_var_fresh(STOREV, "tmp");
  return BLOCK(gen_op_simple(DUP),
               val, tmp,
               gen_call("_modify", BLOCK(gen_lambda(object),
                                         gen_lambda(gen_definedor(gen_noop(),
                                                                  gen_op_bound(LOADV, tmp))))));
}
Exp "+=" Exp {
  $$ = gen_update($1, $3, '+');
} |
static block gen_update(block object, block val, int optype) {
  block tmp = gen_op_var_fresh(STOREV, "tmp");
  return BLOCK(gen_op_simple(DUP),
               val,
               tmp,
               gen_call("_modify", BLOCK(gen_lambda(object),
                                         gen_lambda(gen_binop(gen_noop(),
                                                              gen_op_bound(LOADV, tmp),
                                                              optype)))));
}

Having val before the gen_call("_modify", ...) is the reason that the RHS of //= gets the . of the LHS as its value, the reason that it's evaluated every time, and also the reason that the assignment is done once per-value output by the RHS.

Compare to |= which is coded like this:

Exp "|=" Exp {
  $$ = gen_call("_modify", BLOCK(gen_lambda($1), gen_lambda($3)));
} |

Ok, let's translate all of this to English:

  • First |=: gen_call("_modify", BLOCK(gen_lambda($1), gen_lambda($3))); means: "generate a call to _modify with the lhs ($1) as the first argument and the rhs ($3) as the second argument (note that jq function arguments are lambdas, thus the gen_lambda()s).

  • Now gen_definedor_assign() and gen_update() (which are very similar):

    • the DUP is memory management -- ignore for this analysis
    • val is the RHS, and we will invoke it immediately
    • store the val output(s) (RHS) in tmp (a gensym'ed $binding)
    • call _modify (the heart of modify-assign operators) with the input to the LHS as the first argument and a second argument that amounts to . // $tmp where $tmp is the gensym'ed binding mentione above

The difference between //= and other op= assignments is that // is block-coded in gen_definedor() while the ops are builtins like _plus. // could have been jq-coded, but it's not.

Clone this wiki locally