Skip to content
This repository has been archived by the owner on Jan 26, 2022. It is now read-only.

Latest commit

 

History

History
792 lines (653 loc) · 29.2 KB

README.md

File metadata and controls

792 lines (653 loc) · 29.2 KB

Hack pipe operator for JavaScript

ECMAScript Proposal. J. S. Choi, 2021.

This document has moved to tc39/proposal-pipeline-operator, after advancing to TC39 Stage 2.
The remainder of this document is out of date and for archival purposes only.

This explainer was adapted from an essay by Tab Atkins with permission.

(This document presumptively uses ^ as the placeholder token for the topic reference. This choice of token is not a final decision; ^ could instead be %, or many other tokens.)

Why a pipe operator

In the State of JS 2020 survey, the fourth top answer to “What do you feel is currently missing from JavaScript?” was a pipe operator. Why?

When we perform consecutive operations (e.g., function calls) on a value in JavaScript, there are currently two fundamental styles:

  • passing the value as an argument to the operation (nesting the operations if there are multiple operations),
  • or calling the function as a method on the value (chaining more method calls if there are multiple methods).

That is, three(two(one(value))) versus value.one().two().three(). However, these styles differ much in readability, fluency, and applicability.

Deep nesting is hard to read

The first style, nesting, is generally applicable – it works for any sequence of operations: function calls, arithmetic, array/object literals, await and yield, etc.

However, nesting is difficult to read when it becomes deep: the flow of execution moves right to left, rather than the left-to-right reading of normal code. If there are multiple arguments at some levels, reading even bounces back and forth: our eyes must jump left to find a function name, and then they must jump right to find additional arguments. Additionally, editing the code afterwards can be fraught: we must find the correct place to insert new arguments among many nested parentheses.

Real-world example

Consider this real-world code from React.

console.log(
  chalk.dim(
    `$ ${Object.keys(envars)
      .map(envar =>
        `${envar}=${envars[envar]}`)
      .join(' ')
    }`,
    'node',
    args.join(' ')));

This real-world code is made of deeply nested expressions. In order to read its flow of data, a human’s eyes must first:

  1. Find the initial data (the innermost expression, envars).

  2. And then scan back and forth repeatedly from inside out for each data transformation, each one either an easily missed prefix operator on the left or a suffix operators on the right:

    1. Object.keys() (left side),
    2. .map() (right side),
    3. .join() (right side),
    4. A template literal (both sides),
    5. chalk.dim() (left side), then
    6. console.log() (left side).

As a result of deeply nesting many expressions (some of which use prefix operators, some of which use postfix operators, and some of which use circumfix operators), we must check both left and right sides to find the head of each expression.

Method chaining is limited

The second style, method chaining, is only usable if the value has the functions designated as methods for its class. This limits its applicability. But when it applies, thanks to its postfix structure, it is generally more usable and easier to read and write. Code execution flows left to right. Deeply nested expressions are untangled. All arguments for a function call are grouped with the function’s name. And editing the code later to insert or delete more method calls is trivial, since we would just have to put our cursor in one spot, then start typing or deleting one contiguous run of characters.

Indeed, the benefits of method chaining are so attractive that some popular libraries contort their code structure specifically to allow more method chaining. The most prominent example is jQuery, which still remains the most popular JS library in the world. jQuery’s core design is a single über-object with dozens of methods on it, all of which return the same object type so that we can continue chaining. There is even a name for this style of programming: fluent interfaces.

Unfortunately, for all of its fluency, method chaining alone cannot accomodate JavaScript’s other syntaxes: function calls, arithmetic, array/object literals, await and yield, etc. In this way, method chaining remains limited in its applicability.

Pipe operators combine both worlds

The pipe operator attempts to marry the convenience and ease of method chaining with the wide applicability of expression nesting.

The general structure of all the pipe operators is value |> e1 |> e2 |> e3, where e1, e2, e3 are all expressions that take consecutive values as their parameters. The |> operator then does some degree of magic to “pipe” value from the lefthand side into the righthand side.

Real-world example, continued

Continuing this deeply nested real-world code from React:

console.log(
  chalk.dim(
    `$ ${Object.keys(envars)
      .map(envar =>
        `${envar}=${envars[envar]}`)
      .join(' ')
    }`,
    'node',
    args.join(' ')));

…we can untangle it as such using a pipe operator and a placeholder token (^) standing in for the previous operation’s value:

envars
|> Object.keys(^)
|> ^.map(envar =>
  `${envar}=${envars[envar]}`)
|> ^.join(' ')
|> `$ ${^}`
|> chalk.dim(^, 'node', args.join(' '))
|> console.log(^);

Now, the human reader can rapidly find the initial data (what had been the most innermost expression, envars), then linearly read, from left to right, each transformation on the data.

Temporary variables are often tedious

One could argue that using temporary variables should be the only way to untangle deeply nested code. Explicitly naming every step’s variable causes something similar to method chaining to happen, with similar benefits to reading and writing code.

Real-world example, continued

For example, using our previous modified real-world example from React:

envars
|> Object.keys(^)
|> ^.map(envar =>
  `${envar}=${envars[envar]}`)
|> ^.join(' ')
|> `$ ${^}`
|> chalk.dim(^, 'node', args.join(' '))
|> console.log(^);

…a version using temporary variables would look like this:

const envarKeys = Object.keys(envars)
const envarPairs = envarKeys.map(envar =>
  `${envar}=${envars[envar]}`);
const envarString = envarPairs.join(' ');
const consoleText = `$ ${envarString}`;
const coloredConsoleText = chalk.dim(consoleText, 'node', args.join(' '));
console.log(coloredConsoleText);

But there are reasons why we encounter deeply nested expressions in each other’s code all the time in the real world, rather than lines of temporary variables. And there are reasons why the method-chain-based fluent interfaces of jQuery, Mocha, and so on are still popular.

It is often simply too tedious and wordy to write code with a long sequence of temporary, single-use variables. It is arguably even tedious and visually noisy for a human to read, too.

If naming is one of the most difficult tasks in programming, then programmers will inevitably avoid naming variables when they perceive their benefit to be relatively small.

Why the Hack pipe operator

There are two competing proposals for the pipe operator: Hack pipes and F# pipes. (There was a third proposal for a “smart mix” of the first two proposals, but it has been withdrawn, since its syntax is strictly a superset of one of the proposals’.)

The two pipe proposals just differ slightly on what the “magic” is, when we spell our code when using |>.

Both proposals reuse existing language concepts: Hack pipes are based on the concept of the expression, while F# pipes are based on the concept of the unary function.

Piping expressions and piping unary functions correspondingly have small and nearly symmetrical trade-offs.

This proposal: Hack pipes

In the Hack language’s pipe syntax, the righthand side of the pipe is an expression containing a special placeholder, which is evaluated with the placeholder bound to the lefthand side’s value. That is, we write value |> one(^) |> two(^) |> three(^) to pipe value through the three functions.

Pro: The righthand side can be any expression, and the placeholder can go anywhere any normal variable identifier could go, so we can pipe to any code we want without any special rules:

  • value |> foo(^) for unary function calls,
  • value |> foo(1, ^) for n-ary function calls,
  • value |> ^.foo() for method calls,
  • value |> ^ + 1 for arithmetic,
  • value |> [^, 0] for array literals,
  • value |> {foo: ^} for object literals,
  • value |> `${^}` for template literals,
  • value |> new Foo(^) for constructing objects,
  • value |> await ^ for awaiting promises,
  • value |> (yield ^) for yielding generator values,
  • value |> import(^) for calling function-like keywords,
  • etc.

Con: Piping through unary functions is slightly more verbose with Hack pipes than with F# pipes. This includes unary functions that were created by function-currying libraries like Ramda, as well as unary arrow functions that perform complex destructuring on their arguments: Hack pipes would be slightly more verbose with an explicit function call suffix (^).

Alternative proposal: F# pipes

In the F# language’s pipe syntax, the righthand side of the pipe is an expression that must evaluate into a unary function, which is then tacitly called with the lefthand side’s value as its sole argument. That is, we write value |> one |> two |> three to pipe value through the three functions. left |> right becomes right(left). This is called tacit programming or point-free style.

Real-world example, continued

For example, using our previous modified real-world example from React:

envars
|> Object.keys(^)
|> ^.map(envar =>
  `${envar}=${envars[envar]}`)
|> ^.join(' ')
|> `$ ${^}`
|> chalk.dim(^, 'node', args.join(' '))
|> console.log(^);

…a version using F# pipes instead of Hack pipes would look like this:

envars
|> Object.keys
|> x=> x.map(envar =>
  `${envar}=${envars[envar]}`)
|> x=> x.join(' ')
|> x=> `$ ${x}`
|> x=> chalk.dim(x, 'node', args.join(' '))
|> console.log;

Pro: The restriction that the righthand side must resolve to a unary function lets us write very terse pipes when the operation we want to perform is a unary function call:

  • value |> foo for unary function calls.

This includes unary functions that were created by function-currying libraries like Ramda, as well as unary arrow functions that perform complex destructuring on their arguments: F# pipes would be slightly less verbose with an implicit function call (no (^)).

Con: The restriction means that any operations that are performed by other syntax must be made slightly more verbose by wrapping the operation in a unary arrow function:

  • value |> x=> x.foo() for method calls,
  • value |> x=> x + 1 for arithmetic,
  • value |> x=> [x, 0] for array literals,
  • value |> x=> {foo: x} for object literals,
  • value |> x=> `${x}` for template literals,
  • value |> x=> new Foo(x) for constructing objects,
  • value |> x=> import(x) for calling function-like keywords,
  • etc.

Even calling named functions requires wrapping when we need to pass more than one argument:

  • value |> x=> foo(1, x) for n-ary function calls.

Con: The await and yield operations are scoped to their containing function, and thus cannot be handled by unary functions alone. If we want to integrate them into a pipe expression, await and yield must be handled as special syntax cases:

  • value |> await for awaiting promises, and
  • value |> yield for yielding generator values.

Hack pipes favor more common expressions

Both Hack pipes and F# pipes respectively impose a small syntax tax on different expressions:
Hack pipes slightly tax only unary function calls, and
F# pipes slightly tax all expressions except unary function calls.

In both proposals, the syntax tax per taxed expression is small (both (^) and x=> are only three characters). However, the tax is multiplied by the prevalence of its respectively taxed expressions. It therefore might make sense to impose a tax on whichever expressions are less common and to optimize in favor of whichever expressions are more common.

Unary function calls are in general less common than all expressions except unary functions. In particular, method calling and n-ary function calling will always be popular; in general frequency, unary function calling is equal to or exceeded by those two cases alone – let alone by other ubiquitous syntaxes such as array literals, object literals, and arithmetic operations. This explainer contains several real-world examples of this difference in prevalence.

Furthermore, several other proposed new syntaxes, such as extension calling, do expressions, and record/tuple literals, will also likely become pervasive in the future. Likewise, arithmetic operations would also become even more common if TC39 standardizes operator overloading. Untangling these future syntaxes’ expressions would be more fluent with Hack pipes compared to F# pipes.

Hack pipes might be simpler to use

The syntax tax of Hack pipes on unary function calls (i.e., the (^) to invoke the righthand side’s unary function) is not a special case: it simply is explicitly writing ordinary code, in the way we normally would without a pipe.

On the other hand, F# pipes require us to distinguish between “code that resolves to an unary function” versus “any other expression” – and to remember to add the arrow-function wrapper around the latter case.

For example, with Hack pipes, value |> someFunction + 1 is invalid syntax and will fail early. There is no need to recognize that someFunction + 1 will not evaluate into a unary function. But with F# pipes, value |> someFunction + 1 is still valid syntax – it’ll just fail late at runtime, because someFunction + 1 isn’t callable.

Description

(A formal draft specification is available.)

The topic reference ^ is a nullary operator. It acts as a placeholder for a topic value, and it is lexically scoped and immutable.

^ is not a final choice

(The precise token for the topic reference is not final. ^ could instead be %, or many other tokens. We plan to bikeshed what actual token to use later, if TC39 advances this proposal. However, ^ seems to be the least syntactically problematic, and it also resembles the placeholders of printf format strings and Clojure’s #(^) function literals.)

The pipe operator |> is an infix operator that forms a pipe expression (also called a pipeline). It evaluates its lefthand side (the pipe head or pipe input), immutably binds the resulting value (the topic value) to the topic reference, then evaluates its righthand side (the pipe body) with that binding. The resulting value of the righthand side becomes the whole pipe expression’s final value (the pipe output).

The pipe operator’s precedence is the same as:

  • the function arrow =>;
  • the assignment operators =, +=, etc.;
  • the generator operators yield and yield *;

It is tighter than only the comma operator ,.
It is looser than all other operators.

For example, v => v |> ^ == null |> foo(^, 0)
would group into v => (v |> (^ == null) |> foo(^, 0)),
which in turn is equivalent to v => foo(v == null, 0).

A pipe body must use its topic value at least once. For example, value |> foo + 1 is invalid syntax, because its body does not contain a topic reference. This design is because omission of the topic reference from a pipe expression’s body is almost certainly an accidental programmer error.

Likewise, a topic reference must be contained in a pipe body. Using a topic reference outside of a pipe body is also invalid syntax.

To prevent confusing grouping, it is invalid syntax to use other operators that have the same precedence (the arrow =>, the ternary conditional operator ? :, the assignment operators, and the yield operator) as a pipe head or body. When using, we must use parentheses to explicitly indicate which precedence is correct. For example, a |> b ? ^ : c |> ^.d is invalid syntax; it should be corrected to either a |> (b ? ^ : c) |> ^.d or a |> (b ? ^ : c |> ^.d).

Lastly, topic bindings inside dynamically compiled code (e.g., with eval or new Function) cannot be used outside of that code. For example, v |> eval('^ + 1') will throw a syntax error when the eval expression is evaluated at runtime.

There are no other special rules.

A natural result of these rules is that, if we need to interpose a side effect in the middle of a chain of pipe expressions, without modifying the data being piped through, then we could use a comma expression, such as with value |> (sideEffect(), ^). As usual, the comma expression will evaluate to its righthand side ^, essentially passing through the topic value without modifying it. This is especially useful for quick debugging: value |> (console.log(^), ^).

Real-world examples

The only changes to the original examples were dedentation and removal of comments.

From jquery/build/tasks/sourceMap.js:

// Status quo
var minLoc = Object.keys( grunt.config( "uglify.all.files" ) )[ 0 ];

// With pipes
var minLoc = grunt.config('uglify.all.files') |> Object.keys(^)[0];

From node/deps/npm/lib/unpublish.js:

// Status quo
const json = await npmFetch.json(npa(pkgs[0]).escapedName, opts);

// With pipes
const json = pkgs[0] |> npa(^).escapedName |> await npmFetch.json(^, opts);

From underscore.js:

// Status quo
return filter(obj, negate(cb(predicate)), context);

// With pipes
return cb(predicate) |> _.negate(^) |> _.filter(obj, ^, context);

From ramda.js.

// Status quo
return xf['@@transducer/result'](obj[methodName](bind(xf['@@transducer/step'], xf), acc));

// With pipes
return xf
|> bind(^['@@transducer/step'], ^)
|> obj[methodName](^, acc)
|> xf['@@transducer/result'](^);

From ramda.js.

// Status quo
try {
  return tryer.apply(this, arguments);
} catch (e) {
  return catcher.apply(this, _concat([e], arguments));
}

// With pipes: Note the visual parallelism between the two clauses.
try {
  return arguments
  |> tryer.apply(this, ^);
} catch (e) {
  return arguments
  |> _concat([e], ^)
  |> catcher.apply(this, ^);
}

From express/lib/response.js.

// Status quo
return this.set('Link', link + Object.keys(links).map(function(rel){
  return '<' + links[rel] + '>; rel="' + rel + '"';
}).join(', '));

// With pipes
return links
|> Object.keys(^).map(function (rel) {
  return '<' + links[rel] + '>; rel="' + rel + '"';
})
|> link + ^.join(', ')
|> this.set('Link', ^);

From react/scripts/jest/jest-cli.js.

// Status quo
console.log(
  chalk.dim(
    `$ ${Object.keys(envars)
      .map(envar => `${envar}=${envars[envar]}`)
      .join(' ')}`,
    'node',
    args.join(' ')
  )
);

// With pipes
envars
|> Object.keys(^)
|> ^.map(envar => `${envar}=${envars[envar]}`)
|> ^.join(' ')
|> `$ ${^}`
|> chalk.dim(^, 'node', args.join(' '))
|> console.log(^);

From ramda.js.

// Status quo
return _reduce(xf(typeof fn === 'function' ? _xwrap(fn) : fn), acc, list);

// With pipes
return fn
|> (typeof ^ === 'function' ? _xwrap(^) : ^)
|> xf(^)
|> _reduce(^, acc, list);

From jquery/src/core/init.js.

// Status quo
jQuery.merge( this, jQuery.parseHTML(
  match[ 1 ],
  context && context.nodeType ? context.ownerDocument || context : document,
  true
) );

// With pipes
context
|> (^ && ^.nodeType ? ^.ownerDocument || ^ : document)
|> jQuery.parseHTML(match[1], ^, true)
|> jQuery.merge(^);

Possible future extensions

Hack-pipe functions

If Hack pipes are added to JavaScript, then they could also elegantly handle partial function application in the future with a syntax further inspired by Clojure’s #(^1 ^2) function literals.

There is already a proposed special syntax for partial function application (PFA) with ? placeholders (abbreviated here as ?-PFA). Both ?-PFA and Hack pipes address a similar problem – binding values to placeholder tokens – but they address it in different ways.

With ?-PFA, ? placeholders are valid only directly within function-call expressions, and each consecutive ? placeholder in an expression refers to a different argument value. This is in contrast to Hack pipes, in which every ^ token in an expression refers to the same value. ?-PFA’s design integrates well with F# pipes, rather than Hack pipes, but this could be changed.

?-PFA with F# pipes Hack pipes
x |> y=> y + 1 x |> ^ + 1
x |> f(?, 0) x |> f(^, 0)
a.map(x=> x + 1) a.map(x=> x + 1)
a.map(f(?, 0)) a.map(x=> f(x, 0))
a.map(x=> x + x) a.map(x=> x + x)
a.map(x=> f(x, x)) a.map(x=> f(x, x))
a.sort((x,y)=> x - y) a.sort((x,y)=> x - y)
a.sort(f(?, ?, 0)) a.sort((x,y)=> f(x, y, 0))

The PFA proposal could instead switch from ? placeholders to Hack-pipe topic references. It could do so by combining the Hack pipe |> with the arrow function => into a topic-function operator +>, which would use the same general rules as |>.

+> would be a prefix operator that creates a new function, which in turn binds its argument(s) to topic references. Non-unary functions would be created by including topic references with numbers (^0, ^1, ^2, etc.) or .... ^0 (equivalent to plain ^) would be bound to the zeroth argument, ^1 would be bound to the next argument, and so on. ^... would be bound to an array of rest arguments. And just as with |>, +> would require its body to contain at least one topic reference in order to be syntactically valid.

?-PFA Hack pipe functions
a.map(x=> x + 1) a.map(+> ^ + 1)
a.map(f(?, 0)) a.map(+> f(^, 0))
a.map(x=> x + x) a.map(+> ^ + ^)
a.map(x=> f(x, x)) a.map(+> f(^, ^))
a.sort((x,y)=> x - y) a.sort(+> ^0 - ^1)
a.sort(f(?, ?, 0)) a.sort(+> f(^0, ^1, 0))

Pipe functions would avoid the ?-PFA syntax’s garden-path problem. When we read the expression from left to right, the +> prefix operator makes it readily apparent that the expression is creating a new function from f, rather than calling f immediately. In contrast, ?-PFA would require us to check every function call for a ? placeholder in order to determine whether it is actually an immediate function call.

In addition, pipe functions wouldn’t help only partial function application. Their flexibility would allow for partial expression application, concisely creating functions from other kinds of expressions in ways that would not be possible with ?-PFA.

?-PFA Hack pipe functions
a.map(x=> x + 1) a.map(+> ^ + 1)
a.map(x=> x + x) a.map(+> ^ + ^)
a.sort((x,y)=> x - y) a.sort(+> ^0 - ^1)

Hack-pipe syntax for if, catch, and forof

Many if, catch, and for statements could become pithier if they gained “pipe syntax” that bound the topic reference.

if () |> would bind its condition value to ^,
catch |> would bind its caught error to ^,
and for (of) |> would consecutively bind each of its iterator’s values to ^.

Status quo Hack-pipe statement syntax
const c = f(); if (c) g(c); if (f()) |> b(^);
catch (e) f(e); catch |> f(^);
for (const v of f()) g(v); for (f()) |> g(^);

Optional Hack pipes

A short-circuiting optional-pipe operator |?> could also be useful, much in the way ?. is useful for optional method calls.

For example, value |> ^ != null ? await foo(^) : ^ |> ^ != null ? ^ + 1 : ^
would be equivalent to value |?> await foo(^) |?> ^ + 1.

Tacit unary function application

Tacit unary function application – that is, F# pipes – could still be added to the language with another pipe operator |>> – similarly to how Clojure has multiple pipe macros ->, ->>, and as->.

For example, value |> ^ + 1 |>> f |> g(^, 0)
would mean value |> ^ + 1 |> f(^) |> g(^, 0).

There was an informal proposal for such a split mix of two pipe operators, which was set aside in favor of single-operator proposals.