- Fixed:
/a[/]/
is now parsed as aRegularExpressionLiteral
again (regression from 8.0.3). Thanks to No Two (@noootwo) for reporting!
- Added: Support for ES2023:
#!
hashbang comments at the start of files.
- Fixed: Extremely long string literals, template literals, regex literals, identifiers, comments, and runs of whitespace are now supported where possible. We’re talking about 10 million characters long. Previously, such long tokens could make the regex engine of the runtime give up and throw a
Maximum call stack size exceeded
or similar error. There are still a few even more extreme such edge cases, which I don’t think can be solved but are documented at least.
- Fixed: The TypeScript type definitions shipped in the npm package are now more correct. Previously they used
export default
, but apparentlyexport =
is the correct syntax to use for packages that export a single function, which can be used both in CJS and MJS. (Read more about Incorrect default export). It should now be possible to doconst jsTokens = require("js-tokens")
in a@ts-checked
ed JS file without TypeScript complaining. Note: This requires"esModuleInterop": true
in your tsconfig.json, but as far as I can tell that’s not a breaking change, since importing js-tokens with"esModuleInterop": false
didn’t work at runtime anyway.
- Fixed:
/]/
is now parsed as aRegularExpressionLiteral
. That’s invalid regex syntax, unless Annex B: Additional ECMAScript Features for Web Browsers is honored, which js-tokens does. Thanks to Jared Hughes (@jared-hughes) for reporting and fixing!
Support for ES2022!
-
Added: Support for the
d
regex flag. -
Added: A new token type –
PrivateIdentifier
– for things like#name
.this.#name
now tokenizes differently:- Before:
IdentifierName: this
,Punctuator: .
,Invalid: #
,IdentifierName: name
- After:
IdentifierName: this
,Punctuator: .
,PrivateIdentifier: #name
- Before:
- Added: Support for ES2021: The
||=
,&&=
and??=
operators, as well as underscores in numeric literals (1_000
).
- Changed: The main export of this module is no longer a regex (accompanied by a small helper function). Instead, the only export is a function that tokenizes JavaScript (which was the main use case of the regex). The tokenization is still powered by basically the same regex as before, but is now wrapped up in 300–400 lines of code. This is required to tokenize regex and templates correctly, and to support JSX (see below).
- Changed: You no longer need
.default
when using CommonJS:const jsTokens = require("js-tokens")
. (import jsTokens from "js-tokens"
also works in module environments.) - Changed: Node.js 10 or later is now required (because Unicode property escapes are used.)
- Changed: The tokens are now named like in the ECMAScript spec.
- Changed: Whitespace and line terminator sequences are now matched as separate tokens to match the spec.
- Added: TypeScript definition.
- Added: Support for JSX:
jsTokens("<p>Hello, world!</p>", { jsx: true })
. - Added: Support for BigInt syntax:
5n
. - Added: Support for
?.
and??
. - Added: Support for legacy octal and octal like numeric literals.
- Improved: Template literals now support infinite nesting, and separate tokens are made for interpolations.
- Improved: Regex vs division detection. js-tokens now passes all but 3 of test262-parser-tests.
- Improved: Unclosed regular expressions are now matched as such, instead of as division.
- Improved: Invalid non-ASCII characters are no longer matched as identifier names.
- Added: Support for ES2019. The only change is that
\u2028
and\u2029
are now allowed unescaped inside string literals.
- Added: Support for ES2018. The only change needed was recognizing the
s
regex flag. - Changed: All tokens returned by the
matchToToken
function now have aclosed
property. It is set toundefined
for the tokens where “closed” doesn’t make sense. This means that all tokens objects have the same shape, which might improve performance.
These are the breaking changes:
'/a/s'.match(jsTokens)
no longer returns['/', 'a', '/', 's']
, but['/a/s']
. (There are of course other variations of this.)- Code that rely on some token objects not having the
closed
property could now behave differently.
- No code changes. Just updates to the readme.
- Fixed: ES2015 unicode escapes with more than 6 hex digits are now matched correctly.
This release contains one breaking change, that should improve performance in V8:
So how can you, as a JavaScript developer, ensure that your RegExps are fast? If you are not interested in hooking into RegExp internals, make sure that neither the RegExp instance, nor its prototype is modified in order to get the best performance:
var re = /./g; re.exec(""); // Fast path. re.new_property = "slow";
This module used to export a single regex, with .matchToToken
bolted on, just like in the above example. This release changes the exports of the module to avoid this issue.
Before:
import jsTokens from "js-tokens";
// or:
var jsTokens = require("js-tokens");
var matchToToken = jsTokens.matchToToken;
After:
import jsTokens, { matchToToken } from "js-tokens";
// or:
var jsTokens = require("js-tokens").default;
var matchToToken = require("js-tokens").matchToToken;
- Added: Support for ES2016. In other words, support for the
**
exponentiation operator.
These are the breaking changes:
'**'.match(jsTokens)
no longer returns['*', '*']
, but['**']
.'**='.match(jsTokens)
no longer returns['*', '*=']
, but['**=']
.
- Improved: Made the regex ever so slightly smaller.
- Updated: The readme.
- Improved: Limited npm package contents for a smaller download. Thanks to @zertosh!
- Fixed: Declared an undeclared variable.
- Changed: Merged the 'operator' and 'punctuation' types into 'punctuator'. That type is now equivalent to the Punctuator token in the ECMAScript specification. (Backwards-incompatible change.)
- Fixed: A
-
followed by a number is now correctly matched as a punctuator followed by a number. It used to be matched as just a number, but there is no such thing as negative number literals. (Possibly backwards-incompatible change.)
- Added: Support for the regex
u
flag.
- Improved:
jsTokens.matchToToken
performance. - Added: Support for octal and binary number literals.
- Added: Support for template strings.
- Fixed: Support for unicode spaces. They used to be allowed in names (which is very confusing), and some unicode newlines were wrongly allowed in strings and regexes.
- Changed: The
jsTokens.names
array has been replaced with thejsTokens.matchToToken
function. The capturing groups ofjsTokens
are no longer part of the public API; instead use said function. See this gist for an example. (Backwards-incompatible change.) - Changed: The empty string is now considered an “invalid” token, instead an “empty” token (its own group). (Backwards-incompatible change.)
- Removed: component support. (Backwards-incompatible change.)
- Changed: Match ES6 function arrows (
=>
) as an operator, instead of its own category (“functionArrow”), for simplicity. (Backwards-incompatible change.) - Added: ES6 splats (
...
) are now matched as an operator (instead of three punctuations). (Backwards-incompatible change.)
- Initial release.