Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel babel transpilation #114

Merged
merged 32 commits into from
Jun 28, 2017
Merged

Parallel babel transpilation #114

merged 32 commits into from
Jun 28, 2017

Conversation

mikrostew
Copy link
Contributor

This modifies the options.resolveModuleSource and options.plugins APIs to be able to run babel transpilation in parallel using a worker pool. This API looks like:

['plugin-or-resolver-name', '/path/to/file-to-require.js', { /* options */ }]

That file that is the 2nd item in the array is required in the worker process. That file must define a function build() that returns the plugin or resolver to use. The build() function will be passed the options object that is the 3rd item from that array. Like so:

// file-to-require.js

module.exports = {
  name: 'some-plugin',
  build: function(options) {
    return 'transform-strict-mode';
    // or
    // return function() { /* blah blah */ }
  }
}

For these operations to actually run in parallel requires broccolijs/broccoli-persistent-filter#111 to be committed, so that the transpilations can happen concurrently.

Copy link
Member

@stefanpenner stefanpenner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after reviewing this and the babel-persistent-filter pr, i feel this is the right direction.

Some open questions:

  • where do we do throttling?
  • how is the workflow \w errors (is the output mangled?)

module.exports = {
name: 'amd-name-resolver',

build: function(options) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build(options) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

module.exports = {
name: 'transform-es2015-block-scoping',

build: function(options) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

module.exports = {
name: 'transform-strict-mode',

build: function(options) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

index.js Outdated
@@ -25,16 +33,44 @@ function replaceExtensions(extensionsRegex, name) {
return name;
}

function pluginCanBeParallelized(plugin) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets add a unit test for each of these helper methods

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok - I moved this stuff to a separate file as well to keep index.js a little cleaner

index.js Outdated
return transpiler.transform(string, options);
if (transformIsParallelizable(options)) {
// send the job to the worker pool
return new Promise(function(resolve, reject) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we extract this to its own file, so we can test this branch in isolation? Parallel/worker stuff is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I will extract this to a separate file

index.js Outdated
return transpiler.transform(string, options);
if (transformIsParallelizable(options)) {
// send the job to the worker pool
return new Promise(function(resolve, reject) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this promise wrapper is not required.

return pool.exec('transform', [string, options]).catch(err) {
   if (err.name === 'Error' && (err.message === 'Worker terminated unexpectedly' ||
                                        err.message === 'Worker is terminated')) {
             // retry one time if it's a worker error
    resolve(pool.exec('transform', [string, options]));
  } else {
    throw err;
  }
});

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call - simplifying that

index.js Outdated
resolve(result);
},
function onRejected(err) {
if (err.name === 'Error' && (err.message === 'Worker terminated unexpectedly' ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

err may be undefined or null or a number you should likely add a guard.

typeof err === 'object' && err !== null

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch



module.exports = {
pluginCanBeParallelized: function(plugin) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concise function pluginCanBeParallelized() {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

module.exports = {
pluginCanBeParallelized: function(plugin) {
return typeof plugin === 'string' ||
(Object.prototype.toString.call(plugin) === '[object Array]' &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array.isArray(plugin)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome, I knew there had to be a better way to check that


resolveModuleIsParallelizable: function(resolveModule) {
return resolveModule === undefined ||
(Object.prototype.toString.call(resolveModule) === '[object Array]' &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array.isArray(resolveModule)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok


if (options.resolveModuleSource !== undefined) {
// convert to work with Babel, if needed
if (Object.prototype.toString.call(options.resolveModuleSource) === '[object Array]') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array.isArray(options.resolveModuleSource)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

transformOptions: function(options) {
var newOptions = options;

if (options.resolveModuleSource !== undefined) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nested if seems redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

if (options.plugins !== undefined) {
// convert plugins to work with Babel, if needed
newOptions.plugins = options.plugins.map(function(plugin) {
if (Object.prototype.toString.call(plugin) === '[object Array]') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array.isArray(plugin)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

if (typeof err === 'object' && err !== null &&
(err.message === 'Worker terminated unexpectedly' || err.message === 'Worker is terminated')) {
// retry one time if it's a worker error
return pool.exec('transform', [string, options]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe we should retry (this will just slow things down). The work we doing isn't reliant on external resources that may be a partial failure, we should just assume one attempt === fatal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, and this makes things simpler

});
}
else {
return Promise.resolve(transpiler.transform(string, this.transformOptions(options)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new Promise(resolve => resolve(transpiler.transform(string, this.transformOptions(options));

Otherwise if transpiler.transform throws synchronously, we will break the transformString signature by also throwing synchronously.

In general, if you function returns a promise it should always return a promise. Otherwise every caller must wrap in try/catch or promise constructor etc. So by wrapping at the "leaves" of a program, when we invoke external code that is not promise aware, we hide the rest of our program from these inconsistencies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

@stefanpenner
Copy link
Member

stefanpenner commented Jun 5, 2017

based on our other discussion broccolijs/broccoli-persistent-filter#111 (comment) it seems like this PR will require throttling.

I helped @2cflor with this for broccoli-eyeglass and created a wrapper around async.queue that "just does the right thing" https://github.com/stefanpenner/async-promise-queue and will most likely also help you.

mikrostew added 5 commits June 5, 2017 18:10
due to concurrency changes in babel-persistent-filter, the
order of file processing and errors is non-deterministic

also, fix pluralization in error message :)
concise methods
Array.isArray()
remove redundant nested if
consistently return Promise
@mikrostew
Copy link
Contributor Author

I added throttling/concurrency using async-promise-queue to the babel-persistent-filter PR with this commit.

There is already some throttling with this PR, due to how the workerpool module works. Each worker in the pool only runs one job at a time, and the pool is limited to a max number of workers.

}

function transformString(string, options) {
if (transformIsParallelizable(options)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should debug log which branch we took.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, adding debug logging like in broccoli-persistent-filter

lib/worker.js Outdated
// transpile the input string, using the input options
function transform(string, options) {
return new Promise(function(resolve) {
var result = transpiler.transform(string, ParallelApi.transformOptions(options));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make more sense to name this ParallelApi.deserializeOptions ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that makes sense, to match the other renaming


function transformString(string, options) {
if (transformIsParallelizable(options)) {
return pool.exec('transform', [string, objectifyCallbacks(options)]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make more sense to name objectifyCallbacks as serializeOptions ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I think that's clearer

@stefanpenner
Copy link
Member

can we also document the JOBS env var

@mikrostew
Copy link
Contributor Author

To test a against a decent sized project, I linked this into https://github.com/TryGhost/Ghost (which is using babel 6) and ran it there with no problems, same output as before.

@stefanpenner stefanpenner merged commit a3a5a23 into babel:master Jun 28, 2017
@stefanpenner
Copy link
Member

released as v6.1.0 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants