Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add diffing logic #18

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Add diffing logic #18

wants to merge 3 commits into from

Conversation

eduardoboucas
Copy link
Contributor

No description provided.

@eduardoboucas eduardoboucas requested a review from smnh July 30, 2020 17:15
Comment on lines +12 to 16
const fileWriters = {
'frontmatter-md': writeFrontmatterMarkdown,
json: writeJSON,
yml: writeYAML
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eduardoboucas you can use this battle-tested utility stringifyDataByFilePath to write all sort of files:
https://github.com/stackbithq/utils/blob/d1222b63ede765fb895668b641cb3f44c238d6ea/src/index.js#L373-L394
It takes into account several quirks that will prevent the app from crashing and store files safely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not go through the trouble of changing the implementation unless something needs fixing. The code you linked looks good, but we have slightly different requirements (e.g. we don't need to choose the writer based on the file path, that is provided to us by the user). Also we're using a different YAML package, which a different syntax.

So if you have any concerns about a specific part of the existing implementation, can you flag those specifically?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then just look at the YAML implementation in the link I've sent you and add noRefs attribute to make sure that if an object has circular references it won't be split. Also what about supporting TOML and markdown files with TOML and YAML frontmatters? I know Hugo and 11ty make use of these.

Comment on lines +234 to +256
const { __diff, ...currentDataForPlugin } = data;
const previousData = this.dataForPlugin[index] || initialData;
const diffs = Object.keys(currentDataForPlugin).reduce((diffs, dataBucketKey) => {
return {
...diffs,
[dataBucketKey]: generateDiff(previousData[dataBucketKey], currentDataForPlugin[dataBucketKey]) || []
};
}, {});

this.dataForPlugin[index] = currentDataForPlugin;

const transformedData = await plugin.transform({
data: {
...data,
__diff: diffs
},
debug: this.getDebugMethodForPlugin(pluginName),
getPluginContext: () => contextSnapshot[pluginName] || {},
log: this.logFromPlugin.bind(this),
options: this.parsePluginOptions(plugin, pluginBlock.options)
});

return transformedData;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the data is passed to plugin transform functions by reference and also stored by reference in dataForPlugin. This means that if one plugin mutates the data, it will be changed in dataForPlugin as well, and therefore affect the next diff iteration.

Example:

data = {
  objects: [{"foo": "bar"}]
}
=> currentDataForPlugin = {objects: [{"foo": "bar"}]} // same object
=> this.dataForPlugin[index] = {objects: [{"foo": "bar"}]} // same object
plugin.transform({data: {...data}})

Plugin code

transform({data, ...}) {
  data.objects.foo.bar = "changed";
}

This changes the value inside the data passed in transform but also the value inside this.dataForPlugin[index].

Next time transform runs, the previousData for that plugin will have the value changed and not the original value.

The data passed to transform function (or the data saved into dataForPlugin) should be deeply cloned. Or better yet, deeply freeze the data before passing it to plugin transform (or use immutable.js), requiring plugin developers to always return new objects by leveraging functional development and use of pure functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true. I did consider this when I was working on the implementation, but I worry that by freezing/cloning the data we'll be adding too much effort. We already clone the entire data object on every transform, now we're talking about doing it also for every plugin, on every transform.

I mean, we can do it if you feel that it's important, but it seems that we've been on the fence about the importance of the whole diff object concept, so I'm a bit weary about making it considerably more expensive. On the other hand, there's the argument that if we're doing it, we might as well do it right! 😄

What do you think?

Copy link
Member

@smnh smnh Aug 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"On every transform" you mean every transform of every plugin? Or every global transform run?
If this is for global transform run, then the impact is low and there is no problem with that.
But if the diff is run for every plugin, then my question is do we need to compute the diff for every plugin? Or can we run it once before we begin calling transform functions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, we should check if deep-freeze has any significant performance impact. If not, then I would safely use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants