Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First draft: explainer-new. #193

Merged
merged 3 commits into from
May 31, 2023
Merged

Conversation

otherdaniel
Copy link
Collaborator

@otherdaniel otherdaniel commented May 4, 2023

Explainer for new Sanitizer API design, based on recent sync meeting (#192).

Notes:

  • Not ready yet; would like to add examples.
  • I added several open questions not discussed in the meeting.

@otherdaniel otherdaniel requested a review from mozfreddyb May 4, 2023 16:43
@otherdaniel
Copy link
Collaborator Author

This tries to summarize the result from the recent sync meeting. I hope it's a starting point to replace the current explainer for this project, and to then base future work on it. Please let me know if this doesn't reflect the meeting results accurately.

I've added several open questions (i.e., what about defaults?) that were not explicitly discussed at the meeting.

I'd like to add examples, too, to make this more useful for people not involved in the meetings.

Copy link
Collaborator

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up @otherdaniel! Left my thoughts and nits inline.

explainer-new.md Outdated Show resolved Hide resolved
explainer-new.md Outdated Show resolved Hide resolved
explainer-new.md Outdated Show resolved Hide resolved
explainer-new.md Outdated Show resolved Hide resolved
explainer-new.md Outdated
## Open questions:

- Defaults: If no filter is supplied, do the safe methods have any filtering
other than the baseline? Do the unsafe methods have default filtering?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It seems good to conclude Clarify threat model, "XSS first and foremost" #188 so "baseline" is clear.
  2. I think that by default "unsafe" shouldn't do any filtering. And if you want you get to essentially supply your own block and safelists and the browser won't perform any normalization on them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This is more than "what the baseline is", but also whether the default should be something "more usable" than the baseline (e.g., also issue baseline should be default and default should be baseline #183). I think it should suffice to enforce only the baseline. (Unless we actually do the full research on the defaults of other sanitizers and e.g., popular XSS gadget attributes to also strip from the "default".)

explainer-new.md Outdated
Comment on lines 49 to 51
- Defaults: All of these are new methods, without legacy usage. Would DSD
parsing default to `true`? Do we even decide that, or would we instead ask
the HTML WG and adopt whatever their choice?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ultimately that decision is largely up to the WHATWG. And we decided there previously that new parsing APIs won't need any kind of opt-in for declarative shadow roots. So we shouldn't add any here. I don't think we need opt-out either (that's handled through custom elements not being safelisted).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as a declarative shadow root always requires a custom element as a parent and as long as we require custom elements to be named in the allowElements list, I am very OK with this not being a toggle at all.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded this. Agree on WHATWG getting the last word.

I'm not aware of any requirements that shadow roots - declarative or not - must be hosted by custom elements. I thought a boring old <div> could also host a shadow DOM. Am I overlooking something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to find evidence for restrictions on shadow root hosts. What I find is:

The second one lists quite a few restrictions, but it does allow a number of specific HTML elements, e.g. div.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I raised this in whatwg/dom#831 (comment).

Having said that, I'd appreciate it if @mozfreddyb could elaborate on this concern, as I'm not yet seeing the problem with shadow roots potentially being present in the output.

If there is a problem though I don't think we should address it at the parser level, but instead the sanitizer should block elements with an associated shadow root.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, glad we got some clarity here. I was totally not aware that this may live outside of custom elements. That being said, I think this is all fine as long as the sanitizer traverses the shadow root.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this on the call and agree that this is fine as long as the sanitizer is traversing

explainer-new.md Outdated
Comment on lines 52 to 57
- Should the filter config be a separate object, or should it be a plain dictionary?
- Reasons pro dictionary:
- Simpler.
- Reasons pro object:
- Allows to pre-process of the config and to amortize the cost over many calls.
- Allows adding other useful config operations, like introspection.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If actual normalization is performed on the input and that is exposed (rather than the input being reflected back) I could see folks supporting a class. The current Sanitizer class is not it though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main hope with this was that a framework would supply a Sanitizer instance for the web developer to use in framework-specific code (e.g., to prevent script gadgets attacks). That being said, a dictionary is likely as good.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also discussed this on the call. We agree that there's little value in having an object that just contains a dictionary. Nobody is against an object, as long as there is a useful feature behind it (e.g, normalization, performance).

explainer-new.md Outdated
Comment on lines 61 to 65
- Baseline for safe methods: There are some "special" behaviours (currently in
"handle funky elements"), mainly around dropping javascript:-URLs in contexts
that navigate. Should this be available as an option (which would then be
force-true for safe usage, and default-false but available for non-safe
usage), or would that be custom behaviour only for the safe methods?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be okay leaving it out until we see demand.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. (Removed this from open questions, and added a sentence after the API proposal.)

explainer-new.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@mozfreddyb mozfreddyb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. Thanks so much, Daniel!
(And my deepest apology for the the review delay here!)

I think this is almost ready to merge once we have resolved the minor sidebars

explainer-new.md Outdated Show resolved Hide resolved
explainer-new.md Outdated
Comment on lines 49 to 51
- Defaults: All of these are new methods, without legacy usage. Would DSD
parsing default to `true`? Do we even decide that, or would we instead ask
the HTML WG and adopt whatever their choice?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as a declarative shadow root always requires a custom element as a parent and as long as we require custom elements to be named in the allowElements list, I am very OK with this not being a toggle at all.

explainer-new.md Outdated Show resolved Hide resolved
explainer-new.md Outdated
## Open questions:

- Defaults: If no filter is supplied, do the safe methods have any filtering
other than the baseline? Do the unsafe methods have default filtering?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This is more than "what the baseline is", but also whether the default should be something "more usable" than the baseline (e.g., also issue baseline should be default and default should be baseline #183). I think it should suffice to enforce only the baseline. (Unless we actually do the full research on the defaults of other sanitizers and e.g., popular XSS gadget attributes to also strip from the "default".)

explainer-new.md Outdated
Comment on lines 52 to 57
- Should the filter config be a separate object, or should it be a plain dictionary?
- Reasons pro dictionary:
- Simpler.
- Reasons pro object:
- Allows to pre-process of the config and to amortize the cost over many calls.
- Allows adding other useful config operations, like introspection.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main hope with this was that a framework would supply a Sanitizer instance for the web developer to use in framework-specific code (e.g., to prevent script gadgets attacks). That being said, a dictionary is likely as good.

explainer-new.md Outdated Show resolved Hide resolved
explainer-new.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks great!

explainer.md Outdated
Comment on lines 68 to 71
- `Element.setHTML(string, {options})` - Parses `string` using `this` as
context element, like assigning to `innerHTML` would; applies a filter,
while enforcing an XSS-focused baseline; and finally replaces the children
of `this` with the results.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want to directly compare parsing to innerHTML as there will at least be two major differences:

  1. Support for declarative shadow roots.
  2. No support for XML.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While technically correct, I think it helps to draw a comparison. Even if it's not exactly 1:1, I am expecting many developers to switch from innerHTML= to setHTML()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is summarizing the design for implementers, no? I don't mind it saying innerHTML, but it should be more accurate.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We agree on a softer wording. "almost like", "similar to innerHTML". etc. :). Obviously, there should be a section that explains the differences more closely.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also added a note at the bottom to be more explicit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question on, "No support for XML": I think we agree that Document.parseHTML should not create XML documents (unlike DOMParser). Is this also meant to apply to setHTML when called on an element in an existing XML document?

(I don't mind either way; but would like to be clear.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant it for both.

explainer.md Outdated
Comment on lines 107 to 108
- Defaults: If no filter is supplied, do the safe methods have any filtering
other than the baseline?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reference #188 here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

explainer.md Outdated
Comment on lines 109 to 110
- Defaults: All of these are new methods without legacy usage. Would DSD
parsing default to `true`? (Probably. Decision lies with WHATWG.)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can consider this decided.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. (I added this as one example to the note about differences between the new APIs and their existing counterparts.)

explainer.md Outdated
- Simpler.
- Reasons pro object:
- Allows to pre-process the config and to amortize the cost over many calls.
- Allows adding other useful config operations, like introspection.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems correct. An object here would require one of:

  1. Compelling performance numbers.
  2. A compelling operation that would only work with a pre-processed dictionary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Adopted this wording.

README.md Outdated Show resolved Hide resolved
explainer.md Outdated
Comment on lines 68 to 71
- `Element.setHTML(string, {options})` - Parses `string` using `this` as
context element, like assigning to `innerHTML` would; applies a filter,
while enforcing an XSS-focused baseline; and finally replaces the children
of `this` with the results.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While technically correct, I think it helps to draw a comparison. Even if it's not exactly 1:1, I am expecting many developers to switch from innerHTML= to setHTML()

Copy link
Collaborator

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\o/

Thanks!

@otherdaniel otherdaniel merged commit 6dd0623 into WICG:main May 31, 2023
@otherdaniel otherdaniel deleted the explainer-new branch April 22, 2024 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants