Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build-time mirroring of compat data #15083

Closed
foolip opened this issue Feb 22, 2022 · 12 comments
Closed

Build-time mirroring of compat data #15083

foolip opened this issue Feb 22, 2022 · 12 comments
Assignees
Labels
enhancement Nice to have features. infra Infrastructure issues (npm, GitHub Actions, releases) of this project scripts Issues or pull requests regarding the scripts in scripts/.

Comments

@foolip
Copy link
Contributor

foolip commented Feb 22, 2022

We currently have a mirroring script (npm run mirror) which can be used to copy the data from one browser to another using the same engine, while updating version numbers. For example, Chrome 96 can be mirrored to Chrome Android 96, Opera 82, Opera Android 67, some future release of Samsung Internet, and Android WebView.

A downside is that data for many browsers often goes stale, and needs to be updated manually, as in #15055. And it's very difficult to see which data has been mirrored, and where there are differences.

This is a proposal to simplify maintenance of mirrored data in BCD. Strawman proposal using a simplified version of the structuredClone entry:

{
  "api": {
    "structuredClone": {
      "__compat": {
        "mdn_url": "https://developer.mozilla.org/docs/Web/API/structuredClone",
        "spec_url": "https://html.spec.whatwg.org/multipage/structured-data.html#dom-structuredclone",
        "support": {
          "chrome": {
            "version_added": "98"
          },
          "chrome_android": {
            "version_added": "mirror"
          },
          "edge": {
            "version_added": "mirror"
          },
          "firefox": {
            "version_added": "94"
          },
          "firefox_android": {
            "version_added": "mirror"
          },
          "ie": {
            "version_added": false
          },
          "opera": {
            "version_added": "mirror"
          },
          "opera_android": {
            "version_added": "mirror"
          },
          "safari": {
            "version_added": "15.4"
          },
          "safari_ios": {
            "version_added": "mirror"
          },
          "samsunginternet_android": {
            "version_added": "mirror"
          },
          "webview_android": {
            "version_added": "mirror"
          }
        },
        "status": {
          "experimental": false,
          "standard_track": true,
          "deprecated": false
        }
      }
    }
  }
}

The "mirror" values would be filled in at build time, and the data that BCD consumers see would be unchanged.

One implementation of this would mirror to false if the browser (such as Samsung Internet) doesn't have a release based on Chromium 98 yet, and would automatically get the right version if it's added to browsers/samsunginternet_android.json. An alternative to this would be to disallow mirroring to false values, and instead require replacing false with "mirror" after a release.

I imagine this being rolled out in the following way:

  • Add support for "mirror" in the build script
  • Mass replace versions where it mirrors to exactly the same (maybe only entries where everything can be mirrored, to avoid a mix)
  • Over time, review and replace "manual" values with "mirror" where it's actually bad data, starting with old and implausible off-by-one differences

cc @ddbeck @Elchi3 @queengooborg

@queengooborg queengooborg added enhancement Nice to have features. infra Infrastructure issues (npm, GitHub Actions, releases) of this project scripts Issues or pull requests regarding the scripts in scripts/. labels Feb 22, 2022
@ddbeck
Copy link
Collaborator

ddbeck commented Feb 22, 2022

A few initial thoughts:

I like this idea in general. When we previously directly published the package directly, we couldn't do this sort of trickery without being quite disruptive to consumers. Now that we build a monolithic JSON file, this is totally in reach. The logic for this could happen as part of load() in index.js (perhaps even with the option to opt out of the mirrors, so we can lint both mirrors themselves and the complete data that resolves from them).

Alternate notations

The proposed notation { "version_added": "mirror" } suggests something rather low-level. My second thought was what if we had data with notes. Something like this:

{
  "chrome": {
    "version_added": "98",
    "partial_implementation": true,
    "notes": "A major limitation!"
  },
  "opera": {
    "version_added": "mirror"
  }
}

Would resolve to:

{
  "chrome": {
    "version_added": "98",
    "partial_implementation": true,
    "notes": "A major limitation!"
  },
  "opera": {
    "version_added": "98"
  }
}

Which strikes me as undesirable (and in cases where Chrome has an array of compat statements, unresolvable).

Now, if we're open to perhaps a more significant change to the (internal-only) schema, we might choose a little more abstraction:

{
  "chrome": {
    "version_added": "98",
    "partial_implementation": true,
    "notes": "A major limitation!"
  },
  "opera": {
    "$mirror$": true
  }
}

(using $ to denote internal-only schema additions, perhaps—I'm just spitballing for now)

Though it would suggest that this sort of thing is plausible or desirable:

{
  "chrome": {
    "version_added": "37"
  },
  "opera": [
    {
      "$mirror$": true
    },
    {
      "version_added": "11",
      "version_removed": "15"
    }
  ]
}

(Another route might be to disallow the $mirror$ key in arrays or to use a "mirror" string in place of the array or statement object; if you can't mechanically mirror the entire history from one browser to another, then you don't get to mirror for that feature.)

Notes

Finally, mirroring in general poses some minor headaches for notes. We would probably want to create a backlog of mirrored notes that contain a browser name, so those notes can be rewritten to avoid the browser name (e.g., Before Chrome 37… would need to become Before version 37) for cleaner mirroring.

@foolip
Copy link
Contributor Author

foolip commented Feb 22, 2022

As a starting point, I think 'use a "mirror" string in place of the array or statement object' and not supporting more granular mirroring sounds good. It would also make the source more compact, which would be welcome :)

For notes, perhaps we could create a simple template format to make the replacement explicit, like "Before {{Chrome 37}} bla bla", where the format inside {{}} can only be browser+version, and is automatically replaced with the mirrored browser+version?

@ddbeck
Copy link
Collaborator

ddbeck commented Feb 24, 2022

For notes, perhaps we could create a simple template format to make the replacement explicit, like "Before {{Chrome 37}} bla bla", where the format inside {{}} can only be browser+version, and is automatically replaced with the mirrored browser+version?

Yeah, we could look at doing something like that. Or something I've been kinda interested in, which is to structure the notes (i.e., have versioning fields for notes) and generate text from that (e.g., "Chrome 37-73: Blah blah blah."). Not as slick, but then the templates can live outside the notes themselves.

That said, I think the accuracy gains from build-time mirroring are sufficiently valuable that making the notes look good isn't a blocker.

@Elchi3
Copy link
Member

Elchi3 commented Feb 25, 2022

I like the approaches and I agree that BCD should implement an authoring convenience and make it less error-prone to indicate the same compat for browsers with the same engine.

Given there are more browser candidates that we are investigating to add to BCD (Oculus for example), we would need to add even more "mirror" entries.

So, an alternate proposal to investigate would be to change the meaning of "no entry". Currently, when BCD doesn't provide an entry for a browser, the compatibility for that browser is null. Given we got rid of all null values, I believe that there are actually entries for every browser we tracked in that effort.

If we change the meaning of "no entry" to "mirror", the source would become more compact (which would help with BCD diffs that have become more and more difficult to review). I realize this proposal makes the source files less explicit, but I often really dislike the verbose nature of BCD's JSON files.

{
  "api": {
    "structuredClone": {
      "__compat": {
        "mdn_url": "https://developer.mozilla.org/docs/Web/API/structuredClone",
        "spec_url": "https://html.spec.whatwg.org/multipage/structured-data.html#dom-structuredclone",
        "support": {
          "chrome": {
            "version_added": "98"
          },
          "firefox": {
            "version_added": "94"
          },
          "ie": {
            "version_added": false
          },
          "safari": {
            "version_added": "15.4"
          },
        },
        "status": {
          "experimental": false,
          "standard_track": true,
          "deprecated": false
        }
      }
    }
  }
}

When a browser derives and doesn't support the feature that is normally supported in its engine, we add the browser back to the feature and it overrides. See Edge below. This can also be used to override notes or other details.

{
  "api": {
    "structuredClone": {
      "__compat": {
        "mdn_url": "https://developer.mozilla.org/docs/Web/API/structuredClone",
        "spec_url": "https://html.spec.whatwg.org/multipage/structured-data.html#dom-structuredclone",
        "support": {
          "chrome": {
            "version_added": "98"
          },
          "edge": {
            "version_added": false
          },
          "firefox": {
            "version_added": "94"
          },
          "ie": {
            "version_added": false
          },
          "safari": {
            "version_added": "15.4"
          },
        },
        "status": {
          "experimental": false,
          "standard_track": true,
          "deprecated": false
        }
      }
    }
  }
}

What Philip suggests above would come into play here as well:

If a browser (such as Samsung Internet) doesn't have a release based on Chromium 98 yet, it is false, and once there is a release it would automatically get the right version based on browsers/samsunginternet_android.json.

@foolip
Copy link
Contributor Author

foolip commented Mar 1, 2022

Letting a missing entry mean that the data should be mirrored sounds promising to me. The downside is it would be harder for newcomers to figure out how to fix incorrect data, since it doesn't even appear in the source.

We should also consider what the default for IE should be, since it should be false for all future entries, and can't be mirrored from anything.

@Elchi3
Copy link
Member

Elchi3 commented Mar 1, 2022

We should also consider what the default for IE should be, since it should be false for all future entries, and can't be mirrored from anything.

I believe in mirroring we made IE false when Edge is 13 or newer.

@foolip
Copy link
Contributor Author

foolip commented Mar 1, 2022

Perhaps what we should do is treat every browser (not just IE) as false by default?

@ddbeck
Copy link
Collaborator

ddbeck commented Mar 22, 2022

Perhaps what we should do is treat every browser (not just IE) as false by default?

I don't think this works because of the way we handle Node.js (and Deno) for web APIs. I believe MDN consumes those with the assumption that if they're missing, they're not shown (as they're not expected to be supported), but false values would be. So we'd need to be explicit (though perhaps some tooling could help generate the JSON files for mirroring).

@queengooborg
Copy link
Contributor

I think that this would be a really great addition to BCD, to cut down on a big chunk of the maintenance time for when it comes to Chromium derivatives, and even mobile browser counterparts. I wonder how we'll handle it for the collector...

For the API, I think that we could do something like "browser": "mirror" as a start? From there, we can then write some sort of script that determines what's mirrored within our data now, so we can add the statements as needed.

@foolip
Copy link
Contributor Author

foolip commented Apr 23, 2022

I think that this would be a really great addition to BCD, to cut down on a big chunk of the maintenance time for when it comes to Chromium derivatives, and even mobile browser counterparts. I wonder how we'll handle it for the collector...

I've pondered this too. We will need to use the built BCD to determine if an update is needed, but update the source BCD to fix the data. I'd like to take this opportunity to more clearly separate detection of a problem from fixing the problem, as we currently do both at once and cannot report on problems that we can't autofix. But I think it's all very doable.

For the API, I think that we could do something like "browser": "mirror" as a start? From there, we can then write some sort of script that determines what's mirrored within our data now, so we can add the statements as needed.

Do you mean "mirror" as the value instead of a support statement or array of statements? That would indeed be the most compact possible.

The other option I had in mind is { "mirror": true }, which is structurally similar to the common case and might lead to diffs that are easier to review, and less merge conflicts.

@queengooborg
Copy link
Contributor

This has been resolved by #16401!

@Bmfaried

This comment has been minimized.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement Nice to have features. infra Infrastructure issues (npm, GitHub Actions, releases) of this project scripts Issues or pull requests regarding the scripts in scripts/.
Projects
None yet
Development

No branches or pull requests

6 participants
@ddbeck @Elchi3 @foolip @queengooborg @Bmfaried and others