Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate showcase infrastructure revamp #6882

Open
2 tasks done
Josh-Cena opened this issue Mar 9, 2022 · 9 comments
Open
2 tasks done

Investigate showcase infrastructure revamp #6882

Josh-Cena opened this issue Mar 9, 2022 · 9 comments
Labels
apprentice Issues that are good candidates to be handled by a Docusaurus apprentice / trainee help wanted Asking for outside help and/or contributions to this particular issue or PR. meta Meta-issue about the project itself. Either project maintenance or a list of other issues.

Comments

@Josh-Cena
Copy link
Collaborator

Have you read the Contributing Guidelines on issues?

Description

The current showcase UI is great, and we don't plan to change it! However, the infra behind it is quite constrained and doesn't scale well. The current workflow is like:

  • Users click on the "Please add your site" button on the showcase page, which takes them to edit the users.tsx file in the GitHub UI.
  • They first read the code comment to understand the instructions. Then they need to scroll through this and insert their site in the right place (usually alphabetic order). The users.tsx file, at the time of writing, is 1991 lines long!
  • When they write their site's data, they could make syntax mistakes, they could format the file inconsistently, they may not know what to write in each field and have to use other entries as reference.
  • After they're done editing the file, they fork, commit, PR. After that, they drag & drop the screenshot to the existing PR.
  • Maintainers step in, check out the branch, and optimize the image for them (only 5% of the showcase submissions properly optimized the image beforehand, and it's just economically more sensible for me to checkout every branch and double-check). Also have to change the PR title to docs: add X to showcase and add the "tag: documentation" label.

The problems in this workflow are apparent.

  • Users without knowledge of git/GitHub often struggle with all the operations and can mess up their branches. They often unnecessarily get intimidated by the CLA bot as well.
  • Maintainers need to manually apply changes to PRs because GitHub actions aren't really smart enough to get triggered only after the user has submitted an image (which is usually not the first commit).
  • Images are not collocated with each site, and maintenance of that users.tsx is a huge problem over time. It took me longer than expected for docs: update showcase data #6862 because I need to blame each line and correspond the username with GitHub handles.
  • We need to prettier-ignore the array, otherwise CI would fail 50% of the time.

So how do we solve this? Two things:

  • We need to split users.tsx into smaller, modular data files, and collocate each data entry with its image.
  • We need to use a web form to submit showcase data.

The first one is relatively simple: build a plugin. The second one is slightly harder, because we need to automate everything with GitHub REST APIs, including fork, create branch, commit, PR. If we want to optimize the image on-the-fly, we probably need a serverless service as well.

However, there's an ultimate solution™: get a server, including a DB! Submitting the form will directly insert the site into the DB, and the website can fetch data from the DB. This means:

  • We no longer track things on GitHub. No more confusions of branches and PRs.
  • Cleaner changelogs, commit history, and contributor graph. 25% of our contributors are showcase submissions—that's a lot of noise...
  • Easier data manipulation: we can query the DB in whichever way we want, allowing for richer navigation without bloating the JS bundle.

With 213 showcase entries, we should start planing for how to make it scale. This proposal is just some random ideas—more ideas welcome 👍

Self-service

  • I'd be willing to address this documentation request myself.
@Josh-Cena Josh-Cena added the documentation The issue is related to the documentation of Docusaurus label Mar 9, 2022
@slorber
Copy link
Collaborator

slorber commented Mar 9, 2022

Agree we should do something to reduce maintenance of showcase, pollution of repo with PRs, and reduce friction to send new sites to our showcase

@zpao we can easily add some backend processing through Netlify serverless functions.
Was wondering if Meta is already using any SaaS DB service (Supabase, Planetscale, Fauna, Atlas...) and that we could eventually use to create apps supporting Docusaurus?

@Josh-Cena Josh-Cena added the help wanted Asking for outside help and/or contributions to this particular issue or PR. label Mar 10, 2022
@Josh-Cena Josh-Cena added meta Meta-issue about the project itself. Either project maintenance or a list of other issues. and removed documentation The issue is related to the documentation of Docusaurus labels Mar 29, 2022
@zpao
Copy link
Member

zpao commented Apr 14, 2022

Data in docusaurus is definitely an issue. We have a similar problem on our opensource website - https://opensource.fb.com/projects is backed by a 20k line JSON file which gets some transforms applied at build time, along with images. We use SVGs as much as possible, so the other problem of optimizing isn't too terrible (always room for compressing SVGs too), but also make use of IdealImage to do the processing/resizing at build time. Hell, for your case you could stop having users submit the images entirely and have a build step (or just arbitrary script) which visits the site and takes a screenshot. Remove as much burden from the submitter side as possible.

For serverless/DBs, we don't currently make sure of anything there. And honestly, I think that's actually thinking a bit too narrow. I'd be more inclined to answer this question: How do we make Docusaurus work with headless CMSs? Showcase is a great example out side the typical blog post case for this. It would show how to support arbitrary dynamic data/content from an outside source.

It doesn't quite handle the "just submit a form" case, but honestly that's just asking for abuse. You'll end up with a review process and needing to correct data anyway. But you do probably need something like this anyway if your going to do a CMS - that couldn't be open to everybody so you do need some way to ingest data.

@slorber
Copy link
Collaborator

slorber commented Apr 20, 2022

Hell, for your case you could stop having users submit the images entirely and have a build step (or just arbitrary script) which visits the site and takes a screenshot.

Yes that's possible to do that.

We already have new.docusaurus.io using some basic Netlify serverless functions, we could create a new domain with serverless functions to handle that image processing.

The only problem is that we probably don't want to take thousands of time-consuming screenshots anytime a visitor browses the showcase page

We could store/cache screenshots somewhere. But maybe using CDNs + HTTP caching headers can do the trick here

That's also an opportunity to automatically detect doc sites that are not using Docusaurus anymore (we could look for <div id="__docusaurus">). There's probably value in triggering the update of all screenshots in multiple sizes through a local CLI util or CI.

For serverless/DBs, we don't currently make sure of anything there. And honestly, I think that's actually thinking a bit too narrow. I'd be more inclined to answer this question: How do we make Docusaurus work with headless CMSs? Showcase is a great example out side the typical blog post case for this. It would show how to support arbitrary dynamic data/content from an outside source.

For me, a CMS is already somehow a serverless DB: an external system where you can add and retrieve data

Users are already using Docuaurus with CMSes: they just fetch md docs in /docs as a pre-build step and things work.

Showcase is a different use-case as it's not using a pre-built Docusaurus plugin, be we have the infra to create our own custom showcase plugin and fetch JSON data from a CMS (ie the same data that is currently stored in users.tsx in GitHub)

https://docusaurus.io/docs/api/plugin-methods/lifecycle-apis#createData

export default function showcasePlugin(context, options) {
  return {
    name: 'docusaurus-website-showcase-plugin',

    async loadContent() {
       const users = fetchShowcaseUsersFromCMS(); 
       return {users};
    },

    async contentLoaded({content, actions}) {
      const usersJsonPath = await actions.createData(
        'users.json',
        JSON.stringify(content.users),
      );
      actions.addRoute({
        path: '/showcase',
        component: '@site/src/components/Showcase.js',
        modules: {
          users: usersJsonPath,
        },
      });
    },
  };
}

This doesn't handle local IdealImage etc, but if images are just generated on the fly in serverless functions, this is not really a problem anymore.


It doesn't quite handle the "just submit a form" case, but honestly that's just asking for abuse. You'll end up with a review process and needing to correct data anyway. But you do probably need something like this anyway if your going to do a CMS - that couldn't be open to everybody so you do need some way to ingest data.

As we have to review/fix submissions anyway, we can just create an issue/discussion and users will just post a comment on GitHub asking us to add their site. Typeform, Airtable or any no-code tool.

We mostly want site owners to confirm they want their site added to the showcase.

Asking site owners to fill a form appropriately, adding the right tags, and resizing images only adds friction, and fewer sites get added.


Now, where do we store the reviewed site data?

IMHO a CMS is a bit overkill if only we interact with this data. We could as well use Airtable or a simple GitHub issue table as a CMS:

Site name Site url Site repo Tags
Docusaurus docusaurus.io https://github.com/facebook/docusaurus open-source,design

There are even a few data that could be inferred from the live site URL.


We could also add a site config option so that site owners can provide (and update) most of that data themselves?

{
  showcase: {
    name: "Docusaurus",
    description: "desc", 
    repo: "https://github.com/facebook/docusaurus", 
    tags: ["opensource", "design"]
  }
}

(but we'd still need to ensure this data is correct 🤷‍♂️ )


This looks to me the lowest hanging fruit:

  • Ask site owners to grant permission to add their site to the showcase, using a simple GitHub issue comment
  • We maintainers add/update sites ourselves to Git in batch (like every week), using a simpler data format (users.json instead of users.tsx)
  • Eventually, use a serverless function to auto-generate images from site URLs
  • Eventually, we later move users.json to a CMS / Airtable / GitHub / External system

Any opinion?

@Josh-Cena
Copy link
Collaborator Author

I have a question. How many sites do we actually want added to the showcase? Do we actually want 500 sites in the showcase where most of them have basically no customization besides a landing page?

@slorber
Copy link
Collaborator

slorber commented Apr 20, 2022

I have a question. How many sites do we actually want added to the showcase? Do we actually want 500 sites in the showcase where most of them have basically no customization besides a landing page?

I think it's fine to have a lot of sites there, and site owners are happy to have their sites added, but definitively some sites are not so useful to showcase.

1: Should we be more strict about sites added to the showcase?
2: Or should we filter simpler sites by default in the UI?

I don't really know what's best 😅 but I like option 2 best

Having more sites is good for social proof, but the showcase should remain a useful tool for both marketing and users looking for inspiration/code that stand out from the base Docusaurus template

@Josh-Cena
Copy link
Collaborator Author

What about having the current "showcase-style" showcase for showcase-worthy sites, and move the less designed ones to a more compact list view?

@ghost
Copy link

ghost commented Jan 5, 2024

Any plans to build a showcase plugin? I'd love to build off the showcase UI.

I found this plugin docusaurus-plugin-showcase by @andremartinssw but it seems to be only partially developed, and I can't figure out how to use it.

@andremartinssw
Copy link

@certainlyNotHeisenberg
"partially developed" is (sadly) accurate, as I only made it for internal use at SignalWire, with hardcoded tags for our particular use case and everything. I want to make it more extensible at some point, though.

I suggest copying the plugin code to your docusaurus install, and loading it from a local folder instead of installing from npm, as in its current state it will never work.

@slorber
Copy link
Collaborator

slorber commented Sep 9, 2024

Note: Bluesky copied our showcase code: https://docs.bsky.app/showcase

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
apprentice Issues that are good candidates to be handled by a Docusaurus apprentice / trainee help wanted Asking for outside help and/or contributions to this particular issue or PR. meta Meta-issue about the project itself. Either project maintenance or a list of other issues.
Projects
None yet
Development

No branches or pull requests

4 participants