WIP ExPlat: Self-Contained Client #48475

jessie-ross · 2020-12-17T22:46:49Z

This very rough PR proposes a self-contained ExPlat Client.

~~This PR is at proof-of-concept level and code shouldn't be thoroughly reviewed.~~
Main code is /packages/explat-client, there are a fair few files but most of them are very small.

Self-Contained:

const ExPlatClient = createExPlatClient(makeRequest, getAnonId, logError, isDevelopmentMode)

Dep injects outside parts so it can be fitted to any codebase.
Doesn't assume much of its environment.
No external dependencies or libs needed.
Stores state internally and simply.
Lives in /packages or could even be moved to its own repo.

New API Surface

The Core ExPlat function: `loadExperimentAssignment`

Type signature

loadExperimentAssignment: (experimentName: string) => Promise<ExperimentAssignment>

Usage

const experimentAssignment = await loadExperimentAssignment('experiment_name')

Loads and returns an Experiment Assignment Promise, starting an assignment if necessary.
Call as many times as you like, anywhere you like - except in an SSR context like top-level AND logged-out AND en
Its result may change over time - it keeps the assignment cached and updates with respect to the TTL (currently 3600 seconds in production).
Waiting for the promise to be resolved is the loading state.

Benefits

I was messing around with subscription and external Redux integration when I realised there was a much simpler way: A function which provides a promise. There is a bit more messing around involved to make it intuitive and able to be used as much and whenever an Experimenter wants but this is moving complexity away from Experimenters hands to our hands which I think is worth it.

Able to be used in /lib code, where promises are quite common
Loading state is now just the promise! This makes it super explicit and prevents it being avoided.
Return value is now ExperimentAssignment | null where ExperimentAssignment is a central data structure used ubiquitously throughout. This extra level of boxing is important and yet another layer of protection for the loading state - having an experimentAssignment.variationName === null is different to not having an experimentAssignment.
Extends naturally into hook form: I have included a hooks implementation (as well as a component implementation based on the hooks) - they demo how loadExperimentAssignment can be used.
SSR Safe as far as not crashing Calypso, but it won't run validly if run on the server in an SSR context (see the SSR assessment below).
AFAICT SSR is just in logged-out AND EN cases, which means this function can be used everywhere else.

Synchronous escape hatch: `dangerouslyGetExperimentAssignment`

Type signature

dangerouslyGetExperimentAssignment: ( experimentName: string ) => ExperimentAssignment

Usage

// An experiment MUST be loaded beforehand:
loadExperimentAssignment('experiment_name')

// Significantly enough in the future for the loading to have occurred:
try {
    const experimentAssignment = dangerouslyGetExperimentAssignment('experiment_name')
} catch (e) {
    // This almost always shouldn't be used to deliver default experience if it hasn't loaded yet so in most
    // cases we should be throwing/erroring rather than delivering the default experience.
    throw; 
}

This is a safer alternative to the current "selector method" usage:

Gets but won't load/assign an experiment assignment.
Makes you aware of how dangerous it is.
Easy to spot in a code review.
Logs if the TTL has expired.
Can be instrumented to allow us to see stats of bad usage.

It can be useful within /lib code.

The ExPlat Hook: `useExperiment('experiment_name')`

Type signature

useExperiment: (experimentName: string) => [boolean, ExperimentAssignment | null]

Usage

const [isLoadingExperimentAssignment, experimentAssignment] = useExperiment('experiment_name')

Gets and loads an experimentAssignment if necessary.
Very safe.
Call as many times as you like in different places.
A wrapper around: loadExperimentAssignment
Hooks individually won't obey TTL - they will keep displaying the same experiment assignment as long as their parent component isn't destructed.

Added safety measures

Validation at boundaries.
Tries hard not to take down the rest of the site with an error: Everything is wrapped in a try block.
Throws in development, log-stashes in production.
Lots of invariant checking and throwing.
Adds timeout surrounding the fetch, throws / log-stashes and falls-back to default.
Small level of protection against bad clocks by using a monotonic timestamp source.
Internal code is now hidden away in an internal namespace, making it harder to misuse.
Safeguards around WPCOM SSR usage

Different behaviour surrounding TTL (Time To Live)

getExperimentAssignment obeys TTL - unlike using the getVariationForUser selector - it will refetch if past the TTL. This makes sense from the lib perspective as a user can be using a site for a long period of time, and we need to ensure experiments are turned on/off for them too. Currently we just load data once e.g. in /layout/index.js but if an experimenter is using this in /lib code it will check each time it is called. I would say this is intuitive for experimenters and will give better behaviour for experiments.

However, this is not intuitive for the react side of code, since when you render a react component using an experiment you don't expect or want it to suddenly change for a user - so long as a user has that component up they should have the same experience. Hence useExperiment(Assignment) doesn't obey TTL and same to <Experiment />.

I have added a little protection against bad clocks by using a monotonic timestamp function.
Designed around having experiment level TTL (for singular-experiment-fetching) but ended up checking TTL across experiments.

Future compatible

Simple to move to singular-experiment-fetching once the API is ready. (This will happen very soon.)
Could dep inject the state too in case we want to persist it on some platforms.
Simple to move to prefetching all experiments at page load with synchronous access.

TODO

Confirm design with ExPlat.
Confirm design with Experimenters.
Make sure it is compatible with Calypso SSR (I don't foresee any particular problem).
Set up the package properly, linting according to A8C JS standards, build script, testing.
Unit testing - pretty straight forward as almost everything is dep injected.
A/A test.
Add README
Change documentation and notify users.
Deprecate/Remove existing client code.

Fixes #536 #528

matticbot · 2020-12-17T22:46:51Z

Test live: https://calypso.live/?branch=add/explat-client-self-contained

client/components/experiment/index.tsx

client/lib/explat/index.ts

packages/explat-client/package.json

packages/explat-client/src/create-explat-client.ts

packages/explat-client/src/create-explat-react-client.ts

packages/explat-client/src/index.ts

packages/explat-client/src/internal/experiment-assignments.ts

packages/explat-client/src/internal/requests.ts

packages/explat-client/src/internal/state.ts

packages/explat-client/src/internal/timing.ts

packages/explat-client/src/internal/validations.ts

packages/explat-client/src/types.ts

yanirs · 2020-12-18T06:28:49Z

This is great, @jessie-ross!

I only skimmed the code, but I like the idea of a client with a simple interface and no dependencies.

Not being a Calypso/JS dev, I'm not in a good place to judge it from an experimenter's perspective, but your plan of getting experimenters to review it should cover that. After @withinboredom and @aaronyan have had a look, a P2 post with the contents you put in the PR description, some context, and plenty of mentions would probably get it some attention.

Lives in /packages or could even be moved to its own repo.

To simplify things, /packages seems like a good place to start. I suppose we can quickly "graduate" to a separate repo if needed. Do you see a need to move to a separate repo? It looks like we can still publish packages to npm from this repo.

If this stays in this repo, this PR should also update CODEOWNERS:

wp-calypso/.github/CODEOWNERS

Lines 90 to 91 in 1e075ab

    
           # Experimentation platform legacy library and new client 
        
           /client/lib/abtest @Automattic/experimentation-platform

The current TTL is a little short at 60 seconds so that might need to be lengthened.

This is only for a12s and proxied non-support requests. The default TTL for regular users is 3600 seconds.

packages/explat-client/src/internal/validations.ts

jessie-ross · 2020-12-22T06:56:22Z

SSR Assessment

Since the recent SSR issue (p4TIVU-9zp-p2) I have learnt a lot more about how it works, this is an assessment of this package interacting with SSR.

On Supporting SSR-time Experiments

These are experiments that have code that runs on the server on a SSR'd page.

As I have mentioned this probably won't be a short term goal for us due to two issues:

Getting an anonId at SSR time.
I don't see this as possible unless we move to a different anonId method since we use cookies.
Complexity around caching SSR pages.
This may not be so bad, we can run code in server rendering and add assignment data to the cache key and render context.

Chances are this would require an entirely different client so this client should not support it at all.

SSR Safety

We need to ensure that this client does not run at SSR-time.

We can mitigate this by by swapping this module for another at server-time. By replacing it with a mock package that log's errors to the server we will know when this happens (potential for a traffic light on this in Abacus), and it gives us a layer of protection against using browser-only APIs.

As mentioned in p4TIVU-9zp-p2 it can be hard to know when code is run server-side, but there are only a few routes that do so (and specifically logged out AND en IIRC but I can't seem to find a reference), so we would have some decent heuristics we could add to the Calypso ExPlat Code Review Checklist.

The only real way I can see to prevent it running at SSR-time is more vigilant code reviews, particularly around the proposed new API: loadExperimentAssignment. This function should only be run on the client, but there may be cases where it is ok to run on server for not-in-segment users. Maybe we should rename it to dangerouslyLoadExperimentAssignment?

The hook and Component implementations should be safe as they run in a useEffect.

We already have some level of mitigation against general crashing by wrapping everything in try blocks, but we need to ensure if we are calling any of the dep-injected functions that they are also SSR safe - particularly getAnonId.

Tasks

Replace with mock package if SSR.
Build up heuristics for the Checklist.
Ensure the dep-injected functions are fail-safe.

jessie-ross · 2021-02-08T16:12:45Z

Update

The ExPlat client has mostly been built, mainly PR approvals and /lib code unit tests to go.

The rebuilding PRs:

ExPlat: Self-Contained Client Base Package #49502 Base Package
ExPlat: Self-Contained Client Internals (except for request) #49531 Client Internals (- requests)
ExPlat: Self-Contained Client Internals (requests only) #49602 Client Internals (requests)
ExPlat: Self-Contained Client Internals - Fixes #49829 Client Internals Changes
ExPlat: Self-Contained Client API #49832 Client API

I plan to merge them together at the end and I am keeping this PR in sync with them.

Todo for this stage:

Replace the fetch function as per p1612776525225700-slack-C02DQP0FP
Replace the SSR logger as per p1612777243230000-slack-C02DQP0FP
I think I need to reduce the errors being sent or dedupe them? I have them showing up in the unit test snapshots if you want to see them.
I think I need to re-look over the case of the server sending an empty object of experiments.

yanirs · 2021-02-09T03:45:30Z

I plan to merge them together at the end and I am keeping this PR in sync with them.

@jessie-ross What's the reasoning behind not merging them separately and moving forward with what's been approved? It'd be nice to have a commit history that matches the discussions.

jessie-ross · 2021-02-09T05:20:44Z

@jessie-ross What's the reasoning behind not merging them separately and moving forward with what's been approved? It'd be nice to have a commit history that matches the discussions.

Do you mean merging into trunk or merging into a base branch? I do plan merging them separately but once they are all assembled and ready?

yanirs · 2021-02-09T05:31:25Z

@jessie-ross What's the reasoning behind not merging them separately and moving forward with what's been approved? It'd be nice to have a commit history that matches the discussions.

Do you mean merging into trunk or merging into a base branch? I do plan merging them separately but once they are all assembled and ready?

Merging into trunk. A bunch of them are ready, so it'd be good to get them in.

jessie-ross · 2021-02-09T05:33:43Z

Merging into trunk. A bunch of them are ready, so it'd be good to get them in.

My thoughts here are less deploys, less chance for something to go wrong, i.e. deploying 5 times verses 1.

yanirs · 2021-02-09T05:37:34Z

Merging into trunk. A bunch of them are ready, so it'd be good to get them in.

My thoughts here are less deploys, less chance for something to go wrong, i.e. deploying 5 times verses 1.

I strongly disagree on that. One big deploy is way scarier than five small ones. And in this case, nothing is calling the deployed code, so it should be safe.

For example, if there are any issues with deploying the "empty" package, we're better off finding out when it's empty.

jessie-ross · 2021-02-15T14:25:56Z

Reassessing the PR strategy

Yeah I am not sure this is working for me, it is getting a bit messy especially when I need to go over already committed PRs, make changes and try to sync everything up (e.g. #49990 which I had actually fixed in a later PR). @yanirs I would prefer to not commit as we go.

Stacked commits also aren't feeling great partly from having too large commits and partly from needing to backtrack. I am going to try re-split what we already have into smaller pieces - I think not having to make sure each PR is commit ready will make that easier. It would allow me to make mistakes without worrying about them and fix them later, and with more PRs, committing at the end will reduce the overhead involved.

I strongly disagree on that. One big deploy is way scarier than five small ones. And in this case, nothing is calling the deployed code, so it should be safe.

For example, if there are any issues with deploying the "empty" package, we're better off finding out when it's empty.

As far as the danger of plugging everything at the end, I think having the start of it already committed and then committing the package separately to the /lib code should give us reasonable coverage against this type of issue?

yanirs · 2021-02-16T00:10:54Z

@jessie-ross let's chat about this over Slack today. I'm not sure what alternative you're suggesting if it's not:

PRs that can be merged.
PRs that are stacked.

From the perspective of reviewing, I find it harder to review changes that aren't working/complete with context that's split across multiple PRs. I also don't fully see the benefit of re-splitting now that #49602 is ready to go and #49829 and #49832 don't need much work. However, I'm open to seeing what you have in mind. 🙂

jessie-ross · 2021-02-19T06:25:20Z

I think I need to reduce the errors being sent or dedupe them? I have them showing up in the unit test snapshots if you want to see them.

I think this is fine for now, severe duplication of errors occurs when lots of loadExperimentAssignment are used.

jessie-ross · 2021-03-19T09:12:56Z

Its time :)

yanirs · 2021-03-21T21:03:21Z

🎉

jessie-ross added 6 commits December 18, 2020 04:45

Implement ExPlat Client Package

113b58c

Minor fix

0907cd4

Restructure react client

7d12b0c

Fix package exports

60f731a

Mock implement explat client

4f1bba0

Use the hook to implement experiment component

22210fc

jessie-ross requested a review from a team as a code owner December 17, 2020 22:46

jessie-ross requested review from withinboredom and removed request for a team December 17, 2020 22:46

matticbot added the [Status] Needs Review The PR is ready for review. This also triggers e2e canary tests and wp-desktop tests automatically. label Dec 17, 2020