-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experimentation backend #7492
Experimentation backend #7492
Conversation
Will merge this. We're now reasonably confident about the schema we want, and can address any comments if remaining in followups. We'll anyway do a bit of back and forth when it comes time to connect with frontend. The migrations aren't destroying anything, so should be safe. Adding a to-be-backfilled with False field on featureflags table, which has less than 1000 rows, so should be okay. |
# print(samples) | ||
variant_samples.append(samples) | ||
|
||
probability = sum([int(sample_a > sample_b) for (sample_a, sample_b) in zip(*variant_samples)]) / simulations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
probability = sum(sample_a > sample_b for sample_a, sample_b in zip(*variant_samples)) / simulations
It's also not obvious from the code that variant_samples has length of 2 - why is that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes a lot of sense.
And variant_samples is 2 because we're controlling for it at the top of this function. It doesn't work as is for more than 2 variants, the probability formula would need to change for that.
(Plus, since the funnel breakdown values is limited to 2 values (in init) we shouldn't see any more values.
# Default variant names: control and test | ||
return sorted(variants, key=lambda variant: variant.name, reverse=True) | ||
|
||
@staticmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't need to be in a class. Extract this and cover it with extensive tests instead which don't need any database setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do test this without any DB setup, but ok with extracting it out of the class as well!
total = sum([step["count"] for step in result]) | ||
success = result[-1]["count"] | ||
failure = total - success | ||
breakdown_value = result[0]["breakdown_value"][0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this work for multi-breakdowns? Seems really really suspect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, won't work for multi-breakdowns, but since we're creating the funnel ourselves, we know that there isn't a multi-breakdown.
Hmm, this is valuable feedback, I've been wrangling with this for a few days now, so much that I've forgotten the fresh-eyes look. Will add more documentation at the top of the class to explain what's going on.
I didn't want to move this yet to use breakdown
vs breakdowns
as there seem to be some inconsistencies / bugs around this. Will update once we get rid of that tech debt.
ee/clickhouse/views/experiments.py
Outdated
team=team, | ||
created_by=request.user, | ||
filters=filters, | ||
is_experiment=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field is unneeded. We could just look up which feature flags are referred to by experiments.
It's better to stay in the third normal form if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha, makes sense! Only other consideration is that on the FF list page, we'd be checking & joining the experiments table. But I guess that's okay, this is going to be a small table forever-ish.
|
||
# Parameters include configuration fields for the experiment: What the control & test variant are called, | ||
# and any test significance calculation parameters | ||
parameters: models.JSONField = models.JSONField(default=dict, null=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference between parameters and filters is hazy.
What are all the valid parameters? Can we, should we tie these directly into the schema?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't yet figured out what all these would be, hence the JSONfield cop out. But I do know they don't need to be referenced outside the scope of an experiment, and we'd never query for experiments based on these parameters.
And depending on what things we want to show, they can increase by a lot (ex: confidence_intervals, variant names, kind of significance test to run(?) etc. etc. )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair - as long as we don't have any experiments in production (or even then) let's feel free to replace the dict with proper values though as they're easier to communicate forwards in time and migrate :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whatever, let's goooooo
Changes
Proof of concept of running statistics on experiments result
@liyiy hopefully this gives you an idea of the structure of things. Planning to wait a bit more before we determine the migration, but things are looking good here.
We'd probably need more things to make the list view & everything you need on the FE work, but I hope that's not too complicated to build off of this.
Update
Now backend follows this UI flow: #7462 (comment) - no duplicate FFs allowed, create them from scratch in experiment.
Ready to get the models in as well, stable enough now to go for it.
How did you test this code?
More tests to come, basic tests exist.