Skip to content
This repository has been archived by the owner on Feb 7, 2025. It is now read-only.

Sampling ui #61

Merged
merged 8 commits into from
Jan 23, 2020
Merged

Sampling ui #61

merged 8 commits into from
Jan 23, 2020

Conversation

okennedy
Copy link
Collaborator

This PR implements sampling (resolves VizierDB/web-ui#34)

It relies on the following PR on Mimir UBOdin/mimir#368

@okennedy okennedy requested a review from mrb24 January 22, 2020 03:09
@mrb24 mrb24 merged commit f15d0bc into master Jan 23, 2020
@julianafreire
Copy link

In the design, is it possible to have multiple sampling strategies? I ask because in some of the machine learning use cases we have worked with, we must use stratified sampling. And if we are cleaning for machine learning, this will be needed.

@okennedy
Copy link
Collaborator Author

@julianafreire : This PR adds cell types for the following three sampling modes:

  1. Naive sampling (a flat x% of the data)
  2. Manual stratified sampling (bin records by a column, provide an explicit % rate for each bin)
  3. Automatic stratified sampling (like 2, but pick % rates automatically to ensure equal representation from each bin)

@okennedy okennedy deleted the SamplingUI branch January 23, 2020 16:31
@okennedy
Copy link
Collaborator Author

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sampling Operator
3 participants