Skip to content

Conversation

@Watemlifts
Copy link
Owner

Many experiments operate on data with a very long tail, and the most
frequent part of the distribution can wash out notable results in
sub-groups. For example, experiment results derived from the data of
very large customers often look quite different than the much more
common results from the small data. Even the use of percentile metrics
can't overcome these effects since often the relevant percentiles are
very high (above 99-percentile).

This adds an optional block to Science::Experiment which should return a
"cohort" when called. The cohort is passed the result of the experiment
so it can determine the cohort from the context data, whether the result
is a mismatch or any of the observation data.

The determined cohort value is available as Scientist::Result#cohort
and is intended to be used by the user-defined publication mechanism.

Many experiments operate on data with a very long tail, and the most
frequent part of the distribution can wash out notable results in
sub-groups.  For example, experiment results derived from the data of
very large customers often look quite different than the much more
common results from the small data.  Even the use of percentile metrics
can't overcome these effects since often the relevant percentiles are
very high (above 99-percentile).

This adds an optional block to Science::Experiment which should return a
"cohort" when called.  The cohort is passed the result of the experiment
so it can determine the cohort from the context data, whether the result
is a mismatch or any of the observation data.

The determined cohort value is available as `Scientist::Result#cohort`
and is intended to be used by the user-defined publication mechanism.
@Watemlifts Watemlifts added the good first issue Good for newcomers label Jan 21, 2022
@Watemlifts Watemlifts self-assigned this Jan 21, 2022
@Watemlifts Watemlifts merged commit 07b490d into main Jan 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

good first issue Good for newcomers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants