Evaluation¶
Results of an experiment evaluation.
Source code in src/epstats/toolkit/experiment.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
|
check_columns()
classmethod
¶
checks
dataframe with columns:
timestamp
- timestamp of evaluationexp_id
- experiment idcheck_id
- check id as inExperiment
definitionvariable_id
- name of the variable in check evaluation, SRM check has following variablesp_value
,test_stat
,confidence_level
value
- value of the variable
Source code in src/epstats/toolkit/experiment.py
83 84 85 86 87 88 89 90 91 92 93 94 95 |
|
exposure_columns()
classmethod
¶
exposures
dataframe with columns:
timestamp
- timestamp of evaluationexp_id
- experiment idexp_variant_id
- variant idexposures
- number of exposures of this variant
Source code in src/epstats/toolkit/experiment.py
97 98 99 100 101 102 103 104 105 106 107 |
|
metric_columns()
classmethod
¶
metrics
dataframe with columns:
timestamp
- timestamp of evaluationexp_id
- experiment idmetric_id
- metric id as inExperiment
definitionmetric_name
- metric name as inExperiment
definitionexp_variant_id
- variant idcount
- number of exposures, value of metric denominatormean
-sum_value
/count
std
- sample standard deviationsum_value
- value of goals, value of metric nominatorconfidence_level
- current confidence level used to calculatep_value
andconfidence_interval
diff
- relative diff between sample means of this and control varianttest_stat
- value of test statistic of the relative difference in meansp_value
- p-value of the test statistic under currentconfidence_level
confidence_interval
- confidence interval of thediff
under currentconfidence_level
standard_error
- standard error of thediff
degrees_of_freedom
- degrees of freedom of this variant meansample_size
- current sample sizerequired_sample_size
- size of the sample required to reach the required powerpower
- power based on the collectedsample_size
false_positive_risk
- false positive risk of a significant metric
Source code in src/epstats/toolkit/experiment.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 |
|