New proposal for instant experiment result data model #43

hellais · 2023-11-14T10:24:55Z

This implements: #42

codecov · 2023-11-14T10:44:27Z

Codecov Report

Attention: 171 lines in your changes are missing coverage. Please review.

Comparison is base (cb4eebd) 76.12% compared to head (6f628df) 80.92%.

❗ Current head 6f628df differs from pull request most recent head 829a17b. Consider uploading reports for the commit 829a17b to get more accurate results

Files	Patch %	Lines
oonidata/analysis/website_experiment_results.py	77.14%	109 Missing ⚠️
oonidata/dataviz/web.py	0.00%	31 Missing ⚠️
oonidata/models/observations.py	9.09%	10 Missing ⚠️
oonidata/workers/analysis.py	92.92%	7 Missing ⚠️
oonidata/analysis/web_analysis.py	80.95%	4 Missing ⚠️
oonidata/datautils.py	33.33%	2 Missing ⚠️
oonidata/models/experiment_result.py	95.23%	2 Missing ⚠️
oonidata/cli/command.py	80.00%	1 Missing ⚠️
oonidata/db/connections.py	50.00%	1 Missing ⚠️
...ata/transforms/nettests/measurement_transformer.py	80.00%	1 Missing ⚠️
... and 3 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #43      +/-   ##
==========================================
+ Coverage   76.12%   80.92%   +4.80%     
==========================================
  Files          69       69              
  Lines        5989     6067      +78     
==========================================
+ Hits         4559     4910     +351     
+ Misses       1430     1157     -273

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

hellais · 2023-11-21T08:40:38Z

Once #44 is landed in here. I should be sure to merge this without squashing everything compacting all the above commits into one.

For the rationale behind this decision see this comment: #44 (comment)

Implements: #42 Changes: * Some refactoring to the web_analysis to align with new experiment result model * Refactor feature based analysis into dask * Rename create table query to WebAnalysis * Implement workaround for ooni/probe#2628 * Implement probe_cc based load spreading * Fix start_analysis signature * Kill some dead code * Setup system for running tests on web_analysis * Add checks for ensuring analysis and observations agree on measurement counts * Refactor and add tests for make_cc_batches * Add tests for ground truth generation * A lot of bugfixing and testing improvements

= Overview of scope of this PR Inside of this PR I implement an Experiment Results analysis approach based on the analysis tables. The basic idea is to take the analysis keys that are generated by comparing an individual observation with ground truth data. Through a very large set of rules we are able to assign individual blocking, down and ok rules based on how confident we are in that particular signal being a sign for censorship. We then take all the scores pertaining to a particular observation group relevant to a measurement and generate a `MeasurementExperimentResult` which should be backward compatible with out existing PR. Based on this we add support for generating the experiment results based on the analysis inside of the `mkanalysis` command and a simple web interface for inspecting them. In terms of performance some cursory benchmarks were run the dataset from 2023-09-01 - 2023-11-01 and it was processing data at a rate of ~7k observations per second scaling on 34 cores. = Summary of changes * Implement Experiment Result generation based on the analysis tables * Implement minimal UI for MeasurementExperimentResult * Add support for generating MeasurementExperimentResult as part of mkanalysis cli command * Add more tests for all of the above

* Make utcnow() calls timezone aware * Implement workardound for clickhouse bug mymarilyn/clickhouse-driver#388 * Implement more tests for range deletions * Refactoring of get_prev_range functions * Fix problem in experiment result generation

hellais · 2023-11-21T08:44:20Z

The git history is now clean. This branch is ready for review.

hellais · 2024-03-06T08:17:05Z

I am going to land this into main to make progress on shipping this.

hellais added the funder/drl2022-2024 label Nov 20, 2023

hellais added 4 commits November 21, 2023 09:42

Delete dead code for previous version of experiment results

e230b25

hellais force-pushed the experiment-results branch from c9ffc63 to 6f628df Compare November 21, 2023 08:44

hellais requested a review from bassosimone November 21, 2023 09:21

Small style improvements to analysis UI

829a17b

hellais mentioned this pull request Jan 11, 2024

Generate ExperimentResults and BlockingEvents ooni/ooni.org#1282

Closed

hellais merged commit ace6e14 into main Mar 6, 2024
3 checks passed

hellais deleted the experiment-results branch March 6, 2024 08:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New proposal for instant experiment result data model #43

New proposal for instant experiment result data model #43

hellais commented Nov 14, 2023

codecov bot commented Nov 14, 2023 •

edited

Loading

hellais commented Nov 21, 2023 •

edited

Loading

hellais commented Nov 21, 2023

hellais commented Mar 6, 2024

New proposal for instant experiment result data model #43

New proposal for instant experiment result data model #43

Conversation

hellais commented Nov 14, 2023

codecov bot commented Nov 14, 2023 • edited Loading

Codecov Report

hellais commented Nov 21, 2023 • edited Loading

hellais commented Nov 21, 2023

hellais commented Mar 6, 2024

codecov bot commented Nov 14, 2023 •

edited

Loading

hellais commented Nov 21, 2023 •

edited

Loading