Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make notebooks simpler and reusable across e-mission platform #187

Open
JGreenlee opened this issue Jan 13, 2025 · 0 comments
Open

Make notebooks simpler and reusable across e-mission platform #187

JGreenlee opened this issue Jan 13, 2025 · 0 comments

Comments

@JGreenlee
Copy link
Contributor

JGreenlee commented Jan 13, 2025

I am suggesting a major refactor of the repo, particularly (i) the way params are passed to notebooks and (ii) the ways notebooks can be used independently

The primary purpose of the notebooks in this repo is to be run daily and generate outputs for the public dash.
But I also think each of the ipynb files in this repo should be "viable" as a standalone notebook. Someone should be able to grab one of the ipynb files from this repo, pull it into their e-mission-server (dockerized or otherwise), and run it against their e-mission-server (dockerized or otherwise) without having to change a bunch of parameters in the notebook – perhaps ideally, not change any parameters and only specify a couple environment variables.
In other words, I think the notebooks from here should be able to be used in the way that the notebooks from https://github.com/e-mission/e-mission-eval-private-data can be used.


I think these changes would:

  1. make it simpler for future contributors to understand the codebase
    • I have now worked on all major components of the e-mission platform, and I have found the public dash to be the least intuitive / steepest learning curve to start working on. Per README, it's intended to be "simple and stupid", but I suspect it has grown more complex over time it was originally conceived to be
    • I also think there is a lot of good code (scaffolding, plots) here that could be transferrable to other eda/viz projects with e-mission data, but is not organized in such a way that it can be used anywhere except this repo
  2. make it easier to spot-check / test changes to the notebooks locally
    • we'd be able to open the notebook in VSCode / IDE of choice, set some env variables, and run notebooks locally without having to connect to the Jupyter notebook server
    • this may become even more relevant/useful if we start adding inline assert statements to all the notebooks as part of a testing strategy

Specific changes I suggest:

  1. Simplify the parameters that notebooks receive and/or change params to environment variables. Currently, the notebooks receive:

      year=year,
      month=month,
      program=args.program,
      study_type=dynamic_config['intro']['program_or_study'],
      mode_of_interest=mode_studied,
      include_test_users=dynamic_config.get('metrics', {}).get('include_test_users', False),
      labels = labels,
      use_imperial = dynamic_config.get('display_config', {}).get('use_imperial', True),
      sensed_algo_prefix=dynamic_config.get('metrics', {}).get('sensed_algo_prefix', "cleaned"),
      bluetooth_only = dynamic_config.get('tracking', {}).get('bluetooth_only', False),
      survey_info = dynamic_config.get('survey_info', {}),

    Besides year and month, all of these are derived from the dynamic config. So why are we not just passing the entire config? Unpacking it into a bunch of different variables, with different names, makes it less clear what is going on, and makes it diverge from other components of the e-mission platform.

    In fact, the notebooks should be able to just call eacd.get_dynamic_config themselves, rather than using this duplicated code from generate_plots.py:

    # Read and use parameters from the unified config file on the e-mission Github page
    download_url = "https://raw.githubusercontent.com/e-mission/nrel-openpath-deploy-configs/main/configs/" + STUDY_CONFIG + ".nrel-op.json"
    print("About to download config from %s" % download_url)
    r = requests.get(download_url)
    if r.status_code is not 200:
    print(f"Unable to download study config, status code: {r.status_code}")
    sys.exit(1)
    else:
    dynamic_config = json.loads(r.text)
    print(f"Successfully downloaded config with version {dynamic_config['version']} "\
    f"for {dynamic_config['intro']['translated_text']['en']['deployment_name']} "\
    f"and data collection URL {dynamic_config['server']['connectUrl'] if 'server' in dynamic_config else 'default'}")

    Similarly, e-mission-common should have a function that handles custom label options retrieval vs. default label options from emcommon, which would replace this bit of code:
    # dynamic_labels can be referenced from
    # https://github.com/e-mission/nrel-openpath-deploy-configs/blob/main/label_options/example-study-label-options.json
    labels = { }
    async def load_default_label_options():
    labels = await emcu.read_json_resource("label-options.default.json")
    return labels
    # Check if the dynamic config contains dynamic labels 'label_options'
    # Parse through the dynamic_labels_url:
    if 'label_options' in dynamic_config:
    dynamic_labels_url = dynamic_config['label_options']
    req = requests.get(dynamic_labels_url)
    if req.status_code != 200:
    print(f"Unable to download dynamic_labels_url, status code: {req.status_code} for {STUDY_CONFIG}")
    else:
    labels = json.loads(req.text)
    print(f"Dynamic labels download was successful for nrel-openpath-deploy-configs: {STUDY_CONFIG}" )
    else:
    # load default labels from e-mission-common
    # https://raw.githubusercontent.com/JGreenlee/e-mission-common/refs/heads/master/src/emcommon/resources/label-options.default.json
    labels = asyncio.run(load_default_label_options())
    if not labels:
    print(f"Unable to load labels for : {STUDY_CONFIG}")
    else:
    print(f"Labels loading was successful for nrel-openpath-deploy-configs: {STUDY_CONFIG}")

  2. break scaffolding and plots into smaller, reusable pieces and relocate them to e-mission-server or e-mission-common

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant