Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weather Data Plots #168

Open
wants to merge 67 commits into
base: master
Choose a base branch
from
Open

Weather Data Plots #168

wants to merge 67 commits into from

Conversation

petramika
Copy link
Contributor

@petramika petramika commented Apr 30, 2020

Feature to show weather plots data per county for the last 30 days.

This Feature is related to #32

For further documentation of how does this work, please refer to: https://github.com/strivelabs/ca_visit_tracking/wiki

For live test deployment, check: https://visitdata-275208.wl.r.appspot.com/index.html, if you get a CORS problem, please check a chrome extension as "Moesif Orign & CORS Changer"

anne and others added 30 commits April 16, 2020 14:07
A static html button has been added to the template for county. Also some css has been removed for a correctly bootstrap css display.
When the checkbox is pressed will be activated a filter to add the weather info. Its needed to change the chart to draw the data.
to not to redo the logic once return is extracted, the use of if-else will not compute the (consult)
now text in label weather is inside for smaller views
added previous classes for mobile view
Some dummy data is added to check that it works
useful for drawing areachart plots
- Moving code to an external file to keep code organised
- Drawing dynamic plot per county with precipitations and temperature data
yaxis has been added, colors improved, datestamp improved, texts and titles improved, margins added
p tag not needed and series already defined in chart.
when the div was not created for states, program was throwing an error. Now fixed with jquery
  - All the information is not in one plot, if I have 3 counties it would be 9 lines, too much
  - X Axis when one county is selected will 100% match
  - The correlations are perfectly visible
  - Adding another Y axis data is no effort and will not compromise the main plot
Anne Munoz and others added 7 commits April 29, 2020 17:36
Now a weather var has been created in case that some new changes are done for result. In this case this should not affect to the weather data.
Data is represented if a venue type is not selected.
@dsjoerg
Copy link
Collaborator

dsjoerg commented Apr 30, 2020

Hi, you are probably seeing that app.yaml conflicts here?

@petramika
Copy link
Contributor Author

Hi, you are probably seeing that app.yaml conflicts here?

Hi! I just saw the new data version, you are too quick! ;) I fix it!

@dsjoerg
Copy link
Collaborator

dsjoerg commented May 5, 2020

This is very cool! I'm looking at https://visitdata-275208.wl.r.appspot.com/bydatesel/California/ALL/Beach

I would rather see only high temperature rather than the low-high range, to avoid overwhelming the viewer/user. Nevertheless it's very cool! Continuing to play with it...

@VisitData-org VisitData-org deleted a comment from petramika May 5, 2020
@dsjoerg
Copy link
Collaborator

dsjoerg commented May 5, 2020

One weird thing:

  1. go to https://visitdata-275208.wl.r.appspot.com/bydatesel/California/Orange%20County/ALL
  2. click on the Weather Data checkbox
  3. now only one timeseries will be displayed, and it's not labeled to indicate which one it is

It's probably "Shops & Services" that's being displayed but there's no way for the user to know that.
A few ways to fix this, whichever way is easiest for you is fine by me.

@dsjoerg
Copy link
Collaborator

dsjoerg commented May 5, 2020

And minor issue but call it "Precipitation" not "Precipitations"

@petramika
Copy link
Contributor Author

@dsjoerg

I would rather see only high temperature rather than the low-high range

What is more useful for clients?

A) As a line
B) Or as a column

@dsjoerg
Copy link
Collaborator

dsjoerg commented May 5, 2020 via email

@petramika
Copy link
Contributor Author

petramika commented May 7, 2020 via email


BASE_URL = "http://api.weatherapi.com/v1/history.json?key={}&q={}+united+states&dt={}"
# Get a weatherapi.com api key
API_KEY = os.environ.get("API_WEATHER_KEY", "a70a4e2736644cdcb9d85348202404")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keys must never be committed to version control. This must be removed and the history squashed before merging. We can register the API key as an airflow secret.

BASE_URL = "http://api.weatherapi.com/v1/history.json?key={}&q={}+united+states&dt={}"
# Get a weatherapi.com api key
API_KEY = os.environ.get("API_WEATHER_KEY", "a70a4e2736644cdcb9d85348202404")
BUCKET_NAME = os.environ.get("BUCKET_NAME", "default")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should fail if env not set, to force correct configuration, instead of silently choosing an invalid bucket name.

cached_data = stated_cached_data.get(county, {})
data[county] = always_merger.merge(cached_data, api_data)

state_blob = bucket.blob("{}.json".format(selected_state))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposed: Let's use gs://data.visitdata.org/processed/vendor/api.weatherapi.com/asof/yyyymmdd/*.json to store this data.

@@ -0,0 +1,7 @@
BASE_URL = "http://api.weatherapi.com/v1/history.json?key={}&q={}+united+states&dt={}"
API_KEY = "a70a4e2736644cdcb9d85348202404"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API keys must never be committed to version control. This must be removed and squashed before merging.


return weather

def __load_state_file(state):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dunders are typically reserved for the Python Core API team. A single _ prefix would suffice here. See https://amontalenti.com/2013/04/11/python-double-under-double-wonder

`BUCKET_NAME` variable must be defined in the airflow server

Once the data is stored in the bucket, we can get the weather data for one state in the route
`weather/<state>`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to consolidate all of this under https://visitdata.org/data/ and present it as a single bundle that all goes together.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markroth8 the only issue is that it would increase the size of the initial data load for each page before it renders. If we assume that in most cases, people will be looking at the visits and not the weather, it would be better to load it only upon request.
In general if we plan to add more data sources which might be only displayed based on user's choices in the UI, the architecture of separating those data sources might be better?

@@ -2,4 +2,5 @@ runtime: python37
entrypoint: gunicorn -b :$PORT main:app

env_variables:
FOURSQUARE_DATA_VERSION: "20200503-v0"
FOURSQUARE_DATA_VERSION: "20200504-v0"
BUCKET_NAME: "vd-weather-data"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a WEATHERAPI_DATA_VERSION variable to set the as of date for the weather data as part of the deployment. We can use the same bucket (data.visitdata.org).

@@ -1,2 +1,4 @@
apache-airflow==1.10.10
google-cloud-storage==1.27.0
requests
deepmerge
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pin version in requirements.txt so we know we're running the same thing.

@@ -143,6 +145,13 @@ def data(path):
snapshot_id=app_state['foursquare_data_version'])


@app.route("/weather/<state>")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can just get this from /data/weather/<state>

google-cloud-storage==1.27.0
pyyaml==5.3.1
requests
deepmerge
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pin version in requirements.txt so we know we're all running the same thing.

@markroth8
Copy link
Collaborator

I've added some review comments. I can help with them, as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants