April / May 2020
Author: Markus Konrad
This repository contains code to obtain and analyse "popular times" data from Google Places. It also contains data fetched between March 22nd and April 15th 2020 for different places world-wide. See this blog post for background information.
All Python code in the root directory is used to fetch and manage popularity data.
The main scripts are:
places.py
: search for places of interest (POI) using the Google Maps places search API; try to fetch popularity data from each found place to assess whether it is a POI (i.e. for this place we can potentially get popularity data); store results indata/pois/
; you may run this script several times and different times of the day to get good resultsgenerate_pois_full.py
: generate a complete POI dataset from all previously found POI; identify timezone for each place (according to its city's geo-coordinates); store result todata/places_of_interest_tz.csv
popularity.py
: for the POI listed indata/places_of_interest_tz.csv
, fetch popularity data and store todata/popularity/
; can (and should) be used for periodic data collection (e.g. with a cronjob) for each hour; a schedule at which local time at the given POI data should be collected can be setgenerate_popularity_full.py
: generate a complete popularity dataset from previously collected popularity data indata/popularity/
anddata/pois/
; store result todata/popularity.csv
R code to reproduce the plots etc. is available in analysis/
.
All datasets are located in data/
. The main datasets are:
-
places_of_interest_tz.csv
: all POI with their meta datacity
: queried citycountry
: country nameiso2
: 2-letter ISO code for the countrylat
: city geo-coordinates latitudelng
: city geo-coordinates longitudequery
: query used to find the placeplace_id
: Google Place IDname
: name of the placeaddr
: address of the placeplace_lat
: place geo-coordinates latitudeplace_lng
: place geo-coordinates longitudetz_id
: timezone IDtz_rawoffset
: time zone offsettz_dstoffset
: time zone DST offset
-
popularity.csv
: popularity values for POI; you should only use thelocal_
date/time values for temporal analysisplace_id
: Google Place IDutc_date
: UTC date when popularity was fetchedutc_weekday
: UTC weekday when popularity was fetchedutc_hour
: UTC hour when popularity was fetchedlocal_date
: local date (according to place timezone) when popularity was fetchedlocal_weekday
: local weekday (according to place timezone) when popularity was fetchedlocal_hour
: local hour (according to place timezone) when popularity was fetchedcurrent_pop
: current popularity at this local date and timeusual_pop
: usual popularity at local weekday and hour
Apache License 2.0. See LICENSE file.