pulled here selecting all census tracts in the 5 buroughs.
http://web.mta.info/developers/data/nyct/subway/Stations.csv
https://data.cityofnewyork.us/City-Government/2010-Census-Tracts/fxpq-c8ku
https://data.ny.gov/Transportation/Turnstile-Usage-Data-2020/py8k-a8wg
crowdsourced from here: https://groups.google.com/d/msg/mtadeveloperresources/VYReLOiV5Jg/QDbrYlG_AgAJ
partially hand-linked in this doc: https://docs.google.com/spreadsheets/d/1yLIF85YHxMLt-aUuPjY3Cn0TlWEUXQz7E4xm8du-LZE/edit#gid=0
https://data.cityofnewyork.us/Transportation/Subway-Lines/3qz8-muuu
Pulled from nyc PLUTO dataset. Filtered out hospitals and clinics using building class code of I*
, informed by the data dictionary.
http://web.mta.info/developers/developer-data-terms.html#data --> 'GTFS'-->'New York City Transit Subway' (Updated April 29, 2020)
# filter for hospitals and clinics based on building code
# remember to re-project it to the standard projection
ogr2ogr -where "BldgClass LIKE 'I%'" -t_srs WGS84 ../output/hospitals.geojson ./MapPLUTO.shp MapPLUTO
https://data.cityofnewyork.us/City-Government/Neighborhood-Tabulation-Areas-NTA-/cpf4-rkhq
geo2topo output/nta.json > output/nta_topo.json
http://web.mta.info/developers/fare.html
American Communities Survey rolled up to the Neighborhood Tabulation Areas https://www1.nyc.gov/site/planning/data-maps/open-data/dwn-acs-nta.page
ran processingScripts/acsByNTA.py
to create a cleaned up geojson and then geo2topo data/output/acs_nta.geojson > data/output/acs_nta_topo.json
to create a topojson.
Alternatively, to create a topojson with all of the required geographic data, you can run:
geo2topo data/output/acs_nta.geojson data/output/mapOutline/mapOutline.geojson data/output/subway-lines.geojson > mapData_topo.json
and each file creates an object
with the same name as the input file's original name.
Using (ArchieML)[http://archieml.org/#resources] via Quartz's (aml-gdoc-server)[https://github.com/Quartz/aml-gdoc-server] to pull unstructured data from this google doc into a json format. To pull a new version of the data, run:
aml-gdoc-server
(it may prompt you for your google API credentials — see (documentation)[https://github.com/Quartz/aml-gdoc-server]).
That will open a server at http://127.0.0.1:6006/
. To get a JSON formatted dataset, just go to http://127.0.0.1:6006/GOOGLE_DOC_ID
and save the resulting file.
Downloaded borough outlines from NYC Open Data, then filtered out SI in the command line with the following command:
jq '{type: .type , features: [ .features[]| select( .properties.boro_code != "5") ] }' data/output/borough-boundaries.geojson > data/output/mapOutline.geojson
Downloaded New Jersey County Boundaries and NY Civil Boundaries (both shapefiles) and loaded them into folder called mapOutlines
. Then from there:
# merge into single file
ogrmerge.py -o output/mapOutline -overwrite_ds mapOutlines/County_Boundaries_of_NJ-shp/County_Boundaries_of_NJ.shp mapOutlines/NYS_Civil_Boundaries_SHP/Counties_Shoreline.shp -single
# re-project
ogr2ogr output/mapOutline/reproj.shp -t_srs "WGS84" output/mapOutline/merged.shp
# clip to bounding box
ogr2ogr output/mapOutline/clipped.shp output/mapOutline/reproj.shp -spat -74.178 40.5320 -73.7309 40.946
Shapefile to GeoJSON using GDAL
Just to be safe I first created a virtual environment for the gis dependencies
for reference: can list them all using conda env list
use Ogr2ogr to convert from shapefile
to geojson