Skip to content

Latest commit

 

History

History
65 lines (48 loc) · 3.99 KB

2015-04-24-now-available-rsocrata-1-6-0.md

File metadata and controls

65 lines (48 loc) · 3.99 KB
id title date author layout guid permalink image categories tags
41771
Now Available: RSocrata 1.6.0
2015-04-24 08:00:56 -0500
Tom Schenk
post
/index.php/now-available-rsocrata-1-6-0/
Default
Feature
data.json
Open Data
open data portal
Project Open Data Metadata Schema
R
RSocrata
rstats

Last year the City’s data science team released the first version of RSocrata, which allows an easy way for R programmers to access and download data from Socrata data portals using the R statistical language. This week, we’ve released RSocrata 1.6.0, which introduces some new features for users.

Users can now quickly download a list of all datasets from a Socrata open data portal using ls.socrata (short for “list space Socrata”). Use the domain on an open data portal, such as data.cityofchicago.org, data.hawaii.org, iranhumanrights.socrata.com, or any other portal hosted by Socrata:

all_the_data <- ls.socrata("data.cityofchicago.org")
nrow(allSitesDataFrame) # Number of datasets
allSitesDataFrame$title # Names of each dataset

Users can navigate through the entire list of datasets on the portal, a brief description, download URLs, and more. It’s a quick and easy way to view the available data on the portal and also write scripts to download each dataset (using the read.socrata function).

This feature uses the Project Open Data Metadata Schema—otherwise known as data.json—standard (currently compatible with v1.1). The schema is becoming the _de facto _standard for transmitting metadata, which serves as the basis for the ls.socrata function. This new feature was conceived, written and submitted by Peter Schmiedeskamp from the University of Washington.

API Tokens

Heavy users of Socrata should use API tokens to allow for more API requests without being throttled. RSocrata now supports a separate API token field. Users can simply use that optional field to pass along their token and reduce download throttling.

token <- "ew2rEMuESuzWPqMkyPfOSGJgE"
earthquakesDataFrame <- read.socrata("http://soda.demo.socrata.com/resource/4334-bgaj.csv",
app_token = "ew2rEMuESuzWPqMkyPfOSGJgE")

Want to hide your API keys on a public project, such as on GitHub? Now you can use the app_token to keep your private token from other users. Create a new file in your project called token.txt with the following content:

ew2rEMuESuzWPqMkyPfOSGJgE

You can read-in the token using readLines

token <- readLines(“path/to/token.txt”, n=1)
read.socrata("http://soda.demo.socrata.com/resource/4334-bgaj.csv",
app_token = token)

To mask your token, add token.txt to your .gitignore file and, voila, you’ve hidden the token.

RSocrata 1.6.0 is available on CRAN for R 3.2.0 or greater and can be downloaded using:

install.packages("RSocrata")
library("RSocrata")

Further development will be conducted on the project’s GitHub site. You can install the beta using devtools the devtools package:

install.packages("devtools")
library(devtools)
install_github(“Chicago/RSocrata”)

Use GitHub to submit new features or issues. You can also reach us by contacting @ChicagoCDO or emailing [email protected].