These R packages import sports, weather, stock data and more

2016-08-27 - By 

There are lots of good reasons you might want to analyze public data, from detecting salary trends in government data to uncovering insights about a potential investment (or your favorite sports team).

But before you can run analyses and visualize trends, you need to have the data. The packages listed below make it easy to find economic, sports, weather, political and other publicly available data and import it directly into R — in a format that’s ready for you to work your analytics magic.

Packages that are on CRAN can be installed on your system by using the R command install.packages("packageName") — you only need to run this once. GitHub packages are best installed with the devtools package — install that once with install.packages("devtools") and then use that to install packages from GitHub using the formatdevtools::install_github("repositoryName/packageName"). Once installed, you can load a package into your working session once each session using the formatlibrary("packageName").

Some of the sample code below comes from package documentation or blog posts by package authors. For more information about a package, you can runhelp(package="packageName") in R to get info on functions included in the package and, if available, links to package vignettes (R-speak for additional documentation). To see sample code for a particular function, tryexample(topic="functionName", package="packageName") or simply ?functionName for all available help about a function including any sample code (not all documentation includes samples).

For more useful R packages, see Great R Packages for data import, wrangling and visualization.

R packages to import public data

PACKAGECATEGORYDESCRIPTIONSAMPLE CODEMORE INFO
blscrapeREconomics, GovernmentFor specific information about U.S. salaries and employment info, the Bureau of Labor Statistics offers a wealth of data available via this new package. blsAPIpackage is another option. CRAN.bls_api(c(“LEU0254530800”, “LEU0254530600”),
startyear = 2000, endyear = 2015)
Blog post by package author
FredRFinance, GovernmentIf you’re interested just in Fed data, FredR can access data from the Federal Reserve Economic Data API, including 240,000 US and international data sets from 77 sources.Free API key needed. GitHub.fred <- FredR(api.key)
fred$series.search(“GDP”)
gdp <- fred$series.observations(series_id = ‘GDPC1’)
Project’s GitHub page
quantmodFinance, GovernmentThis package is designed for financial modelling but also has functions to easily pull data from Google Finance, Yahoo Finance and the St. Louis Federal Reserve (FRED). CRAN.getSymbols(“DEXJPUS”,src=”FRED”)Intro on getting data
censusapiGovernmentThere are several other R packages that work with data from the U.S. Census, but this aims to be complete and offer data from all the bureau’s APIs, not just from one or two surveys. API key required. GitHub.mydata <- getCensus(name=”acs5″, vintage=2014,
key=mycensuskey,
vars=c(“NAME”, “B01001_001E”, “B19013_001E”),
region=”congressional district:*”, regionin=”state:36″)
This Urban Institute presentationhas more details; theproject GitHub pageoffers some basics.
RSocrataGovernmentPull data from any municipality that uses the Socrata data platform. Created by the City of Chicago data team. CRAN.mydata <- read.socrata(
“https://data.cityofchicago.org/
Transportation/Towed-Vehicles/ygr5-vcbg”)
RSocrata blog post
forbesListRMiscA bit of a niche offering, this taps into lists maintained by Forbes including largest private companies, top business schools and top venture capitalists. GitHub.#top venture capitalists 2012-2016
mydata <-
get_years_forbes_list_data(years = 2012:2016,
list_name = “Top VCs”)
See theproject GitHub page. You may need to manually load the tidyr package for code to work.
pollstRPoliticsThis package pulls political polling data from the Huffington Post Pollster API. CRAN.elec_2016_polls <- pollster_chart_data(
“2016-general-election-trump-vs-clinton”)
See theIntro vignette
LahmanSportsR interface for the famed Lahman baseball database. CRAN.batavg <- battingStats()Blog postHacking the new Lahman Package 4.0-1 with RStudio
stattleshipRSportsStattleship offers NFL, NBA, NHL and MLB game data via a partnership with Gracenote. API key (currently still free) needed. GitHub.set_token(“your-API-token”)
sport <- ‘baseball’
league <- ‘mlb’
ep <- ‘game_logs’
q_body <- list(team_id=’mlb-bos’, status=’ended’,
interval_type=’regularseason’)
gls <- ss_get_result(sport=sport, league=league,
ep=ep, query=q_body, walk=TRUE)
game_logs <- do.call(‘rbind’,
lapply(gls, function(x) x$game_logs))
See theStattleship blog post
weatherDataWeatherPull historical weather data from cities/airports around the world. CRAN. If you have trouble pulling data, especially on a Mac, try uninstalling and re-installing a different version with the codeinstall_github("ozagordi/weatherData")mydata <- getWeatherForDate(“BOS”, “2016-08-01”,
end_date=”2016-08-15″)
See thispost by the package author.

 

Original article here.

Posted In:  Big Data Statistics

Site Search

Search
Exact matches only
Search in title
Search in content
Search in comments
Search in excerpt
Filter by Custom Post Type
 

BlogFerret

Help-Desk
X
Sign Up

Enter your email and Password

Log In

Enter your Username or email and password

Reset Password

Enter your email to reset your password

X
<-- script type="text/javascript">jQuery('#qt_popup_close').on('click', ppppop);