stplanr 0.1.1
[This article was first published on Robin Lovelace - R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Version 0.1.1 of the package stplanr has been released on CRAN. This is a major update with many new functions and a new class definition, Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
SpatialLinesNetwork
, for route planning and network analysis using igraph.
This short post, by myself and package co-author Richard Ellison, describes how stplanr can be used for transport research with a few simple examples from the package documentation. We hope that stplanr is of use to transport researchers and practitioners worldwide and encourage contributions to the development version hosted on GitHub.
Working with origin-destination data
Origin-destination (OD) data is one of the basic data sources for understanding travel behaviour. Usually OD data in R is represented by a table containing at least the following columns:- Origin ID: a text string identifying the zone in which journeys originate
- Destination ID: a test string identifying the destination zone
- Number of trips: the rate of travel between the unique OD pair
flows
, as illustrated in the Table below.
library(stplanr) ## Loading required package: sp library(tmap) data("flow") knitr::kable(flow[1:3,c(1, 2, 3, 13)])
Area.of.residence | Area.of.workplace | All | On.foot | |
---|---|---|---|---|
920573 | E02002361 | E02002361 | 109 | 59 |
920575 | E02002361 | E02002363 | 38 | 4 |
920578 | E02002361 | E02002367 | 10 | 1 |
SpatialPointsDataFrame
from the sp package in cents
:
data(cents) plot(cents)To link the flow data we can use the command
od2line()
to create SpatialLinesDataFrame
:
odlines <- od2line(flow = flow, zones = cents) plot(cents) plot(odlines, add = TRUE)Note that the function also accepts a
SpatialPolygonsDataFrame
as an input by setting the line start and end point to the zone’s geographic centroid:
odlines <- od2line(flow = flow, zones = zones) plot(zones) plot(odlines, add = TRUE)To gain a basic understanding of the rate of travel in this simple travel system, we can plot the
odlines
with width proportional to the number of people travelling:
plot(odlines, lwd = odlines$All / mean(odlines$All) * 3, col = "red") plot(odlines, lwd = odlines$On.foot / mean(odlines$All) * 3, col = "green", add = T)In the resulting plot the total rate of travel is represented by the width of red lines. The proportion of people who walk is illustrated by the relationship between the width of the green and red lines. We can use this data to explore the relationship between walking and distance:
odlines <- spTransform(odlines, CRS("+init=epsg:27700")) odlines$dist <- rgeos::gLength(odlines, byid = T) plot(odlines$dist, odlines$On.foot / odlines$All) # fit a model to the curve m <- lm(On.foot / All ~ dist, odlines@data) lines(odlines$dist, m$fitted.values)
summary(m) ## ## Call: ## lm(formula = On.foot/All ~ dist, data = odlines@data) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.26915 -0.06987 -0.00694 0.06190 0.63195 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 5.365e-01 4.503e-02 11.915 8.36e-16 *** ## dist -1.409e-04 2.501e-05 -5.633 9.64e-07 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.1585 on 47 degrees of freedom ## Multiple R-squared: 0.403, Adjusted R-squared: 0.3903 ## F-statistic: 31.73 on 1 and 47 DF, p-value: 9.638e-07This is useful information: we can see a clear negative relationship between the distance of the trip (in metres) and the proportion who are willing to make the journey on foot.
Working with route-allocated ‘flow’ data
stplanr includes functions for allocating OD pairs to the transport network, includingroute_cyclestreet()
, route_graphhopper()
and, most rececently viaroute()
which provides an R interface to the superfast OSRM routing API. This is useful because roads rarely take you directly from origin to destination, as illustrated below for the trip from Leeds to London one could take to to attend the upcoming GISRUK conference:
route <- route_cyclestreet("Leeds", "Greenwich") library(tmap) tiles <- read_osm(bb(route, ext = 2)) tm_shape(tiles) + tm_raster() + tm_shape(route) + tm_lines()We can allocate all of the OD pairs in
odlines
to the transport network using these functions. The routes_fast
dataset, for example, was created using line2route()
and represents the rastest route that a cyclist may take, according to the CycleStreets.net API. A sample of this dataset is illustrated below:
routes_fast$weight <- c(5, 10) plot(routes_fast[1:2,], lwd = routes_fast$weight)Note that there is some overlap between the two lines above. It is sometimes useful to take aggregate statistics for the attributes of overlapping lines, for example to estimate the number of people using any particular part of the transport network. This can be acheived using Barry Rowlingson’s function
overline()
:
rnet <- overline(routes_fast[1:2,], attrib = "weight", fun = sum)Note that in the above plot the final segment to the east has a
weight
value that is the sum of the two overlapping lines inroutes_fast[1:2,]
: 5 + 10 = 15. We can verify this with Barry’s neat function
plot(rnet, lwd = rnet$weight, col = "red") lineLabels(rnet, "weight")## Other functions There are many other functions designed to help transport researchers in
stplanr
. These include:
read_stats19*
functions which import and format UK ‘Stats19’ road traffic casualty datacalc_catchment*
functions for calculating transport ‘catchment areas’ using buffers around transport facilitiesgtfs2sldf()
for reading-in Google’s GTFS format into Rtoptail*
functions for removing the beginning and ends ofSpatialLines
objects
calc_catchment*
functions can be illustrated using some simple data from Sydney showing the potential catchment of a possible separated cycle paths. First we import the data that we want to use:
library(rgdal) ## rgdal: version: 1.1-3, (SVN revision 594) ## Geospatial Data Abstraction Library extensions to R successfully loaded ## Loaded GDAL runtime: GDAL 1.11.2, released 2015/02/10 ## Path to GDAL shared files: /usr/share/gdal/1.11 ## Loaded PROJ.4 runtime: Rel. 4.8.0, 6 March 2012, [PJ_VERSION: 480] ## Path to PROJ.4 shared files: (autodetected) ## Linking to sp version: 1.2-2 data_dir <- system.file("extdata", package = "stplanr") unzip(file.path(data_dir, 'smallsa1.zip')) unzip(file.path(data_dir, 'testcycleway.zip')) sa1income <- readOGR(".","smallsa1") # Import some population data ## OGR data source with driver: ESRI Shapefile ## Source: ".", layer: "smallsa1" ## with 638 features ## It has 19 fields testcycleway <- readOGR(".","testcycleway") # Import the path of the cycleways to test ## OGR data source with driver: ESRI Shapefile ## Source: ".", layer: "testcycleway" ## with 2 features ## It has 2 fieldsWe can then use our population data and the path of the cycleways to estimate the population catchment for a given distance. If our population layer contains fields with multiple subsets of data for which we want to calculate the catchment area (e.g., men, women and children), we can calculate the individual catchments. For this example, we will simply use the ‘Total’ field containing the total population:
cycle_catchment <- calc_catchment( polygonlayer = sa1income, # The SpatialPolygonsDataFrame containing the population data targetlayer = testcycleway, # The Spatial* object containing the transport infrastructure of interest calccols = c('Total'), # The columns to summarise distance = 500, # The desired distance, projection = 'austalbers', # The projection to use for calculating the area dissolve = TRUE # Collapse all the population zones into a single polygon for the catchment ) cycle_catchment$Total # Print the total catchment population ## [1] 23944.32We can also plot the catchment area and the cycle paths. You will notice that in this example, there are gaps in the buffers. These gaps are because of the gaps in the population layer where Sydney harbour passes through the area. To take into account the road network and not simply straight-line distance, we can use the
calc_network_catchment
function.
plot(cycle_catchment) plot(testcycleway, col="red", add=TRUE, lwd=2)The toptail functionality is useful for removing the beginning and ends of SpatialLines, both for improving aestetchics of plots and for ensuring that lines do not overlap. This functionality is illustrated below using the
routes_fast
data.
proj4string(routes_fast) <- CRS("+init=epsg:4326") rf_toptailed <- toptail(routes_fast, toptail_dist = 300) plot(routes_fast, col = "red", lwd = 5) plot(rf_toptailed, add = T)The package vignette contains some further illustrations of
stplanr
’s functions which we plan to improve on over time. While become almost ‘industry standard’ in fields such as diverse as genetics, astronomy and epidemiology, R has received limited attention in transport planning. We believe that there is great potential for R, via new packages such as stplanr, to help solve real world transport problems such as estimating the geographical distribution of cycling potential.
The ‘sustainable’ in the package name relates to the emphasis on low-carbon modes in the package such as cycling and public transport. There is a huge amount of work to be done to plan for a transition away from fossil fuels in the sector, for health andenvironmental reasons. In this context we hope that software such as stplanr
contributes to the evidence base needed to design better transport systems.
To leave a comment for the author, please follow the link and comment on their blog: Robin Lovelace - R.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.