A spatial data framework for ggplot2
Dewey Dunnington March 30, 2017
Spatial data plus the power of the ggplot2
framework means easier mapping when input data are already in the form of Spatial*
objects (most spatial data in R).
The package is available on CRAN, and can be installed using install.packages("ggspatial")
. The development version can be installed via devtools
.
install.packages("devtools") # if devtools isn't installed
devtools::install_github("paleolimbot/ggspatial")
The ggspatial
package provides several functions to convert spatial objects to ggplot2
layers. There are three cases: vector data (geom_spatial()
or ggspatial()
), raster data (geom_spatial()
or ggraster()
), and tiled basemap (geom_osm()
or ggosm()
). Their usage is almost identical to normal ggplot geom_*
functions, except that the mapping and the data arguments are switched (usually mapping
comes before data
, but in this context, where the type of the object determines the method that gets called, it makes more sense for data
to come first). Almost every call you make within this package will be:
library(ggspatial)
ggplot(...) + [ geom_osm() + ] # geom_osm is optional
geom_spatial(spatial_object, ...) +
... +
coord_map()
To save on typing, there are a few shortcuts. The first is where a geom_spatial
is the first layer: for this, you can use ggspatial(spatial_object, ...)
, which is short for ggplot() + geom_spatial(spatial_object, ...) + coord_map()
.
library(ggspatial)
## Warning: package 'ggplot2' was built under R version 3.2.5
## Warning: package 'sp' was built under R version 3.2.5
ggspatial(longlake_waterdf, fill = "lightblue")
Generally, a geom_osm()
is your first layer. For this, you can use ggosm()
, which is short for ggplot() + geom_osm() + coord_map()
. In fact, geom_osm()
only works using coord_map()
, since tiles are projected and not in lat/lon coordinates.
ggosm() +
geom_spatial(longlake_waterdf, fill = "red", alpha = 0.25)
Notice that geom_osm()
doesn’t need any arguments. Cool, right!? For more information, see ?geom_spatial
and ?geom_osm
. Now on to more mundane details…
Many (but not all) objects of type Spatial*
can be used with ggplot
using the fortify()
family, but syntax is inconsistent and results vary. This package introduces a single geom_
for use with Spatial*
objects (e.g. SpatialPointsDataFrame
, SpatialLinesDataFrame
, SpatialPolygonsDataFrame
…essentially what you get when you use rgdal::readOGR()
or maptools::readShapeSpatial()
to read any kind of spatial data). A few spatial objects are included in the package as examples, namely the longlake_*
series of layers that I used to create my honours thesis figures (a distressingly long time ago, I may add).
library(ggspatial)
data(longlake_waterdf)
ggplot() +
geom_spatial(longlake_waterdf, fill="lightblue") +
coord_map()
If we examine longlake_waterdf
, we can use the columns as aesthetics just as we would for a normal data.frame
.
ggplot() +
geom_spatial(longlake_waterdf, aes(fill=label)) +
coord_map()
A more useful use of this may be to examine a depth survey from Long Lake I took on for my honours thesis.
data(longlake_depthdf)
ggplot() + geom_spatial(longlake_waterdf[2,], fill="lightblue") +
geom_spatial(longlake_depthdf, aes(col=DEPTH.M), size=2) +
scale_color_gradient(low = "red", high = "blue") +
coord_map()
If you’ve been trying this at home, you may have noticed that you get a little message for every non-lat/lon dataset you try to plot. geom_spatial()
is spatially aware, and if you don’t tell it otherwise, it will convert all of your input data to lat/lon. I made this the default because ggplot
has the nice coord_map()
function that helpfully projects things that happen to be in lat/lon format (and communicates your intent clearly). If you’re working in the polar regions or near the international date line, it’s unlikely that this what you want. To get around the default projection, you can specify your own using crsto
(see also ?as.CRS
). For example, the previous plot could be rendered to the Google Mercator projection by passing crsto=3857
(note that you’ll have to do this for all the layers).
data(longlake_depthdf)
ggplot() +
geom_spatial(longlake_waterdf[2,], fill="lightblue",
crsto=3857) +
geom_spatial(longlake_depthdf, aes(col=DEPTH.M), size=2,
crsto=3857) +
scale_color_gradient(low = "red", high = "blue")
Using the rosm package, ggspatial
can load Open Street Map tiles (among other tile sources) automatically as a backdrop for your other data (provided your data are in lat/lon, or are converted to such using geom_spatial()
). Note that here we use ggosm(...)
instead of ggplot(...) + geom_osm() + coord_map()
, since geom_osm()
only works when using coord_map()
anyway.
ggosm(type = "stamenwatercolor") +
geom_spatial(longlake_waterdf, fill = "red", alpha = 0.25) +
geom_spatial(longlake_roadsdf, lty = 2) +
coord_map()
So far we’ve only used ggosm()
with no arguments, where it provides an auotmatic backdrop for all of the other layers. The first argument of ggosm()
(and geom_osm()
) is actually a bounding box (or an object from which one can be extracted, such as a Spatial*
object, a Raster*
object, a set of lon/lat coordinates, or a string of the location name, geocoded using prettymapr::geocode
). When used with ggosm()
, the most useful usage is a location name. It’s worth noting that passing a vector of location names results in a bounding box containing all of them, so custom bounding boxes can be constructed a little more easily.
ggosm("nova scotia")
A number of packages exist (‘raster’ being one of them) to efficiently deal with raster data, but if you are really keen on using ggplot, there are a couple of tools in ggspatial
to help yout out. The first is a fortify()
implementation of the Raster
class, so that you can pass a Raster
wherever you would pass a data.frame in ggplot and expect a data.frame with columns x, y, band1, band2, and so on. You can call fortify()
with format = "long"
to return a melted form of the Raster, which as far as I can tell is only useful for facetting.
raster_df <- fortify(longlake_osm)
## Loading required package: raster
## Warning: no function found corresponding to methods exports from 'raster'
## for: 'overlay'
head(raster_df)
## x y band1 band2 band3
## 1 409891.4 5084849 180 217 163
## 2 409894.8 5084849 184 219 164
## 3 409898.1 5084849 189 223 167
## 4 409901.4 5084849 186 223 166
## 5 409904.8 5084849 166 210 157
## 6 409908.1 5084849 163 208 156
ggplot(longlake_osm, aes(x, y, fill = band1)) +
geom_raster() +
coord_fixed()
raster_df_long <- fortify(longlake_osm, format = "long")
ggplot(raster_df_long, aes(x, y, fill = value)) +
geom_raster() + facet_wrap(~band) +
coord_fixed()
There is also an implementation of geom_spatial()
for rasters, which essentially fills in aes(x, y)
for you, and if you’re looking for an adventure, it will project and unproject the x and y coordinates. The only reason I can see that this might be useful is when using something like stat = "contour"
, which I haven’t tried yet. Unlike other implementations of geom_spatial()
, the raster version leaves coordinates as is by default. This is largely because geom_raster()
will not work in non-cartesian coordinate systems. It may be possible in the future to work in on-the-fly projection, but this is unlikely to be fast or useful.
ggplot() +
geom_spatial(longlake_osm, aes(fill = band1)) +
coord_fixed()
Is the same as…
ggraster(longlake_osm, aes(fill = band1))
Much faster is skipping the data frame and punching the raster image straight into the grob, as it were. Basically, if you already have an RGB raster (longlake_osm
is one, or you could read one using raster::brick("my_file.tif")
), you can use annotation_spraster()
or geom_spraster_rgb()
to plot it much more efficiently. Similar to ggplot’s annotation_raster()
, annotation_spraster()
does not modify the plot bounds, whereas geom_spraster_rgb()
will display the entire raster.
ggplot() + geom_spraster_rgb(longlake_osm, interpolate = TRUE) +
coord_fixed()
There is a good chance I am missing a lot of common spatial ggplot use cases. Not being somebody who makes maps in R often, there’s a good chance the community has better solutions, which I’m happy to incorporate (especially if you write the code for me…). Keep updated at the github page or at my website.