When you download geographical data the polygons are often highly detailed, leading to large file sizes and slow processing times. This tutorials shows you how to simplify detailed polygons in R and QGIS without creating slivers (gaps) between the resultant shapes.

HUGOMORE42When you download boundary data, for example from the census boundary data page, the polygons are usually highly detailed. Often this detail is unnecessary if you’re not intending to produce small–scale maps.

Most thematic maps, for example, tend to compare large geographies such as nations or regions, so the detail is unnecessary. Likewise, if you’re producing your map for use on the web, for example as an interactive visualisation, too much detail can slow the rendering and responsiveness of your app.

Downloading pre–simplified versions or trying to simplify the polygons yourself in QGIS or ArcGIS can be unsatisfactory because some simplification functions introduce ‘slivers’ or gaps between the simplified geometries.

There are at least two approaches to obtaining simplified geometries that are topologically–aware (i.e., you don’t end up with any gaps).

Update 30 August 2018 - it looks like as of QGIS v3 the default simplify function is now topologically aware.

## QGIS -`v.generalize`

One approach is to use the `v.generalize`

function in GRASS, and the easiest way to do this is through QGIS.

- Open QGIS and load your shapefile that you’d like to generalise/simplify.
- Open the ‘ Toolbox’ from the Processing menu
- Open the GRASS > Vector (v.*) menu
- Select
`v.generalize`

By default this will save a temporary layer which you can then export, or you can specify a filename. With the default values I was able to simplify a shapefile from 35MB to 11MB, or about \(\frac{1}{3}\) of the original size without introducing any gaps.

## R - `rmapshaper`

Traditionally simplifcation was performed with the `rgeos`

package in R, but this was not topologically–aware. The package `rmapshaper`

simplifies polygons without introducing slivers and gaps, and is compatible with `sf`

and `sp`

objects.

Install and load some required packages (I’m assuming you already have `tidyverse`

and `sf`

installed):

```
# install.packages("rmapshaper") # install if necessary
library("tidyverse")
library("sf")
library("rmapshaper")
```

Download and unzip the detailed shapefile for simplification:

```
tmp_dir = tempdir()
tmp = tempfile(pattern = "", tmpdir = tmp_dir, fileext = ".zip")
download.file(
paste0(
"https://borders.ukdataservice.ac.uk/ukborders/easy_download/",
"prebuilt/shape/infuse_rgn_2011.zip"
),
destfile = tmp
)
unzip(tmp, exdir = tmp_dir)
```

Read the shapefile:

`rgn = read_sf(tmp_dir, "infuse_rgn_2011")`

And simplify:

`rgn_simp = ms_simplify(rgn, keep = 0.01)`

And compare the sizes (for this I’ve used the `pryr`

package but you don’t need to do this step):

`pryr::object_size(rgn)`

`## 13.4 MB`

`pryr::object_size(rgn_simp)`

`## 162 kB`

So the simplification has definitely worked in terms of file size! Next lets compare the plots:

```
rgn_simp$simplified = "Simplified"
rgn$simplified = "Original"
rgn = reduce(
list(rgn, rgn_simp),
rbind
)
ggplot(rgn) + geom_sf() + facet_wrap( ~ simplified)
```

At this scale I can’t tell the difference, which would be perfect for web visualisation and even for printing. Even if your plots are being printed, you’ll save yourself a lot of rendering time over the unsimplified files.