Rstats

Spatial packages and Travis

A number of R spatial libraries have been updated in the last couple of weeks, and this has played havoc with my Travis-CI build. I had still been using Ubuntu Trusty with Travis which uses old versions of libraries like rgdal and rgeos, so I needed to move to updated versions of these. In addition sf has now become a dependency for a number of spatial packages like tmap, and this uses libudunits2-dev which isn’t installed by default.

Spatial microsimulation 101

I recently gave a presentation for analysts and data modellers at the Department for Work and Pensions (DWP) introducing the spatial microsimulation technique (specifically the IPF flavour), and below are the slides I used (use spacebar to navigate through the slides): Alternatively you can download the presentation as a standard html file to open in your browser. Much of the content is based on material from Spatial Microsimulation with R by Robin Lovelace and Morgane Dumont (online content | physical book) and my own rakeR package for R.

Simplify polygons without creating slivers

When you download geographical data the polygons are often highly detailed, leading to large file sizes and slow processing times. Often this detail is unnecessary if you’re not intending to produce small–scale maps. Most thematic maps, for example, tend to compare large geographies such as nations or regions, so the detail is unnecessary. Likewise, if you’re producing your map for use on the web, for example as an interactive visualisation, too much detail can slow the rendering and responsiveness of your app.

rakeR v0.1.1 released on CRAN

I’m proud to announce the initial release of rakeR, v0.1.1, has been published on CRAN! It’s licensed under the GPLv3 so you can use it for any projects you wish. Purpose The goal behind rakeR is to make performing spatial microsimulation in R as easy as possible. R is a succinct and expressive language, but previously performing spatial microsimulation required multiple stages, including weighting, integerising, expanding, and subsetting. This doesn’t even include testing inputs and outputs, and validation of the results.

townsendr interactive map

This week I updated my Townsend Material Deprivation Score project. The update makes townsendr an interactive online map of deprivation that users can simply view in their browser, rather than having to download and run the R code or view only static maps. I think the result is much more intuitive and useful. Making the map interactive is achieved by using Shiny, a technology for R to make interactive charts and plots.

Formal informal testing of research code

When writing research code I do test my code and results, but until recently I’ve only been doing this informally and not in any systematic way. I decided it was time to change my testing habits when I noticed I had recoded a variable incorrectly despite my informal tests suggesting nothing was wrong. When I went back and corrected this error this made a small, but noticeable, difference to my model.

Dissolve polygons in R

Dissolving polygons is another fairly elementary GIS task that I need to perform regularly. With R this is can be a bit involved, but once done is fully reproducible and the code can be re-used. This post is essentially a companion piece to Clipping polygons in R; I wrote both because I often forget how to complete these tasks in R. Getting started Let’s gather together everything we need to complete this example.

Clipping polygons in R

Clipping polygons is a basic GIS task. It involves removing unneeded polygons outside of an area of interest. For example, you might want to study all local authorities (LADs) in the Yorkshire and the Humber region but can only obtain shapefiles that contain all the LADs in the UK. Removing the LADs outside of Yorkshire and the Humber could be achieved by ‘clipping’ the shapefile of LADs, using the extent of the larger region as a template.

Regression Diagnostics with R

The R statistical software is my preferred statistical package for many reasons. It’s mature, well-supported by communities such as Stack Overflow, has programming abilities built right in, and, most-importantly, is completely free (in both senses) so that anyone can reproduce and check your analyses. R is extremely comprehensive in terms of available statistical analyses, which can be easily expanded using additional packages that are free and easily installed. When there isn’t a readily-available built-in function, for example I don’t know of a function to calculate the standard error, R’s support for programming functions means it’s a doddle to write your own.