Test your data

I’ve written and spoken before about how important it is to test your functions and data analysis scripts. I decided to revisit these ideas and write this tutorial based on my recent experience of calculating the number of units of alcohol the panel members in the NCDS and BCS70 birth cohorts drank at different time points. I initially thought this would be a straightforward mathematical calculation but this turned out to be vastly more complicated than I thought (it always does!). My tests of the data identified the problem (something I would likely have missed without them) and confirmed when I had solved it. I use testthat in R although the ideas are language–agnostic.

[Read More]

Trigpoints data set released

I’ve packaged up the Ordnance Survey’s archive of trig points into an R package for immediate download and use with R. Install it with: install.packages("trigpoints") Load it as you would a normal package (I also load a few other useful packages here): library("trigpoints") library("dplyr") ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The following objects are masked from 'package:base': ## ## intersect, setdiff, setequal, union library("sf") ## Linking to GEOS 3. [Read More]

Unit testing in R

Yesterday I gave a short talk for the Sheffield R Users’ Group about unit testing in R. I shared some examples of tests I've used with rakeR and my thesis. In short, there are two things you should really be testing when working in R: Data analysis scripts (usually in data-raw/) Functions (usually in R/) Head over to the repo and follow the instructions there to get started with your own tests: https://philmikejones. [Read More]

rakeR v0.2.1 patched

I’ve patched rakeR on CRAN to v0.2.1 to fix a couple of problems in the examples and tests, which were using old labels and failing on some machines (thanks to Derek Atherton for the feedback). I’ve also updated the documentation website to use the new pkgdown template to be consistent with other R packages, most notably the tidyverse. And, that’s about it. If you’re using v0.2.0 and are happy there are no changes to the API to worry about. [Read More]

rakeR v0.2.0 on CRAN

I am absolutely delighted to announce that the latest version of rakeR, version 0.2.0, is on CRAN. You can install it in R or RStudio with: install.packages("rakeR") DOI rakeR now has a DOI. This is probably more useful for me than it is for you but nevertheless, if you use rakeR please be sure to cite it and use the DOI: https://doi.org/10.5281/zenodo.821506 Changes and improvements Speed improvements in integerise() The most noticeable change is that the integerise() step, which previously took hours on a reasonable–sized data set, now takes minutes. [Read More]

Spatial packages and Travis

A number of R spatial libraries have been updated in the last couple of weeks, and this has played havoc with my Travis–CI build. I had still been using Ubuntu Trusty with Travis which uses old versions of libraries like rgdal and rgeos, so I needed to move to updated versions of these. In addition sf has now become a dependency for a number of spatial packages like tmap, and this uses libudunits2-dev which isn't installed by default. [Read More]
sf  travis  rstats  gis 

Spatial microsimulation 101

I recently gave a presentation for analysts and data modellers at the Department for Work and Pensions (DWP) introducing the spatial microsimulation technique (specifically the IPF flavour), and below are the slides I used (use spacebar to navigate through the slides): Alternatively you can download the presentation as a standard html file to open in your browser. Much of the content is based on material from Spatial Microsimulation with R by Robin Lovelace and Morgane Dumont (online content | physical book) and my own rakeR package for R. [Read More]