Thin Plate Splines

Tags:  

Thin Plate Splines are used to interpolate data.  They are an extension of cubic splines and can be used to interpolate data in 3 dimensions.

Thin plate splines can be used to develop contour maps as shown above. The original data set contained more than 55,000 data points (see below) and it was difficult to determine a trend or pattern amongst these points. The image above shows 153 years of temperature observations in Sydney, Australia. It can be seen that winter temperatures (blue bands) are reducing in their intensity

Thin plate splines can be used to develop contour maps as shown above. The original data set contained more than 55,000 data points (see below) and it was difficult to determine a trend or pattern amongst these points. The image above shows 153 years of temperature observations in Sydney, Australia. It can be seen that winter temperatures (blue bands) are reducing in their intensity

Adding a Dimension

The original data set was a list of maximum temperatures over a 153 time period.  This data was transformed such that the effects of seasons could be visualised by inserting breakpoints at every 365 observations and then constructing a 365 (days) row by 153  (years) column matrix.  The aspect ratio of this matrix was further refined by binning (i.e. aggregating) every 4th observation and taking the mean of the result. The resulting matrix contained 84 x 153 (12,852) observations and the results of this transformation are shown in the blue image below.

The image shows more than 55,000 daily temperature observations over a 153 year period.  Each tile is one observation. The observations were separated into deciles and 10 different shades of blue were assigned.  The darker the colour, the higher the temperature.  It difficult looking at the raw data to determine any trend.  Interpolation techniques such as Thin Plate Splines can be used to fit trend lines to a data set and this may be useful to determine patterns in the data.

The image represents more than 55,000  daily temperature observations over a 153 year period. The original data was binned into about 13 000 buckets — each tile above represents one of these buckets.   The tiles  were separated into deciles and 10 different shades of blue were assigned. The darker the colour, the higher the temperature. It difficult looking at the raw data to determine any trend. Interpolation techniques such as Thin Plate Splines can be used to fit trend lines to a data set and this may be useful to determine patterns in the data.

Fitting Curves

To fit curves to the data, the raw data was summarised by reducing the 55,000 data points into a smaller set of 700 points.  This is shown by the black and white image below.  The size of each circle indicates the magnitude of the average temperature — the larger the circle, the higher the temperature.  Curves were then fitted to this reduced data set.  After the curves were fitted, a raster was created by projecting the curves onto a lattice of 6 million pixels. This contained a continuous set of values and for the sake of analysis and visualisation, these values were discretised into 20 bins and colours were assigned to each of these 20 bands.

The image on the left shows how the original data, a 365 x 153 matrix, was aggregated into a smaller 20 x 35 matrix by binning the data and then taking the mean of each bin. This resulted in regular tessellation as shown by the image on the right. This size of each white dot indicates the magnitude of the temperature — the higher the temperature, the larger the dot.

The image on the left shows how the original data, a 365 x 153 matrix, was aggregated into a smaller 20 x 35 matrix by binning the data and then taking the mean of each bin. This resulted in regular tessellation as shown by the image on the right. This size of each white dot indicates the magnitude of the temperature — the higher the temperature, the larger the dot.

Implementation in R

The Thin Plate Spline function [i.e. Tps() ] from the fields packages was used to create the curves that interpolated between the 700 3D points.  After calculating this model it was applied to a grid of 6.5 million cells.  A number of spatial classes from the “sp” package was used as containers for the 700 3D points and for the grid of 6.5 million cells.   The following code snippet shows how the model as created and then applied to 6.5 million cells. The SPDF object is an instantiation of the Spatial Pixel Data Frame class.

#1) Construct. the model.
#coordinates() returns x, y values, spDf$temp is the Z
tps #2) SPDF has 6.5 million cells.
SPDF$spl_pred 

Creating Contour Bands

The following code snippet shows how the 6.5 million pixels were binned into 20 bands using the classIntervals() function.  10 colours were originally taken directly from the ColorBrewer website but these were interpolated into 20 colours using the colorRampPalette() function.

#Number of contour bands
intDivNum #set of 10 colors from colorbrewer2.org
colVect  "#e0f3f8","#abd9e9","#74add1","#4575b4","#313695")
#turn 10 colors into 20, first get function from colorRampPalette()
cr #next apply the fuction. cGrad contains 20 colours
cGrad #slice the temperatures into 20 bins
ci image(SPDF, "spl_pred", breaks = ci$brks, col = cGrad)

Github Code and Data

The data and code used to create the images above can be obtained here.

OctocatSmall