About the Data

The geovisualisations produced during the project are not really about the data. However, we always need good quality data when we are working with GIS. Therefore, the spatial interaction data chosen for this project was the ward-level migration dataset (MG201) from the UK Census. There are a number of different ward-level tables, but in this case I chose to use table MG201, which provides a total of all movers.

More information on this kind of data can be found here. For a more in-depth, academic perspective, see this paper (link may load slowly) by John Stillwell and Oliver Duke-Williams.

A quick summary for now though... The UK Census was last taken on Sunday 29 April 2001. One of the questions asked people where they lived one year prior to that day (April 29, 2000). With this information, we are then able to map the household mobility patterns of millions of individual people and households.  This can then be compared to where people live and spatial interaction data generated. The MG201 dataset includes 10,608 ward orgins and destinations. This produces a matrix with over 112 million cells, but in actual fact there were just over 850,000 interactions between UK wards. (Commuting data is also collected from the Census, by asking for the postcode of where people work.)

The spatial units for which the MG201 data are available are known as 'interaction wards' since they comprise 7,969 census area statistics (CAS) wards for England, 881 for Wales, 582 for Northern Ireland, and 1,176 standard table wards for Scotland. There are subtle differences between these ward types but in the vast majority of cases CAS wards and standard table wards are comparable.

Academic users at UK institutions can download this data from the WICID interface on the CIDER website.

A final issue to note here, and one that is unique to UK spatial interaction data, is that of disclosure control, or small cell adjustment (see Stillwell and Duke-Williams, 2007). The so-called ‘small cell adjustment method’ (SCAM) was applied to all SMS data tables with the exception of flows of migration with destinations in Scotland. In this process, small cell counts of values presumed to be either 1 or 2 have been adjusted to values of either 0 or 3 (see Duke-Williams and Stillwell, 2007). From a user point of view, the preponderance of threes and multiples thereof is the most obvious manifestation of the SCAM process, leading to greater uncertainty in the data as the spatial resolution reduces and individual flows become smaller. The approach adopted here cannot overcome the SCAM restrictions, but it can give visual expression to a disclosure control mechanism that acts to deliberately obfuscate large portions of a dataset.

  • Duke-Williams O. and Stillwell J. (2007) ‘Investigating the potential effects of small cell adjustment on interaction data from the 2001 Census’, Environment and Planning A, 39(5), 1079-1100.
  • Stillwell, J. and Duke-Williams, O. (2007) ‘Understanding the 2001 UK census migration and commuting data: the effect of small cell adjustment and problems of comparison with 1991’, Journal of the Royal Statistical Society Series A (statistics in society) 170(2), 1-21.
  • Census output is Crown copyright and is reproduced with the permission of the Controller of HMSO and the Queen's Printer for Scotland
  • Source: 2001 Census: Special Migration Statistics (Level 2)