Internal Migration Estimation in England and Wales
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
During my MS.c. I worked on methods for combining internal migration data in England and Wales. Migration data is often represented in square tables of origin-destination flows. These are of particular interest to analysing migration patterns when they are disaggregated by age, sex and some other variable such as illness, ethnicity or economic status. In England and Wales the data within these detailed flow table are typically missing in non-census years. However, row and column (origin and destination) totals are regularly provided from the NHS patient registers (see the first two columns of the hypothetical data situation below). I worked on a method to estimate the detailed missing flow data to sum to the provided totals in non-census years (see the third column of the hypothetical data situation below). This method is particularly useful for estimating migration flow tables disaggregated by detailed characteristics of migrants (such as illness, ethnicity or economic status) that are only provided by the ONS for census years.
Hypothetical Example of Data Set Situation (where migrant origins are labelled on the vertical axis and destinations on the horizontal axis).
Auxiliary Data (e.g. 2001 Census) | Primary Data (e.g. 2004 NHSCR Data) | Detailed Estimates for 2004 Based on Methodology | ||||||||||||||
Without Limiting Long Term Illness | Without Limiting Long Term Illness | |||||||||||||||
N | M | S | N | M | S | |||||||||||
N | 80 | 20 | 50 | 150 | N | 88 | 56 | 40 | 183 | |||||||
M | 50 | 100 | 50 | 200 | Illness details unavailable | M | 29 | 145 | 21 | 195 | ||||||
S | 10 | 30 | 110 | 150 | N | M | S | S | 7 | 52 | 54 | 113 | ||||
140 | 150 | 210 | 500 | N | 260 | 124 | 252 | 115 | 491 | |||||||
With Limiting Long Term Illness | M | 320 | With Limiting Long Term Illness | |||||||||||||
N | M | S | S | 170 | N | M | S | |||||||||
N | 30 | 10 | 20 | 60 | 200 | 370 | 180 | 750 | N | 33 | 28 | 16 | 77 | |||
M | 40 | 50 | 70 | 160 | M | 23 | 73 | 29 | 125 | |||||||
S | 30 | 10 | 40 | 80 | S | 20 | 17 | 20 | 57 | |||||||
100 | 70 | 130 | 300 | 76 | 118 | 65 | 259 |
The estimated values maintain some properties (various cross product ratios) of the Census data whilst updating marginal totals to more current data. For more details see my MS.c. dissertation (which I have put online here). This contains the R/S-Plus code to conduct the estimation in the Appendix. Note, there is also a published paper based on my MS.c. (abstract and links here) that uses a slightly modified R code.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.