Hi, you are logged in as , if you are not , please click here

Combining Data from Multiple Admin and Survey Sources for Statistical Purposes

More Info

Course Information

Combining data from multiple administrative and survey sources for statistical purposes

Course Summary

Day one provides a general introduction to combining multiple administrative and survey datasets for statistical purposes. A total-error framework is presented for integrated statistical data, which provides a systematic overview of the origin and nature of the various potential errors. The most typical data configurations are illustrated and the relevant statistical methods reviewed.

Day two covers a handful of selected statistical methods. Training will be given on the techniques of data fusion, or statistical matching, by which joint statistical data is created from separate marginal observations. The participants will be introduced to several imputation or adjustment techniques, in the presence of constraints arising from overlapping data sources.

Target Audience:

This course is ideal for social and medical researchers with interests in combining data from multiple sources or analysing data from different sources; staff at National Statistical Institutes (or similar organisations) who are involved in the design, management and quality assurance of statistical processes based on data from multiple sources including censuses, administrative data and sample surveys.


Understanding of the following are required: central concepts of statistical uncertainty (such as bias, variance, confidence interval) and distribution, basic knowledge of data cleaning and imputation, basic experience/skill of R for statistical computing. Methodological training, knowledge and experience will be helpful.

Further course details can be found here.

Podcast for some of our previous courses can be found here

Course Code

ADRCE-training Zhang 040 - 2017

Course Dates

7th November 2017 – 8th November 2017

Places Available

Course Leader

Prof Li-Chun Zhang
Course Description

Course Contents:


  • Life-cycle of integrated statistical data and transformation processes
  • A framework of error sources associated with data integration
  • Population coverage and unit errors
  • Uncertainty and techniques of categorical data fusion, or statistical matching
  • Imputation and adjustment methods subjected to micro- and macro-level constraints


Learning Outcomes:

By the end of the course participants will have gained:

  • Understanding of potential errors and statistical uncertainty involved in data integration
  • Ability to apply relevant concepts and methods in practice

Appreciation of opportunities and challenges of inference based on data integrati

How would you rate your experience today?

How can we contact you?

What could we do better?

   Change Code