Introduction to multiple imputation and its application in STATA

Course offered by the National Centre for School Research and the DPU Research Unit "Social Inequality in Education and Beyond". To be held in Aarhus on 10-11 January 2019.

12.09.2018 | Helle Klareskov

Dato tor 10 jan fre 11 jan
Tid 10:00    12:30
Sted Nobelparken, Jens Chr. Skous Vej 4, 8000 Aarhus C, Bygning 1481, Lokale 264

Course description

Missing data are a pervasive problem in the social sciences.

Data for a given unit of observation may be missing entirely, for example, because a sampled respondent refused to participate in a survey (survey nonresponse). Alternatively, information may be missing only for a subset of variables (item nonresponse), for example, because a respondent refused to answer some of the questions in a survey.

The traditional way of dealing with item nonresponse, referred to as “complete case analysis” (CCA) or “listwise deletion”, excludes any observation with missing information from the analysis. While easy to implement, complete case analysis is wasteful and can lead to biased estimates.

Multiple imputation (MI) seeks to address these issues and provides more efficient and unbiased estimates if certain conditions are met. Therefore, it is increasingly replacing CCA as the method of choice for dealing with item nonresponse in applied quantitative work in the social sciences.

This two-day course introduces the method of Multiple Imputation, with an emphasis on implementing the approach with the statistics package STATA.

The course will be hands-on, with frequent opportunities to practice the covered material. STATA examples will be given throughout the course and exercises will provide plenty of opportunity for participants to improve their MI-related skills.

The course will cover the following themes:

  • The problem of missing data
  • Fundamentals of Multiple Imputation
  • Imputation diagnostics
  • Dealing with complex data structures
  • Application of Multiple Imputation in Stata using chained equations

The course is limited to 30 participants.


None, however knowledge of/experience with quantitative data analysis in general and dealing with missing data in particular is an advantage. The course will not cover regression techniques.

Software and hardware requirements

Participants will need to bring their own laptop to the course, preferably with a version of Stata installed. Participants who do not have a Stata license can team up with participants who have one for exercises.

Preparatory readings (optional, but advisable):

Allison, Paul D. 2001. Missing Data. Thousand Oaks: Sage.

Azur, Melissa J., Elizabeth A. Stuart, Constantine Frangakis, and Philip J. Leaf. 2011. “Multiple Imputation by Chained Equations: What Is It and How Does It Work?” International Journal of Methods in Psychiatric Research 20(1):40–49.

Enders, Craig K. 2010. Applied Missing Data Analysis. New York: Guilford.

StataCorp. 2015. Stata Multiple-Imputation Reference Manual. Release 14. College Station: Stata Press.

Van Buuren, Stef. 2012. Flexible Imputation of Missing Data. Boca Raton: Chapman & Hall/CRC.

Language: English

Dates: 10 January 2019, 10.00-17.00 and 11 January 2019, 9.00-12.30

Max number of particpitants: 30 (waiting list available if all seats are taken)

Location: Nobelparken, Jens Chr. Skous Vej 4, 8000 Aarhus C, Bygning 1481, Lokale 264

Lecturer: Dr Jan Heisig, Berlin Social Science Center

Course fee: Participation is free, but there is a no-show fee of 300 DKK. The fee will be invoiced if you fail to show up for the course or cancel your partipation after the cancellation deadline.

Lunch: Lunch is included on 10 January 2019

Deadline for registration: 6 January 2019

Deadline for cancellation: 7 January 2019

Register here