Data Science: Multiple Imputation in Practice

Course code
Course fee (excl. housing)
Apply now!

This 4-day course teaches you the basics in solving your own missing data problems appropriately. Participants will learn how to form imputation models, how to combine data sets, how to model non-response, how to use diagnostics to inspect the imputed values, how to obtain valid inference on incomplete data and how to avoid many of the pitfalls associated with real-life missing data problems.

Most researchers in the social and behavioural sciences have encountered the problem of missing data: It seriously complicates the statistical analysis of data, and simply ignoring it is not a good strategy. A general and statistically valid technique to analyse incomplete data is multiple imputation, which is rapidly becoming the standard in social and behavioural science research.

This course will explain a modern and flexible imputation technique that is able to preserve important features in the data. The aim of this course is to enhance participants’ knowledge in imputation methodology and to provide a flexible solution to their incomplete data problems using R. The course will explain the principles of missing data theory, outline a step-by-step approach toward creating high quality imputations, and provide guidelines how the results can be reported. The course will use the authors' MICE package in R.

The lectures will follow the book “Flexible Imputation of Missing Data” by Stef van Buuren ( 2nd edition, Chapman & Hall, 2018). The book can be read online for free at
Participants should have a basic knowledge of scripting and programming in R. Participants who have limited experience with R are suggested to follow the Summer School course S24: Data Science:Statistical Programming in R, or a similar level course beforehand.
The theory and practice discussed in this course requires that participants are familiar with basic statistical concepts and techniques, such as linear modeling, least squares estimation and hypothesis testing. 

This course is part of a series of 5 courses in the Summer School Data Science specialisation taught by UU’s department of Methodology & Statistics. Please see here for more information about the full specialisation. This course can also be taken separately.

Summer School Data Science specialisation:

Upon completing 3 out of 5 courses in the specialisation (no more than one text mining course), students can obtain a certificate. Each course may also be taken separately.

Please note that there is always the possibility that we have to change the course pending COVID19-related developments. The exact details, including a day-to-day program, will be communicated 6 weeks prior to the start of the course.

Course director

Dr. Gerko Vink


Prof. dr. Stef van Buuren (Netherlands Organization for Applied Scientific Research (TNO) and Utrecht University), dr. Gerko Vink (Utrecht University)

Target audience

This course is relevant for applied researchers or statistical researchers that would like to get acquainted with the theory and practice of multiple imputation. Participants should have basic understanding of statistical techniques (such as analysis of variance and (non)linear regression) and the concept of statistical inference. This course is suitable for students at Master level, Advanced master level en PhD level.

A max. of 50 participants will be allowed in this course. Please note that the selection for this course will be done on a first-come-first-served basis.

Course aim

The aim of this course is to enhance participants’ knowledge in imputation methodology, and to provide a flexible solution to their incomplete data problems using R.

For an overview of all our summer school courses offered by the Department of Methodology and Statistics please click here.

Study load

Four days (09.00 – 17.00 hrs.).

Please note that there are no graded activities included in this course. Therefore, we are not able to provide students with a transcript of grades. You will obtain a certificate upon completion of this course.


Course fee:
Course + course materials
Housing fee:

Housing through: Utrecht Summer School.

You can choose between two options for participating in this course, but please note that there is always the possibility that we have to change the course pending COVID19-related developments: 

  1.  If you choose the livestream option, you will get a discount on the course fee since we will not provide lunch then. The lectures will be broadcasted in Central European Summer Time via a livestream (not recorded). Participants can ask questions via the chat which will be moderated by a second lecturer who will either directly answer your questions via the chat or ask your questions to the first lecturer during class. You will also receive online support during the group computer labs from our team. Additionally, Q&A sessions will be organised so you will benefit from our normal high level expertise while enjoying the class from the comfort of your own chair.
  2. If you choose the campus option, you will be able to attend the lectures and computer labs at our campus. Of course, we will follow all COVID19-guidelines that hold at the time of the start of your course. We will keep you updated about the newest developments (see also Note that, at the moment, it is unclear how many participants will be allowed in our lecture rooms. Therefore, if you register for the campus option, we will also register you for the livestream option such that you are guaranteed a spot via the livestream option (and at first, send an invoice for this option only). We will put you ‘on hold’ for the campus option until we have more information about how many participants are allowed in our lecture rooms. As soon as we hear from the university, we will contact you and send you a second invoice for the part of the fee related to catering and campus registration.

If you are interested in the campus option, let us know via a message in the application form under ‘Student Comment’.

The physical course costs €615, but if you participate via the livestream you will get a 80 euro discount. Note that if you choose the campus option, you will be asked to first pay the livestream-fee (€535) and, when we have permission from the university to actually organise classes on location, we will send a second invoice for the remainder of the fee. This way, you will be ensured to have at least a spot for the livestream.

Tuition fee for PhD students from the Faculty of Social and Behavioural Sciences from Utrecht University will be funded by the Graduate School of Social and Behavioural Sciences.


There are no scholarships available for this course.


Please include a short description about your (scientific) background, and what you expect to learn from this course (or would like to learn).

More information

Irma Reyersen | E:

Recommended combinations
Data Science: Data Analysis
Data Science: Applied Text Mining
Data Science: Statistical Programming with R


Application deadline: 28 June 2021

Share this course