
This four-day hybrid course by the MICE developers teaches you the basics in solving your own missing data problems appropriately. Participants will learn how to form imputation models, how to combine data sets, how to model non-response, how to use diagnostics to inspect the imputed values, how to obtain valid inference on incomplete data and how to avoid many of the pitfalls associated with real-life missing data problems. While there will be plenty of opportunity to ask the experts for help and advice throughout the course, we end the course with the opportunity to consult us on your own specific missing data problem.
Most researchers need to deal with incomplete data. Missing data complicate the statistical analysis of data. Simply removing the missing data is not a good strategy and can bias the results. Multiple imputation is a general and statistically valid technique to analyze incomplete data. Multiple imputation has rapidly become the standard in social and behavioural science research.
This hybrid course will explain modern and flexible imputation techniques that are able to preserve salient data features. The course enhances participants’ knowledge of imputation principles and provides flexible hands-on solutions to incomplete data problems using R. The course discusses principles of missing data theory, outlines a step-by-step approach toward creating high quality imputations, and provides guidelines on how to report the results. The course will use the authors’ MICE package in R.
The lectures will follow the book “Flexible Imputation of Missing Data” by Stef van Buuren ( 2nd edition, Chapman & Hall, 2018). The book can be read online for free at https://stefvanbuuren.name/fimd/.
Format of the course
We iterate short lectures with hands-on practical sessions and plenary discussion of the practicals. This ensures that we form an interactive group of participants that learn the theory and practice of multiple imputation in bite-size blocks. Each block builds up to the next one. We invite participants to share their own experience and challenges during these blocks so that we can foster a collaborative learning environment.
Prerequisites:
Participants should have a basic knowledge of scripting and programming in R. Participants who have limited experience with R need to have followed a relevant R course beforehand, such as:
- Winter School S002 Introduction to R followed by Winter School S004 Regression in R
- Summer School S24: Data Science: Statistical Programming in R
or any similar level course elsewhere.
The theory and practice discussed in this course requires that participants are familiar with basic statistical concepts and techniques, such as linear modeling, least squares estimation and hypothesis testing.
Participants are requested to bring their own laptop computer. Software will be available online.
This course is hybrid, meaning that it can be attended both on-site as well as online. Please indicate your preferred mode of attendance when registering.
This course is part of a series of courses in the Summer School Data Science specialization taught by the department of Methodology and Statistics of Utrecht University. Please see here for more information about the full specialization. This course can also be taken separately.
Lecturers
Prof. dr. Stef van Buuren (Netherlands Organization for Applied Scientific Research (TNO) and Utrecht University), dr. Gerko Vink (Utrecht University)
Target audience
This course is relevant for applied researchers or statistical researchers that would like to get acquainted with incomplete data theory and the practice of multiple imputation. Participants should have a basic understanding of statistical techniques (such as analysis of variance and (non)linear regression) and the concept of statistical inference.
This course is suitable for students at master, advanced master, and PhD level
For an overview of all our summer courses offered by the Department of Methodology and Statistics please click here.
Aim of the course
The aim of this course is:
- To enhance participants’ knowledge of imputation methodology;
- To get comfortable with flexible solutions to deal with incomplete data using R.
Learning goals:
- Participants will learn to make informed decisions on how to handle incomplete data in a scientifically valid way;
- Participants will be able to implement the approach taken using state-of-the-art R technology.
Study load
Four days (09.00 to 17.00 hrs).
You will receive a certificate upon course completion. Please be aware that this course does not include graded activities, and therefore we cannot provide a transcript of grades.
Costs
Tuition fee for PhD students from the Faculty of Social and Behavioural Sciences at Utrecht University will be funded by the Graduate School of Social and Behavioural Sciences.
There are no scholarships available for this course.
Additional information
The housing costs include housing, plus a Utrecht Summer School sleeping bag, for you to keep. This sleeping bag also includes an inflatable pillow and matrass cover. If you wish to bring your own bedding, please contact us, so we can give you a € 50 discount on the housing fee. Please note that you cannot buy individual bedding items.
Application
Please include a short description about your (scientific) background, and what you expect to learn from this course (or would like to learn).
This course can be attended both on-site and online. Please indicate your preferred mode of attendance when registering.
Contact details
Irma Reyersen | E: ms.summerschool@uu.nl