Data Science: Multiple Imputation in Practice (Hybrid)

€730

Specifications

8 Jul. - 11 Jul. 2024

Master

1.5 ECTS

Utrecht, The Netherlands Online course

Description

Participants will learn how to form imputation models, how to combine data sets, how to model non-response, how to use diagnostics to inspect the imputed values, how to obtain valid inference on incomplete data and how to avoid many of the pitfalls associated with real-life missing data problems. While there will be plenty of opportunity to ask the experts for help and advice throughout the course, we end the course with the opportunity to consult us on your own specific missing data problem.

Most researchers need to deal with incomplete data. Missing data complicate the statistical analysis of data. Simply removing the missing data is not a good strategy and can bias the results. Multiple imputation is a general and statistically valid technique to analyze incomplete data. Multiple imputation has rapidly become the standard in social and behavioural science research.

This hybrid course will explain modern and flexible imputation techniques that are able to preserve salient data features. The course enhances participants’ knowledge of imputation principles and provides flexible hands-on solutions to incomplete data problems using R. The course discusses principles of missing data theory, outlines a step-by-step approach toward creating high quality imputations, and provides guidelines on how to report the results. The course will use the authors’ MICE package in R.

The lectures will follow the book “Flexible Imputation of Missing Data” by Stef van Buuren ( 2^nd edition, Chapman & Hall, 2018). The book can be read online for free at https://stefvanbuuren.name/fimd/.

Format of the course

We iterate short lectures with hands-on practical sessions and plenary discussion of the practicals. This ensures that we form an interactive group of participants that learn the theory and practice of multiple imputation in bite-size blocks. Each block builds up to the next one. We invite participants to share their own experience and challenges during these blocks so that we can foster a collaborative learning environment.

Prerequisites:

Participants should have a basic knowledge of scripting and programming in R. Participants who have limited experience with R need to have followed a relevant R course beforehand, such as:

Winter School S002 Introduction to R followed by Winter School S004 Regression in R
Summer School S24: Data Science: Statistical Programming in R

or any similar level course elsewhere.

The theory and practice discussed in this course requires that participants are familiar with basic statistical concepts and techniques, such as linear modeling, least squares estimation and hypothesis testing.

Participants are requested to bring their own laptop computer. Software will be available online.

This course is hybrid, meaning that it can be attended both on-site as well as online. Please indicate your preferred mode of attendance when registering.

This course can be taken separately, but is also part of a series of 8 courses in the Summer School Data Science specialisation taught by UU’s department of Methodology & Statistics:

Data Science: Programming with Python (Course code S17, 8-12 July 2024)
Data Science: Statistical Programming with R (Course code S24, 8-12 July 2024)
Data Science: Multiple Imputation in Practice (Course code S28, 8-11 July 2024)
Data Science: Data Analysis (Course code S31, 15-19 July 2024)
Data Science: Network Science (Course code S37, 15-19 July 2024)
Data Science: Applied Text Mining (Course code S42, 15-19 July 2024)
Data Science: Machine Learning with Python (Course code S70, 22-26 July 2024)
Data Science: Text Mining with R (Course code S41, 19-22 August 2024)

Upon completing, within 5 years, 3 out of 8 courses in the Summer School Data Science specialisation (no more than one text mining course), students can obtain a certificate.

Please see here for more information about the full specialisation.

S28 Day-to-day 2024.pdf

Target audience

This course is relevant for applied researchers or statistical researchers that would like to get acquainted with incomplete data theory and the practice of multiple imputation. Participants should have a basic understanding of statistical techniques (such as analysis of variance and (non)linear regression) and the concept of statistical inference.

This course is suitable for students at master, advanced master, and PhD level

For an overview of all our summer courses offered by the Department of Methodology and Statistics please click here.

Aim of the course

The aim of this course is:

To enhance participants’ knowledge of imputation methodology;
To get comfortable with flexible solutions to deal with incomplete data using R.

Learning goals:

Participants will learn to make informed decisions on how to handle incomplete data in a scientifically valid way;
Participants will be able to implement the approach taken using state-of-the-art R technology.

Study load

Four days (09.00 to 17.00 hrs).

You will receive a certificate upon course completion. Please be aware that this course does not include graded activities, and therefore we cannot provide a transcript of grades.

Costs

Course fee: €730.00
Included: Course + course materials + lunch
Housing fee: €200
Housing provider: Utrecht Summer School

PhD students from the Faculty of Social and Behavioural Sciences at Utrecht University have the opportunity to attend three Winter/Summer School courses funded by the Graduate School of Social and Behavioural Sciences. Additionally, they may choose to take as many courses as they wish at their own expense from their personal budget.

There are no scholarships available for this course.

Additional information

The housing costs do not include a Utrecht Summer School sleeping bag. This is a separate product on the invoice. If you wish to bring your own bedding, please deselect or remove the sleeping bag from your order.

Application

Please include a short description about your (scientific) background, and what you expect to learn from this course (or would like to learn).

This course can be attended both on-site and online. Please indicate your preferred mode of attendance when registering.

Related courses

Data Science: Data Analysis

Organising institution
Utrecht University

Faculty
Faculty of Social and Behavioural Sciences

15 Jul. - 19 Jul. 2024

Course Level
Advanced Master

ECTS credits
1.5 ECTS

€850
Data Science: Statistical Programming with R

Organising institution
Utrecht University

Faculty
Faculty of Social and Behavioural Sciences

8 Jul. - 12 Jul. 2024

Course Level
Advanced Master

ECTS credits
1.5 ECTS

€850