Data Science: Statistical Programming with R

Course code
Course fee (excl. housing)
Advanced Master
Apply now!

R is rapidly becoming the standard platform for data analysis. This course offers an elaborate introduction into statistical programming in R. Students learn to operate R, form pipelines for data analysis, make high quality graphics, fit, assess and interpret a variety of statistical models and do advanced statistical programming. The statistical theory in this course covers t-testing, regression models for linear, dichotomous, ordinal and multivariate data, statistical inference, statistical learning, bootstrapping and Monte Carlo simulation techniques.

R is rapidly becoming the standard platform for data manipulation, visualisation and analysis and has a number of advantages over other statistical software packages. A wide community of users contribute to R, resulting in an enormous coverage of statistical procedures, including many that are not available in any other statistical program. Furthermore, it is highly flexible for programming and scripting purposes, for example when manipulating data or creating professional plots. However, R lacks standard GUI menus, as in SPSS for example, from which to choose what statistical test to perform or which graph to create. As a consequence, R is more challenging to master. Therefore, this course offers an elaborate introduction into statistical programming in R.

Students learn to operate R, make plots, fit, assess and interpret a variety of basic statistical models and conduct advanced statistical programming and data manipulation. The topics in this course include regression models for linear, dichotomous, ordinal and multivariate data, statistical inference, statistical learning, bootstrapping and Monte Carlo simulation techniques. 

The course deals with the following topics:
  • An introduction to the R environment;
  • Basic to advanced programming skills: data generation, manipulation, pipelines, summaries and plotting;
  • Fitting statistical models: estimation, prediction and testing;
  • Drawing statistical inference from data;
  • Basic statistical learning techniques;
  • Bootstrapping and Monte Carlo simulation.
The course starts at a very basic level and builds up gradually. At the end of the week, participants will master advanced programming skills with R. No previous experience with R is required.
This course is part of a series of 5 courses in the Summer School Data Science specialisation taught by UU’s department of Methodology & Statistics. Please see here for more information about the full specialisation. This course can also be taken separately.

Summer School Data Science specialisation:

Upon completing 3 out of 5 courses in the specialisation (no more than one text mining course), students can obtain a certificate. Each course may also be taken separately.

Please note that there is always the possibility that we have to change the course pending COVID19-related developments. The exact details, including a day-to-day program, will be communicated 6 weeks prior to the start of the course.

Course director

Dr. Gerko Vink


Dr. Gerko Vink

Target audience

Applied researchers and (master) students who already use statistical software and would like to learn to use, or improve their usage of the flexible R-environment. Understanding of basic statistical theory such as t-tests, hypothesis testing and regression is required.

Participants from a variety of fields, including sociology, psychology, education, human development, marketing, business, biology, medicine, political science, and communication sciences, will benefit from the course.

A maximum of 80 participants will be allowed in this course. Please note that the selection for this course will be done on a first-come-first-served basis.

Course aim

The course teaches students the necessary skills to understand how R works, and how to use R for a variety of statistical analysis of data in many domains of science. The skills addressed in this practical are:

  • Working with the R environment;
  • Using R-functions for data generation, manipulation and summaries;
  • Making high-quality plots;
  • Forming pipelines;
  • Reproducible programming;
  • Statistical inference;
  • Basic statistical learning;
  • Fitting and interpreting a variety of statistical models;
  • Programming of bootstraps and Monte Carlo simulations.

For an overview of all our summer school courses offered by the Department of Methodology and Statistics please click here.

Study load

Five full days. A typical course day starts at 9.00 and ends at 17.00 with breaks for coffee, lunch and tea.

Please note that there are no graded activities included in this course. Therefore, we are not able to provide students with a transcript of grades. You will obtain a certificate upon completion of this course.


Course fee:
Course + course materials
Housing fee:

Housing through: Utrecht Summer School.

You can choose between two options for participating in this course, but please note that there is always the possibility that we have to change the course pending COVID19-related developments: 

  1.  If you choose the livestream option, you will get a discount on the course fee since we will not provide lunch then. The lectures will be broadcasted in Central European Summer Time via a livestream (not recorded). Participants can ask questions via the chat which will be moderated by a second lecturer who will either directly answer your questions via the chat or ask your questions to the first lecturer during class. You will also receive online support during the group computer labs from our team. Additionally, Q&A sessions will be organised so you will benefit from our normal high level expertise while enjoying the class from the comfort of your own chair.
  2. If you choose the campus option, you will be able to attend the lectures and computer labs at our campus. Of course, we will follow all COVID19-guidelines that hold at the time of the start of your course. We will keep you updated about the newest developments (see also Note that, at the moment, it is unclear how many participants will be allowed in our lecture rooms. Therefore, if you register for the campus option, we will also register you for the livestream option such that you are guaranteed a spot via the livestream option (and at first, send an invoice for this option only). We will put you ‘on hold’ for the campus option until we have more information about how many participants are allowed in our lecture rooms. As soon as we hear from the university, we will contact you and send you a second invoice for the part of the fee related to catering and campus registration.

If you are interested in the campus option, let us know via a message in the application form under ‘Student Comment’.

The physical course costs €720, but if you participate via the livestream you will get a 100 euro discount. Note that if you choose the campus option, you will be asked to first pay the livestream-fee (€620) and, when we have permission from the university to actually organise classes on location, we will send a second invoice for the remainder of the fee. This way, you will be ensured to have at least a spot for the livestream.

Tuition fee for PhD students from the Faculty of Social and Behavioural Sciences from Utrecht University will be funded by the Graduate School of Social and Behavioural Sciences.


There are no scholarships available for this course.


Please include a short description about your (scientific) background, and what you expect to learn from this course (or would like to learn).

More information

Irma Reyersen | E:

Recommended combinations
Data Science: Data Analysis
Data Science: Applied Text Mining
Data Science: Introduction to Text Mining with R
Data Science: Multiple Imputation in Practice


Application deadline: 21 June 2021

Share this course