Python has become the dominant programming language used in data science. This course offers an introduction to computational thinking about data-related problems and the implementation of data analysis programs with Python. It starts at the very basics and is explicitly intended for students who have no or only little programming experience.
Programming is the process of designing and building an executable computer program for accomplishing a specific computational task. The course will introduce you to programming with Python, which is currently one of the most popular programming languages in (data) science. After familiarization with the basics (input and output, variables, data types, data structures, conditional branching, loops, functions, etc.) the course will address specific data science topics, such as statistical analyses with the pandas package and data visualization with matplotlib.
The course will take 5 full days. A typical course day starts at 9.00 and ends at 17.00 with breaks for coffee, lunch and tea (provided on location). The course is offered offline
Every day, short lectures will be combined with practicals, where students can practice with example datasets that will vary over the course of the week. In the afternoon, students will work in small project groups on applying the lessons of the day to a real-life dataset.
More details on the day-by-day programme can be found in a separate file. Broadly, the following topics are discussed:
Day 1: getting started, the programming environment, editing and running Python programs, input and output, variables, arithmetic expressions, conditional branching, loops
Day 2: functions, the standard library, data structures
Day 3: basics of object-oriented programming, data frames, statistical analyses with the pandas package
Day 4: data visualization with matplotlib, matrix computations with numpy
Day 5: group presentations, best practices for software project management
Course credits of 1.5 EC are offered to students who attend meetings every day, actively participate in the exercises and participate in the presentations of the group assignments on the final day of the course.
The course will use freely available literature that will be made available to course participants during the course. The literature serves both as a practical guide to course materials, and more in-depth reading that can be done during or after the course.
Participants are requested to bring their own laptop. Software will be available online.
This course can be taken separately, but is also part of a series of 7 courses in the Summer School Data Science specialisation taught by UU’s department of Methodology & Statistics:
- Data Science: Programming with Python (Course code S17, 3-7 July 2023)
- Data Science: Statistical Programming with R (Course code S24, 3-7 July 2023)
- Data Science: Multiple Imputation in Practice (Course code S28, will not be available in 2023)
- Data Science: Data Analysis (Course code S31, 10-14 July 2023)
- Data Science: Network Science (Course code S37, 10-14 July 2023)
- Data Science: Introduction to Text Mining with R (Course code S41, 10-13 July 2023)
- Data Science: Applied Text Mining (Course code S42, 17-21 July 2023)
Upon completing, within 5 years, 3 out of 7 courses in the Summer School Data Science specialisation (no more than one text mining course), students can obtain a certificate.
Please see here for more information about the full specialisation.
dr. Anastasia Giachanou
The course requires no specific previous knowledge, in particular no prior programming skills. You will need to bring your own laptop to do the exercises. Any operating system (Windows, Mac OSX, Linux) is fine, as long as new software can be installed on the machine. We assume that you have elemental computer skills such as browser usage, storing files, installing programs, etc..
For an overview of all our summer school courses offered by the Department of Methodology and Statistics please click here.
Aim of the course
After finishing the course successfully, you will be able to:
- think computationally about data-related problems
- design programs for specific computational tasks
- write Python programs for specific computational tasks, including, e.g., asking and reading input from the user, loading data from files, preprocessing and analyzing data, performing calculations, simulating processes, visualizing data and results, storing data and results into files
- validate Python programs for correct functioning
- document and describe Python programs
The course will take 5 full days. A typical course day starts at 9.00 and ends at 17.00 with breaks for coffee, lunch and tea (provided on location). The course is offered on location.
Please note that there are no graded activities included in this course. Therefore, we are not able to provide students with a transcript of grades. You will obtain a certificate upon completion of this course.
Tuition fee for PhD students from the Faculty of Social and Behavioural Sciences from Utrecht University will be funded by the Graduate School of Social and Behavioural Sciences.
For students who are taking this course as a prerequisite for entering the Master ‘Applied Data Science’, we offer a course discount of 200 euros. If you are an ADS student, please mention this at your application so we can charge you the reduced fee.
Please include some details on your programming experience in your motivation.
Team M&S Summer School | E: email@example.com