Python has become the dominant programming language used in data science. This course offers an introduction to computational thinking about data-related problems and the implementation of data analysis programmes with Python. It starts at the very basics and is explicitly intended for students who have no or only little programming experience.
Programming is the process of designing and building an executable computer program for accomplishing a specific computational task. The course will introduce you to programming with Python, which is currently one of the most popular programming languages in (data) science. After familiarization with the basics (input and output, variables, data types, data structures, conditional branching, loops, functions, etc.) the course will address specific data science topics, such as statistical analyses with the pandas package and data visualization with matplotlib.
Every day, short lectures will be combined with practicals, where students can practice with example datasets that will vary over the course of the week. In the afternoon, students will work in small project groups on applying the lessons of the day to a real-life dataset.
More details on the day-by-day programme can be found in a separate file. Broadly, the following topics are discussed:
Day 1: getting started, the programming environment, editing and running Python programs, input and output, variables, arithmetic expressions, conditional branching
Day 2: loops, functions, the standard library, data structures
Day 3: basics of object-oriented programming, file I/O, data frames, statistical analyses with the pandas package
Day 4: data visualization with matplotlib, matrix computations with numpy
Day 5: group presentations, best practices for software project management
Course credits of 1.5 EC are offered to students who attend meetings every day, actively participate in the exercises and participate in the presentations of the group assignments on the final day of the course.
Application deadline: 4 July 2022