Course code TUD18
Course title Data Science for Software Engineering
Institution Delft University of Technology
Course address Van Mourik Broekmanweg 6, 2628 XE Delft
City Delft
Minimum year of study 4th year
Minimum level of English Fluent
Minimum level of French None
Key words


data science, mining software repositories, software engineering



Language English
Professor responsible Dr. Alberto Bacchelli
Participating professors


Dr. Alberto Bacchelli

Number of places Minimum: 20, Maximum: 25, Reserved for local students:


This course explores techniques and leading research in doing data science on Software Engineering data, discusses challenges associated with mining SE data, highlights SE data mining success stories, and outlines future research directions.

Software repositories archive valuable software engineering data, such as source code, execution traces, historical code changes, mailing lists, and bug reports. This data contains a wealth of information about a project's status and history. Doing data science on software repositories, researchers can gain empirically based understanding of software development practices, and practitioners can better manage, maintain and evolve complex software projects.



Programme to be followed


1. Introduction to Data Science for Software Engineering. Mining source code changes

2. Mining source code changes to support evolution

3. Learning from software bugs to support software quality

4. Analyzing code reviews to discover insights



Background on programming (possibly Python or Kotlin) is a requirement: The students will be required to build a full analysis during the course using the programming language of their choice. Knowledge of statistical methods is a plus. For this course a good to advanced knowledge of English is indispensable.


Course exam


The exam will consist in a 30-minute multiple choice test and in the presentation of the analysis done during the week and its results.