Many modern digital applications increasingly rely on machine learning as a means to derive predictive strength from high-dimensional data sets. Compared to traditional statistics, the absence of a focus on scientific hypotheses, and the need for easily leveraging detailed signals in the data require a different set of models, tools, and analytical reflexes.
This course aims to bring participants to the level where they can independently tackle the analytical part of data mining projects. This means that the most common types of projects will be addressed - regression-type with continuous outcomes, classification with categorical outcomes, and clustering. For each of these, the practical use of a set of standard methods will be shown, like Random Forests, Gradient Boosting Machines, Support Vector Machines, k-Nearest-Neighbors, K-means,... Furthermore, throughout the course, concepts will be highlighted that are of concern in every statistical learning applications, like the curse of dimensionality, model capacity, overfitting and regularization, and practical strategies will be offered to deal with them, introducing techniques such as the Lasso and ridge regression, cross-validation, bagging and boosting. Instructions will also be given on a selection of specific techniques that are often of interest, such as modern visualization of high-dimensional data, model calibration, outlier detection using isolation forests, explanation of black-box models,... Finally, the last lecture will introduce the idea of deep learning as a powerful tool for data analysis, discussing when and how to practically use it, and when to shy away from it.
Fees and registration form are available on the website of the Academy for Lifelong Learning of the Faculty of Sciences (UGent).
This course targets professionals and investigators from all areas that are involved in predictive modeling based on large and/or high-dimensional databases.
Participants are expected to be familiar with basic statistical modeling (as for instance taught in Module 3 of this program), and to have had a first experience programming in Python (as for instance taught in Module 4 of this program).
If you take part in all 7 sessions you will receive a certificate of attendance via e-mail after the course ends.
Additionally, you can take part in an exam. If you succeed in this test a certificate from Ghent University is issued.
The exam consists of a take home project assignment. You are required to write a report by a set deadline.
This is an on campus course. We offer blended learning options if, exceptionally, you can't attend a class on campus.
Seven Monday evenings in April, May and June 2023: April 17 and 24, May 8, 15 and 22, June 5 and 12, 2023, from 5.30 pm to 9 pm.
Faculty of Science, Campus Sterre, Krijgslaan 281, Building S9, Ghent
Access to the slides and Python code notebooks