From Language to Information: Natural Language Processing

AI and Data Science


More information
° DA2122-M16 EN


In many sources of data, relevant information is conveyed by free text: this is the case for instance when analyzing the contents of patient records, scientific publications, social media, etc. Because of the non-formal nature of human language, contrary for instance to programming languages, computer-based extraction of structured information from natural language text is challenged by the high variation in expression and the importance of context for correct interpretation. Natural Language Processing aims to design methods that address these challenges, using human knowledge or data-driven methods. This course aims to bring participants to the level where they can independently perform text classification and extract data from text for further data processing and analysis.

The course provides an introduction to Natural Language Processing, including how to handle language units such as words, phrases, sentences, and additional information such as part-of-speech and syntactic structure. The most common applications of supervised machine learning to text analytics will be introduced, such as text classification, sequence labelling for information extraction, focusing on entity recognition and classification, as well as the creation and use of word embeddings and neural classifiers. The course will take biomedical text as illustration, supported by a short introduction to the representation and processing of biomedical terminology.

This course is part of a larger course series in Data Analysis consisting of 19 individual modules. Find more information and enroll for this module via


Content structure:

  • Introduction to Natural Language Processing
  • Basic Natural Language Processing tools
  • Machine learning for text classification
  • Sequence labelling for information extraction
  • Biomedical terminology for entity recognition
  • Word embeddings and neural classifiers for entity recognition

Course number:
Short- en long-term programmes
Area of interest:
AI and Data Science, Sciences
Academic year:
2021 - 2022
Pierre Zweigenbaum
Contact person:
More information

Your browser does not meet the minimum requirements to view this website. The browsers below are compatible. If you do not have one of these browsers, click on the icon to download the desired browser.