(Bio-)ingenieurswetenschappen

Geïnteresseerd?

Meer informatie
° MC-RDA 2024-2025 Engels
Tags: Micro-credentials , Data Analysis

Beschrijving

The scientific method is historically linked to the possibility that other researchers can replicate and verify its results. As scientific analysis becomes more complex and interdisciplinary, ensuring reproducibility becomes more challenging specially in fields that combine different expertise. To promote transparency, consistency and robustness in science, journals, funders, and institutions are encouraging the use of tools and practices that enhance reproducibility. Lifelong learning helps professionals to keep up with the fast-paced scientific developments and to foster creativity and innovation. By learning about version control, containers, pipelines and data reproducibility, scientists of all levels can improve the reproducibility of their research, as well as the impact and reliability of their findings and methods. Moreover, using the methods introduced here they can collaborate and experiment in ways that allow reproducibility and creativity to coexist and thrive.

Detailed content

Session 1

- Successful examples of the use of these tools for reproducibility

- Aspects of reproducibility of data

Session 2

- Introduction to Git and GitHub concepts

- Routine usage of Git

- Inspect and compare different versions of a git project

- Connecting and integrating to GitHub

- Collaborate and experiment with Git and GitHub

Session 3

- How to access VSC facilities and use the HPC scheduling

Session 4

- Introduction to containers basic concepts and Docker syntax

- Find, obtain, and run a Docker image

- Adapt and build Docker recipes

- Find and run Apptainer images

- Adapt and build Apptainer images

- Use Apptainer in the VSC with the scheduling system

Session 5

- Introduction to NextFlow concepts and syntax

- Execute NextFlow pipelines with different executors and environments

- Write and run a NextFlow pipeline

- Write and modify modules and config files as best practice for pipeline development

- Use NextFlow in the VSC with the scheduling system

Session 6

- Projects:

* 2 small projects

o Git & GitHub project consists of creating collaboratively your documentation, with version control. The project is started during the lesson and finished asynchronously before delivery. (Estimated asynchronous time 3h)

o Docker and Apptainer project consist of adapting, writing, and building one Docker image based on a Docker recipe. The project must be delivered in GitHub, with history of versions available. The project is to be collaboratively developed after the lesson. (estimated asynchronous time 6h)

* 1 medium project

o NextFlow project consists of using docker or Apptainer images to create and run NextFLow pipeline that use config files and modules. The project must be delivered in GitHub with the history of versions available.

o Complementarily, an oral presentation (defence) of the final project must include a summary of the topics learned and examples that can demonstrate the use of the tools and focusing on reproducibility.

o The project is to be developed collaboratively after the lesson (estimated asynchronous time 8h)

Course prerequisites

* Being able to use simple shell commands (Linux for example), you can use this e-learning material to prepare.

* Experience with scripting is preferred (point to resource of catch-up before the course)

* Creating a VSC account

Final competences

  1. To use Git and GitHub for version control, reproducible and easy to share code, text documents, and other appropriate data for this context;
  2. To use Git and GitHub for individual projects and for collaboration;
  3. To use, adapt and write containers locally and in a super computer, understanding the different systems and its particularities;
  4. To use and to differentiate Docker containers and Apptainer images and their particular usage.
  5. To run, adapt and write NextFlow pipelines locally and in a super computer,
  6. To understand and to make use of best practices for NetFlow pipelines, such as config-files and modules;

Exam

Project evaluation and a Oral evaluation.

Type of course

This is an on campus course.

Course material

Syllabus, overheads, exercises handout

Location

Technology park, 75 – CMB building (FSVM II), 9052 Ghent

Day 1: L5 room, the 5th floor

All the other L4 room, the 4th floor

Cursusnummer:
MC-RDA 2024-2025
Type opleiding:
Kort- en langlopende opleidingen, Micro-credentials
Interessegebied:
(Bio-)ingenieurswetenschappen, AI en Data Science, Biomedische wetenschappen, Geneeskunde, Wetenschappen
Taal:
Engels
Academiejaar:
2024 - 2025
Startdatum:
28.02.2025
Lesgevers:
Bart Mesuere
Contactpersoon:
science-academy@ugent.be
Meer informatie

Uw browser voldoet niet aan de minimale vereisten om deze website te bekijken. Onderstaande browsers zijn compatibel. Mocht je geen van deze browsers hebben, klik dan op het icoontje om de gewenste browser te downloaden.