Events

Machine Learning on HPC - Introduction

by Prof. Alexander Binder (Leipzig University / ScaDS.AI Dresden/Leipzig), Dr Paramita Mirza (TU Dresden / ScaDS.AI Dresden/Leipzig), Dr Peter Winkler (TU Dresden / ScaDS.AI Dresden/Leipzig)

Europe/Berlin
Online

Online

Description

The use of High Performance Computing (HPC) systems can have huge advantages for Machine Learning (ML) methods. Due to the heterogeneity of ML applications, the motivation to switch to an HPC system can be manifold, e.g. large memory requirements, GPU usage or increase of computational speed. This course presents how a typical ML workflow can be realized in the HPC environment. It is possible to switch to the HPC system at different points in the workflow – depending on the requirements. The development of Machine Learning applications is often done by collaborative work within groups, which is also taken into account in this course.

Agenda

  • Access to the HPC system (e.g. ssh, Jupyterhub)
  • Data transfer and storage of training data, models, source codes etc. (e.g. scp, dtcp, user space, workspaces)
  • Setup of the required software environment (e.g. using module system, virtual environments, containers)
  • Execution/testing/debugging of applications  (e.g. batch jobs, interactive jobs)
  • Evaluation and storage of results
  • Simple monitoring to optimize applications (Pika)

Handouts

The course material (slides, sample application) will be available.

Prerequisites

Participants should have knowledge on Python, Tensorflow or Pytorch and the use of the Linux shell.

Learning Outcomes

Participants will gain knowledge about the implementation of Machine Learning workflows using specific examples, taking into account individual requirements.

Course language

English

Target group 

HPC Basics / HPC User

Organized by

Trainings ScaDS.AI

Registration
Participants
60 / 60