Summer School 2023Summer School

Europe/Berlin
Dorint Hotel Dresden

Dorint Hotel Dresden

Grunaer Str. 14 01069 Dresden Germany
René Jäkel (ScaDS.AI Dresden/Leipzig), Gina Valentin (ScaDS.AI Dresden/Leipzig)
Description

ScaDS.AI Dresden/Leipzig happily invites you to the 9th International Summer School on AI and Big Data. Our yearly summer school aimes at graduate students, Ph.D. students, researchers as well as practitioners starting or being active in the areas of Machine Learning, Artificial Intelligence and/or Big Data. Within the program, we will offer inspiring insights into various research areas by internationally recognized and well known speakers. 

Date and Location

The 9th International Summer School an AI and Big Data will take place from 03.07 - 07.07.2023 in Dresden at:

Dorint Hotel Dresden
Grunaer Str. 14
01069 Dresden
Germany

Topics

This year, we have invited national and international experts to Dresden to give you in-depth insights into latest developments in the research field of Large AI Models. Specifically, the ScaDS.AI summer school 2023 targets three main topic areas:

  • Large AI Models
  • Natural Language Processing
  • Computing Architectures for AI

Costs

  • Standard price: 420 euros
  • Special Offer Leipzig University members: 320 euros
  • Special Offer TU Dresden members: 320 euros (280 euros will appear on the bill, the associated tax portion is offset internally by TUD regulation) 

 

The standard ticket for the 9th International Summer School on AI and Big Data includes the day's catering and the social program. Travel costs and accommodation are covered by the participants themselves. Please note, that the participation fees for employees of TU Dresden will be charged via internal cost allocation. For non-affiliates of TU Dresden, the payment is made by invoice.

 

Please note that the registration is binding and therefore costs are incurred.

 

Please note that the event will be accompanied by a photographer. The resulting photos may be used for reporting purposes in press releases, on our website (www.scads.ai) and on social media. If you do not agree with the production of photos, please contact the photographer directly on site.

 

More information: 

www.scads.ai/summer-school-2023

 

Regular registration deadline: 18.06.2023 

Final registration is possible until 28.06.2023 with a surcharge of 10% of the participation fee. Registration after this date is not possible.

    • 1
      Registration and Welcome Coffee
    • 2
      Introduction
      Speaker: Prof. Wolfgang E. Nagel (TU Dresden / ScaDS.AI Dresden/Leipzig)
    • 2:45 PM
      Break
    • 3
      Big Data and AI for Multimodal Communication Research

      Much of linguistic research has long focused on words and sentences alone, ignoring co-speech gesture and facial expressions. This was to some extent caused by the availability of large corpora of written text, which is comparably easy to collect and process. With the advances we have seen in data processing and machine learning over the past decade, the use of large audiovisual datasets has come into reach.

      This presentation will report on the full lifecycle of large audiovisual datasets, spanning collection, cleaning, processing, access infrastructure, and analysis. We will see how NLP, audio processing and Computer Vision technology support the research process or enable us to work on entirely new research questions. The datasets used are taken from American TV and from Russian international media on YouTube.

      In the second part of the presentation, we will discuss the gap between the research in Computer Science and related disciplines and its applications to research carried out in the Humanities and Social Sciences.

      Speaker: Dr Peter Uhrig (TU Dresden / ScaDS.AI Dresden/Leipzig)
    • 4:45 PM
      Poster Session + Get Together
    • 4
      Fire Talk: AI and Start Up Scene
      Speaker: Mr Richard Socher (you.com)
    • 5
      Developing Multilingual, Open and European Large Language Models

      Since the introduction of ChatGPT in November 2022, Large Language Models (LLMs) have become ubiquitous in everyday life for a large portion of the global population, and have also simplified and facilitated a wide range of tasks for both experts and non-professional users alike.

      However, the underlying technologies of conversational models such as ChatGPT remain closed-sourced and in the hands of probably less than a dozen private organisations worldwide. In our presentation we will report on our efforts in the project OpenGPT-X, funded by the Federal German Ministry of Economic Affairs and Climate Action (BMWK), to develop large generative language models for the German language, while making them open-source and respectful of European values. In addition to providing an overview of the project, we will present our efforts towards developing multilingual language models in collaboration with the EU project European Language Equality (ELE) towards the curation of a large, multilingual data set, data filtering and preparation; we will give details about the training of our European models and show the first results of the evaluation.

      Furthermore, we will give an overview of the general state of play of digital language inequality in Europe, which we aim to transform, over the next few years, into full digital language equality in Europe by 2030. Two platforms and initiatives that are of crucial importance in that regard are European Language Grid (ELG) and the recently started Common European Language Data Space (LDS), which will also be briefly highlighted.

      Speakers: Dr Georg Rehm (DFKI), Mr Pedro Ortiz Suarez (DFKI)
    • 10:30 AM
      Break
    • 6
      Distributed AI and Multi-agent Systems for Space Systems Autonomy

      AI-based autonomy has been recognised as a key enabler of the next-generation space systems that aim at increasing responsiveness and continuity of space-based observations, covering large areas with higher resolutions, minimizing communication and data access latencies, and reducing costs of both the space and ground segments. Space systems autonomy encompasses onboard autonomous decision-making capabilities that enable the space segment to continue mission operations and to survive critical situations without relying on ground segment intervention. It relates to all aspects of spacecraft operations, including continuous mission planning and execution on board, real-time spacecraft control outside ground contact, maximisation of mission objectives in relation to the available onboard resources and capabilities of other spacecraft, and system robustness in presence of on-board failures and external uncertainties. This talk focuses on distributed AI and multiagent technology enabling distributed space systems autonomy capabilities, including onboard AI processing and actionable intelligence, multiagent-based small spacecraft and constellation resilience, distributed AI for dynamic optimisation of spacecraft operations, and AI-based real-time tasking and resource allocation in distributed space systems.

      Speaker: Prof. Ryszard Kowalczyk (University of South Australia)
    • 12:30 PM
      Lunch
    • 7
      NLP for Minor Languages Support
      Speaker: Dr Sunna Torge (TU Dresden / ScaDS.AI Dresden/Leipzig)
    • 3:00 PM
      Break
    • 8
      Architectures and systems for AI inference
      Speaker: Prof. Diana Göhringer (TU Dresden / ScaDS.AI Dresden/Leipzig)
    • 9
      Next-Generation Data Management via Large Language Models

      The past years have been marked by several breakthrough results in the domain of generative AI, culminating in the rise of tools like ChatGPT, able to solve a variety of language-related tasks without specialized training. In this talk, I outline novel opportunities in the context of data management, enabled by these advances. I discuss several recent research projects at Cornell, aimed at exploiting advanced language processing for tasks such as parsing a database manual to support automated tuning, or mining data for patterns, described in natural language. Finally, I discuss our recent and ongoing research, aimed at synthesizing code for SQL processing in general-purpose programming languages, while enabling customization via natural language commands.

      Speaker: Prof. Immanuel Trummer (Cornell University)
    • 10:30 AM
      Break
    • 10
      Machine Learning in Aviation – from Traceable Machine Learning to Flows Around Airfoils

      Artificial intelligence is one of the key drivers for digitization in aviation. Its application ranges from automisation of flight control in certain scenarios up to engineering & design tools used in virtual certification. At the DLR Institute of Software Methods for Product Virtualization, we are researching AI models and their application in the aviation domain. We focus mostly on engineering & design processes and try to augment & combine existing tools with AI methods. In this talk we will present and discuss some of our use cases from traceable ML up to graph kernel networks for flow prediction.

      Speaker: Dr Betty Calpas (Deutsches Zentrum für Luft-und Raumfahrt e.V)
    • 12:00 PM
      Lunch
    • 11
      The “Genes” of Materials Properties and Functions Identified by Symbolic Regression

      The identification of correlations describing materials properties and functions is crucial for guiding materials discovery, since the number of possible materials is practically infinite and only few compounds are useful for a given application. However, the materials behaviour might result from an intricate interplay of several underlying physical processes, challenging the explicit modelling of materials by simulation and the derivation of these correlations.

      In this talk, the combination of consistent experimental and theoretical data with symbolic regression is presented as an approach to model materials and to determine the key physicochemical descriptive parameters ("materials genes") reflecting the processes that trigger, facilitate, or hinder the materials performance. The symbolic regression AI approach leverages the small number of materials that can be accessed experimentally and identifies nonlinear correlations that can be exploited for enhancing physical understanding and designing new materials. The data-centric approach is illustrated in the context of heterogeneous catalysis and mechanical properties.

      Speaker: Dr Lucas Foppa (Fritz-Haber-Institut, Max-Planck-Gesellschaft)
    • 2:30 PM
      Social Event: Hike and Dinner at Louisenhof Dresden Louisenhof Dresden

      Louisenhof Dresden

      Bergbahnstraße 8 01324 Dresden https://goo.gl/maps/qo8ekz6Mnbdhtqog7

      Hike through Dresdner Heide followed by a dinner at Louisenhof

      Louisenhof
      Bergbahnstraße 8
      01324 Dresden
      https://goo.gl/maps/qo8ekz6Mnbdhtqog7

    • 12
      Computing Architectures for AI

      Artificial intelligence has long since arrived in our everyday lives and is present in many ways, whether visible or invisible. Microelectronics provide the basis for almost all applications, and in combination with powerful software, AI models and connectivity, increasingly attractive solutions are emerging that are changing our lives. In the process, AI is increasingly distributed and computation takes place at the most efficient layer. Semiconductor companies like Infineon are both enablers and users of AI. With their products, they enable AI, and in their manufacturing fabs, they have long used AI for highly automated and efficient production at the highest level. The presentation covers an arc from the role of microelectronics for AI to the Saxon perspective, Silicon Saxony, to concrete applications with AI and their challenges, with a focus on Infineon.

      Speaker: Mr Uwe Gäbler (Infineon)
    • 10:30 AM
      Break
    • 13
      Information field theory: concepts, astronomical applications, & relation to AI
      Speaker: Dr Torsten Enßlin (MPI Garchingen)
    • 12:30 PM
      Lunch
    • 14
      Bayesian Statistics and Machine Learning

      Bayesian Statistics are a powerful framework to think about statistical distributions and inference with associated models. In this talk we will give an overview of the main differences between the frequentist and Bayesian view on statistics. Further we introduce the main concepts needed to understand and apply Bayesian Statistics to real-world applications. The talk finishes with an overview of Markov-Chain Monte Carlo methods and its application to (simple) ML models.

      Speaker: Dr Gunar Ernis (Fraunhofer IAIS)
    • 3:00 PM
      Break
    • 15
      An Introduction to Gaussian Processes

      Since the early 1990s, Gaussian Processes have evolved from an unusual machine learning method to a standard tool of data scientists. Especially in regimes with small amounts of data and abstract expert knowledge, their strength becomes apparent: the combination of assumptions and data leads to particularly efficient models. Furthermore, as Bayesian models, they offer the possibility of uncertainty estimation, which can be exploited for special applications, such as Active Learning and Bayesian Optimization.

      This talk introduces Gaussian Processes: first they are derived starting from simple random variables, followed by a brief introduction of the basics of kernel design. After an application example showing how the fusion of data and prior knowledge can work, I give practical tips and tricks for application.

      Speaker: Mrs Dorina Weichert (Fraunhofer IAIS)
    • 16
      Experimental Design using Active Learning and Bayesian Optimization

      Real-world data can be very expensive: laboratory experiments, numerical simulations, training large neural networks not only takes a lot of time, but also a lot of money. Nevertheless, the collection of these data is mandatory if progress is to be made. Statistical design of experiments is a traditional way to find out the necessary data from the ones that may be generated. However, there have also been relevant advances from the AI field in recent years: active learning and Bayesian optimization offer the possibility to create particularly efficient sequential experimental designs.

      In this talk, relevant methods from Bayesian Optimization and Active Learning, their similarities, differences, and limitations will be presented. In addition to standard extensions for practical use, we will show excerpts from the state-of-the-art and finally the application in really relevant applications: the United Nations Sustainable Development Goals.

      Speakers: Mrs Dorina Weichert (Fraunhofer IAIS), Dr Gunar Ernis (Fraunhofer IAIS)
    • 17
      Fundamentals of Representation Learning

      Deep learning approaches have been used very successfully to automatically find appropriate representations of input data in order to solve machine learning tasks. One particularly relevant, but also challenging, type of input data are knowledge graphs (KGs) that encode human knowledge. Currently, most deep learning approaches for representation learning in knowledge graphs are empirically driven. There is a lack of a clear mathematical understanding of how deep learning approaches can capture the complexity of human knowledge in specific application domains. In this talk, you will be introduced to basics concepts of representation learning, linear algebra and knowledge graphs and embedding models.

      Speaker: Dr Sahar Vahdati (InfAI)
    • 10:30 AM
      Break
    • 18
      Knowledge in Perception Systems

      Semantic memory and episodic memory play a critical role in human perception. The semantic memory refers to our brain’s repository of general world knowledge and episodic memory refers to our “episodic memory system”, which encodes, stores, and allows access to “episodic memories”, e.g. recollection of personally experienced events situated within a unique spatial and temporal contexts. This inspired us to introduce the semantic stream, a dynamic knowledge graph, wherein semantic and episodic memories are represented as interconnected graphs. This presentation allows integration and fusion of various kinds of sensory observations, e.g, images, videos and point clouds, into interlinked sub-symbolic and symbolic data streams at different levels of semantic abstractions. 

      My talk will delve into the fundamental elements of perception systems, from sensory inputs to high-level cognition, providing a comprehensive overview of how different knowledge types contribute to the whole process of building these systems. Special attention will be given to dynamic knowledge representation, semantic-driven learning, the fusion of sensory data, and the integration of contextual knowledge. Furthermore, I will share my experiences in building perception pipelines for autonomous vehicles and robots via a declarative programming model based on semantic streams. This programming model enables developers to write semantic stream fusion programs, composed of if-then rules associated with stream data fusion operations for both reasoning and learning tasks.

      Speaker: Danh Le Phuoc (BIFOLD)
    • 12:30 PM
      Lunch