Summer School 2023
from
Monday, July 3, 2023 (1:00 PM)
to
Friday, July 7, 2023 (2:00 PM)
Monday, July 3, 2023
1:00 PM
Registration and Welcome Coffee
Registration and Welcome Coffee
1:00 PM - 1:30 PM
1:30 PM
Introduction
-
Wolfgang E. Nagel
(
TU Dresden / ScaDS.AI Dresden/Leipzig
)
Introduction
Wolfgang E. Nagel
(
TU Dresden / ScaDS.AI Dresden/Leipzig
)
1:30 PM - 2:45 PM
2:45 PM
Break
Break
2:45 PM - 3:15 PM
3:15 PM
Big Data and AI for Multimodal Communication Research
-
Peter Uhrig
(
TU Dresden / ScaDS.AI Dresden/Leipzig
)
Big Data and AI for Multimodal Communication Research
Peter Uhrig
(
TU Dresden / ScaDS.AI Dresden/Leipzig
)
3:15 PM - 4:45 PM
Much of linguistic research has long focused on words and sentences alone, ignoring co-speech gesture and facial expressions. This was to some extent caused by the availability of large corpora of written text, which is comparably easy to collect and process. With the advances we have seen in data processing and machine learning over the past decade, the use of large audiovisual datasets has come into reach. This presentation will report on the full lifecycle of large audiovisual datasets, spanning collection, cleaning, processing, access infrastructure, and analysis. We will see how NLP, audio processing and Computer Vision technology support the research process or enable us to work on entirely new research questions. The datasets used are taken from American TV and from Russian international media on YouTube. In the second part of the presentation, we will discuss the gap between the research in Computer Science and related disciplines and its applications to research carried out in the Humanities and Social Sciences.
4:45 PM
Poster Session + Get Together
Poster Session + Get Together
4:45 PM - 6:30 PM
Tuesday, July 4, 2023
8:00 AM
Fire Talk: AI and Start Up Scene
-
Richard Socher
(
you.com
)
Fire Talk: AI and Start Up Scene
Richard Socher
(
you.com
)
8:00 AM - 9:00 AM
9:00 AM
Developing Multilingual, Open and European Large Language Models
-
Georg Rehm
(
DFKI
)
Pedro Ortiz Suarez
(
DFKI
)
Developing Multilingual, Open and European Large Language Models
Georg Rehm
(
DFKI
)
Pedro Ortiz Suarez
(
DFKI
)
9:00 AM - 10:30 AM
Since the introduction of ChatGPT in November 2022, Large Language Models (LLMs) have become ubiquitous in everyday life for a large portion of the global population, and have also simplified and facilitated a wide range of tasks for both experts and non-professional users alike. However, the underlying technologies of conversational models such as ChatGPT remain closed-sourced and in the hands of probably less than a dozen private organisations worldwide. In our presentation we will report on our efforts in the project OpenGPT-X, funded by the Federal German Ministry of Economic Affairs and Climate Action (BMWK), to develop large generative language models for the German language, while making them open-source and respectful of European values. In addition to providing an overview of the project, we will present our efforts towards developing multilingual language models in collaboration with the EU project European Language Equality (ELE) towards the curation of a large, multilingual data set, data filtering and preparation; we will give details about the training of our European models and show the first results of the evaluation. Furthermore, we will give an overview of the general state of play of digital language inequality in Europe, which we aim to transform, over the next few years, into full digital language equality in Europe by 2030. Two platforms and initiatives that are of crucial importance in that regard are European Language Grid (ELG) and the recently started Common European Language Data Space (LDS), which will also be briefly highlighted.
10:30 AM
Break
Break
10:30 AM - 11:00 AM
11:00 AM
Distributed AI and Multi-agent Systems for Space Systems Autonomy
-
Ryszard Kowalczyk
(
University of South Australia
)
Distributed AI and Multi-agent Systems for Space Systems Autonomy
Ryszard Kowalczyk
(
University of South Australia
)
11:00 AM - 12:30 PM
AI-based autonomy has been recognised as a key enabler of the next-generation space systems that aim at increasing responsiveness and continuity of space-based observations, covering large areas with higher resolutions, minimizing communication and data access latencies, and reducing costs of both the space and ground segments. Space systems autonomy encompasses onboard autonomous decision-making capabilities that enable the space segment to continue mission operations and to survive critical situations without relying on ground segment intervention. It relates to all aspects of spacecraft operations, including continuous mission planning and execution on board, real-time spacecraft control outside ground contact, maximisation of mission objectives in relation to the available onboard resources and capabilities of other spacecraft, and system robustness in presence of on-board failures and external uncertainties. This talk focuses on distributed AI and multiagent technology enabling distributed space systems autonomy capabilities, including onboard AI processing and actionable intelligence, multiagent-based small spacecraft and constellation resilience, distributed AI for dynamic optimisation of spacecraft operations, and AI-based real-time tasking and resource allocation in distributed space systems.
12:30 PM
Lunch
Lunch
12:30 PM - 2:00 PM
2:00 PM
NLP for Minor Languages Support
-
Sunna Torge
(
TU Dresden / ScaDS.AI Dresden/Leipzig
)
NLP for Minor Languages Support
Sunna Torge
(
TU Dresden / ScaDS.AI Dresden/Leipzig
)
2:00 PM - 3:00 PM
3:00 PM
Break
Break
3:00 PM - 3:45 PM
3:45 PM
Architectures and systems for AI inference
-
Diana Göhringer
(
TU Dresden / ScaDS.AI Dresden/Leipzig
)
Architectures and systems for AI inference
Diana Göhringer
(
TU Dresden / ScaDS.AI Dresden/Leipzig
)
3:45 PM - 5:15 PM
Wednesday, July 5, 2023
9:00 AM
Next-Generation Data Management via Large Language Models
-
Immanuel Trummer
(
Cornell University
)
Next-Generation Data Management via Large Language Models
Immanuel Trummer
(
Cornell University
)
9:00 AM - 10:30 AM
The past years have been marked by several breakthrough results in the domain of generative AI, culminating in the rise of tools like ChatGPT, able to solve a variety of language-related tasks without specialized training. In this talk, I outline novel opportunities in the context of data management, enabled by these advances. I discuss several recent research projects at Cornell, aimed at exploiting advanced language processing for tasks such as parsing a database manual to support automated tuning, or mining data for patterns, described in natural language. Finally, I discuss our recent and ongoing research, aimed at synthesizing code for SQL processing in general-purpose programming languages, while enabling customization via natural language commands.
10:30 AM
Break
Break
10:30 AM - 11:00 AM
11:00 AM
Machine Learning in Aviation – from Traceable Machine Learning to Flows Around Airfoils
-
Betty Calpas
(
Deutsches Zentrum für Luft-und Raumfahrt e.V
)
Machine Learning in Aviation – from Traceable Machine Learning to Flows Around Airfoils
Betty Calpas
(
Deutsches Zentrum für Luft-und Raumfahrt e.V
)
11:00 AM - 12:00 PM
Artificial intelligence is one of the key drivers for digitization in aviation. Its application ranges from automisation of flight control in certain scenarios up to engineering & design tools used in virtual certification. At the DLR Institute of Software Methods for Product Virtualization, we are researching AI models and their application in the aviation domain. We focus mostly on engineering & design processes and try to augment & combine existing tools with AI methods. In this talk we will present and discuss some of our use cases from traceable ML up to graph kernel networks for flow prediction.
12:00 PM
Lunch
Lunch
12:00 PM - 1:30 PM
1:30 PM
The “Genes” of Materials Properties and Functions Identified by Symbolic Regression
-
Lucas Foppa
(
Fritz-Haber-Institut, Max-Planck-Gesellschaft
)
The “Genes” of Materials Properties and Functions Identified by Symbolic Regression
Lucas Foppa
(
Fritz-Haber-Institut, Max-Planck-Gesellschaft
)
1:30 PM - 2:30 PM
The identification of correlations describing materials properties and functions is crucial for guiding materials discovery, since the number of possible materials is practically infinite and only few compounds are useful for a given application. However, the materials behaviour might result from an intricate interplay of several underlying physical processes, challenging the explicit modelling of materials by simulation and the derivation of these correlations. In this talk, the combination of consistent experimental and theoretical data with symbolic regression is presented as an approach to model materials and to determine the key physicochemical descriptive parameters ("materials genes") reflecting the processes that trigger, facilitate, or hinder the materials performance. The symbolic regression AI approach leverages the small number of materials that can be accessed experimentally and identifies nonlinear correlations that can be exploited for enhancing physical understanding and designing new materials. The data-centric approach is illustrated in the context of heterogeneous catalysis and mechanical properties.
2:30 PM
Social Event: Hike and Dinner at Louisenhof Dresden
Social Event: Hike and Dinner at Louisenhof Dresden
2:30 PM - 9:00 PM
Thursday, July 6, 2023
9:00 AM
Computing Architectures for AI
-
Uwe Gäbler
(
Infineon
)
Computing Architectures for AI
Uwe Gäbler
(
Infineon
)
9:00 AM - 10:30 AM
Artificial intelligence has long since arrived in our everyday lives and is present in many ways, whether visible or invisible. Microelectronics provide the basis for almost all applications, and in combination with powerful software, AI models and connectivity, increasingly attractive solutions are emerging that are changing our lives. In the process, AI is increasingly distributed and computation takes place at the most efficient layer. Semiconductor companies like Infineon are both enablers and users of AI. With their products, they enable AI, and in their manufacturing fabs, they have long used AI for highly automated and efficient production at the highest level. The presentation covers an arc from the role of microelectronics for AI to the Saxon perspective, Silicon Saxony, to concrete applications with AI and their challenges, with a focus on Infineon.
10:30 AM
Break
Break
10:30 AM - 11:00 AM
11:00 AM
Information field theory: concepts, astronomical applications, & relation to AI
-
Torsten Enßlin
(
MPI Garchingen
)
Information field theory: concepts, astronomical applications, & relation to AI
Torsten Enßlin
(
MPI Garchingen
)
11:00 AM - 12:30 PM
12:30 PM
Lunch
Lunch
12:30 PM - 2:00 PM
2:00 PM
Bayesian Statistics and Machine Learning
-
Gunar Ernis
(
Fraunhofer IAIS
)
Bayesian Statistics and Machine Learning
Gunar Ernis
(
Fraunhofer IAIS
)
2:00 PM - 3:00 PM
Bayesian Statistics are a powerful framework to think about statistical distributions and inference with associated models. In this talk we will give an overview of the main differences between the frequentist and Bayesian view on statistics. Further we introduce the main concepts needed to understand and apply Bayesian Statistics to real-world applications. The talk finishes with an overview of Markov-Chain Monte Carlo methods and its application to (simple) ML models.
3:00 PM
Break
Break
3:00 PM - 3:30 PM
3:30 PM
An Introduction to Gaussian Processes
-
Dorina Weichert
(
Fraunhofer IAIS
)
An Introduction to Gaussian Processes
Dorina Weichert
(
Fraunhofer IAIS
)
3:30 PM - 4:30 PM
Since the early 1990s, Gaussian Processes have evolved from an unusual machine learning method to a standard tool of data scientists. Especially in regimes with small amounts of data and abstract expert knowledge, their strength becomes apparent: the combination of assumptions and data leads to particularly efficient models. Furthermore, as Bayesian models, they offer the possibility of uncertainty estimation, which can be exploited for special applications, such as Active Learning and Bayesian Optimization. This talk introduces Gaussian Processes: first they are derived starting from simple random variables, followed by a brief introduction of the basics of kernel design. After an application example showing how the fusion of data and prior knowledge can work, I give practical tips and tricks for application.
4:30 PM
Experimental Design using Active Learning and Bayesian Optimization
-
Dorina Weichert
(
Fraunhofer IAIS
)
Gunar Ernis
(
Fraunhofer IAIS
)
Experimental Design using Active Learning and Bayesian Optimization
Dorina Weichert
(
Fraunhofer IAIS
)
Gunar Ernis
(
Fraunhofer IAIS
)
4:30 PM - 5:30 PM
Real-world data can be very expensive: laboratory experiments, numerical simulations, training large neural networks not only takes a lot of time, but also a lot of money. Nevertheless, the collection of these data is mandatory if progress is to be made. Statistical design of experiments is a traditional way to find out the necessary data from the ones that may be generated. However, there have also been relevant advances from the AI field in recent years: active learning and Bayesian optimization offer the possibility to create particularly efficient sequential experimental designs. In this talk, relevant methods from Bayesian Optimization and Active Learning, their similarities, differences, and limitations will be presented. In addition to standard extensions for practical use, we will show excerpts from the state-of-the-art and finally the application in really relevant applications: the United Nations Sustainable Development Goals.
Friday, July 7, 2023
9:00 AM
Fundamentals of Representation Learning
-
Sahar Vahdati
(
InfAI
)
Fundamentals of Representation Learning
Sahar Vahdati
(
InfAI
)
9:00 AM - 10:30 AM
Deep learning approaches have been used very successfully to automatically find appropriate representations of input data in order to solve machine learning tasks. One particularly relevant, but also challenging, type of input data are knowledge graphs (KGs) that encode human knowledge. Currently, most deep learning approaches for representation learning in knowledge graphs are empirically driven. There is a lack of a clear mathematical understanding of how deep learning approaches can capture the complexity of human knowledge in specific application domains. In this talk, you will be introduced to basics concepts of representation learning, linear algebra and knowledge graphs and embedding models.
10:30 AM
Break
Break
10:30 AM - 11:00 AM
11:00 AM
Knowledge in Perception Systems
-
Danh Le Phuoc
(
BIFOLD
)
Knowledge in Perception Systems
Danh Le Phuoc
(
BIFOLD
)
11:00 AM - 12:30 PM
Semantic memory and episodic memory play a critical role in human perception. The semantic memory refers to our brain’s repository of general world knowledge and episodic memory refers to our “episodic memory system”, which encodes, stores, and allows access to “episodic memories”, e.g. recollection of personally experienced events situated within a unique spatial and temporal contexts. This inspired us to introduce the semantic stream, a dynamic knowledge graph, wherein semantic and episodic memories are represented as interconnected graphs. This presentation allows integration and fusion of various kinds of sensory observations, e.g, images, videos and point clouds, into interlinked sub-symbolic and symbolic data streams at different levels of semantic abstractions. My talk will delve into the fundamental elements of perception systems, from sensory inputs to high-level cognition, providing a comprehensive overview of how different knowledge types contribute to the whole process of building these systems. Special attention will be given to dynamic knowledge representation, semantic-driven learning, the fusion of sensory data, and the integration of contextual knowledge. Furthermore, I will share my experiences in building perception pipelines for autonomous vehicles and robots via a declarative programming model based on semantic streams. This programming model enables developers to write semantic stream fusion programs, composed of if-then rules associated with stream data fusion operations for both reasoning and learning tasks.
12:30 PM
Lunch
Lunch
12:30 PM - 2:00 PM