Keynote Speakers
Razvan Pascanu
Razvan Pascanu grew up in Romania and studied computer science and electrical engineering for his undergraduate in Germany. He got his MSc from Jacobs University, Bremen in 2009 under the supervision of prof. Herbert Jaeger. He holds a PhD from University of Montreal (2014), which he did under the supervision of prof. Yoshua Bengio. His PhD thesis can be found here.
He was involved in developing Theano and help write some of the deep learning tutorials for Theano. Razvan has published several papers on topics surrounding deep learning and deep reinforcement learning (see his scholar page). He is one of the organizers of EEML (www.eeml.eu) and part of the organizers of AIRomania. As part of the AIRomania community, he has organized RomanianAIDays since 2020, and helped build a course on AI aimed at high school students.
Title of talk: State Space Models, a new form of Recurrent Neural Network
Abstract:
In this talk I will focus on State Space Models (SSMs), a subclass of Recurrent Neural Networks (RNNs) that has recently gained some attention through works like Mamba, obtaining strong performance against transformer baselines in language. I will start by first explaining how SSMs can be viewed as just a particular parametrization of RNNs and what are the crucial differences compared to previous recurrent architectures that led to these results. My goal is to demystify the relative complex parametrization of the architecture and identify what elements are needed for the model to perform well. In this process I will introduce the Linear Recurrent Unit (LRU), a simplified linear layer inspired by existing SSM layers. In the second part of the talk I will focus on language modelling and the block structure in which such layers tend to be embedded. I will argue that beyond the recurrent layer itself, the block structure borrowed from transformers plays a crucial role in the recent successes of this architecture, and present results at scale of well performing hybrid recurrent architectures as compared to strong transformer baseline.
Shika Surana
Shika Surana is a Research Engineer at InstaDeep. The current project she is working on leverages machine learning for the in-silico design of de-novo biological sequences. Prior to this, Shika worked on using Reinforcement Learning for Combinatorial Optimization, where her Team developed a novel method that leverages a single set of network parameters to solve NP-hard problems by learning infinitely many policies. Before joining InstaDeep, Shika pursued a Master's degree in MSc Computing (AI and Machine Learning) at Imperial College where she did her dissertation in the field of robotics, specifically she used Quality-Diversity to develop a robust framework that allowed quadruped robots in simulation to traverse a wide range of environments.
Talk title:
"COMPASS - Combinatorial Optimization with Policy Adaptation using Latent Space Search"
Abstract:
Combinatorial optimization is crucial for many real-world applications, but creating effective algorithms for these NP-hard problems is challenging. While Reinforcement Learning (RL) offers a flexible framework, it hasn't yet outperformed industrial solvers. Current RL methods often use limited search procedures. We introduce COMPASS, an RL approach that generates diverse, specialized policies through a continuous latent space. COMPASS outperforms state-of-the-art methods in benchmark tasks and shows better generalization across various problem instances.
Marta Wolinska
Marta Wolinska is a Research Engineer in the Genomics team at InstaDeep, her work focuses on investigating novel applications of foundational models specifically for genomics applications.
Marta graduated with a Master's degree in MSc Artificial Intelligence from Imperial College in October 2023. Her research thesis was also inter-disciplinary. Marta investigated whether Quality-Diversity algorithms (that come from the field of robotics) could be applied to materials discovery. Prior to this she was on a different career path, working as a consultant with primarily large banks on their digital transformation.
Talk title:
"The Nucleotide Transformer initiative: building and evaluating robust foundation models for genomics"
Abstract:
The human genome sequence provides the underlying code for human biology. Since the sequencing of the human genome 20 years ago, a main challenge in genomics has been the prediction of molecular phenotypes from DNA sequences alone. Models that can “read” the genome of each individual and predict the different regulatory layers and cellular processes hold the promise to better understand, prevent and treat diseases. Here, we introduce the Nucleotide Transformer (NT), our initiative to build robust and general DNA foundation models that learn the languages of genomic sequences and molecular phenotypes. We will first present our first collection of DNA foundational models, having up-to 2.5B parameters and being pre-trained on 850 genomes from various species. The Nucleotide Transformer (NT) models v1 and v2 and agroNT, a version specific for agricultural applications, have learned transferable, context-specific representations of nucleotide sequences, and can be fine-tuned at low cost to solve a variety of genomics applications. We will then discuss avenues on how to improve these models to tackle modern challenges in the field. Notably, we will use this discussion as an opportunity to present our progress on several fronts towards more general genomics AI agents that integrate different modalities and have improved transfer capabilities. The training and application of such foundational models in genomics provide a widely applicable stepping stone to bridge the gap of accurate predictions from DNA sequence.