Miguel Vasco

Hi, I’m Miguel, currently a Postdoctoral Researcher at KTH Royal Institute of Technology (Stockholm, Sweden), advised by Danica Kragic.

My long-term goal is to build multimodal artificial agents that naturally co-exist with humans in real and virtual environments. To achieve this, I leverage reinforcement learning to design agents that act effectively and adapt to changes in perception, their environment, and human preferences.

I completed my Ph.D. at Instituto Superior Tecnico (Lisbon, Portugal), where my work on multimodal reinforcement learning was awarded the Best PhD Thesis in AI in Portugal. Previously, I was an RSS Pioneer and a research intern at Sony AI.

Feel free to explore my work or reach out to discuss potential collaborations!

news

May 06, 2025	New preprint on a framework for the long-term co-existence between humans and artificial agents (link).
May 02, 2025	Our work on the alignment of image models for visual decoding from the brain (link) has been accepted at ICML.
Apr 15, 2025	I’m organizing the Reinforcement Learning and Video Games workshop at RLC 2025!
Jan 27, 2025	Our work on sample-efficient adaptation of robot behavior with preference-based RL has been accepted at ICRA (link)!
Sep 26, 2024	Two papers accepted at NeurIPS! Our first work explores foundation models of chemical data for human olfaction (Spotlight, (link)). Our second work explores test-time compute for agents with arbitrarily large observations (Poster, (link)).
Sep 04, 2024	Won the Best PhD Thesis in AI in Portugal award by the Portuguese Association for Artificial Intelligence (APPIA).
Aug 10, 2024	Our paper on super-human autonomous racing won an Outstanding Paper Award at RLC 2024 (link)!

selected publications

ICML

Human-Aligned Image Models Improve Visual Decoding from the Brain

Nona Rajabi, Antônio H Ribeiro^*, Miguel Vasco^*, and 3 more authors

arXiv preprint arXiv:2502.03081. Accepted at ICML 2025., 2025

Abs HTML

Decoding visual images from brain activity has significant potential for advancing brain-computer interaction and enhancing the understanding of human perception. Recent approaches align the representation spaces of images and brain activity to enable visual decoding. In this paper, we introduce the use of human-aligned image encoders to map brain signals to images. We hypothesize that these models more effectively capture perceptual attributes associated with the rapid visual stimuli presentations commonly used in visual brain data recording experiments. Our empirical results support this hypothesis, demonstrating that this simple modification improves image retrieval accuracy by up to 21% compared to state-of-the-art methods. Comprehensive experiments confirm consistent performance improvements across diverse EEG architectures, image encoders, alignment methods, participants, and brain imaging modalities.
arXiv

Humans Co-exist, So Must Embodied Artificial Agents

Hannah Kuehn^*, Joseph La Delfa^*, Miguel Vasco^*, and 2 more authors

arXiv preprint arXiv:2502.04809, 2025

Abs HTML

Modern embodied artificial agents excel in static, predefined tasks but fall short in dynamic and long-term interactions with humans. On the other hand, humans can adapt and evolve continuously, exploiting the situated knowledge embedded in their environment and other agents, thus contributing to meaningful interactions. We introduce the concept of co-existence for embodied artificial agents and argues that it is a prerequisite for meaningful, long-term interaction with humans. We take inspiration from biology and design theory to understand how human and non-human organisms foster entities that co-exist within their specific niches. Finally, we propose key research directions for the machine learning community to foster co-existing embodied agents, focusing on the principles, hardware and learning methods responsible for shaping them.
ICRA

FLoRA: Sample-Efficient Preference-based RL via Low-Rank Style Adaptation of Reward Functions

Daniel Marta^*, Simon Holk^*, Miguel Vasco, and 6 more authors

In IEEE International Conference on Robotics and Automation (ICRA), 2025

Abs HTML

Preference-based reinforcement learning (PbRL) is a suitable approach for style adaptation of pre-trained robotic behavior: adapting the robot’s policy to follow human user preferences while still being able to perform the original task. However, collecting preferences for the adaptation process in robotics is often challenging and time-consuming. In this work we explore the adaptation of pre-trained robots in the low-preference-data regime. We show that, in this regime, recent adaptation approaches suffer from catastrophic reward forgetting (CRF), where the updated reward model overfits to the new preferences, leading the agent to become unable to perform the original task. To mitigate CRF, we propose to enhance the original reward model with a small number of parameters (low-rank matrices) responsible for modeling the preference adaptation. Our evaluation shows that our method can efficiently and effectively adjust robotic behavior to human preferences across simulation benchmark tasks and multiple real-world robotic tasks.
NeurIPS

Can Transformers Smell Like Humans?

Farzaneh Taleb, Miguel Vasco, Antonio H. Ribeiro, and 2 more authors

In The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024

Spotlight Abs HTML

Spotlight Presentation at NeurIPS 2024

The human brain encodes stimuli from the environment into representations that form a sensory perception of the world. Despite recent advances in understanding visual and auditory perception, olfaction remains an under-explored topic in the machine learning community due to the lack of large-scale datasets annotated with labels related to human olfactory perception. In this work, we ask the question of whether transformer models trained on chemical structures encode representations that are aligned with human olfactory perception, i.e., \emphcan transformers smell like humans? We demonstrate, by means of three analyses, that representations encoded from transformers pre-trained on general chemical structures are highly aligned with human olfactory perception. We use 5 different datasets and 3 different types of perceptual representations to show that the representations encoded by transformer models are able to predict 1) labels associated with odorants‌‌ provided by experts; 2) ratings provided by human participants with respect to pre-defined descriptors; 3) similarity ratings between odorants provided by human participants. Finally, we also evaluate the extent to which this alignment is associated with chemical features of odorants known to be relevant for olfactory perception.
NeurIPS

NeuralSolver: Learning Algorithms For Consistent and Efficient Extrapolation Across General Tasks

Bernardo Esteves, Miguel Vasco, and Francisco S. Melo

In The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024

Abs HTML

We contribute NeuralSolver, a novel recurrent solver that can efficiently and consistently extrapolate, i.e., learn algorithms from smaller problems (in terms of observation size) and execute those algorithms in large problems. Contrary to previous recurrent solvers, NeuralSolver can be naturally applied in both same-size problems, where the input and output sizes are the same, and in different-size problems, where the size of the input and output differ. To allow for this versatility, we design NeuralSolver with three main components: a recurrent module, that iteratively processes input information at different scales, a processing module, responsible for aggregating the previously processed information, and a curriculum-based training scheme, that improves the extrapolation performance of the method. To evaluate our method we introduce a set of novel different-size tasks and we show that NeuralSolver consistently outperforms the prior state-of-the-art recurrent solvers in extrapolating to larger problems, considering smaller training problems and requiring less parameters than other approaches.
RLC

A Super-human Vision-based Reinforcement Learning Agent for Autonomous Racing in Gran Turismo

Miguel Vasco^*, Takuma Seno^*, Kenta Kawamoto, and 3 more authors

Reinforcement Learning Journal, 2024

Outstanding Paper Award Abs HTML PDF

Outstading Paper Award in Applications of RL

Racing autonomous cars faster than the best human drivers has been a longstanding grand challenge for the fields of Artificial Intelligence and robotics. Recently, an end-to-end deep reinforcement learning agent met this challenge in a high-fidelity racing simulator, Gran Turismo. However, this agent relied on global features that require instrumentation external to the car. This paper introduces, to the best of our knowledge, the first super-human car racing agent whose sensor input is purely local to the car, namely pixels from an ego-centric camera view and quantities that can be sensed from on-board the car, such as the car’s velocity. By leveraging global features only at training time, the learned agent is able to outperform the best human drivers in time trial (one car on the track at a time) races using only local input features. The resulting agent is evaluated in Gran Turismo 7 on multiple tracks and cars. Detailed ablation experiments demonstrate the agent’s strong reliance on visual inputs, making it the first vision-based super-human car racing agent.
ICML

Geometric Multimodal Contrastive Representation Learning

Petra Poklukar^*, Miguel Vasco^*, Hang Yin, and 3 more authors

In Proceedings of the 39th International Conference on Machine Learning, 2022

Abs HTML PDF Code

Learning representations of multimodal data that are both informative and robust to missing modalities at test time remains a challenging problem due to the inherent heterogeneity of data obtained from different channels. To address it, we present a novel Geometric Multimodal Contrastive (GMC) representation learning method consisting of two main components: i) a two-level architecture consisting of modality-specific base encoders, allowing to process an arbitrary number of modalities to an intermediate representation of fixed dimensionality, and a shared projection head, mapping the intermediate representations to a latent representation space; ii) a multimodal contrastive loss function that encourages the geometric alignment of the learned representations. We experimentally demonstrate that GMC representations are semantically rich and achieve state-of-the-art performance with missing modality information on three different learning problems including prediction and reinforcement learning tasks.
Neural Networks

Leveraging hierarchy in multimodal generative models for effective cross-modality inference

Miguel Vasco, Hang Yin, Francisco S. Melo, and 1 more author

Neural Networks (2021 Special Issue on AI and Brain Science: Brain-inspired AI), 2022

Abs HTML Code

This work addresses the problem of sensing the world: how to learn a multimodal representation of a reinforcement learning agent’s environment that allows the execution of tasks under incomplete perceptual conditions. To address such problem, we argue for hierarchy in the design of representation models and contribute with a novel multimodal representation model, MUSE. The proposed model learns hierarchical representations: low-level modality-specific representations, encoded from raw observation data, and a high-level multimodal representation, encoding joint-modality information to allow robust state estimation. We employ MUSE as the sensory representation model of deep reinforcement learning agents provided with multimodal observations in Atari games. We perform a comparative study over different designs of reinforcement learning agents, showing that MUSE allows agents to perform tasks under incomplete perceptual experience with minimal performance loss. Finally, we evaluate the performance of MUSE in literature-standard multimodal scenarios with higher number and more complex modalities, showing that it outperforms state-of-the-art multimodal variational autoencoders in single and cross-modality generation.