You are here

Webinars series

2025-2026
Registration	Marco Valentino (University of Sheffield) TBA (Thursday, July 2, 2026 - 15:00 CET) Summary: . Bio: .
Registration	Beyza Ermiş (Cohere Labs) TBA (Thursday, June 4, 2026 - 15:00 CET) Summary: . Bio: .
Registration	Barbara Plank (Ludwig-Maximilians-Universität München) The Emergence of Multilingual Representations:  Tracing Linguistic Capabilities During Language Model Pretraining (Thursday, May 21, 2026 - 15:00 CET) Summary: There is increasing interest in understanding multilingual training dynamics and shared representations, instead of analysing final model checkpoints. Tracing training dynamics allows us to analyse when linguistic information and shared concept spaces emerge during pre-training and understand model mechanisms, e.g. where alignment breaks down. In this talk, I will discuss why studying training dynamics is useful, particularly from a multilingual lens, and present recent findings on studying model behaviour and representations during pre-training. Bio: Barbara Plank is Professor and co-director of the Center for Information and Language Processing at LMU Munich. She holds the Chair for AI and Computational Linguistics at LMU Munich and is a visiting Professor at the Computer Science department at the IT University of Copenhagen. Her MaiNLP research lab (Munich AI and NLP lab) focuses on robust machine learning for Natural Language Processing with an emphasis on human-inspired and data-centric approaches. Her research has been funded by distinguished grants and awards, including an ERC Consolidator Grant, DFF Sapere Aude Research Leader grant, ELLIS Fellow, and several best paper awards. She regularly serves on international committees, including the Association for Computational Linguistics (ACL), the European Chapter of the ACL, the Northern European Association for Language Technology (NEALT) and Scientific Advisory Boards of Research Centers across Europe.
Registration	Ranjay Krishna (University of Washington) Visual Reasoning will be bigger than language reasoning (Thursday, April 16, 2026 - 15:00 CET) Summary: I will argue that visual reasoning is a fundamental capability and one that has tremendous potential in multimodal language models. I will start by outlining the types of tasks that multimodal models still fall short on, drawing on decades of computer vision research. Next, I will introduce the concept of sketching, which operationalizes visual reasoning using external computer vision models as tools. I will demonstrate the potential of visual reasoning with sketching, and outline the limitations. After which, we will overcome these limitations by incorporating visual reasoning directly into the language model using perception tokens. Finally, I will describe how visual reasoning can enable robots to reason in space, allowing them to surpass non-reasoning proprietary robotics foundation models. Bio: Ranjay Krishna is an Assistant Professor at the Allen School of Computer Science & Engineering. He co-directs the RAIVN lab at UW and directs the PRIOR team at the Allen Institute. His research lies at the intersection of computer vision, natural language processing, robotics, and human computer interaction. This research has received best paper honorable mentions at CVPR'25 and CSCW'23, outstanding paper at NeurIPS'21 and ACL'21, and dozens of orals at CVPR, ACL, CSCW, NeurIPS, UIST, and ECCV, and has been reported by Science, Forbes, the Wall Street Journal, and PBS NOVA. He is also recognized as one of MIT Technology Review's 35 under 35 Asia Pacific '25. His research has been supported by Google, Apple, Ai2, Amazon, Cisco, Toyota Motor Inc, Toyota Research Institute, NSF, ONR, and Yahoo. He holds a bachelor's degree in Electrical & Computer Engineering and in Computer Science from Cornell University, a master's degree in Computer Science from Stanford University and a Ph.D. in Computer Science from Stanford University.
Registration	José Andrés González-López (Universidad de Granada) From Neural Signals to Fluent Speech: Recent Advances in Neural Speech Interfaces (Thursday, March 5, 2026 - 15:00 CET) Summary: Neural speech interfaces aim to restore natural communication in individuals who have lost the ability to speak while preserving cognitive function. Over the past decade, this field has undergone a remarkable transformation, moving from slow and cognitively demanding spelling-based brain–computer interfaces to systems capable of decoding continuous speech directly from neural activity. These advances have been driven by the convergence of high-resolution invasive neural recording technologies, improved experimental paradigms for speech production and perception, and powerful deep learning models inspired by modern automatic speech recognition systems. In this talk, I will review the state of the art in neural speech prostheses, with a particular focus on next-generation BCIs that translate cortical activity into text or synthetic speech. I will discuss key design choices, including neural recording techniques (such as ECoG, sEEG, and intracortical microelectrodes), target brain areas, decoding architectures, and evaluation metrics. I will also highlight recent clinical results demonstrating unprecedented levels of accuracy, fluency, and long-term stability in continuous speech decoding. Finally, I will outline current challenges and future directions, including scalability across users, real-time bidirectional feedback, and the path towards clinical and real-world deployment, illustrated with ongoing work from our research group. Bio: Jose A. Gonzalez-Lopez is an Associate Professor at the University of Granada whose research sits at the frontier of artificial intelligence, computational neuroscience, and neural speech prostheses. His work addresses the core challenge of how to translate high-dimensional neural activity into fluent, natural speech, bridging invasive neural recordings with modern deep learning and speech–language models. He leads multiple competitive R&D projects on AI-driven speech restoration for individuals with severe neurological and phonatory impairments, with a strong emphasis on long-term robustness, scalability across users, and real-world clinical deployment. He has published over 100 papers in leading international journals and conferences. His contributions have been recognized with several awards for scientific excellence and technological innovation, and his research is embedded in a strong international collaboration network built through extended research visits to institutions such as the University of Sheffield, the University of Bremen, and Maastricht University.
Registration	Henning Wachsmuth (Leibniz University Hannover) Toward Argumentative Large Language Models (Thursday, February 5, 2026 - 15:00 CET) Summary: Today's large language models (LLMs) are optimized toward giving helpful answers in response to prompts. In many situations, however, it may be preferable for an LLM to foster critical thinking rather than just following an instruction. While recent LLMs are said to 'reason', they barely build on established reasoning concepts known from argumentation theory. In this talk, I will give insights into recent efforts of my group in making LLMs more argumentative. Starting from basics of LLM training processes, I will present how to specialize LLMs for argumentation tasks via instruction fine-tuning as well as how to align the arguments they generate using reinforcement learning. From there, I will give an outlook on how to improve the actual reasoning capabilities of LLMs. Bio: Henning Wachsmuth leads the Natural Language Processing Group at the Institute of Artificial Intelligence of Leibniz University Hannover. After receiving his PhD from Paderborn University in 2015, he worked as a PostDoc at Bauhaus-Universität Weimar and as a junior professor in Paderborn, before he became a full professor in Hannover in 2022. His group does basic research on large language models for computational argumentation, social bias detection and mitigation, as well as explainable and educational NLP. Henning's main research interests include the generation of audience-aware text, the assessment of pragmatic text quality, and the modeling of bias and framing.
Registration	Thamar Solorio (Mohamed bin Zayed University of Artificial Intelligence (MBZUAI)) A Research Agenda for Low Resource NLP (Thursday, January 15, 2026 - 15:00 CET) Summary: Low resource NLP is nowadays an umbrella keyword that covers a wide set of research directions. In this talk I will argue that it is important to carefully present the low resource scenario and distinguish when the languages are truly lacking resources, as opposed to simulating lack of labeled data. I propose to follow a more systematic way to represent work in this space and to question how we approach technology development for these languages. I will also present recent work in my group that contributes to improve language representation by exploring efficient approaches to diverse languages. Bio: Thamar Solorio is a professor of NLP at MBZUAI where she also serves as Vice Provost for Faculty Excellence and Advancement. Her research interests include NLP for low-resource settings and multilingual data, including code-switching and information extraction. More recently, she has been exploring language and vision problems, focusing on developing inclusive NLP. She served two terms as an elected board member of the North American Chapter of the Association of Computational Linguistics (NAACL) and was PC co-chair for NAACL 2019, and recently stepped down from being co-Editor in Chief of the ACL Rolling Review Initiative (ARR). She was the general chair of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP).
Registration	Goran Glavaš (Universität Würzburg) Improving Multilingual Abilities of (Different Types of) Language Models (Thursday, December 4, 2025 - 15:00 CET) Summary: Language models tend to excel in languages they see the most during (pre)training—leaving low-resource languages at a stark disadvantage. But what if we could boost performance without throwing (much) more data or compute at the problem? In this talk, I’ll present a set of resource-lean (read: “cheap”) strategies that enhance multilingual language understanding and generation in low-resource settings. I’ll show how conceptually effective knowledge transfer techniques—not just bigger models—can improve multilingual capabilities across three major fronts: (1) standard text-based LLMs, (2) vision-language models, and (3) code language models. The takeaway? Scaling isn’t the only answer: for truly inclusive multilingual language technology, we need stronger inductive biases and more conceptual innovation. Bio: Goran Glavaš is a Full Professor for Natural Language Processing at the University of Würzburg (Germany), Center for AI and Data Science (CAIDAS). His research focuses on multilingual language understanding and cross-lingual transfer, vision-and-language models, and trustworthiness of (multilingual) language models. He has (co-)authored over 120 publications in NLP and IR, regularly publishing at top-tier venues (ACL, EMNLP, NAACL, EACL, TACL, SIGIR, ECIR). He received the best long paper award at EACL 2021 and outstanding paper awards at EACL 2024 and ACL 2024. He served as an Editor-in-Chief of the ACL Rolling Review (ARR) and regularly serves as (Senior) Area Chair for top-tier NLP conferences.
Registration	Ivan Vulić (University of Cambridge / Google DeepMind) On Merging and MoErging Models and Modules (Thursday, November 6, 2025 - 15:00 CET) Summary: Despite recent tendencies towards building large "monolithic" neural models, fine-tuned expert models and parameter-efficient specialised modules still offer gains over large monoliths in specific tasks and for specific data distributions (e.g., low-resource languages or specialised domains). Moreover, such modularisation of skills and expertise into dedicated models or modules allows for asynchronous, decentralised, and more efficient continuous model development, as well as module reusability. However, a central question remains: how to combine and compose these modules to enable positive transfer, sample-efficient learning, and improved out-of-domain generalisation. In this talk, after discussing the key advantages of modularisation and modular specialisation, I will provide an overview of prominent module and model composition strategies. I will focus on composition at the parameter level (model merging) and functional level (model MoErging), and then illustrate the usefulness of these techniques across several applications. Bio: Ivan Vulić is currently a Research Scientist at Google DeepMind in Zurich after spending a year there as a Visiting Researcher. Before that he was a Research Professor and a Royal Society University Research Fellow in the Language Technology Lab, University of Cambridge, where he spent 10 years across different research roles. From January 2018 until November 2024 he was also a Senior Scientist at PolyAI in London. Ivan holds a PhD in Computer Science from KU Leuven awarded summa cum laude. In 2021 he was awarded the annual Karen Spärck Jones Award from the British Computing Society for his research contributions to Natural Language Processing and Information Retrieval. His core expertise and research interests span, among others, cross-lingual, multilingual and multi-modal representation learning, modularity and composability of ML models, sample-efficient, parameter-efficient and few-shot ML, conversational AI, data-centric ML.
Registration	Jose Camacho-Collados (Cardiff University) How Language Models Navigate Culture in a Multilingual World (Thursday, October 16, 2025 - 15:00 CET) Summary: Language models have become ubiquitous in NLP and beyond. In particular, the new wave of large language models (LLMs) are increasingly used to communicate and solve practical problems in many languages and countries, and by an increasingly diverse set of users. However, even though there is no doubt that these models open up plenty of opportunities, there are important issues and research questions that arise when it comes to LLMs and their application in different languages and cultures. For instance, the language coverage in language models drastically decreases for less-resourced languages and as such, their performance. And not only the general performance is affected, but general-purpose LLMs may be implicitly biased to specific cultures and languages depending on their underlying training data. In this talk, I will discuss how language models reflect on cultural diversity, including potential shortcomings and how language coverage and cultural awareness may be intrinsically intertwined. I will also share some lessons learned based on our recent research in this area, including a large effort to develop a cultural benchmark of everyday knowledge for dozens of languages and countries. Bio: Jose Camacho-Collados is a UKRI Future Leaders Fellow and Professor at the School of Computer Science of Cardiff University, where he co-founded the Cardiff Natural Language Processing group (Cardiff NLP). Before joining Cardiff University, he completed his PhD in Sapienza University of Rome and was a Google AI PhD Fellow. Jose has worked in multiple NLP areas with a particular focus on semantics, multilinguality and computational social science with an interdisciplinary perspective. In this area, he has been developing specialised and efficient NLP models for social media applications, such as TweetNLP and related efforts. His work has received several recognitions, including awards at top NLP conferences, and the 2023 AIJ Prominent Paper Award. He is also the co-author of the “Embeddings in Natural Language Processing” book.

2024-2025
Registration	Mirella Lapata (The University of Edinburgh) Prompting is not all you need! Or why Multi-LLM Collaboration Matters (Thursday, June 5, 2025 - 15:00 CET) Summary: Recent years have witnessed the rise of increasingly larger and more sophisticated language models (LMs) capable of performing every task imaginable, sometimes at (super)human level. In this talk, I will argue that in many realistic scenarios solely relying on a single general-purpose LLM is suboptimal. A single LLM is likely to under-represent real-world data distributions, heterogeneous skills, and task-specific requirements. Instead, I will discuss Multi-LLM collaboration as an alternative for compositional generative modeling. This approach leads to more effective problem-solving while being more inclusive and explainable. I will focus on narrative story generation tasks and demonstrate how these can be tackled by orchestrating a society of agents --- each pursuing individual goals while collectively working toward the overall task objective. Additionally, I will explore how these agent societies leverage reasoning to improve performance. Bio: Mirella Lapata is professor of natural language processing in the School of Informatics at the University of Edinburgh. Her research focuses on getting computers to understand, reason with, and generate natural language. She is the first recipient (2009) of the British Computer Society and Information Retrieval Specialist Group (BCS/IRSG) Karen Sparck Jones award and a Fellow of the Royal Society of Edinburgh, the ACL, and Academia Europaea. Mirella has also received best paper awards in leading NLP conferences and has served on the editorial boards of the Journal of Artificial Intelligence Research, the Transactions of the ACL, and Computational Linguistics. She was president of SIGDAT (the group that organizes EMNLP) in 2018. She has been awarded an ERC consolidator grant, a Royal Society Wolfson Research Merit Award, and a UKRI Turing AI World-Leading Researcher Fellowship.
Registration	Sampo Pyysalo (University of Turku) Towards Open Foundation Models for Europe (Tuesday, May 13, 2025 - 10:00 CET) Summary: Large Language Models (LLMs) are a breakthrough technology with broad social and economic effects. However, the development of leading models is currently concentrated in a few technology hubs primarily in the U.S. and China, leaving smaller languages behind and making Europe dependent on external technologies. To secure its digital sovereignty and ensure that its languages are fully represented, it is essential for Europe to have the capacity to build its own foundation models. In this talk, I will present a line of LLM work ranging from early monolingual models for Finnish to current efforts to create fully open foundation models for all European languages in the OpenEuroLLM project. Bio: Sampo Pyysalo is one of the leads of the TurkuNLP group (https://turkunlp.org/) in the University of Turku, Finland. His work focuses on machine learning for natural language processing, with particular emphasis on scientific text mining, Finnish language technology, and large language models. He received his PhD thesis from the University of Turku and held researcher positions at the University of Tokyo, University of Manchester and University of Cambridge before returning to the University of Turku in 2019. He is currently PI in the HPLT (https://hplt-project.org/) and OpenEuroLLM (https://openeurollm.eu/) projects, where he leads efforts to train multilingual language models.
Registration	André F. T. Martins (Universidade de Lisboa) xCOMET, Tower, EuroLLM: Open & Multilingual LLMs for Europe (Thursday, May 8, 2025 - 15:00 CET) Summary: Today, LLMs are Swiss knives and MT one of their tools. Is this the end of MT research? In this talk, I argue that the connection between LLM and MT research is two-way. I present some of our recent work advancing multilingual LLMs, tools to estimate their quality, and how the two can be combined for test-time scaling. First, I present xCOMET, an open-source learned metric which integrates sentence-level evaluation and error span detection, exhibiting state-of-the-art performance across all types of meta-evaluation (sentence-level, system-level, and error span detection). Moreover, it does so while highlighting and categorizing error spans, thus enriching the quality assessment. Then, I present Tower, a suite of open multilingual LLMs for translation-related tasks. Tower models are created through continued pretraining on a carefully curated multilingual mixture of monolingual and parallel data. The combination of Tower with COMET reranking obtained the best results in 8 out of 11 language pairs in the WMT General Translation shared task, according to human evaluation. Finally, I describe EuroLLM, an ongoing EU-made project whose goal is to train an open multilingual LLM from scratch using the European HPC infrastructure (EuroHPC). The last release (EuroLLM-9B) supports 35 languages, including all 24 official EU languages, and it achieves strong results in various benchmarks, comparable or better than the best existing models of similar size. xCOMET: https://huggingface.co/collections/Unbabel/xcomet-659eca973b3be2ae4ac023bb Tower: https://huggingface.co/collections/Unbabel/tower-659eaedfe36e6dd29eb1805c EuroLLM: https://huggingface.co/blog/eurollm-team/eurollm-9b Bio: André F. T. Martins (PhD 2012, Carnegie Mellon University and Instituto Superior Técnico; https://andre-martins.github.io/) is an Associate Professor at Instituto Superior Técnico, University of Lisbon, researcher at Instituto de Telecomunicações, and the VP of AI Research at Unbabel. His research, funded by a ERC Starting Grant (DeepSPIN) and Consolidator Grant (DECOLLAGE), among other grants, include machine translation, quality estimation, structure and interpretability in deep learning systems for NLP. His work has received several paper awards at ACL conferences. He co-founded and co-organizes the Lisbon Machine Learning School (LxMLS), and he is a Fellow of the ELLIS society and co-director of the ELLIS Program in Natural Language Processing. He is a member of the R&I advisory group of EuroHPC, the European infrastructure for supercomputing.
Registration	Emanuele Bugliarello (Google DeepMind) Towards Inclusive Multimodal AI (Thursday, April 3, 2025 - 15:00 CET) Summary: Visual assistants are becoming ubiquitous, yet their effectiveness varies drastically across languages and cultures. This talk presents an overview of the critical issue of multicultural disparity in image–text models. We'll explore this gap through three lenses: evaluation, training, and generation. First, I'll introduce benchmarks like MaRVL designed to quantify multilingual and multicultural competence. Next, we'll delve into techniques for mitigating these disparities in model training. Finally, we'll examine the emerging challenges and opportunities in multicultural visual generation. Bio: Emanuele Bugliarello is a research scientist at Google DeepMind based in Grenoble, France where he works on improving evaluation and capabilities of multimodal generative models. He completed his PhD in the NLP Section at the University of Copenhagen, while spending time at DeepMind, Google, Mila and Spotify. Previously, he studied computer and communication sciences at EPFL, Tongji University and Politecnico di Torino.
Registration	Christian Herff (Maastricht University) Speech neuroprostheses based on intracranial EEG (Thursday, March 6, 2025 - 15:00 CET) Summary: Speech is our most natural way of communication and the loss of the ability to speak is therefore devastating to patients. A speech neuroprostheses that directly reconstructs speech processes from neural activity could provide a new means of communications to these severely affected patients. In this presentation, I will present some approaches to reconstruct different representations of speech from intracranial recordings and highlight how they can be used to build a speech neuroprosthesis. The decoding of speech processes is particularly challenging, as not only the neural, but also the target signal has complex, nonlinear dynamics. I will stress the use of interpretable machine learning models for this task to ensure that meaningful activity is decoded and scientific insights might be generated as a side product. Bio: Dr. Christian Herff is an assistant professor in the School for Mental Health and Neuroscience at Maastricht University where he leads the invasive BCI research line. His research interest lays in the application of machine learning technology to neurophysiological data for Brain-Computer Interfaces and neuroscience research. With a particular focus on the decoding of speech processes from intracranial data, he tries to improve the lives of severely paralyzed patients while simultaneously improving our understanding of complex higher order cognition. He emphasizes the ability to achieve interpretable results based on computational models. In particular, visualization of complex dynamic models, such as deep neural networks, is of interest to him.
Registration	Sebastian Ruder (Meta) Multilingual LLM Evaluation in Practical Settings (Thursday, February 6, 2025 - 15:00 CET) Summary: Large language models (LLMs) are increasingly used in a variety of applications across the globe but do not provide equal utility across languages. In this talk, I will discuss multilingual evaluation of LLMs in two practical settings: conversational instruction-following and usage of quantized models. For the first part, I will focus on a specific aspect of multilingual conversational ability where errors result in a jarring user experience: generating text in the user’s desired language. I will describe a new benchmark and evaluation of a range of LLMs. We find that even the strongest models exhibit language confusion, i.e., they fail to consistently respond in the correct language. I will discuss what affects language confusion, how to mitigate it, and potential extensions. In the second part, I will discuss the first evaluation study of quantized multilingual LLMs across languages. We find that automatic metrics severely underestimate the negative impact of quantization and that human evaluation—which has been neglected by prior studies—is key to revealing harmful effects. Overall, I highlight limitations of multilingual LLMs and challenges of real-world multilingual evaluation. Bio: Sebastian Ruder is a research scientist at Meta based in Berlin, Germany where he works on improving evaluation and benchmarking of large language models (LLMs). He previously led the Multilinguality team at Cohere with the objective to improve the multilingual capabilities of Cohere's LLMs. Before that he was a research scientist at Google DeepMind. He completed his PhD in Natural Language Processing (NLP) at the Insight Research Centre for Data Analytics, while working as a research scientist at Dublin-based text analytics startup AYLIEN. Previously, he studied Computational Linguistics at the University of Heidelberg, Germany and at Trinity College, Dublin.
Registration	Ekaterina Shutova (University of Amsterdam) Canceled. Cross-lingual information sharing in multilingual language models (Thursday, January 30, 2025 - 15:00 CET) Summary: Multilingual language models (MLMs), such as XLM-R or BLOOM, are pretrained on data covering many languages and share their parameters across all languages. This modeling approach has several powerful advantages, such as allowing similar languages to exert positive influence on each other, and enabling cross-lingual task transfer (i.e., fine-tuning on some source language(s), then using the model on different target languages). The success of such transfer, however, depends on the model's ability to effectively share information between different languages in its parameter space. Yet, the cross-lingual information sharing mechanisms within MLMs are still not fully understood. In this talk, I will present our recent research that investigates this question from three different perspectives: encoding of typological relationships between languages within MLMs, language-wise modularity of MLMs and the influence of training examples in specific languages on predictions made in others. Bio: Ekaterina Shutova is an Associate Professor at the ILLC, University of Amsterdam, where she leads the Amsterdam Natural Language Understanding Lab and the Natural Language Processing & Digital Humanities research unit. She received her PhD from the University of Cambridge, and then worked as a research scientist at the University of California, Berkeley. Ekaterina’s current research focuses on few-shot learning for language interpretation tasks, multilingual NLP, generalisability and robustness of NLP models and interpretability in deep learning. Her prominent service roles include Program Chair of ACL 2025, Senior Action Editor of ACL Rolling Review, Action Editor of Computational Linguistics and Demonstrations chair at EMNLP 2022. She is also an ELLIS scholar.
Registration	Javier de la Rosa (Artificial Intelligence Lab (National Library of Norway)) The Mímir Project: Impact of copyrighted materials in LLMs (Thursday, December 12, 2024 - 15:00 CET) Summary: The Mímir Project is an initiative by the Norwegian government that aims to assess the significance and influence of copyrighted materials in the development and performance of generative large language models (LLMs) tailored to the Norwegian languages. This collaborative effort involves three leading institutions from different regions of the country: the National Library of Norway (NB), the University of Oslo (UiO), and the Norwegian University of Science and Technology (NTNU); each contributing unique expertise in language technology, corpus curation, model training, copyright law, and computational linguistics. The ultimate goal of the project was to gather empirical evidence that informed the formulation of a compensation scheme for authors whose works are utilized by these advanced artificial intelligence (AI) systems, ensuring that intellectual property rights are respected and adequately compensated. Bio: Javier de la Rosa is a Research Scientist at the Artificial Intelligence Lab at the National Library of Norway. A former Postdoctoral Fellow in Natural Language Processing at UNED, he holds a PhD in Hispanic Studies with a specialization in Digital Humanities by the University of Western Ontario, and a Masters in Artificial Intelligence by the University of Seville. Javier has previously worked as a Research Engineer at the Stanford University, and as the Technical Lead at the University of Western Ontario CulturePlex Lab. He is interested in Natural Language Processing applied to historical and literary text, with a special focus on large language models.
Registration	Elena Sokolova (Amazon Text-to-Speech Group) No recording available for this webinar How we do research in Speech at Amazon (Thursday, November 7, 2024 - 15:00 CET) Summary: In this talk we will present how Speech technology has developed in the past 20 years. We will take a dive deep into the research that we do at Amazon in our Text to Speech lab, describe the challenges that we face and how we solve them at scale. We will also give an overview of the internship opportunities we have in our department for those of you who want to join our team in 2025. Bio: Elena is a Machine Learning team manager at Amazon, where she leads novel research in the field of speech technology. Over the past five years, she has overseen the deployment of machine learning projects into production and collaborated with her team to publish cutting-edge research on text-to-speech technology. Before joining Amazon, Elena completed her PhD at Radboud University Nijmegen in the Netherlands and gained industry experience as a Senior Machine Learning Scientist at Booking.com.

2023-2024
Registration	Marco Baroni (Universitat Pompeu Fabra) Unnatural Language Processing: On the Puzzling Out-of-Distribution Behavior of Language Models (Thursday, June 6, 2024 - 15:00 CET) Summary: Modern language models (LMs) respond with uncanny fluency when prompted using a natural language, such as English. However, they can also produce predictable, semantically meaningful output when prompted with low-likelihood "gibberish" strings, a phenomenon exploited for developing effective information extraction prompts (Shin et al. 2020) and bypassing security checks in adversarial attacks (Zou et al. 2023). Moreover, the same "unnatural" prompts often trigger the same behavior across LMs (Rakotonirina et al. 2023, Zou et al. 2023), hinting at a shared "universal" but unnatural LM code. In my talk, I will use unnatural prompts as a tool to gain insights into how LMs process language-like input. I will in particular discuss recent and ongoing work on three fronts: transferable unnatural prompts, as a window into LM invariances (Rakotonirina et al. 2023); mechanistic interpretability exploration of the activation pathways triggered by natural and unnatural prompts (Kervadec et al. 2023); and first insights into the lexical nature of unnatural prompts. Although a comprehensive understanding of how and why LMs respond to unnatural language remains elusive, I aim to present a set of intriguing facts that I hope will inspire others to explore this phenomenon. Bio: Marco Baroni received a PhD in Linguistics from the University of California, Los Angeles. After various experiences in research and industry, in 2019 he became an ICREA research professor, affiliated with the Linguistics Department of Pompeu Fabra University in Barcelona. Marco's work in the areas of multimodal and compositional distributed semantics has received widespread recognition, including a Google Research Award, an ERC Grant, the ICAI-JAIR best paper prize and the ACL test-of-time award. Marco was recently awarded another ERC grant to conduct research on improving communication between artificial neural networks, taking inspiration from human language and other animal communication systems.
Registration	Smaranda Muresan (Columbia University) Human-centric NLP: From Argumentation to Creativity (Thursday, March 7, 2024 - 15:00 CET) Summary: Abstract: Large language models (LLMs) constitute a paradigm shift in Natural Language Processing (NLP) and its applications across all domains. Models such as ChatGPT seem to possess human-like abilities --- reasoning about problems, passing bar exams, writing stories. But do they? In trying to answer this question, I will discuss three main desiderata for building human-centric NLP systems: knowledge-aware models, human-AI collaboration frameworks, and theoretically-grounded evaluation protocols. In this talk, I will use argumentation and creativity as two case studies. I will cover knowledge-aware models for implicit premise generation, human-AI collaboration framework for high-quality datasets creation (e.g., visual metaphors) and helping human solve tasks (e.g., writing short stories), and last but not least a novel evaluation protocol for assessing the creative capabilities of LLMs in both producing as well as assessing creative text. Bio: Smaranda Muresan is a Research Scientist at the Data Science Institute at Columbia University, a Visiting Associate Professor at Barnard College and an Amazon Scholar. Her research focuses on human-centric Natural Language Processing for social good and responsible computing. She develops theory-guided and knowledge-aware computational models for understanding and generating language in context (e.g., visual, social, multilingual, multicultural) with applications to computational social science, education, and public health. Research topics that she worked on over the years include: argument mining and generation, fact-checking and misinformation detection, figurative language understanding and generation (e.g., sarcasm, metaphor, idioms), and multilingual language processing for low-resource and endangered languages. Recently, her research interests include explainable models and human-AI collaboration frameworks for high-quality datasets creation. She received best papers awards at SIGDIAL 2017 and ACL 2018 (short paper). She served as a board member for the North American Chapter of the Association for Computational Linguistics
Registration	Heng Ji (University of Illinois) SmartBook: an AI Prophetess for Disaster Reporting and Forecasting (Friday, February 16, 2024 - 15:00 CET) Summary: History repeats itself, sometimes in a bad way. If we don’t learn lessons from history, we might suffer similar tragedies, which are often preventable. For example, many experts now agree that some schools were closed for too long during COVID-19 and that abruptly removing millions of children from American classrooms has had harmful effects on their emotional and intellectual health. Also many wish we had invested in vaccines earlier, prepared more personal protective equipment and medical facilities, provided online consultation services for people who suffered from anxiety and depression, and created better online education platforms for students. Similarly, genocides throughout history (from those in World War II to the recent one in Rwanda in 1994) have also all shared early warning signs (e.g., organization of hate groups, militias, and armies and polarization of the population) forming patterns that follow discernible progressions. Preventing natural or man-made disasters requires being aware of these patterns and taking pre-emptive action to address and reduce them, or ideally, eliminate them. Emerging events, such as the COVID pandemic and the Ukraine Crisis, require a time-sensitive comprehensive understanding of the situation to allow for appropriate decision-making and effective action response. Automated generation of situation reports can significantly reduce the time, effort, and cost for domain experts when preparing their official human-curated reports. However, AI research toward this goal has been very limited, and no successful trials have yet been conducted to automate such report generation and “what-if” disaster forecasting. Pre-existing natural language processing and information retrieval techniques are insufficient to identify, locate, and summarize important information, and lack detailed, structured, and strategic awareness. We propose SmartBook, a novel framework that cannot be solved by ChatGPT, targeting situation report generation which consumes large volumes of news data to produce a structured situation report with multiple hypotheses (claims) summarized and grounded with rich links to factual evidence by claim detection, fact checking, misinformation detection and factual error correction. Furthermore, SmartBook can also serve as a novel news event simulator, or an intelligent prophetess. Given “What-if” conditions and dimensions elicited from a domain expert user concerning a disaster scenario, SmartBook will induce schemas from historical events, and automatically generate a complex event graph along with a timeline of news articles that describe new simulated events based on a new Λ-shaped attention mask that can generate text with infinite length. By effectively simulating disaster scenarios in both event graph and natural language format, we expect SmartBook will greatly assist humanitarian workers and policymakers to exercise reality checks (what would the next disaster look like under these given conditions?), and thus better prevent and respond to future disasters. Bio: Heng Ji is a professor at Computer Science Department, and an affiliated faculty member at Electrical and Computer Engineering Department and Coordinated Science Laboratory of University of Illinois Urbana-Champaign. She is an Amazon Scholar. She is the Founding Director of Amazon-Illinois Center on AI for Interactive Conversational Experiences (AICE). She received her B.A. and M. A. in Computational Linguistics from Tsinghua University, and her M.S. and Ph.D. in Computer Science from New York University. Her research interests focus on Natural Language Processing, especially on Multimedia Multilingual Information Extraction, Knowledge-enhanced Large Language Models, Knowledge-driven Generation and Conversational AI. She was selected as a Young Scientist to attend the 6th World Laureates Association Forum, and selected to participate in DARPA AI Forward in 2023. She was selected as "Young Scientist" and a member of the Global Future Council on the Future of Computing by the World Economic Forum in 2016 and 2017. She was named as part of Women Leaders of Conversational AI (Class of 2023) by Project Voice. The awards she received include "AI's 10 to Watch" Award by IEEE Intelligent Systems in 2013, NSF CAREER award in 2009, PACLIC2012 Best paper runner-up, "Best of ICDM2013" paper award, "Best of SDM2013" paper award, ACL2018 Best Demo paper nomination, ACL2020 Best Demo Paper Award, NAACL2021 Best Demo Paper Award, Google Research Award in 2009 and 2014, IBM Watson Faculty Award in 2012 and 2014 and Bosch Research Award in 2014-2018. She was invited by the Secretary of the U.S. Air Force and AFRL to join Air Force Data Analytics Expert Panel to inform the Air Force Strategy 2030, and invited to speak at the Federal Information Integrity R&D Interagency Working Group (IIRD IWG) briefing in 2023. She is the lead of many multi-institution projects and tasks, including the U.S. ARL projects on information fusion and knowledge networks construction, DARPA ECOLE MIRACLE team, DARPA KAIROS RESIN team and DARPA DEFT Tinker Bell team. She has coordinated the NIST TAC Knowledge Base Population task since 2010-2021. She was the associate editor for IEEE/ACM Transaction on Audio, Speech, and Language Processing, and served as the Program Committee Co-Chair of many conferences including NAACL-HLT2018 and AACL-IJCNLP2022. She is elected as the North American Chapter of the Association for Computational Linguistics (NAACL) secretary 2020-2023. Her research has been widely supported by the U.S. government agencies (DARPA, NSF, DoE, ARL, IARPA, AFRL, DHS) and industry (Amazon, Google, Facebook, Bosch, IBM, Disney).
Registration	Emily M. Bender (University of Washington) Meaning making with artificial interlocutors and risks of language technology (Thursday, November 2, 2023 - 16:00 CET) Summary: Humans make sense of language in context, bringing to bear their own understanding of the world including their model of their interlocutor's understanding of the world. In this talk, I will explore various potential risks that arise when we as humans bring this sense-making capacity to interactions with artificial interlocutors. That is, I will ask what happens in conversations where one party has no (or extremely limited) access to meaning and all of the interpretative work rests with the other, and briefly explore what this entails for the design of language technology. Bio: Emily M. Bender is a Professor of Linguistics and an Adjunct Professor in the School of Computer Science and the Information School at the University of Washington, where she has been on the faculty since 2003. Her research interests include multilingual grammar engineering, computational semantics, and the societal impacts of language technology. In 2022 she was elected as a Fellow of the American Association for the Advancement of Science (AAAS).

2022-2023
	Pascale Fung (The Hong Kong University of Science and Technology) Safer Generative ConvAI (Thursday, June 1, 2023 - 15:00 CET) Summary: Generative models for Conversational AI are less than a decade old, but they hold great promise for human-machine interactions. Machine responses based on generative models can seem quite fluent and human-like, empathetic and funny, knowledgeable and professional. However, behind the confident voice of generative ConvAI systems, they can also be hallucinating misinformation, giving biased and harmful views, and are still not "safe" enough for many real life applications. The expressive power of generative ConvAI models and their undesirable behavior are two sides of the same coin. How can we harness the fluency, diversity, engagingness of generative ConvAI models while mitigating the downside? In this talk, I will present some of our team’s recent work in making generative ConvAI safer via mitigating hallucinations, misinformation, and toxicity. Bio: Pascale Fung is a Chair Professor at the Department of Electronic & Computer Engineering at The Hong Kong University of Science & Technology (HKUST), and a visiting professor at the Central Academy of Fine Arts in Beijing. She is an elected Fellow of the Association for the Advancement of Artificial Intelligence (AAAI) for her "significant contributions to the field of conversational AI and to the development of ethical AI principles and algorithms", an elected Fellow of the Association for Computational Linguistics (ACL) for her “significant contributions towards statistical NLP, comparable corpora, and building intelligent systems that can understand and empathize with humans”. She is a Fellow of the Institute of Electrical and Electronic Engineers (IEEE) for her “contributions to human-machine interactions” and an elected Fellow of the International Speech Communication Association for “fundamental contributions to the interdisciplinary area of spoken language human-machine interactions”. She is the Director of HKUST Centre for AI Research (CAiRE). She was the founding chair of the Women Faculty Association at HKUST. She is an expert on the Global Future Council, a think tank for the World Economic Forum. She represents HKUST on Partnership on AI to Benefit People and Society. She is on the Board of Governors of the IEEE Signal Processing Society. She is a member of the IEEE Working Group to develop an IEEE standard - Recommended Practice for Organizational Governance of Artificial Intelligence. Her research team has won several best and outstanding paper awards at ACL, ACL and NeurIPS workshops.
	Martin Cooke (Ikerbasque – Basque Foundation for Science) Who needs big data? Listeners' adaptation to extreme forms of variability in speech (Thursday, May 4, 2023 - 15:00 CET) Summary: No theory of speech perception can be considered complete without an explanation of how listeners are able to extract meaning from severely degraded forms of speech. Starting with a brief overview of a century of research which has seen the development of many types of distorted speech, followed by some anecdotal evidence that automatic speech recognisers still have some way to go to match listeners' performance in this area, I will describe the outcome of one recent [1] and several ongoing studies into the detailed time course of a listener's response to distorted speech. These studies variously consider the rapidity of adaptation, whether adaptation can only proceed if words are recognised, the degree to which the response to one form of distortion is conditioned on prior experience with other forms, and the nature of adaptation in a language other than one's own native tongue. Taken together, findings from these experiments suggest that listeners are capable of continuous and extremely rapid adaptation to novel forms of speech that differ greatly from the type of input that makes up the vast bulk of their listening experience. It is an open question as to whether big-data-based automatic speech recognition can offer a similar degree of flexibility. [1] Cooke, M, Scharenborg, O and Meyer, B (2022). The time course of adaptation to distorted speech. J. Acoust. Soc. Am. 151, 2636-2646. 10.1121/10.0010235 Bio: Martin Cooke is Ikerbasque Research Professor. After starting his career in the UK National Physical Laboratory, he worked at the University of Sheffield for 26 years before taking up his current position. His research has focused on analysing the computational auditory scene, devising algorithms for robust automatic speech recognition and investigating human speech perception. His interests also include the effects of noise on talkers as well as listeners, and second language listening in noise.
	Isabelle Augenstein (University of Copenhagen) Beyond Fact Checking — Modelling Information Change in Scientific Communication (Thursday, March 2, 2023 - 15:00 CET) Summary: Most work on scholarly document processing assumes that the information processed is trustworthy and factually correct. However, this is not always the case. There are two core challenges, which should be addressed: 1) ensuring that scientific publications are credible -- e.g. that claims are not made without supporting evidence, and that all relevant supporting evidence is provided; and 2) that scientific findings are not misrepresented, distorted or outright misreported when communicated by journalists or the general public. In this talk, I will present some first steps towards addressing these problems, discussing our research on exaggeration detection, scientific fact checking, and on modelling information change in scientific communication more broadly. Bio: Isabelle Augenstein is a Professor at the University of Copenhagen, Department of Computer Science, where she heads the Copenhagen Natural Language Understanding research group as well as the Natural Language Processing section. Her main research interests are fact checking, low-resource learning, and explainability. Prior to starting a faculty position, she was a postdoctoral researcher at University College London, and before that a PhD student at the University of Sheffield. In October 2022, Isabelle Augenstein became Denmark’s youngest ever female full professor. She currently holds a prestigious ERC Starting Grant on 'Explainable and Robust Automatic Fact Checking', as well as the Danish equivalent of that, a DFF Sapere Aude Research Leader fellowship on 'Learning to Explain Attitudes on Social Media’. She is a member of the Young Royal Danish Academy of Sciences and Letters, and Vice President-Elect of SIGDAT, which organises the EMNLP conference series.
	Thomas Hueber (CNRS/GIPSA-lab) Computational model of speech learning, a focus on the acoustic-articulatory mapping (Thursday, February 2, 2023 - 15:00 CET) Summary: Speech production is a complex motor process involving several physiological phenomena, such as the neural, nervous and muscular activities that drive our respiratory, laryngeal and articulatory movements. Modeling speech production, in particular the relationship between articulatory gestures (tongue, lips, jaw, velum) and acoustic realizations of speech, is a challenging, and still evolving, research question. From an applicative point of view, such models could be embedded into assistive devices able to restore oral communication when part of the speech production chain is damaged (articulatory synthesis, silent speech interface). They could also help rehabilitate speech sound disorders using a therapy based on biofeedback (and articulatory inversion). From a more fundamental research perspective, such models can also be used to question the cognitive mechanisms underlying speech learning, perception and motor control. In this talk, I will present three recent studies conducted in our group to address some of these fundamental questions. In the first one, we quantified the benefit of relying on lip movement when learning speech representations in a self-supervised manner using predictive coding techniques. In the second one, we integrated articulatory priors into the latent space of a variational auto-encoder, with potential application to speech enhancement. In the third one, I will describe a first attempt toward a computational model of speech learning, based on deep learning, which can be used to understand how a child learns the acoustic-to-articulatory inverse mapping in a self-supervised manner. Bio: Thomas Hueber is a senior research scientist at CNRS (« Directeur de recherche ») working at GIPSA-lab in Grenoble, France. He is head of the CRISSP research team (cognitive robotics, interactive systems and speech processing). He holds a Ph.D. in Computer Science from Pierre and Marie Curie University (Paris) in 2009. His research activities focus on automatic speech processing, with a particular interest in (1) the capture, analysis and modeling of articulatory gestures and electrophysiological signals involved in its production, (2) the development of speech technologies that exploit these different signals, for speech recognition and synthesis, for people with a spoken communication disorder, and (3) the study, through modeling and simulation, of the cognitive mechanisms underlying speech perception and production. He received in 2011 the 6th Christian Benoit award (ISCA/AFCP/ACB) and in 2015 the ISCA Award for the best paper published in Speech Communication. In 2017, he co-edited in IEEE/ACM Trans. Audio Speech and Language Processing, a special issue on Biosignal-based speech processing. He is also associate editor of EURASIP Journal on Audio, Speech, and Music Processing.
	Maarit Koponen (University of Eastern Finland) Machine translation as a tool for multilingual information: different users and use scenarios (Thursday, December 1, 2022 - 15:00 CET) Summary: Recent advances in machine translation quality have improved its usefulness as a tool to satisfy the demand for multilingual information and communication. Machine translation is nowadays a common part of professional translation workflows, but it is not a tool exclusive to translators. Users of machine translation can be found, for example, in public service institutions and newsrooms looking to produce and disseminate information in multiple languages. At the same time, machine translation can also offer a way for people to access information that may not otherwise be available in their language. Effective and responsible use of machine translation, however, requires a clear understanding of the potential risks as well as potential benefits. In this talk, I discuss how machine translation is used for producing and accessing information and how various situational factors affect its use in different scenarios. Bio: Dr Maarit Koponen currently works as Professor of Translation Studies at the University of Eastern Finland. She has previously worked as a post-doctoral researcher at the University of Helsinki and as a lecturer at the University of Turku after receiving her PhD in Language Technology at the University of Helsinki in 2016. Her research focuses on translation technology, particularly machine translation, and the effect of technology on translation both in professional and non-professional settings. Starting in October 2022, Koponen leads a work package focusing on linguistic barriers to information accessibility and technological solutions as part of the research project DECA (Democratic epistemic capacities in the age of algorithms), funded by the Academy of Finland Strategic Research Council. She chairs Working Group 7 “Language work, language professionals” of the EU COST Action “Language in the Human-Machine Era” (LITHME). She has also worked as a professional translator for several years.
	Vered Shwartz (The University of British Columbia-Vancouver) Incorporating Commonsense Reasoning into NLP Models (Thursday, November 3, 2022 - 15:30 CET) Summary: NLP models are primarily supervised, and are by design trained on a sample of the situations they may encounter in practice. The ability of models to generalize to and address unknown situations reasonably is limited, but may be improved by endowing models with commonsense knowledge and reasoning skills. In this talk, I will present several lines of work in which commonsense is used for improving the performance of NLP tasks: for completing missing knowledge in underspecified language, interpreting figurative language, and resolving context-sensitive event coreference. Finally, I will discuss open problems and future directions in building NLP models with commonsense reasoning abilities. Bio: Vered Shwartz is an Assistant Professor of Computer Science at the University of British Columbia and a faculty member at the Vector Institute for Artificial Intelligence. Her research interests include commonsense reasoning, computational semantics and pragmatics, and multiword expressions. Previously, Vered was a postdoctoral researcher at the Allen Institute for AI (AI2) and the University of Washington, and received her PhD in Computer Science from Bar-Ilan University.
	Xiang Ren (University of Southern California - USC) Commonsense Reasoning in the Wild (Thursday, October 6, 2022 - 17:00 CET) Summary: Current NLP systems impress us by achieving close-to-human performance on benchmarks of answering commonsense questions or writing interesting stories. However, most of the progress is evaluated using static, closed-ended datasets created for individual tasks. To deploy commonsense reasoning services in the wild, we look to develop and evaluate systems that can generate answers in an open-ended way, perform robust logical reasoning, and generalize across diverse task formats, domains, and datasets. In this talk I will share our effort on introducing new formulations of commonsense reasoning challenges and novel evaluation protocols, towards broadening the scope in approaching machine common sense. We hope that such a shift of evaluation paradigm would encourage more research on externalizing the model reasoning process and improving model robustness and cross-task generalization. Bio: Xiang Ren is an assistant professor and Viterbi Early Career Chair at the USC Computer Science Department, a Research Team Leader at USC ISI, and the director of the Intelligence and Knowledge Discovery (INK) Lab at USC. Priorly, he spent time as a research scholar at Stanford University and received his Ph.D. in Computer Science from the University of Illinois Urbana-Champaign. Ren's research seeks to build generalizable natural language processing (NLP) systems which can handle a wide variety of language tasks and situations. He works on new algorithms and datasets to make NLP systems cheaper to develop and maintain, arm machine models with common sense, and improve models’ transparency and reliability to build user trust. His research work has received several best paper awards in top NLP and AI conference venues. Ren has been awarded a NSF CAREER Award, multiple faculty research awards from Google, Facebook, Amazon, JP Morgan and Sony, and the 2018 ACM SIGKDD Doctoral Dissertation Award. He was named Forbes' Asia 30 Under 30 in 2019.

2021-2022
	Mikel Artetxe (FAIR (Meta AI)) Is scale all you need? (Friday, June 24, 2022 - 10:00 CET) Summary: Every once in a while, a new language model with gazillion parameters makes a big splash in Twitter, smashing the previous SOTA in some benchmarks or showing some impressive emerging capabilities. While some may argue that scaling will eventually solve NLP, others are skeptical about the scientific value of this trend. In this talk, I will argue that scaling is not just engineering, but also comes with exciting research questions. I will present some of our recent work in the topic, and discuss our efforts to make large language models more accessible for the community. Bio: Mikel Artetxe is a Research Scientist at FAIR (Meta AI). His primary area of research is multilingual NLP. Mikel was one the pioneers of unsupervised machine translation, and has done extensive work on cross-lingual representation learning. More recently, he has also been working on natural language generation, few-shot learning, and large-scale language models. Prior to joining FAIR, Mikel did his PhD at the IXA group at the University of the Basque Country, and interned at DeepMind, FAIR and Google.
	Sakriani Sakti (Japan Advanced Institute of Science and Technology) Semi-supervised Learning for Low-resource Multilingual and Multimodal Speech Processing with Machine Speech Chain (Thursday, May 5, 2022 - 15:00 CET) Summary: The development of advanced spoken language technologies based on automatic speech recognition (ASR) and text-to-speech synthesis (TTS) has enabled computers to either learn how to listen or speak. Many applications and services are now available but still support fewer than 100 languages. Nearly 7000 living languages that are spoken by 350 million people remain uncovered. This is because the construction is commonly done based on machine learning trained in a supervised fashion where a large amount of paired speech and corresponding transcription is required. In this talk, we will introduce a semi-supervised learning mechanism based on a machine speech chain framework. First, we describe the primary machine speech chain architecture that learns not only to listen or speak but also to listen while speaking. The framework enables ASR and TTS to teach each other given unpaired data. After that, we describe the use of machine speech chain for code-switching and cross-lingual ASR and TTS of several languages, including low-resourced ethnic languages. Finally, we describe the recent multimodal machine chain that mimics overall human communication to listen while speaking and visualizing. With the support of image captioning and production models, the framework enables ASR and TTS to improve their performance using an image-only dataset. Bio: Sakriani Sakti is currently an associate professor at Japan Advanced Institute of Science and Technology (JAIST) Japan, adjunct associate professor at Nara Institute of Science and Technology (NAIST) Japan, visiting research scientist at RIKEN Center for Advanced Intelligent Project (RIKEN AIP) Japan, and adjunct professor at the University of Indonesia. She received DAAD-Siemens Program Asia 21st Century Award in 2000 to study in Communication Technology, University of Ulm, Germany, and received her MSc degree in 2002. During her thesis work, she worked with the Speech Understanding Department, DaimlerChrysler Research Center, Ulm, Germany. She then worked as a researcher at ATR Spoken Language Communication (SLC) Laboratories Japan in 2003-2009, and NICT SLC Groups Japan in 2006-2011, which established multilingual speech recognition for speech-to-speech translation. While working with ATR and NICT, Japan, she continued her study (2005-2008) with Dialog Systems Group University of Ulm, Germany, and received her Ph.D. degree in 2008. She was actively involved in international collaboration activities such as Asian Pacific Telecommunity Project (2003-2007) and various speech-to-speech translation research projects, including A-STAR and U-STAR (2006-2011). In 2011-2017, she was an assistant professor at the Augmented Human Communication Laboratory, NAIST, Japan. She also served as a visiting scientific researcher of INRIA Paris-Rocquencourt, France, in 2015-2016, under JSPS Strategic Young Researcher Overseas Visits Program for Accelerating Brain Circulation. In 2018–2021, she was a research associate professor at NAIST and a research scientist at RIKEN, Center for Advanced Intelligent Project AIP, Japan. Currently, she is an associate professor at JAIST, adjunct associate professor at NAIST, visiting research scientist at RIKEN AIP, and adjunct professor at the University of Indonesia. She is a member of JNS, SFN, ASJ, ISCA, IEICE, and IEEE. Furthermore, she is currently a committee member of IEEE SLTC (2021-2023) and an associate editor of the IEEE/ACM Transactions on Audio, Speech, and Language Processing (2020-2023). She was a board member of Spoken Language Technologies for Under-resourced languages (SLTU) and the general chair of SLTU2016. She was also the general chair of the "Digital Revolution for Under-resourced Languages (DigRevURL)" Workshop as the Interspeech Special Session in 2017 and DigRevURL Asia in 2019. She was also the organizing committee of the Zero Resource Speech Challenge 2019 and 2020. She was also involved in creating joint ELRA and ISCA Special Interest Group on Under-resourced Languages (SIGUL) and served as SIGUL Board since 2018. Last year, in collaboration with UNESCO and ELRA, she was also the organizing committee of the International Conference of "Language Technologies for All (LT4All): Enabling Linguistic Diversity and Multilingualism Worldwide". Her research interests lie in deep learning & graphical model framework, statistical pattern recognition, zero-resourced speech technology, multilingual speech recognition and synthesis, spoken language translation, social-affective dialog system, and cognitive-communication.
	Dan Roth (University of Pennsylvania) It’s Time to Reason (Thursday, April 7, 2022 - 15:00 CET) Summary: The fundamental issue underlying natural language understanding is that of semantics – there is a need to move toward understanding natural language at an appropriate level of abstraction in order to support natural language understanding and communication with computers. Machine Learning has become ubiquitous in our attempt to induce semantic representations of natural language and support decisions that depend on it; however, while we have made significant progress over the last few years, it has focused on classification tasks for which we have large amounts of annotated data. Supporting high level decisions that depend on natural language understanding is still beyond our capabilities, partly since most of these tasks are very sparse and generating supervision signals for it does not scale. I will discuss some of the challenges underlying reasoning – making natural language understanding decisions that depend on multiple, interdependent, models, and exemplify it mostly using the domain of Reasoning about Time, as it is expressed in natural language. Bio: Dan Roth is the Eduardo D. Glandt Distinguished Professor at the Department of Computer and Information Science, University of Pennsylvania, lead of NLP Science at Amazon AWS AI, and a Fellow of the AAAS, the ACM, AAAI, and the ACL. In 2017, Roth was awarded the John McCarthy Award, the highest award the AI community gives to mid-career AI researchers. Roth was recognized “for major conceptual and theoretical advances in the modeling of natural language understanding, machine learning, and reasoning.” Roth has published broadly in machine learning, natural language processing, knowledge representation and reasoning, and learning theory, and has developed advanced machine learning based tools for natural language applications that are being used widely. Roth was the Editor-in-Chief of the Journal of Artificial Intelligence Research (JAIR) and a program chair of AAAI, ACL, and CoNLL. Roth has been involved in several startups; most recently he was a co-founder and chief scientist of NexLP, a startup that leverages the latest advances in Natural Language Processing (NLP), Cognitive Analytics, and Machine Learning in the legal and compliance domains. NexLP was acquired by Reveal in 2020. Prof. Roth received his B.A Summa cum laude in Mathematics from the Technion, Israel, and his Ph.D. in Computer Science from Harvard University in 1995.
	Desmond Elliott (University of Copenhagen) Visually Grounded Reasoning across Languages and Cultures (Thursday, March 3, 2022 - 15:00 CET) Summary: The design of widespread vision-and-language datasets and pre-trained encoders directly adopts, or draws inspiration from, the concepts and images of ImageNet. While one can hardly overestimate how much this benchmark contributed to progress in computer vision, it is mostly derived from lexical databases and image queries in English, resulting in source material with a North American or Western European bias. Therefore, we devise a new protocol to construct an ImageNet-style hierarchy representative of more languages and cultures. In particular, we let the selection of both concepts and images be entirely driven by native speakers, rather than scraping them automatically. Specifically, we focus on a typologically diverse set of languages, namely, Indonesian, Mandarin Chinese, Swahili, Tamil, and Turkish. On top of the concepts and images obtained through this new protocol, we create a multilingual dataset for Multicultural Reasoning over Vision and Language (MaRVL) by eliciting statements from native speaker annotators about pairs of images. The task consists of discriminating whether each grounded statement is true or false. We establish a series of baselines using state-of-the-art models and find that their cross-lingual transfer performance lags dramatically behind supervised performance in English. These results invite us to reassess the robustness and accuracy of current state-of-the-art models beyond a narrow domain, but also open up new exciting challenges for the development of truly multilingual and multicultural systems. Bio: Desmond is an Assistant Professor at the University of Copenhagen. His primary research interests are multimodal and multilingual machine learning and he was involved in the creation of the Multi30K, How2, and MaRVL datasets. His work received an Area Chair Favourite paper at COLING 2018 and the Best Long Paper Award at EMNLP 2021. He co-organised the Multimodal Machine Translation Shared Task from 2016–2018, the 2018 Frederick Jelinek Memorial Workshop on Grounded Sequence-to-Sequence Learning, the How2 Challenge Workshop at ICML 2019, and the Workshop on Multilingual Multimodal Learning at ACL 2022.
	Roger Moore (The University of Sheffield) Talking with Robots: Are We Nearly There Yet? (Thursday, February 3, 2022 - 15:00 CET) Summary: Recent years have seen considerable progress in the deployment of 'intelligent' communicative agents such as Apple's Siri and Amazon’s Alexa. However, effective speech-based human-robot dialogue is less well developed; not only do the fields of robotics and spoken language technology present their own special problems, but their combination raises an additional set of issues. In particular, there appears to be a large gap between the formulaic behaviour that typifies contemporary spoken language dialogue systems and the rich and flexible nature of human-human conversation. As a consequence, we still seem to be some distance away from creating Autonomous Social Agents such as robots that are truly capable of conversing effectively with their human counterparts in real world situations. This talk will address these issues and will argue that we need to go far beyond our current capabilities and understanding if we are to move from developing robots that simply talk and listen to evolving intelligent communicative machines that are capable of entering into effective cooperative relationships with human beings. Bio: Prof. Moore has over 40 years’ experience in Speech Technology R&D and, although an engineer by training, much of his research has been based on insights from human speech perception and production. As Head of the UK Government's Speech Research Unit from 1985 to 1999, he was responsible for the development of the Aurix range of speech technology products and the subsequent formation of 20/20 Speech Ltd. Since 2004 he has been Professor of Spoken Language Processing at the University of Sheffield, and also holds Visiting Chairs at Bristol Robotics Laboratory and University College London Psychology & Language Sciences. He was President of the European/International Speech Communication Association from 1997 to 2001, General Chair for INTERSPEECH-2009 and ISCA Distinguished Lecturer during 2014-15. In 2017 he organised the first international workshop on ‘Vocal Interactivity in-and-between Humans, Animals and Robots (VIHAR)’. Prof. Moore is the current Editor-in-Chief of Computer Speech & Language and in 2016 he was awarded the LREC Antonio Zampoli Prize for "Outstanding Contributions to the Advancement of Language Resources & Language Technology Evaluation within Human Language Technologies” and in 2020 he was given the International Speech Communication Association Special Service Medal for "service in the establishment, leadership and international growth of ISCA".
	Odette Scharenborg (Delft University of Technology) Speech Representations and Processing in Deep Neural Networks (Thursday, January 13, 2022 - 15:00 CET) Summary: Abstract Speech recognition is the mapping of a continuous, highly variable speech signal onto discrete, abstract representations. The question of how speech is represented and processed in the human brain and in automatic speech recognition (ASR) systems, although crucial in both the field of human speech processing and the field of automatic speech processing, has historically been investigated in the two fields separately. This webinar will discuss how comparisons between humans and deep neural network (DNN)-based ASRs, and cross-fertilization of the two research fields, can provide valuable insights into the way humans process speech and improve ASR technology. Specifically, it will present results of several experiments carried out on both human listeners and DNN-based ASR systems on the representation of speech in human listeners and DNNs and on lexically-guided perceptual learning, i.e., the ability to adapt a sound category on the basis of new incoming information resulting in improved processing of subsequent information. It will explain how listeners adapt to the speech of new speakers, and will present the results of a lexically-guided perceptual study carried out on a DNN-based ASR system, similar to the human experiments. In order to investigate the speech representations and adaptation processes in the DNN-based ASR systems, activations in the hidden layers of the DNN were visualized. These visualizations revealed that DNNs use speech representations that are similar to those used by human listeners, without being explicitly taught to do so, and showed an adaptation of the phoneme categories similar to what is assumed happens in the human brain. Bio: Odette Scharenborg is an Associate Professor and Delft Technology Fellow at Delft University of Technology working on automatic speech processing. She has an interdisciplinary background in automatic speech recognition and psycholinguistics, and uses knowledge from how humans process speech in order to develop inclusive automatic speech recognition systems that are able to recognise speech from everyone, irrespective of how they speak or the language they speak. Since 2017, she is on the Board of the International Speech Communication Association, and currently serves as Vice-President. Since 2018, she is on the IEEE Speech and Language Processing Technical Committee, and she is a Senior Associate Editor of IEEE Signal Processing Letters.
	Sam Bowman (New York University) When Combating Hype, Proceed with Caution (Thursday, December 2, 2021 - 15:00 CET) Summary: Researchers in NLP increasingly frame and discuss research results in ways that serve to deemphasize the field's successes, at least in part in an effort to combat the field's widespread hype. Though well-meaning, this often yields misleading or even false claims about the limits of our best technology. This is a problem, and it may be more serious than it looks: It harms our credibility in ways that can make it harder to mitigate present-day harms, from NLP deployments, like those involving discriminatory systems for content moderation or resume screening. It also limits our ability to prepare for the potentially enormous impacts of more distant future advances. This talk urges researchers to be careful about these claims and suggests some research directions and communication strategies that will make it easier to avoid or rebut them. Bio: Sam Bowman has been on the faculty at NYU since 2016, when he completed PhD with Chris Manning and Chris Potts at Stanford. At NYU, he is a member of the Center for Data Science, the Department of Linguistics, and Courant Institute's Department of Computer Science. His research focuses on data, evaluation techniques, and modeling techniques for sentence and paragraph understanding in natural language processing, and on applications of machine learning to scientific questions in linguistic syntax and semantics. He is the senior organizer behind the GLUE and SuperGLUE benchmark competitions and he has received a 2015 EMNLP Best Resource Paper Award, a 2019 *SEM Best Paper Award, a 2017 Google Faculty Research Award, and a 2021 NSF CAREER award.
	Hinrich Schuetze (University of Munich) Humans Learn From Task Descriptions and So Should Our Models (Thursday, November 4, 2021 - 15:00 CET) Summary: Task descriptions are ubiquitous in human learning. They are usually accompanied by a few examples, but there is little human learning that is based on examples only. In contrast, the typical learning setup for NLP tasks lacks task descriptions and is supervised with 100s or 1000s and often many more examples. This webinar will introduce Pattern-Exploiting Training (PET), an approach to learning that mimics human learning in that it leverages task descriptions in few-shot settings. PET is built on top of a pretrained language model that "understands" the task description, especially after fine-tuning, resulting in excellent performance compared to other few-shot methods. In particular, a model trained with PET outperforms GPT-3 even though it has 99.9% fewer parameters. The idea of task descriptions can also be applied to reducing bias in text generated by language models. Instructing a model to reveal and reduce its biases is remarkably effective as will be demonstrated in an evaluation on several benchmarks. This may contribute in the future to a fairer and more inclusive NLP. Bio: .

2020-2021
	Heidi Christensen (University of Sheffield, UK) Automated processing of pathological speech (Thursday, June 3, 2021 - 15:00 CET) Summary: As speech technologies mature and become ever more pervasive, the opportunities for real impact for people increases. This talk will outline the major challenges faced by researchers in porting mainstream speech technology to the domain of healthcare applications; in particular, the need for personalised systems and the challenge of working in an inherently sparse data domain. Three areas in automatic processing of pathological speech will be covered: i) detection, ii) therapy/treatment and iii) facilitating communication. The talk will give an overview of recent state-of-the-art results and specific experiences from current projects at the University of Sheffield (UK)'s Speech and Hearing (SPandH) & Healthcare lab. Bio: .
	Jose Luis Alba Castro - Carmen García Mateo (University of Vigo) Automatic Spanish Sign-Language Recognition: On-going Work & Challenges Ahead (Thursday, May 6, 2021 - 15:00 CET) Summary: In this talk we will quickly review the general approaches followed by the research community to solve the Sign Language Recognition (SLR) problem in the pre-deep learning era and then review, also briefly, the latest architectures using DNNs. These data-hungry models pose a very important problem in this specific task due to the scarcity of labeled data. In the last 5 years there has been a great deal of effort on compiling labeled datasets of Word-Level SLR and Continuous-SLR, but we are still very far from the amount of data readily available for other speech-based tasks. Acquiring SLR has the double challenge of needing donors that are scarce and needing SLR interpreters that help with the logistics, curation and labeling of the dataset. The GTM group at the atlanTTic Center in the University of Vigo has started this research line three years ago. We will show the state of the project nowadays and the state of the dataset we are acquiring with the help of Galician deaf associations and SL interpreters. We will also show the different approaches we are following both for understanding manual and facial components of the sign language and the latest results on Word-Level SLR. Bio: .
	Iryna Gurevych (Technische Universität Darmstadt) Let's Argue - Understanding and Generating Natural Language Arguments (Thursday, March 4, 2021 - 15:00 CET) Summary: People love to argue. In recent years, Artificial Intelligence has achieved great advances in modelling natural language argumentation. While analysing and creating arguments is a highly complex (and enjoyable!) task at which even humans are not good, let alone perfect, we describe our natural language processing (NLP) research to identify arguments, their stance and aspects, aggregate arguments into topically coherent clusters, and finally, even to generate new arguments, given their desired topic, aspect and stance. The talk will tell you the story how the ArgumenText project has been conceptualized into a set of novel NLP tasks and highlight their main research outcomes. Argument mining has a tremendous number of possible applications, of which the talk discusses a few selected ones. Bio: .
	Ricardo Baeza-Yates (Northeastern University) Biases on Social Media (Thursday, February 11, 2021 - 15:00 CET) Summary: Is social media data representative? If not, what are their biases? Can we mitigate those biases and make them representative? Does all this depend on the language? Can word embeddings help? We will answer partially all these questions with concrete use cases. Bio: .
PDF	Kyunghyun Cho (NYU) Unreasonably Shallow Deep Learning (Friday, January 29, 2021 - 17:30 CET) Summary: The talk will be about some gotcha's in Deep Learning. Bio: .
	Eduard Hovy (CMU) The Birth of a New NLP Centre: Making the Most of a Newborn Technology (Sunday, November 29, 2020 - 17:30 CET) Summary: Natural Language Processing (NLP) is at a very exciting time in its history. In the last 5 years a new technology has revolutionized the way we do our work. Even without special adaptation it tends to work better than almost every prior method, and yet we still don't really know how it works! So this is also a dangerous time: how can you trust a system that might (and sometimes does) do very strange things for which you can find no explanation or correction? In such a situation it is not a bad idea to look at the history of NLP, what NLP is at its core, and how the new technology fits into the NLP landscape. And, most importantly, where NLP is going (with or without this new technology) and how we can best prepare for it. The HiTZ Centre has a wonderful opportunity to help shape a future in which NLP will be as ubiquitous and as useful as the cellphone. Bio: .

Copyright © 2026 hitz@ehu.eus

Designed by Zymphonies