DeepReading: Mining, Understanding, and Reasoning with Multilingual Content.
(2019 - 2021)
This subproject will aim to develop transfer and deep learning approaches to address the lack of training data and knowledge resources for many NLP tasks and domains, focusing mostly on Spanish, English, Basque, Catalan and Galician. Even though deep learning and monolingual embeddings have improved the state of the art of NLP across tasks and languages, higher-level semantic tasks still require sufficient annotated data for supervised machine learning. For many languages and domains, the existence of such corpora is limited or simply non-existent, leading to much lower results than those obtained for English. This subproject also explores how to apply deep learning techniques for building automatically large-scale lexical knowledge bases from scratch from any language and domain
Organization: Ministerio de Ciencia, Innovación y Universidades.
Main researcher: Rodrigo Agerri, German Rigau
Rodrigo Agerri, Itziar Aldabe, Izaskun Aldezabal, Itziar Gonzalez-Dios, Mikel Iruskieta, Montserrat Maritxalar, German Rigau , Aitor Soroa