publications

Agustín Alonso, Victor García, Inma Hernáez, Eva Navas, Jon Sanchez 

Automatic Classification of Synthetic Voices for Voice Banking Using Objective Measures (2022)

Itziar Aldabe, Aritz Farwell, Eva Navas, Inma Hernaez, German Rigau 

ELE Project: an overview of the desk research (2022)

Inma Hernaez, Jose Andres Gonzalez Lopez, Eva Navas, Jose Luis Pérez Córdoba, Ibon Saratxaga, Gonzalo Olivares, Jon Sanchez de la Fuente, Alberto Galdón, Victor Garcia, Jesús del Castillo, Inge Salomons, Eder del Blanco Sierra 

ReSSInt project: voice restoration using Silent Speech Interfaces (2022)

Eder Del Blanco, Inge Salomons, Eva Navas, Inma Hernáez 

Phone classification using electromyographic signals (2022)

Eneko Agirre

Few-shot Information Extraction is Here: Pre-train, Prompt and Entail (2022)

SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Nayla Escribano, Jon Ander González, Julen Orbegozo-Terradillos, Ainara Larrondo-Ureta, Simón Peña-Fernández, Olatz Pérez-de-Viñaspre, Rodrigo Agerri

Euskararen erabilera Eusko Legebiltzarreko debateetan (2012-2020) (2022)

Nayla Escribano, Jon Ander González, Julen Orbegozo-Terradillos, Ainara Larrondo-Ureta, Simón Peña-Fernández, Olatz Pérez-de-Viñaspre, Rodrigo Agerri (2022). Euskararen erabilera Eusko Legebiltzarreko debateetan (2012-2020). In Mediatika, 19, 163-178.

Aitor Ormazabal, Mikel Artetxe, Manex Agirrezabal, Aitor Soroa, Eneko Agirre

PoeLM: A Meter- and Rhyme-Controllable Language Model for Unsupervised Poetry Generation (2022)

Findings of the Association for Computational Linguistics: EMNLP 2022

Itziar Glez Dios, Aitor Soroa, Hugo Laurençon, Lucile Saulnier, Thomas Wang, Christopher Akiki, Albert Villanova del Moral, Teven Le Scao, Leandro Von Werra, Chenghao Mou, Eduardo González Ponferrada, Huu Nguyen, Jörg Frohberg, Mario Šaško, Quentin Lhoest, Angelina McMillan-Major, Gérard Dupont, Stella Biderman, Anna Rogers, Loubna Ben Allal, Francesco de Toni, Giada Pistilli, Olivier Nguyen, Somaieh Nikpoor, Maraim Masoud, Pierre Colombo, Javier de la Rosa, Paulo Villegas, Tristan Thrush, etal.

The BigScience ROOTS Corpus: A 1.6 TB Composite Multilingual Dataset (2022)

2022. Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track

Gonzalez-Dios, Itziar and Altuna, Begoña

Natural Language Processing and Language Technologies for the Basque Language (2022)

Gonzalez-Dios, Itziar and Altuna, Begoña (2022). Natural Language Processing and Language Technologies for the Basque Language. In Cuadernos Europeos de Deusto. NÚMERO ESPECIAL. Linguas minoritarias e futuro de Europa. Minority Languages and the Future of Europe 26, 203-230. https://doi.org/10.18543/ced.2477 https://ced.revistas.deusto.es/issue/view/285

Mikel Iruskieta, Ainara Estarrona, Aritz Farwell, German Rigau

INTELE: promoviendo la participación en las infraestructuras: CLARIN y DARIAH (2022)

The International Congress on Libraries & Digital Humanities: projects and challenges

Itziar Gonzalez-Dios, Iker Gutiérrez-Fandiño, Oscar M. Cumbicus-Pineda, Aitor Soroa

IrekiaLF_es: a new open benchmark and baseline systems for Spanish Automatic Text Simplification (2022)

Gonzalez-Dios, I., Gutiérrez-Fandiño, I., Cumbicus-Pineda, O. M., & Soroa, A. (2022, December). IrekiaLFes: a new open benchmark and baseline systems for Spanish automatic text simplification. In Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022) (pp. 86-97).

Xabier Soto, Olatz Perez-De-Viñaspre, Gorka Labaka, Maite Oronoz

Comparing and combining tagging with different decoding algorithms for back-translation in NMT: learnings from a low resource scenario (2022)

In Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, pages 31–40, Ghent, Belgium. European Association for Machine Translation.

Ona de Gibert Bonet, Iakes Goenaga, Olatz Perez-de-Viñaspre, Jordi Armengol-Estapé, Carla Parra Escartín, Marina Sanchez, Mārcis Pinnis, Gorka Labaka and Maite Melero

Unsupervised Machine Translation in Real-World Scenarios (2022)

Proceedings of the 13th Edition of the Language Resources and Evaluation Conference (LREC 2022)

Nayla Escribano, Jon Ander González, Julen Orbegozo-Terradillos, Ainara Larrondo-Ureta, Simón Peña-Fernández, Olatz Perez-de-Viñaspre, Rodrigo Agerri

BasqueParl: A Bilingual Corpus of Basque Parliamentary Transcriptions (2022)

Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3382–3390, Marseille, France. European Language Resources Association.

Mikel Iruskieta, Ainara Estarrona, Aritz Farwell, German Rigau

INTELE : promoviendo la participación en las infraestructuras ERIC CLARIN y DARIAH (2022)

Boletín ANABAD. LXXII (2022), NÚM. 2, ABRIL-JUNIO. MADRID. ISSN: 2794-0519 (USB) - 2444-7293 (Internet)

Mikel Artetxe, Itziar Aldabe, Rodrigo Agerri, Olatz Perez-de-Viñaspre, Aitor Soroa

Does Corpus Quality Really Matter for Low-Resource Languages? (2022)

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 7383–7390.

Izaskun Aldezabal, Jose Mari Arriola, Arantxa Otegi

TZOS: an Online Terminology Database Aimed at Working on Basque Academic Terminology Collaboratively (2022)

Proceedings of the 13th Language Resources and Evaluation Conference. Editors: Nicoletta Calzolari (Conference chair), Fred´ eric B ´ echet, Philippe Blache, Khalid Choukri, ´ Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hel´ ene Mazo, Jan Odijk, Stelios Piperidis

I. Aduriz, I. Alegria, I. Aldezabal, X. Artola, A. Díaz de Ilarraza, N. Ezeiza, K. Sarasola, R. Urizar

Euskara (batua) ingurune digitalean: bidean ikasiXa eta etorkizuneko erronkak (2022)
file2
(2022)

Arantzazutik mundu zabalera: Euskararen normatibizazioa: 1968-2018. “Euskara (batua) ingurune digitalean: bidean ikasiXa eta etorkizuneko erronkak”. Andres M. Urrutia (Arg.). Iker bilduma. 455-470. Euskaltzaindia – Iberoamericana-Vervuert. 2022

Aitor Almeida, Unai Bermejo, Aritz Bilbao, Gorka Azkune, Unai Aguilera, Mikel Emaldi, Fadi Dornaika, Ignacio Arganda-Carreras

A Comparative Analysis of Human Behavior Prediction Approaches in Intelligent Environments (2022)

Sensors, vol 22, Issue 3, pp 701

Marta Gianzo, Itziar Urizar-Arenaza, Iraia Muñoa-Hoyos, Gorka Labaka, Zaloa Larreategui, Nicolás Garrido, Jon Irazusta, Nerea Subirán

Sperm aminopeptidase N identifies the potential for high-quality blastocysts and viable embryos in oocyte-donation cycles (2022)

Human Reproduction, Volume 37, Issue 10, October 2022, Pages 2246–2254

Cristina Aceta, Izaskun Fernandez, Aitor Soroa

KIDE4I: A Generic Semantics-Based Task-Oriented Dialogue System for Human-Machine Interaction in Industry 5.0 (2022)

Applied Sciences 12, no. 3: 1192

Blanca Calvo Figueras, Montse Cuadros, Rodrigo Agerri

A Semantics-Aware Approach to Automated Claim Verification (2022)

In Proceedings of the Fifth Fact Extraction and VERification Workshop (FEVER), pages 37–48, Dublin, Ireland. Association for Computational Linguistics

Jeremy Barnes, Laura Oberlaender, Enrica Troiano, Andrey Kutuzov, Jan Buchmann, Rodrigo Agerri, Lilja Øvrelid, Erik Velldal

SemEval 2022 Task 10: Structured Sentiment Analysis (2022)

In SemEval 2022

Gorka Urbizu, Iñaki San Vicente, Xabier Saralegi, Rodrigo Agerri and Aitor Soroa

BasqueGLUE: A Natural Language Understanding Benchmark for Basque (2022)

LREC 2022

Maxime Masson, Christian Sallaberry, Rodrigo Agerri, Marie-Noelle Bessagnet, Philippe Roose, Annig Le Parc Lacayrelle

A Domain-Independent Method for Thematic Dataset Building from Social Media: The Case of Tourism on Twitter (2022)

In: Chbeir, R., Huang, H., Silvestri, F., Manolopoulos, Y., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2022. WISE 2022. Lecture Notes in Computer Science, vol 13724. Springer, Cham.

Iker Garcia-Ferrero, Rodrigo Agerri, German Rigau

Model and Data Transfer for Cross-Lingual Sequence Labelling in Zero-Resource Settings (2022)

Findings of the Association for Computational Linguistics: EMNLP 2022

MarÍa Jesús Aranzabe, Izaskun Aldezabal, Igone Zabala

Recursos y Herramientas de Lingüística de Corpus y PLN para la Monitorización e Investigación de los Usos Académicos del Euskera (2022)

III. workshop de INTELE (Infraestructura de Tecnologías del Lenguaje). Madrid, 13 y 14 de septiembre (Workshop horretan aurkeztutako posterra)

María Jesús Aranzabe, Antton Gurrutxaga, Igone Zabala

Compilación del corpus académico de noveles en euskera HARTAeus y su explotación para el estudio de la fraseología académica (2022)

Procesamiento del Lenguaje Natural, Revista no 69, septiembre de 2022, pp. 95-103

Margarita Alonso Ramos, Igone Zabala

HARTAes-vas: Lexical combinations for an academic writing aid tool in Spanish and Basque (2022)

SEPLN-PD 2022. Annual Conference of the Spanish Association for Natural Language Processing 2022: Projects and Demonstrations, September 21-23, 2022, A Coruña, España.

Elisa Sanchez-Bayona, Rodrigo Agerri

From Automatic Metaphor Processing in Spanish to a Multilingual Perspective: Annotation, Systems, and Evaluation (2022)

Doctoral Symposium on Natural Language Processing from the PLN.net network 2022 (RED2018-102418-T), 21-23 September 2022, A Coruña, Spain.

Elisa Sanchez-Bayona, Rodrigo Agerri

Leveraging a New Spanish Corpus for Multilingual and Crosslingual Metaphor Detection (2022)

Proceedings of the 26th Conference on Computational Natural Language Learning (CoNLL), pages 228--240, Abu Dhabi, United Arab Emirates, Association for Computational Linguistics.

Itziar Aldabe, Jane Dunne, Aritz Farwell, Owen Gallagher, Federico Gaspari, Maria Giagkou, Jan Hajic, Jens Peter Kückens, Teresa Lynn, Georg Rehm, German Rigau, Katrin Marheinecke, Stelios Piperidis, Natalia Resende, Tea Vojtěchová, Andy Way

Overview of the ELE Project (2022)

Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, p. 353–354

Itziar Aldabe, Aritz Farwell, Eva Navas, Inma Hernaez, German Rigau

ELE Project: an overview of the desk research (2022)

Proc. IberSPEECH 2022, 231-234

A Garcia Olea, I Valdelvira Vazquez, I Diez Gonzalez, A Atutxa Salazar, K Gojenola Galletebeitia, J M Ormaetxe Merodio

Prediction of new onset atrial fibrillation recurrence or persistence with artificial intelligence: first insights of the PRAFAI study (2022)

European Heart Journal - Digital Health, Volume 3, Issue 4, December 2022,

Jose Mari Arriola

MACHINE TRANSLATION AS AN AID FOR WRITING BY COMPUTER SCIENCE UNIVERSITY STUDENTS (2022)

15th annual International Conference of Education, Research and Innovation, 7-9 November, 2022 Seville, Spain

Oscar Cumbicus-Pineda, Iker Gutiérrez-Fandiño, Itziar Gonzalez-Dios, Aitor Soroa

Noisy Channel for Automatic Text Simplification (2022)

Cumbicus-Pineda, O. M., Gutiérrez-Fandiño, I., Gonzalez-Dios, I., & Soroa, A. (2022). Noisy Channel for Automatic Text Simplification. arXiv preprint arXiv:2211.03152.

Scao, T. L., Fan, A., Akiki, C., Pavlick, E., Ilić, S., Hesslow, D., Soroa, A., Gonzalez-Dios, I,... & Manica, M.

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model (2022)

Scao, T. L., Fan, A., Akiki, C., Pavlick, E., Ilić, S., Hesslow, D., ... & Manica, M. (2022). BLOOM: A 176B-Parameter Open-Access Multilingual Language Model. arXiv preprint arXiv:2211.05100.

Nora Hollenstein, Itziar Gonzalez-Dios, Lisa Beinborn, and Lena Jäger

Patterns of text readability in human and predicted eye movements (2022)

Nora Hollenstein, Itziar Gonzalez-Dios, Lisa Beinborn, and Lena Jäger. 2022. Patterns of Text Readability in Human and Predicted Eye Movements. In Proceedings of the Workshop on Cognitive Aspects of the Lexicon, pages 1–15, Taipei, Taiwan. Association for Computational Linguistics.

Petter Mæhlum, Andre Kåsen, Samia Touileb, and Jeremy Barnes.

Annotating Norwegian language varieties on Twitter for Part-of-speech. (2022)

Proceedings of the Ninth Workshop on NLP for Similar Languages, Varieties and Dialects

David Samuel, Jeremy Barnes, Robin Kurtz, Stephan Oepen, Lilja Øvrelid, and Erik Velldal

Direct Parsing to Sentiment Graphs (2022)

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages: 470–478

Mikel Iruskieta, Mari Mar Boillos

Aproximación al género Trabajo de Fin de Grado en euskera: hacia una identificación de las características lingüístico-discursivas (2022)

In Elena Alarcón, José Sanchez-Santamaria, Purificación Cruz (Coord.) Nuevos contenidos para una nueva docencia, 283-296

Xabier Soto, Olatz Pérez-de-Viñaspre, Maite Oronoz, Gorka Labaka

Development of a Machine Translation system for promoting the use of a low resource language in the clinical domain: the case of Basque. (2022)

Chapter 7 In Natural Language Processing In Healthcare A Special Focus on Low Resource Languages. Routledge, Taylor & Francis Group.

Bernardo Magnini, Begoña Altuna, Alberto Lavelli, Anne-Lyse Minard, Manuela Speranza, and Roberto Zanoli

European Clinical Case Corpus (2022)

Bernardo Magnini, Begoña Altuna, Alberto Lavelli, Anne-Lyse Minard, Manuela Speranza, and Roberto Zanoli (2022). European Clinical Case Corpus. Georg Rehm ed. European Language Grid, A Language Technology Platform for Multilingual Europe. Springer, Cham, Switzerland. https://doi.org/10.1007/978-3-031-17258-8

Gildo Fabregat Ander Cejudo Juan Martinez-Romo Alicia Pérez Lourdes Araujo Nuria Lebeña Maite Oronoz Arantza Casillas

Approximate Nearest Neighbour Extraction Techniques and Neural Networks for Suicide Risk Prediction in the CLPsych 2022 Shared Task (2022)

CLPsych 2022 Shared Task, Accepted in CLPsych 2022 Shared Task, July 15th 2022

E Agirre, M Apidianaki, I Vulić

Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures (2022)

Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures. Association for Computational Linguistics, Dublin, Ireland

Oscar Sainz, Haoling Qiu, Oier Lopez de Lacalle, Eneko Agirre, Bonan Min

ZS4IE: A toolkit for Zero-Shot Information Extraction with simple Verbalizations (2022)

In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations, Seattle, Washington. Association for Computational Linguistics.

Oscar Sainz, Itziar Gonzalez-Dios, Oier Lopez de Lacalle, Bonan Min, Eneko Agirre

Textual Entailment for Event Argument Extraction: Zero- and Few-Shot with Multi-Source Learning (2022)

In Findings of the Association for Computational Linguistics: NAACL 2022, Seattle, Washington. Association for Computational Linguistics.

Jon Alkorta, Mikel Iruskieta

Adding the Basque Parliament Corpus to ParlaMint Project (2022)

ParlaCLARIN III at LREC2022 - Workshop on Creating, Enriching and Using Parliamentary Corpora: 107–110

Ibarra, I. eta Iruskieta M.

Corpus lingüísticos, smartpen y whatsapp: Intervención en escritura de una madre con sus hijos (2022)

IV Congreso internacional en Inclusión Social y Educativa: CIISE

Irune Ibarra, Mikel Iruskieta

Disgrafia hobetzeko esku-hartzea idazkailu digitala erabiliz (2022)

UZTARO 121, 155-178

Mikel Iruskieta

Herramientas Digitales para las Humanidades Digitales en la e-infraestructura CLARIN (2022)

Creación de un proyecto en humanidades digitales basado en el análisis de textos: modelado y procesamiento

Harritxu Gete, Thierry Etchegoyhen, David Ponce, Gorka Labaka, Nora Aranberri, Ander Corral, Xabier Saralegi, Igor Ellakuria and Maite Martin

TANDO: A Corpus for Document-level Machine Translation. (2022)

Proceedings of the 13th Edition of the Language Resources and Evaluation Conference (LREC 2022)

Aitor Ormazabal, Mikel Artetxe, Aitor Soroa, Gorka Labaka, Eneko Agirre

Principled Paraphrase Generation with Parallel Corpora (2022)

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1621-1638

Owen Trigueros, Alberto Blanco, Nuria Lebeña, Arantza Casillas, Alicia Pérez

Explainable ICD multi-label classification of EHRs in Spanish with convolutional attention (2022)

International Journal of Medical Informatics

Alberto Blanco, Sonja Remmer, Alicia Pérez, Hercules Dalianis, Arantza Casillas

Implementation of specialised attention mechanisms: ICD-10 classification of Gastrointestinal discharge summaries in English, Spanish and Swedish (2022)

Journal of Biomedical Informatics

Itxaso Alayo, Ander Merketegi, Maite Oronoz, Arantza Casillas, Alicia Pérez, Olatz Garin, Isabel Moreira, Montse Ferrer, Jordi Alonso, Yolanda Pardo

A baseline model for the automation of the systematic review of Patient-Reported Outcomes measures: the case of the BiblioPRO virtual library (2022)

Jornada científica CIBERESP 2022 (https://jornadacientifica.ciberesp.es/). Centro de Investigación Biomédica en Red, Epidemiología y Salud Pública.

Alberto Blanco, Alicia Pérez, Arantza Casillas

Exploiting ICD Hierarchy for Classification of EHRs in Spanish Through Multi-Task Transformers (2022)

IEEE Journal of Biomedical and Health Informatics

Arantxa Otegi, Iñaki San Vicente, Xabier Saralegi, Anselmo Peñas, Borja Lozano, Eneko Agirre

Information retrieval and question answering: A case study on COVID-19 scientific literature (2022)

Knowledge-Based Systems, Volume 240.