publications

Thierry Etchegoyhen, Haritz Arzelus, Harritxu Gete, Aitor Álvarez, Inma Hernaez, Eva Navas, Ander González-Docasal, Jaime Osácar, Edson Benites, Igor Ellakuria, Eusebi Calonge, Maite Martin 

MINTZAI: Sistemas de Aprendizaje Profundo E2E para Traduccion Automatica del Habla (2020)

Itziar Gonzalez-Dios

Data statement of the Corpus of Basque Simplified Texts (2020)
file2
(2020)
file3
(2020)

Data Statements workshop

María Espinosa, Rodrigo Agerri, Roberto Centeno, Alvaro Rodrigo

DeepReading@SardiStance:Combining Textual, Social and Emotional Features. (2020)

Proceedings of the Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA 2020). Winners of the

SardiStance@Evalita
2020 shared task

Rodrigo Agerri, German Rigau

Projecting Heterogeneous Annotations for Named Entity Recognition (2020)

In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020). Winner of the

CAPITEL@IberLEF
task on Spanish NER.

Xabier Soto, Olatz Perez-de-Viñaspre, Gorka Labaka, Maite Oronoz

Ixamed's submission description for WMT20 Biomedical shared task: benefits and limitations of using terminologies for domain adaptation (2020)

Proceedings of the Fifth Conference on Machine Translation, pp: 873--878.

Iker de la Iglesia, Mikel Martinez-Puente, Alexander Platas, Iria San Miguel, Aitziber Atutxa, Koldo Gojenola

MEDIA team at the CLEF-2020 MultilingualInformation Extraction Task (2020)

Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum Thessaloniki, Greece, September 22-25, 2020.

Kepa Sarasola, Itziar Aldabe, Arantza Diaz de Ilarraza, Ainara Estarrona, Aritz Farwell, Inma Hernaez, Eva Navas; Reviewers: Annika Grützner-Zahn, Maria Giagkou; Editors: Maria Giagkou, Stelios Piperidis, Georg Rehm, Jane Dunne

Report on the Basque Language. European Language Equality (2020)

Deliverables of the Project ELE (European Language Equality). D1.4 Report on the Basque Language, https://european-language-equality.eu/deliverables/

Begoña Altuna

Análisis de estructuras temporales en euskera y creación de un corpus (2020)

Procesamiento del Lenguaje Natural, Revista no 64, marzo de 2020, pp. 131-134 URL: http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6206 ISSN: 1989-7553

Uxoa Inurrieta, Itziar Aduriz, Arantza Díaz de Ilarraza, Gorka Labaka, Kepa Sarasola

Learning about phraseology from corpora: A linguistically motivated approach for Multiword Expression identification. (2020)

Inurrieta U, Aduriz I, Díaz de Ilarraza A, Labaka G, Sarasola K (2020) Learning about phraseology from corpora: A linguistically motivated approach for Multiword Expression identification. PLoS ONE 15(8): e0237767. https://doi.org/10.1371/journal.pone.0237767

Ainara Estarrona, Izaskun Aldezabal, Arantza Díaz de Ilarraza

How the corpus-based Basque Verb Index lexicon was built (2020)

Language Resources and Evaluation. First Online 05 December 2018. DOI: https://doi.org/10.1007/s10579-018-9440-0. Springer Netherlands

Itziar Aduriz, Jose Mari Arriola, Xabier Artola, Zuhaitz Beloki, Nerea Ezeiza, Koldo Gojenola

Morfeus+: Word Parsing in Basque beyond Morphological Segmentation (2020)

WORD STRUCTURE 13.3, 283-315

Nora Aranberri

Can translationese features help users select an MT system for post-editing? (2020)

Revista Procesamiento del Lenguaje Natural, 64, 93-100.

Sara Santiso, Alicia Pérez, Arantza Casillas, Maite Oronoz

Neural negated entity recognition in Spanish electronic health records (2020)

Journal of Biomedical Informatics (JBI) https://doi.org/10.1016/j.jbi.2020.103419

Alberto Blanco, Alicia Pérez, Arantza Casillas

Extreme multi-label ICD classification: sensitivity to hospital service and time (2020)

IEEE Access, Volume 8, 183534-183545

Itziar Gonzalez-Dios, Kepa Bengoetxea, Amaia Aguirregoitia

LagunTest: A NLP Based Application to Enhance Reading Comprehension (2020)

1st Workshop on Tools and Resources to Empower People with REAding DIfficulties (READI2020), pages 63–69. ISBN: 979-10-95546-44-3 https://www.aclweb.org/anthology/2020.readi-1.10/ https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/READI2020book.pdf

Kepa Bengoetxea, Itziar Gonzalez-Dios, Amaia Aguirregoitia

AzterTest: Open source linguistic and stylistic analysis tool (2020)

Procesamiento del Lenguaje Natural, 64, 61-68. http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6196

Alberto Blanco, Alicia Pérez, Arantza Casillas, Daniel Cobos

Extracting Cause of Death from Verbal Autopsy with Deep Learning interpretable methods (2020)

IEEE Journal of Biomedical and Health Informatics

Javier Álvez, Itziar Gonzalez-Dios, German Rigau

Applying the Closed World Assumption to SUMO-based FOL Ontologies for Effective Commonsense Reasoning (2020)
file2
(2020)

Frontiers in Artificial Intelligence and Applications. Giuseppe De Giacomo, Alejandro Catala, Bistra Dilkina, Michela Milano, Senén Barro, Alberto Bugarín, Jérôme Lang (eds.). Volume 325: ECAI 2020. Pages 585 - 592. IOS Press Ebooks

Kepa Sarasola, Iñaki Alegria, Olatz Perez de Viñaspre

Language Technology for Language Communities: An Overview based on Basque Experience 2020 (2020)
file2
(2020)

Symposiwm Academaidd Technolegau Iaith Cymru 2020 -11-04 // Wales Academic Symposium on Language Technologies 2020-11-04

Itziar Aldabe, Josu Aztiria, Francho Beltrán, Myriam Bras, Klara Ceberio, Itziar Cor tes, Jean-Baptiste Coyos, Benaset Dazeas, Louise Esher, Gorka Labaka, Igor Leturia, Kepa Sarasola, Aure Séguier, Jean Sibille

LINGUATEC: Development of cross-border cooperation and knowledge transfer in language technologies (2020)

Workshop "INTELE : INfraestructura de TEcnologías del LEnguaje" CLARIN DARIAH-EU. http://ixa2.si.ehu.eus/intele/?q=node/71

Camacho A., Iruskieta M., Latatu A., Lonbide P.

UEUren Online ikaskuntzarako eredu pedagogikoaren sorrera eta garapena (2020)

UZTARO 118, 5-38

Perez, N; Accuosto, P; Bravo, A; Quadres, M; Martinez-Garcia, E; Saggion, H; Rigau, G.

Cross-lingual semantic annotation of Biomedical literature: experiments in Spanish and English (2020)

Bioinformatics, 36, 6, 1872-1880. , ISSN 1367-1880

Itziar Gonzalez-Dios, Javier Álvez, German Rigau

Towards modeling SUMO attributes through WordNet adjectives: a Case Study on Qualities. (2020)

Proceedings of the Workshop on Multimodal Wordnets (MMWN-2020), pages 1–6. ISBN: 979-10-95546-41-2 https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/MMW2020book.pdf

Lima S., Pérez-Miguel N., Cuadros M. and Rigau G.

NUBes: A Corpus of Negation and Uncertainty in Spanish Clinical Texts. (2020)

Proceedings of the 12th Language Resources and Evaluation Conference (LREC'20). Marseille, France. 2020.

Ainara Estarrona, Izaskun Etxeberria, Ricardo Etxepare, Manuel Padilla-Moyano, Ander Soraluze

Dealing with dialectal variation in the construction of the Basque historical corpus (2020)

Proceedings of the 7th Workshop on NLP for similar languages, varieties and dialects (VarDial2020 at COLING 2020).

Bernardo Magnini, Begoña Altuna, Alberto Lavelli, Manuela Speranza, Roberto Zanoli

The E3C Project:Collection and Annotation of a Multilingual Corpus of Clinical Cases (2020)

In Johanna Monti, Felice Dell'Orletta and Fabio Tamburini (eds.), Proceedings of the Seventh Italian Conference on Computational Linguistics. Associazione Italiana di Linguistica Computazionale. Bologna, Italy, 2020.

Unai Atutxa, Mikel Iruskieta, Olatz Ansa

Laburpena eskolan: estrakzioaren eta abstrakzioaren arteko zubia eskolan (2020)

Hizkuntzaren eta Literaturaren Didaktika testuinguru eleaniztunetan: Hizkuntzaren eta Literaturaren Didaktikaren Nazioarteko XX. Kongresuko aktak. 57-66.

Eneko Agirre, Marianna Apidianaki, Ivan Vulić (Editors)

Proceedings of Deep Learning Inside Out (DeeLIO): The First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures (2020)

In conjunction with EMNLP. Association for Computational Linguistics

Ainara Estarrona, Izaskun Etxeberria, Ricardo Etxepare, Manuel Padilla-Moyano, Ander Soraluze

Sintaktikoki etiketatutako euskarazko corpus historikoa eraikitzen (2020)

Fontes Linguae Vasconum 50 urte. Ekarpen berriak euskararen ikerketari. Nuevas aportaciones al estudio de la lengua vasca

Rodrigo Agerri, Iñaki San Vicente, Jon Ander Campos, Ander Barrena, Xabier Saralegi, Aitor Soroa, Eneko Agirre

Give your Text Representation Models some Love: the Case for Basque (2020)

Proceedings of LREC. Also available at arxiv https://arxiv.org/pdf/2004.00033.pdf

C. Pradel, D. Sileo, A. Rodrigo, A. Peñas, E. Agirre.

Question Answering when Knowledge Bases are Incomplete? (2020)

Proceedings of Conference and Labs of the Evaluation Forum.

Piroska Lendvai , Sándor Darányi, Christian Geng, Moniek Kuijpers, Oier Lopez de Lacalle , Jean-Christophe Mensonides, Simone Rebora and Uwe Reichel

Detection of Reading Absorption in User-Generated Book Reviews: Resources Creation and Evaluation (2020)

Proceeding of 12th Edition of its Language Resources and Evaluation Conference (LREC2020). Marseille, France

Oscar Sainz, Oier Lopez de Lacalle, Itziar Aldabe, Montse Maritxalar

Domain Adapted Distant Supervision for Pedagogically Motivated Relation Extraction (2020)

Proceeding of 12th Edition of its Language Resources and Evaluation Conference (LREC2020). Marseille, France

Andrea Horbach, Itziar Aldabe, Marie Bexte, Oier Lopez de Lacalle and Montse Maritxalar

Linguistic Appropriateness and Pedagogic Usefulness of Reading Comprehension Questions (2020)

Proceeding of 12th Edition of its Language Resources and Evaluation Conference (LREC2020). Marseille, France

Javier Álvez, Itziar Gonzalez-Dios, German Rigau

Towards Word Sense Disambiguation by Reasoning (2020)

Vampire 2018 and Vampire 2019. The 5th and 6th Vampire Workshops. EPiC Series in Computing. Pages 19-29. ISSN: 2398-7340

Jon Ander Campos, Kyunghyun Cho, Arantxa Otegi, Aitor Soroa, Eneko Agirre, Gorka Azkune

Improving Conversational Question Answering Systems after Deployment using Feedback-Weighted Learning (2020)

Proceedings of the 28th International Conference on Computational Linguistics (COLING), pages 2561–2571. Outstanding Paper (Top 1%).

Arantxa Otegi, Jon Ander Campos, Gorka Azkune, Aitor Soroa, Eneko Agirre

Automatic Evaluation vs. User Preference in Neural Textual Question Answering over COVID-19 Scientific Literature (2020)

Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020

Jon Ander Campos, Arantxa Otegi, Aitor Soroa, Jan Deriu, Mark Cieliebak, Eneko Agirre

DoQA - Accessing Domain-Specific FAQs via Conversational QA (2020)

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7302–7314

Arantxa Otegi, Aitor Agirre, Jon Ander Campos, Aitor Soroa, Eneko Agirre

Conversational Question Answering in Low Resource Scenarios: A Dataset and Case Study for Basque (2020)

Proceedings of The 12th Language Resources and Evaluation Conference, pp. 429–435. European Language Resources Association. ISBN: 979-10-95546-34-4

Oier Lopez de Lacalle, Ander Salaberria, Aitor Soroa, Gorka Azkune and Eneko Agirre

Evaluating Multimodal Representations on Visual Semantic Textual Similarity (2020)

Proceedings of the Twenty-third European Conference on Artificial Intelligence, ECAI 2020, June 8-12, 2020, Santiago Compostela, Spain

Gorka Urbizu, Ander Soraluze, Olatz Arregi

Sequence to Sequence Coreference Resolution (2020)

Proceedings of the 3rd Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2020), pages 39–46,Barcelona, Spain (online), December 12, 2020.

Rachel Bawden, Giorgio Maria Di Nunzio, Cristian Grozea, Inigo Jauregi Unanue, Antonio Jimeno Yepes, Nancy Mah, David Martinez, Aurélie Névéol, Mariana Neves, Maite Oronoz, Olatz Perez-de-Viñaspre, Massimo Piccardi, Roland Roller, Amy Siu, Philippe Thomas, Federica Vezzani, Maika Vicente Navarro, Dina Wiemann and Lana Yeganova

Findings of the WMT 2020 Biomedical Translation Shared Task: Basque, Italian and Russian as New Additional Languages (2020)

Fith Conference on Machine Translation (WMT20). Shared Task: Biomedical Translation Task

Xabier Soto, Dimitar Shterionov, Alberto Poncelas, Andy Way

Selecting Backtranslated Data from Multiple Sources for Improved Neural Machine Translation (2020)

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp: 3898–3908.

Jan Deriu, Don Tuggener, Pius von Däniken, Jon Ander Campos, Alvaro Rodrigo, Thiziri Belkacem, Aitor Soroa, Eneko Agirre, Mark Cieliebak

Spot The Bot: A Robust and Efficient Framework for the Evaluation of Conversational Dialogue Systems (2020)

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). (Pages 3971–3984). Honorable Mention Paper (Top 1%).

Juan J. Lastra-Díaz, Josu Goikoetxea, Mohamed Ali Hadj Taieb, Ana Garcia-Serrano, Mohamed Ben Aouicha, Eneko Agirre, David Sánchez

A large reproducible benchmark of ontology-based methods and word embeddings for word similarity (2020)

Information Systems. Online first.

Mikel Artetxe, Gorka Labaka, Eneko Agirre

Translation Artifacts in Cross-lingual Transfer Learning (2020)

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). (Pages 7674–7684).

Uxoa Iñurrieta

Identification and translation of verb+noun multiword expressions: a Spanish-Basque study (2020)

Procesamiento del Lenguaje Natural, 64, pp. 123-126.

Itziar Aduriz, Jose Mari Arriola

Testu-corpusen informazio morfosintaktikoaren etiketatze automatikoa hizkuntz ezagutzan oinarrituz: zenbait arazo, hainbat erronka (2020)

Fontes Linguae Vasconum 50 urte. Ekarpen berriak euskararen ikerketari / Nuevas aportaciones al estudio de la lengua vasca.

Alberto Blanco, Alicia Pérez, Arantza Casillas

Automatic Classification of Medical Records with Multi-label Classifiers and Similarity Match Coders (2020)

CEUR Workshop Proceedings, Vol 2696 - Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum

Alberto Blanco, Olatz Perez de Viñaspre, Alicia Pérez, Arantza Casillas

Boosting ICD multi-label classification of health records with contextual embeddings and label-granularity (2020)

Computer Methods and Programs in Biomedicine, Volume 188, 105264

Santana, S and Pérez, A and Casillas, A

HapLap at eHealth-KD Challenge 2020 (2020)

Proceedings of the Iberian Languages Evaluation Forum co-located with 36th Conference of the Spanish Society for Natural Language Processing, IberLEF@ SEPLN

Hormaetxe G., Iruskieta M.

Parekoen behaketarekin komunikazio-gaitasuna ebaluatzen: zer dute nahiago ikasleek, errubrika tradizionala ala bideo-behaketa (2020)

e-Hizpide 96

Ibarra, I., Ortube, M., Iruskieta, M.

Loturak landuz: idazketa errazeko programa (2020)

Booktegi.

Mikel Artetxe, Gorka Labaka, Noe Casas, Eneko Agirre

Do all roads lead to Rome? Understanding the role of initialization in iterative back-translation (2020)

Knowledge-Based Systems, Volume 206 (online first). Pre-print https://arxiv.org/abs/2002.12867

Eneko Agirre

Cross-Lingual Word Embeddings (Book Review) (2020)

Computational Linguistics 46 (1), 245-248. (https://doi.org/10.1162/COLI_r_00372)

Jan Deriu, Katsiaryna Mlynchyk, Philippe Schläpfer, Alvaro Rodrigo, Dirk von Grünigen, Nicolas Kaiser, Kurt Stockinger, Eneko Agirre, Mark Cieliebak

A Methodology for Creating Question Answering Corpora Using Inverse Data Annotation (2020)

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 897-911.

Ivana Kvapilíková, Mikel Artetxe, Gorka Labaka, Eneko Agirre, Ondřej Bojar

Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining (2020)

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. Pages 255-262

Gorka Azkune, Aitor Almeida, Eneko Agirre

Cross-environment activity recognition using word embeddings for sensor and activity representation (2020)

Neurocomputing (available online 1 September 2020)

José Ramom Pichel, Pablo Gamallo, Marco Neves & Iñaki Alegria

Distância diacrónica automática entre variantes diatópicas do português e do espanhol (2020)

Linguamática, Vol. 12 N. 1, 117–126 ISSN: 1647–0818

Elena Zotova, Rodrigo Agerri, Manuel Nuñez and German Rigau

Multilingual Stance Detection in Tweets: The Catalonia Independence Corpus (2020)

Language Resources and Evaluation Conference (LREC 2020)

Rodrigo Agerri, German Rigau

Language independent sequence labelling for Opinion Target Extraction (2020)

International Joint Conference on Artificial Intelligence (IJCAI 2020)

Nora Aranberri

With or without you? Effects of using machine translation to write flash fiction in the foreign language (2020)

Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, p. 165–174, Lisboa, Portugal, November 2020.

Adrián Nuñez-Marcos, Gorka Azkune, Eneko Agirre, Diego López-de-Ipiña, Ignacio Arganda-Carreras

Using External Knowledge to Improve Zero-shot Action Recognition in Egocentric Videos (2020)

International Conference on Image Analysis and Recognition (ICIAR)

Arantxa Otegi, Aitor Soroa, Eneko Agirre, Jon Ander Campos

Cómo gestionar la sobrecarga de información científica sobre COVID-19 (2020)

The Conversation. ISSN 2201-5639. https://theconversation.com/como-gestionar-la-sobrecarga-de-informacion-cientifica-sobre-covid-19-138651

Jose Mari Arriola, Josu Goikoetxea, Mikel Iruskieta

Hizkuntza-teknologiak hizkuntzen ikas-irakaskuntzan: zenbat aukera, hainbat erronka (2020)

ehizpide 95: 1--21

Thierry Declerck, Itziar Gonzalez-Dios, German Rigau (editors)

Proceedings of the LREC 2020 Workshop on Multimodal Wordnets (MMWN-2020) (2020)

European Language Resources Association (ELRA), Paris. https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/MMW2020book.pdf ISBN: 979-10-95546-41-2 EAN: 9791095546412

Jon Alkorta, Itziar Gonzalez-Dios

Exploring the Enrichment of Basque WordNet with a Sentiment Lexicon (2020)

Proceedings of the Workshop on Multimodal Wordnets (MMWN-2020), pages 20–24. ISBN: 79-10-95546-41-2 https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/MMW2020book.pdf

Begoña Altuna, María Jesús Aranzabe, Arantza Díaz de Ilarraza

EusTimeML: A mark-up language for temporal information in Basque (2020)

Research in Corpus Linguistics 8: 86-104. ISSN 2243-4712. Asociación Española de Lingüística de Corpus (AELINCO) DOI 10.32714/ricl.08.01.06

Mikel Artetxe, Sebastian Ruder, Dani Yogatama

On the cross-lingual transferability of monolingual representations (2020)

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Mikel Artetxe, Sebastian Ruder, Dani Yogatama, Gorka Labaka, Eneko Agirre

A Call for More Rigor in Unsupervised Cross-lingual Learning (2020)

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Pablo Gamallo José Ramom Pichel and Iñaki Alegria

Measuring Language Distance of Isolated European Languages (2020)

MDPI Information 2020, 11(4), 181 doi:10.3390/info11040181

Sara Santiso

Adverse Drug Reaction extraction on Electronic Health Records written in Spanish (2020)

Procesamiento del Lenguaje Natural http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6203

Mikel Iruskieta, Amaia Arroyo-Sagasta, Abel Camacho, Montse Maritxalar

Teknologia, testuinguru digitala eta konpetentzia digitalak hezkuntzan (2020)

Euskonews 748. ISSN: 1139-3629. URL: http://www.euskonews.eus/zbk/748/teknologia-testuinguru-digitala-eta-konpetentzia-digitalak-hezkuntzan/ar-0748001002E/

Jose R. Pichel, Pablo Gamallo, Iñaki Alegria, Marco Neves

A Methodology to Measure the Diachronic Language Distance between Three Languages Based on Perplexity (2020)

Journal of Quantitative Linguistics. DOI 10.1080/09296174.2020.1732177

Rebecka Weegar, Alicia Pérez, Arantza Casillas, Maite Oronoz

Recent advances in Swedish and Spanish medical entity recognition in clinical texts using deep neural approaches (2020)

BMC Medical Informatics and Decision Making