Ressources vocales et langagières
Pour développer des produits et des applications en technologie linguistique, il est nécessaire de disposer de ressources linguistiques de base (corpus textuel et oral, lexiques et bases de connaissances) et d'outils de développement (analyseurs morphologiques et syntaxiques, désambiguiseurs, outils de traitement de corpus, lemmatiseurs, environnements intégrés des outils, etc.)
Nous avons plus de 25 ans d'expérience dans la création de ce type de ressources linguistiques de base et no...lire la suite
data_tabs
Demos
Konbitzul
Izen+aditz konbinazio-itzulpenen datu-basea
e-ROLda
A tool for looking up verb entries in the BVI lexicon and examples in EPEC-RolSem corpus
Universal Dependencies treebank for Basque
This treebank has 121 K words annotated following the guidelines proposed in the Universal Dependencies project.
Contrats
(2020 - 2021)
(2019 - 2019)- Hizkuntza Teknologia: Egoeraren diagnostikoa eta AMIA egitea.
(2019 - 2019) - Euskara HTen arloan sustatzeko proposamenak.
(2019 - 2019) - Hizkuntza-teknologiak sustatzeko proiektu transbertsalak
(2019 - 2019) - Orotariko Euskal Hiztegia corpus bihurtzea: bigarren urratsa, B fasea.
Phase B, second stage in the conversion to corpus of the dictionary Orotariko Euskal Hiztegia.
(2017 - 2017) - Orotariko Euskal Hiztegia corpus bihurtzea: bigarren urratsa.
Second stage in the conversion to corpus of the dictionary Orotariko Euskal Hiztegia.
(2016 - 2016)
Projects
- IKER-GAITU: hizkuntza ereduak ikertzea Adimen Artifizialean erabiltzeko
(2023 - 2025) - CLARIAH-EUS EJ: Europako ikerketa-azpiegituretan Giza eta Gizarte Zientzietan euskara eta euskaraz ikertzeko aukera bultzatzeko egitasmoa.
(2023 - 2025)
Language In The Human-Machine Era (LITHME). COST Action number: CA19102.
(2020 - 2024)
(2022 - 2024)- LUTEST: LANGUAGE UNDERSTANDING TEST SETS
(2020 - 2023)
Study of lexical combinations in Basque based on a novice academic corpus for an Academic Texts Writing Aid
(2020 - 2023)
Trustworthy AI - Integrating Learning, Optimisation and Reasoning
(2020 - 2023)
European Language Equality
(2021 - 2022)
enetCollect: A New European Network for combining Language Learning with Crowdsourcing Techniques
(2017 - 2021)
red estratégica para la promoción de las infraestructuras de tecnologías del lenguaje en ehumanidades y ciencias sociales
(2020 - 2021)
New generation of neural artificial intelligence models to transform language technologies in the Basque Country's industry.
(2020 - 2021)- CROSSTEXT: Automatic Generation of Multilingual Semantic Processors
Automatic generation of multilingual semantic taggers
(2017 - 2019) - DL4NLP: Deep Learning aplicado al Procesamiento del Lenguaje Natural como apoyo a los ámbitos del RIS3
(2019 - 2019)
(2011 - 2011) All HiTZ projects
Patents
Ressources
Publications
Ainara Estarrona, Izaskun Etxeberria, Manuel Padilla-Moyano, Ander Soraluze
Measuring language distance for historical texts in Basque (2023)
Procesamiento del Lenguaje Natural, Revista no 70, marzo del 2023, pp. 53-61
Igone Zabala
Euskararen erregistro akademikoen garapenaz: hiztegia eta fraseologia (2023)
Lindemann David (ed.) Miren Azkarateri esker onez. Bilbo: UPV/EHUko Argitalpen Zerbitzua: 313-332
Itziar Aduriz, Manex Agirrezabal, Eneko Agirre, Iñaki Alegria, Xabier Arregi, Jose Mari Arriola Xabier Artola, Arantza Díaz de Ilarraza, Ainara Estarrona, Izaskun Etxeberria, Nerea Ezeiza, Kepa Sarazola
Mofologia Konputazionala Euskaraz, 35 urte (2023)
Lindemann, D. (arg.). Miren Azkarateri esker onez, 15-30. UPV/EHU Argitalpen zerbitzua. Bilbo.
Izaskun Aldezabal, María Jesús Aranzabe
Euskararen eredutik hizkuntza-ereduen euskarara (2023)
David Lindemann (arg.), Miren Azkarateri esker onez, 57-75. Bilbo: UPV/EHUko Argitalpen Zerbitzua
Izaskun Aldezabal, Jose Mari Arriola, Arantxa Otegi
TZOS: an Online Terminology Database Aimed at Working on Basque Academic Terminology Collaboratively (2022)
Proceedings of the 13th Language Resources and Evaluation Conference. Editors: Nicoletta Calzolari (Conference chair), Fred´ eric B ´ echet, Philippe Blache, Khalid Choukri, ´ Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hel´ ene Mazo, Jan Odijk, Stelios Piperidis
Gonzalez-Dios, Itziar and Altuna, Begoñ
Natural Language Processing and Language Technologies for the Basque Language (2022)
Gonzalez-Dios, Itziar and Altuna, Begoña (2022). Natural Language Processing and Language Technologies for the Basque Language. In Cuadernos Europeos de Deusto. NÚMERO ESPECIAL. Linguas minoritarias e futuro de Europa. Minority Languages and the Future of Europe 26, 203-230. https://doi.org/10.18543/ced.2477 https://ced.revistas.deusto.es/issue/view/285
María Jesús Aranzabe, Antton Gurrutxaga, Igone Zabala
Compilación del corpus académico de noveles en euskera HARTAeus y su explotación para el estudio de la fraseología académica (2022)
Procesamiento del Lenguaje Natural, Revista no 69, septiembre de 2022, pp. 95-103
MarÍa Jesús Aranzabe, Izaskun Aldezabal, Igone Zabala
Recursos y Herramientas de Lingüística de Corpus y PLN para la Monitorización e Investigación de los Usos Académicos del Euskera (2022)
III. workshop de INTELE (Infraestructura de Tecnologías del Lenguaje). Madrid, 13 y 14 de septiembre (Workshop horretan aurkeztutako posterra)
Bernardo Magnini, Begoña Altuna, Alberto Lavelli, Anne-Lyse Minard, Manuela Speranza, and Roberto Zanoli
European Clinical Case Corpus (2022)
Bernardo Magnini, Begoña Altuna, Alberto Lavelli, Anne-Lyse Minard, Manuela Speranza, and Roberto Zanoli (2022). European Clinical Case Corpus. Georg Rehm ed. European Language Grid, A Language Technology Platform for Multilingual Europe. Springer, Cham, Switzerland. https://doi.org/10.1007/978-3-031-17258-8
Petter Mæhlum, Andre Kåsen, Samia Touileb, and Jeremy Barnes.
Annotating Norwegian language varieties on Twitter for Part-of-speech. (2022)
Proceedings of the Ninth Workshop on NLP for Similar Languages, Varieties and Dialects
Itziar Glez Dios, Aitor Soroa, Hugo Laurençon, Lucile Saulnier, Thomas Wang, Christopher Akiki, Albert Villanova del Moral, Teven Le Scao, Leandro Von Werra, Chenghao Mou, Eduardo González Ponferrada, Huu Nguyen, Jörg Frohberg, Mario Šaško, Quentin Lhoest, Angelina McMillan-Major, Gérard Dupont, Stella Biderman, Anna Rogers, Loubna Ben Allal, Francesco de Toni, Giada Pistilli, Olivier Nguyen, Somaieh Nikpoor, Maraim Masoud, Pierre Colombo, Javier de la Rosa, Paulo Villegas, Tristan Thrush, etal.
The BigScience ROOTS Corpus: A 1.6 TB Composite Multilingual Dataset (2022)
2022. Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track
Scao, T. L., Fan, A., Akiki, C., Pavlick, E., Ilić, S., Hesslow, D., Soroa, A., Gonzalez-Dios, I,... & Manica, M.
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model (2022)
Scao, T. L., Fan, A., Akiki, C., Pavlick, E., Ilić, S., Hesslow, D., ... & Manica, M. (2022). BLOOM: A 176B-Parameter Open-Access Multilingual Language Model. arXiv preprint arXiv:2211.05100.
Nayla Escribano, Jon Ander González, Julen Orbegozo-Terradillos, Ainara Larrondo-Ureta, Simón Peña-Fernández, Olatz Perez-de-Viñaspre, Rodrigo Agerri
BasqueParl: A Bilingual Corpus of Basque Parliamentary Transcriptions (2022)
Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3382–3390, Marseille, France. European Language Resources Association.
Margarita Alonso Ramos, Igone Zabala
HARTAes-vas: Lexical combinations for an academic writing aid tool in Spanish and Basque (2022)
SEPLN-PD 2022. Annual Conference of the Spanish Association for Natural Language Processing 2022: Projects and Demonstrations, September 21-23, 2022, A Coruña, España.
Mikel Artetxe, Itziar Aldabe, Rodrigo Agerri, Olatz Perez-de-Viñaspre, Aitor Soroa
Does Corpus Quality Really Matter for Low-Resource Languages? (2022)
Proceedings of EMNLP 2022.
Elisa Sanchez-Bayona, Rodrigo Agerri
Leveraging a New Spanish Corpus for Multilingual and Crosslingual Metaphor Detection (2022)
Proceedings of the 26th Conference on Computational Natural Language Learning (CoNLL), pages 228--240, Abu Dhabi, United Arab Emirates, Association for Computational Linguistics.
Elisa Sanchez-Bayona, Rodrigo Agerri
From Automatic Metaphor Processing in Spanish to a Multilingual Perspective: Annotation, Systems, and Evaluation (2022)
Doctoral Symposium on Natural Language Processing from the PLN.net network 2022 (RED2018-102418-T), 21-23 September 2022, A Coruña, Spain.
Cecilia Domingo, Tatiana Gonzalez-Ferrero, Itziar Gonzalez-Dios
What is on Social Media that is not in WordNet? A Preliminary Analysis on the TwitterAAE Corpus (2021)
Domingo, C., Gonzalez-Ferrero, T., & Gonzalez-Dios, I. (2021, January). What is on Social Media that is not in WordNet? A Preliminary Analysis on the TwitterAAE Corpus. In Proceedings of the 11th Global Wordnet Conference (pp. 234-242).
Itziar Gonzalez-Dios, Uxoa Iñurrieta, Igone Zabala
General and Specialised Corpora to Raise Linguistic Awareness in a Language Undergoing the Normalisation Process: Academic Writing in Basque (2021)
Gonzalez Dios, I.; Iñurrieta, U.; Zabala, I. General and specialised corpora to raise linguistic awareness in a language undergoing the normalisation process: academic writing in Basque. A: AELFE-TAPP 2021 (19th AELFE Conference, 2nd TAPP Conference). "Multilingual academic and professional communication in a networked world. Proceedings of AELFE-TAPP 2021 (19th AELFE Conference, 2nd TAPP Conference). Vilanova i la Geltrú (Barcelona), 7-9 July 2021". Vilanova i la Geltrú: Universitat Politècnica de Catalunya, 2021, ISBN 978-84-9880-943-5.
Igone Zabala
Euskararen lantze funtzionala esparru akademiko eta profesionaletan (2021)
In Grenoble, Lenore / Lane, Pia / Røyneland, Unn (eds.) Ivan Igartua & Lourdes Oñederra (Basqeu eds.) Linguistic Minorities in Europe Online. A Born-Digital, Multimodal, Peer-Reviewed Online Reference Resource The Gruyter Mouton
Igone Zabala
Euskaltzaindiaren Hiztegiaren ekarpena lexiko espezializatuaren eta ez-espezializatuaren harmonizazioan (2021)
In Andres Urrutia (ed.) Arantzazutik mundu zabalera. Euskararen normatibizazioa: 1968-2018. IKER 40. Euskaltzaindia-Iberoamericana Vervuert: 285-299
Igone Zabala, Izaskun Aldezabal, Maria Jesus Aranzabe
Academic Research Works and Domain Dinamics: Resources and Tools for Basque Academic Writing (2021)
18th International Conference on Minority Languages (Bilbao, 2021/03/24-26)
Jon Alkorta
Hacia el análisis de sentimientos en euskera (2021)
J. Alkorta. (2021). Hacia el análisis de sentimientos en euskera. Procesamiento del Lenguaje Natural, 66, 201-204.
Jon Alkorta, Koldo Gojenola, Mikel Iruskieta
Ezeztapena identifikatzeko Murriztapen Gramatikako erregelak sentimenduen analisiaren testuinguruan (2021)
Alkorta, J., Gojenola, K. eta Iruskieta, M. 2021. Ezeztapena identifikatzeko Murriztapen Gramatikako erregelaksentimenduen analisiaren testuinguruan. IV. IKERGAZTE NAZIOARTEKO IKERKETA EUSKARAZ Kongresuko artikulu-bilduma, Editoreak: Olatz Arbelaitz, Ainhoa Latatu, Miren Josu Omaetxebarria, Blanca Urgell. Bilbo: UEU, 169-176 orr.
Prys Delyth, Sarasola Kepa, Alegria Iñaki, Perez-de-Viñaspre Olatz, Palmer Geraint, Corcoran Padraig, Arman Laura, Knight Dawn ,Spasic Irena, Bryn Jones Dewi, Cooper Sarah, Prys Myfyr, Muralidaran Vigneshwaran, O’Hare Keeziah, Prys Gruffudd, Watkins Gareth, Roberts Jonathan C, Butcher Peter W. S., Lew Robert, Rees Geraint, Sharma Nirwan, Frankenberg-Garcia Ana, Farhat Leena Sarah, Teahan William John.
Language and Technology in Wales: Volume I (2021)
Language and Technology in Wales: Volume I. University of Bangor. ISBN: 978-1-84220-189-3
Prys Delyth, Sarasola Kepa, Alegria Iñaki, Perez-de-Viñaspre Olatz, Palmer Geraint, Corcoran Padraig, Arman Laura, Knight Dawn ,Spasic Irena, Bryn Jones Dewi, Cooper Sarah, Prys Myfyr, Muralidaran Vigneshwaran, O’Hare Keeziah, Prys Gruffudd, Watkins Gareth, Roberts Jonathan C, Butcher Peter W. S., Lew Robert, Rees Geraint, Sharma Nirwan, Frankenberg-Garcia Ana, Farhat Leena Sarah, Teahan William John.
Iaith a Thechnoleg yng Nghymru: Cyfrol 1 (2021)
Iaith a Thechnoleg yng Nghymru: Cyfrol 1. University of Bangor. ISBN: 978-1-84220-189-6
Xavier Gómez Guinovart, Itziar Gonzalez-Dios, Antoni Oliver, German Rigau
Multilingual Central Repository: a Cross-lingual Framework for Developing Wordnets (2021)
Xavier Gómez Guinovart, Itziar Gonzalez-Dios, Antoni Oliver, German Rigau (2021) Multilingual Central Repository: a Cross-lingual Framework for Developing Wordnets. arXiv:2107.00333
Ainara Estarrona, Izaskun Etxeberria, Ricardo Etxepare, Manuel Padilla-Moyano, Ander Soraluze
The First Annotated Corpus of Historical Basque (2021)
Digital Scholarship in the Humanities, vol. 37(2), pp. 391-404
Igone Zabala, María Jesús Aranzabe, Izaskun Aldezabal
Retos actuales del desarrollo y aprendizaje de los registros académicos orales y escritos del euskera (2021)
Círculo de Lingüística Aplicada a la Comunicación 88, pp. 31-50
Bernardo Magnini, Begoña Altuna, Alberto Lavelli, Manuela Speranza, Roberto Zanoli
The E3C Project: European Clinical Case Corpus (2021)
Proceedings of the Annual Conference of the Spanish Association for Natural Language Processing: Projects and Demonstrations (SEPLN-PD 2021). Pages 17-20. ISSN: 1613-0073. URL: http://ceur-ws.org/Vol-2968/paper5.pdf
Ainara Estarrona, Izaskun Aldezabal, Arantza Díaz de Ilarraza
How the corpus-based Basque Verb Index lexicon was built (2020)
Language Resources and Evaluation. First Online 05 December 2018. DOI: https://doi.org/10.1007/s10579-018-9440-0. Springer Netherlands
Piroska Lendvai , Sándor Darányi, Christian Geng, Moniek Kuijpers, Oier Lopez de Lacalle , Jean-Christophe Mensonides, Simone Rebora and Uwe Reichel
Detection of Reading Absorption in User-Generated Book Reviews: Resources Creation and Evaluation (2020)
Proceeding of 12th Edition of its Language Resources and Evaluation Conference (LREC2020). Marseille, France
Arantxa Otegi, Aitor Agirre, Jon Ander Campos, Aitor Soroa, Eneko Agirre
Conversational Question Answering in Low Resource Scenarios: A Dataset and Case Study for Basque (2020)
Proceedings of The 12th Language Resources and Evaluation Conference, pp. 429–435. European Language Resources Association. ISBN: 979-10-95546-34-4
Javier Álvez, Itziar Gonzalez-Dios, German Rigau
Towards Word Sense Disambiguation by Reasoning (2020)
Vampire 2018 and Vampire 2019. The 5th and 6th Vampire Workshops. EPiC Series in Computing. Pages 19-29. ISSN: 2398-7340
Uxoa Iñurrieta
Identification and translation of verb+noun multiword expressions: a Spanish-Basque study (2020)
Procesamiento del Lenguaje Natural, 64, pp. 123-126.
Kepa Bengoetxea, Itziar Gonzalez-Dios, Amaia Aguirregoitia
AzterTest: Open source linguistic and stylistic analysis tool (2020)
Procesamiento del Lenguaje Natural, 64, 61-68. http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6196
Rodrigo Agerri, Iñaki San Vicente, Jon Ander Campos, Ander Barrena, Xabier Saralegi, Aitor Soroa, Eneko Agirre
Give your Text Representation Models some Love: the Case for Basque (2020)
Proceedings of LREC. Also available at arxiv https://arxiv.org/pdf/2004.00033.pdf
Itziar Gonzalez-Dios, Javier Álvez, German Rigau
Towards modeling SUMO attributes through WordNet adjectives: a Case Study on Qualities. (2020)
Proceedings of the Workshop on Multimodal Wordnets (MMWN-2020), pages 1–6. ISBN: 979-10-95546-41-2 https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/MMW2020book.pdf
Jon Alkorta, Itziar Gonzalez-Dios
Exploring the Enrichment of Basque WordNet with a Sentiment Lexicon (2020)
Proceedings of the Workshop on Multimodal Wordnets (MMWN-2020), pages 20–24. ISBN: 79-10-95546-41-2 https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/MMW2020book.pdf
Thierry Declerck, Itziar Gonzalez-Dios, German Rigau (editors)
Proceedings of the LREC 2020 Workshop on Multimodal Wordnets (MMWN-2020) (2020)
European Language Resources Association (ELRA), Paris. https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/MMW2020book.pdf ISBN: 979-10-95546-41-2 EAN: 9791095546412
Begoña Altuna, María Jesús Aranzabe, Arantza Díaz de Ilarraza
EusTimeML: A mark-up language for temporal information in Basque (2020)
Research in Corpus Linguistics 8: 86-104. ISSN 2243-4712. Asociación Española de Lingüística de Corpus (AELINCO) DOI 10.32714/ricl.08.01.06
Begoña Altuna
Análisis de estructuras temporales en euskera y creación de un corpus (2020)
Procesamiento del Lenguaje Natural, Revista no 64, marzo de 2020, pp. 131-134 URL: http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6206 ISSN: 1989-7553
Elena Zotova, Rodrigo Agerri, Manuel Nuñez and German Rigau
Multilingual Stance Detection in Tweets: The Catalonia Independence Corpus (2020)
Language Resources and Evaluation Conference (LREC 2020)
Uxoa Inurrieta, Itziar Aduriz, Arantza Díaz de Ilarraza, Gorka Labaka, Kepa Sarasola
Learning about phraseology from corpora: A linguistically motivated approach for Multiword Expression identification. (2020)
Inurrieta U, Aduriz I, Díaz de Ilarraza A, Labaka G, Sarasola K (2020) Learning about phraseology from corpora: A linguistically motivated approach for Multiword Expression identification. PLoS ONE 15(8): e0237767. https://doi.org/10.1371/journal.pone.0237767
Ainara Estarrona, Izaskun Etxeberria, Ricardo Etxepare, Manuel Padilla-Moyano, Ander Soraluze
Sintaktikoki etiketatutako euskarazko corpus historikoa eraikitzen (2020)
Fontes Linguae Vasconum 50 urte. Ekarpen berriak euskararen ikerketari. Nuevas aportaciones al estudio de la lengua vasca
Ainara Estarrona, Izaskun Etxeberria, Ricardo Etxepare, Manuel Padilla-Moyano, Ander Soraluze
Dealing with dialectal variation in the construction of the Basque historical corpus (2020)
Proceedings of the 7th Workshop on NLP for similar languages, varieties and dialects (VarDial2020 at COLING 2020).
Gorka Urbizu, Ander Soraluze, Olatz Arregi
Sequence to Sequence Coreference Resolution (2020)
Proceedings of the 3rd Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2020), pages 39–46,Barcelona, Spain (online), December 12, 2020.
Jon Ander Campos, Arantxa Otegi, Aitor Soroa, Jan Deriu, Mark Cieliebak, Eneko Agirre
DoQA - Accessing Domain-Specific FAQs via Conversational QA (2020)
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7302–7314
Itziar Gonzalez-Dios
Data statement of the Corpus of Basque Simplified Texts (2020)file2 (2020)
Data Statements workshop
Bernardo Magnini, Begoña Altuna, Alberto Lavelli, Manuela Speranza, Roberto Zanoli
The E3C Project:Collection and Annotation of a Multilingual Corpus of Clinical Cases (2020)
In Johanna Monti, Felice Dell'Orletta and Fabio Tamburini (eds.), Proceedings of the Seventh Italian Conference on Computational Linguistics. Associazione Italiana di Linguistica Computazionale. Bologna, Italy, 2020.
Itziar Aldabe, Josu Aztiria, Francho Beltrán, Myriam Bras, Klara Ceberio, Itziar Cor tes, Jean-Baptiste Coyos, Benaset Dazeas, Louise Esher, Gorka Labaka, Igor Leturia, Kepa Sarasola, Aure Séguier, Jean Sibille
LINGUATEC: Development of cross-border cooperation and knowledge transfer in language technologies (2020)
Workshop "INTELE : INfraestructura de TEcnologías del LEnguaje" CLARIN DARIAH-EU. http://ixa2.si.ehu.eus/intele/?q=node/71
Kepa Sarasola, Itziar Aldabe, Arantza Diaz de Ilarraza, Ainara Estarrona, Aritz Farwell, Inma Hernaez, Eva Navas; Reviewers: Annika Grützner-Zahn, Maria Giagkou; Editors: Maria Giagkou, Stelios Piperidis, Georg Rehm, Jane Dunne
Report on the Basque Language. European Language Equality (2020)
Deliverables of the Project ELE (European Language Equality). D1.4 Report on the Basque Language, https://european-language-equality.eu/deliverables/
Jon Alkorta, Koldo Gojenola, Mikel Iruskieta
SentiTegi: building a semantic oriented Basque lexicon (2019)
Computación y Sistemas, 22 (4)
Igone Zabala
The elaboration of Basque in academic and professional domains. (2019)
In Grenoble, Lenore; Lane, Pia & Røyneland, Unn Unn Røyneland (ed.) Linguistic Minorities in Europe Online. The Gruyter Mouton. ISSN 2510-5361
Aitziber Atutxa, Kepa Bengoetxea, Arantza Diaz de Ilarraza, Mikel Iruskieta
Towards a top-down approach for an automatic discourse analysis for Basque: Segmentation and Central Unit detection tool (2019)
PLoS ONE 14(9): e0221639
Ander Soraluze, Olatz Arregi, Xabier Arregi, Arantza Diaz de Ilarraza
EUSKOR: End-to-end coreference resolution system for Basque (2019)
PLoS ONE 14(9): e0221801. https://doi.org/10.1371/journal.pone.0221801
Ainara Estarrona, Izaskun Etxeberria, Ander Soraluze, Manuel Padilla-Moyano
Spelling Normalisation of Basque Historical Texts (2019)
Procesamiento del Lenguaje Natural, vol. 63, pp. 59-66
Javier Álvez, Itziar Gonzalez-Dios, German Rigau
Commonsense Reasoning Using WordNet and SUMO: a Detailed Analysis (2019)
Proceedings of the Tenth Global Wordnet Conference, pp 197--205. ISBN 978-83-7493-108-3
ItziarGonzalez-Dios, German Rigau
Textual genre based approach to use wordnets in language-for-specific-purpose classroom as dictionary (2019)
Proceedings of the Tenth Global Wordnet Conference, pp 222--227. ISBN 978-83-7493-108-3
Jon Ander Campos, Arantxa Otegi, Aitor Soroa, Jan Deriu, Mark Cieliebak, Eneko Agirre
Conversational QA for FAQs (2019)
NeurIPS 3rd Conversational AI Workshop: “Today's Practice and Tomorrow's Potential”
Meghan Dowling, Kepa Sarasola, Ana Zelaia, Aitzol Astigarraga
Looking for possible new articles. What Wikipedia pages are often consulted in English... but there are not defined in Gaelic? (2019)
Meghan Dowling, Kepa Sarasola, Ana Zelaia, Aitzol Astigarraga (2019) 'Looking for possible new articles. What Wikipedia pages are often consulted in English... but there are not defined in Gaelic?' Wikimedia+Education Conference, Donostia 2019
Begoña Altuna, Maria Jesús Aranzabe, Arantza Diaz de Ilarraza
Adapting TimeML to Basque: Event Annotation (2018)
In Gelbukh A. (eds.) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science (LNCS, vol 9624), 565-577. Springer, Cham. DOI https://doi.org/10.1007/978-3-319-75487-1_43 ; Print ISBN 978-3-319-75486-4; Online ISBN 978-3-319-75487-1
Uxoa Iñurrieta, Itziar Aduriz, Arantza Díaz de Ilarraza, Gorka Labaka, Kepa Sarasola
Konbitzul: an MWE-specific Database for Spanish-Basque (2018)
Proceedings of the 11th Language Resources and Evaluation Conference, Miyazaki, Japan. orrialdeak: pages 2500-2504.
Uxoa Iñurrieta, Itziar Aduriz, Ainara Estarrona, Itziar Gonzalez-Dios, Antton Gurrutxaga, Ruben Urizar, Iñaki Alegria
Verbal Multiword Expressions in Basque corpora (2018)
In the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (at COLING 2018)
Igone Zabala
Euskararen lantze funtzionala eta profesionalen komunikazio-gaitasunen garapena osasun-alorrean (2018)
BAT Soziolinguistika Aldizkaria 108, 2018 (3): 11-34
Igone Zabala
Euskararen terminologiaren garapena Terminologiaren Teoria Komunikatiboaren argitan (2018)
In Ruben Urizar eta Itizar Aduriz (ed.) Hizkuntzalari Euskaldunen III Topaketa. Zer berri?. 349-358.
Klara Ceberio, Itziar Aduriz, Arantza Díaz de Ilarraza and Ines Garzia-Azkoaga
Coreferential Relations in Basque: The Annotation Process (2018)
J Psycholinguist Res (2018) 47, Issue 2. Pages 325-342. https://doi.org/10.1007/s10936-018-9559-6. ISSN 0090-6905. Online ISSN 1573-6555.
Izaskun Aldezabal, Xabier Artola, Arantza Diaz De Ilarraza, Itziar Gonzalez-Dios, Gorka Labaka, German Rigau and Ruben Urizar
Basque e-lexicographic resources: linguistic basis, development, and future perspectives (2018)file2 (2018)
Workshop on eLexicography: Between Digital Humanities and Artificial Intelligence. https://lexdhai.insight-centre.org/Lex_DH__AI_2018_paper_5.pdf
Itziar Aduriz, María Jesús Aranzabe, José María Arriola, Arantza Díaz de Ilarraza, Itziar Gonzalez-Dios, Ruben Urizar
Building the Gold Standard for the Surface Syntax of Basque (2017)
Procesamiento del Lenguaje Natural, 58, 125-132. Consultado en http://ixa.si.ehu.es/sites/default/files/dokumentuak/8825/5421-4766-1-PB.pdf (ISSN edición impresa: 1135-5948) (ISSN edición electrónica: 1989-7553)
Itziar Aduriz, Iñaki Alegria, Olatz Arregi, Arantza Diaz de Ilarraza, Kepa Sarasola
Hizkuntza-teknologia “Datu Handien” garaian: programa bilatzaileak, itzultzaileak… (2017)
Senez, 48, pp. 191-200. ISSN: 1132-2152. 2017 https://eizie.eus/eu/argitalpenak/senez/20171102/aurkezpena/datuhandiak
Zabala I., San Martin I., Lersundi M.
Learning terminology in order to become an active agent in the development of Basque biomedical registers (2016)
Language Learning in Higher Education. Journal of CercleS (European Confederation of Language Centres in Higher Education). De Gruyter Mouton. Volume 6, Issue 1 (May 2016). Special issue: Teaching Medical Discourse in Higher Education. ISSN (Online) 2191-6128, ISSN (Print) 2191-611X, DOI: 10.1515/cercles-2016-0007 URL: http://www.degruyter.com/view/j/cercles.2016.6.issue-1/cercles-2016-0007/cercles-2016-0007.xml
Arantxa Otegi, Nora Aranberri, António Branco, Jan Hajic, Steven Neale, Petya Osenova, Rita Pereira, Martin Popel, Joao Silva, Kiril Simov, Eneko Agirre
QTLeap WSD/NED Corpora: Semantic Annotation of Parallel Corpora in Six Languages (2016)
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), European Language Resources Association (ELRA). ISBN 978-2-9517408-9-1
Estarrona A., Aldezabal I., Díaz de Ilarraza A. eta Aranzabe M.J.
A Methodology for the Semiautomatic Annotation of EPEC-RolSem, a Basque Corpus Labeled at Predicate Level following the PropBank/Verbnet Model (2016)
Edward Vanhoutte (ed.) Digital Scholarship in the Humanities (2016) 31 (3): 470-492. DOI: http://dx.doi.org/10.1093/llc/fqv001 First published online: 17 June 2015 (23 pages). Published by Oxford University Press on behalf of EADH: The European Association for Digital Humanities (Online ISSN 2055-768X - Print ISSN 2055-7671)
A. Minard, M. Speranza, R. Urizar, B. Altuna, M. van Erp, A. Schoen, and C. van Son
MEANTIME, the NewsReader Multilingual Event and Time Corpus (2016)
Proceedings of LREC 2016.Pages: 4417-4422. ISBN: 978-2-9517408-9-1
Zabala I., Lersundi M., Martínez M., Requero M.A., Omaetxebarria M.J.
Biokimikaren terminologiaren deskripzioa: erabilera errealetik hiztegietara. (2015)
Beatriz Fernandez eta Pello Salaburu (ed.) Ibon Sarasola, Gorazarre, Homenatge, Homenaje. UPV/EHUko Argitalpen Zerbitzua. Bilbo: 679-692
Maria Jesús Aranzabe, Aitziber Atutxa, Kepa Bengoetxea, Arantza Diaz de Ilarraza, Iakes Goenaga, Koldo Gojenola, Larraitz Uria
Automatic Conversion of the Basque Dependency Treebank to Universal Dependencies (2015)
Markus Dickinsons, Erhard Hinrichs, Agnieszka Patejuk, Adam Przepiórkowski (eds), Proceedings of the Fourteenth International Workshop on Treebanks an Linguistic Theories (TLT14), 233-241. Institute of Computer Science of the Polish Academy of Sciences, Warszawa, Poland. ISBN: 978-83-63159-18-4
Zabala I., San Martin I., Lersundi M.
Linguistic and sociolinguistic factors that influence the detection, implantation and circulation of natural terminology in academic uses of Basque (2014)
Pascaline Dury; José Carlos de Hoyos; Julie Makri-Morel; François Maniez; Vincent Renner; María Belén Villar Díaz (ed.) La néologie en langue de spécialité. Détection, implantation eta circulation des nouveaux termes. Travaux du CRTT (Centre de Recherche en Terminologie et Traduction. Université Lumière Lyon 2):141-164 ISBN: 978-2-9533061-0-1
Larraitz Uria, Montserrat Maritxalar, Igone Zabala
An Environment for Learner Corpus Research and Error Analysis: The Study of Determiner Errors in Basque (2014)
International Journal of Computer-Assisted Language Learning and Teaching (IJCALLT) , 4 (3), 34-51, July-September 2014, edited by Bin Zou (ISSN 2155-7098, eISSN 2155-7101).
Igor Leturia, Kepa Sarasola, Xabier Arregi, Arantza Diaz de Ilarraza, Eva Navas, Iñaki Sainz, Arantza del Pozo, David Baranda, Urtza Iturraspe
BerbaTek: euskararako hizkuntza teknologien garapena itzulpengintza, edukien kudeaketa eta irakaskuntza arloetan (2013)
Euskalingua aldizkari digitala, 23, 66-76. http://mendebalde.eus/euskalinguak/Euskalingua%2023/Berbatek:%20euskararako%20hizkuntza%20teknologien%20garapena%20itzulpengintza,%20edukien%20kudeaketa%20eta%20irakaskuntza%20arloetan.pdf
Iruskieta M., Aranzabe M., Diaz de Ilarraza A., Gonzalez I., Lersundi I., Lopez de Lacalle O.
The RST Basque TreeBank: an online search interface to check rhetorical relations (2013)
4th Workshop RST and Discourse Studies, 40-49, Sociedad Brasileira de Computacao, Fortaleza, CE, Brasil. October 20-24 (http://encontrorst2013.wix.com/encontro-rst-2013)
Zabala I., San Martin I., Lersundi M., Azkue J. J., Mendizabal J.L.
The Elaboration of Human Anatomy Terminology for the Basque Language: the Contribution of Translators, Linguists and Experts (2012)
Terminàlia Vol. 6: 15-25
Zabala I., San Martin I.
Basque and Romance Languages: Languages with Different Structures (2012)
Pello Salaburu & Xabier Alberdi (ed.) The Basque Country, a Bilingual Society. Center for Basque Studies University of Nevada: 51-72
Pociello E., Agirre E. and Aldezabal I.
Methodology and construction of the Basque WordNet (2011)
Language Resources and Evaluation. Springer. Volume 45, Issue 2, pp 121-142. ISSN 1574-020X. DOI 10.1007/s10579-010-9131-y. official
Iñaki Alegria, Maria Jesús Aranzabe, Xabier Arregi, Xabier Artola, Arantza Diaz de Ilarraza, Aingeru Mayor, Kepa Sarasola
Valuable Language Resources and Applications Supporting the Use of Basque (2011)
Z. Vetulani (Ed.): LTC 2009, Lecture Notes in Artifitial Intelligence LNAI 6562, pp. 327--338. Springer, Heidelberg. ISBN:978-3-642-20094-6, DOI: 10.1007/978-3-642-20095-3, https://link.springer.com/content/pdf/10.1007%2F978-3-642-20095-3_30.pdf
Zabala I., San Martin I., Lersundi M., Elordui A.
Graduate Teaching of Specialized Registers in a Language in the Normalization Process: Towards a Comprehensive and Interdisciplinary Treatement of Academic Basque (2011)
Sergio Maruenda-Bataller y Begoña Clavel-Arroitia (ed.) Multiple Voices in Academic and Professional Discourse: Current Issues in Specialised Language Research, Teaching and New Technologies. Cambridge Scholars Publishing: 208-229 ISBN (10) 1-4438-2971-4; ISBN (13): 978-1-4438-2971-7
Izaskun Aldezabal, Maria Jesús Aranzabe, Arantza Diaz de Ilarraza, Ainara Estarrona, Larraitz Uria
EusPropBank: Integrating Semantic Information in the Basque Dependency Treebank (2010)
Lecture Notes in Computer Science (LNCS) nº 6008, Alexander Gelbukh (Ed.), Computational Linguistics and Intelligent Text Processing. pp.60-73, Springer. ISSN: 0302-9743, ISBN-10: 3-642-12115-2 Springer Berlin Heidelberg New York, ISBN-13: 978-3-642-12115-9 Springer Berlin Heidelberg New York. 11th International Conference, CICLing 2010, Iasi, Romania, March 21-27, 2010
Izaskun Aldezabal, Maria Jesús Aranzabe, Arantza Diaz de Ilarraza, Ainara Estarrona
Building the Basque PropBank (2010)
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner and Daniel Tapias (eds.), Proceedings of the Seventh Conference on International Language Resources and Evaluation (LREC 2010), pp. 1414-1417, European Language Resources Association (ELRA), ISBN: 2-9517408-6-7. LREC 2010, Valletta (Malta), May 19-21, 2010
Uria L., Estarrona A., Aldezabal I., Aranzabe M., Díaz de Ilarraza A., Iruskieta M.
Evaluation of the Syntactic Annotation in EPEC, the Reference Corpus for the Processing of Basque (2009)
Lecture Notes in Computer Science (LNCS) nº 5449, Alexander Gelbukh (Ed.), Computational Linguistics and Intelligent Text Processing. pp 72-85. Springer. ISSN: 0302-9743, ISBN-10: 3-642-00381-8, ISBN-13: 978-3-642-00381-3. 10th International Conference, CICLing 2009, Mexico City, Mexico, March 1-7, 2009
Izaskun Aldezabal, Maria Jesús Aranzabe, Jose Maria Arriola, Arantza Diaz de Ilarraza
Syntactic annotation in the Reference Corpus for the Processing of Basque (EPEC): Theoretical and practical issues (2009)
Corpus Linguistics and Linguistic Theory 5-2 (2009), 241-269. Mouton de Gruyter. Berlin-New York. Print ISSN: 1613-7027 Online ISSN: 1613-7035
Elosegi A., Aldezabal I., Garcia J.
Euskarazko produktu linguistikoen azterketa (2009)
In I. Zabala (ed.) GARATERM: diskurtso akademiko-profesionalaren didaktika eta garapena uztartzeko tresna informatikoen diseinua eta integrazioa. EHUko Argitalpen Zerbitzuko argitalpen elektronikoa (argitarabidean). http://garaterm.ehu.es/garaterm_ataria/publications/
Elosegi A., Aldezabal I., Garcia J.
Euskararen garapen lexiko-diskurtsiboan eragina duten produktu linguistiko nagusiei buruzko informazioa biltzen duen datu-basea (2009)
GARATERM: diskurtso akademiko-profesionalaren didaktika eta garapena uztartzeko tresna informatikoen diseinua eta integrazioa. EHUko Argitalpen Zerbitzuko argitalpen elektronikoa. (argitarabidean). http://garaterm.ehu.es/garaterm_ataria/publications/
Elordui A., Zabala I.
Euskara Batuaren garapen lexiko-diskurtsiboa: batasunetik aniztasun funtzionalera (2009)
Ricardo Etxepare, Ricardo Gomez &Joseba Lakarra (ed.) A Festschrift for Bernard Oyharçabal. Bilbao: Université du Pays Basque. Supplements of the Internationa Journal of Basque Language and Linguistics: 231-246 ISSN: 05826152
Zabala I., Aierbe A., Aldezabal I., Aranzabe M., Arregi X., Arriola J.M., Elordui A., Elosegi A., Elosegi K., Ezeiza J., Garcia I., Garcia J., Lersundi M., San Martin I. eta Ugarteburu I.
GARATERM: Diskurtso akademiko-profesionalaren didaktika eta garapena uztartzeko tresna informatikoen diseinua eta integrazioa helburu duen proiektua (2008)
In Iñaki Ugarteburu eta Pello Salaburu (arg.), Espezialitate hizkerak eta terminologia III: espezialitate hizkeren didaktika eta komunikazioa, 211-219, UPV/EHUko argitalpen zerbitzua. Bilbo (Bizakia). ISBN: 978-84-691-6424-2
Izaskun Aldezabal, Klara Ceberio, Itsaso Esparza, Ainara Estarrona, Jone Etxeberria, Elixabete Izagirre, Mikel Iruskieta, Larraitz Uria
EPEC (Euskararen Prozesamendurako Erreferentzia Corpusa) segmentazio-mailan etiketatzeko eskuliburua (2007)
UPV/EHU / LSI / TR 11-2007
Itziar Aduriz, Maria Jesús Aranzabe, Jose Maria Arriola, Aitziber Atutxa, Arantza Diaz de Ilarraza, Nerea Ezeiza, Koldo Gojenola, Maite Oronoz, Aitor Soroa, Ruben Urizar
Methodology and steps towards the construction of EPEC, a corpus of written Basque tagged at morphological and syntactic levels for the automatic processing (2006)
Corpus Linguistics Around the World. Book series: Language and Computers. Vol 56 (pag 1- 15). ISBN 90-420-1836-4 Ed. Andrew Wilson, Paul Rayson, and Dawn Archer. Rodopi. Netherlands.
Eneko Agirre, Izaskun Aldezabal, Jone Etxeberria, Mikel Iruskieta, Elixabete Izagirre, Karmele Mendizabal, Eli Pociello
Improving the Basque WordNet by corpus annotation. (2006)
Proceedings of Third International WordNet Conference. pp. 287-290. ISBN 80-210-3915-9. Jeju Island (Korea).
Izaskun Aldezabal
Euskal Filologia Saila. Zientzia Fakultatea. Leioa. UPV/EHU. 2004ko apirila.
Aldezabal I., Goenaga P.
Analyzing Verbal Subcategorization Aimed at its Computational Application (2003)
Inquiries into the lexicon-syntax relations in Basque. Volumen especial del Anuario del Seminario de Filología Vasca Julio de Urquijo (ASJU), XLVI: 95 – 126. Bernard Oiharçabal (ed.), University of the Basque Country.
data_tabs_full
Konbitzul
Izen+aditz konbinazio-itzulpenen datu-basea
e-ROLda
A tool for looking up verb entries in the BVI lexicon and examples in EPEC-RolSem corpus
Universal Dependencies treebank for Basque
This treebank has 121 K words annotated following the guidelines proposed in the Universal Dependencies project.
(2020 - 2021)
(2019 - 2019)- Hizkuntza Teknologia: Egoeraren diagnostikoa eta AMIA egitea.
(2019 - 2019) - Euskara HTen arloan sustatzeko proposamenak.
(2019 - 2019) - Hizkuntza-teknologiak sustatzeko proiektu transbertsalak
(2019 - 2019) - Orotariko Euskal Hiztegia corpus bihurtzea: bigarren urratsa, B fasea.
Phase B, second stage in the conversion to corpus of the dictionary Orotariko Euskal Hiztegia.
(2017 - 2017) - Orotariko Euskal Hiztegia corpus bihurtzea: bigarren urratsa.
Second stage in the conversion to corpus of the dictionary Orotariko Euskal Hiztegia.
(2016 - 2016)
- IKER-GAITU: hizkuntza ereduak ikertzea Adimen Artifizialean erabiltzeko
(2023 - 2025) - CLARIAH-EUS EJ: Europako ikerketa-azpiegituretan Giza eta Gizarte Zientzietan euskara eta euskaraz ikertzeko aukera bultzatzeko egitasmoa.
(2023 - 2025)
Language In The Human-Machine Era (LITHME). COST Action number: CA19102.
(2020 - 2024)
(2022 - 2024)- LUTEST: LANGUAGE UNDERSTANDING TEST SETS
(2020 - 2023)
Study of lexical combinations in Basque based on a novice academic corpus for an Academic Texts Writing Aid
(2020 - 2023)
Trustworthy AI - Integrating Learning, Optimisation and Reasoning
(2020 - 2023)
European Language Equality
(2021 - 2022)
enetCollect: A New European Network for combining Language Learning with Crowdsourcing Techniques
(2017 - 2021)
red estratégica para la promoción de las infraestructuras de tecnologías del lenguaje en ehumanidades y ciencias sociales
(2020 - 2021)
New generation of neural artificial intelligence models to transform language technologies in the Basque Country's industry.
(2020 - 2021)- CROSSTEXT: Automatic Generation of Multilingual Semantic Processors
Automatic generation of multilingual semantic taggers
(2017 - 2019) - DL4NLP: Deep Learning aplicado al Procesamiento del Lenguaje Natural como apoyo a los ámbitos del RIS3
(2019 - 2019)
(2011 - 2011) All HiTZ projects
Ainara Estarrona, Izaskun Etxeberria, Manuel Padilla-Moyano, Ander Soraluze
Measuring language distance for historical texts in Basque (2023)
Procesamiento del Lenguaje Natural, Revista no 70, marzo del 2023, pp. 53-61
Igone Zabala
Euskararen erregistro akademikoen garapenaz: hiztegia eta fraseologia (2023)
Lindemann David (ed.) Miren Azkarateri esker onez. Bilbo: UPV/EHUko Argitalpen Zerbitzua: 313-332
Itziar Aduriz, Manex Agirrezabal, Eneko Agirre, Iñaki Alegria, Xabier Arregi, Jose Mari Arriola Xabier Artola, Arantza Díaz de Ilarraza, Ainara Estarrona, Izaskun Etxeberria, Nerea Ezeiza, Kepa Sarazola
Mofologia Konputazionala Euskaraz, 35 urte (2023)
Lindemann, D. (arg.). Miren Azkarateri esker onez, 15-30. UPV/EHU Argitalpen zerbitzua. Bilbo.
Izaskun Aldezabal, María Jesús Aranzabe
Euskararen eredutik hizkuntza-ereduen euskarara (2023)
David Lindemann (arg.), Miren Azkarateri esker onez, 57-75. Bilbo: UPV/EHUko Argitalpen Zerbitzua
Izaskun Aldezabal, Jose Mari Arriola, Arantxa Otegi
TZOS: an Online Terminology Database Aimed at Working on Basque Academic Terminology Collaboratively (2022)
Proceedings of the 13th Language Resources and Evaluation Conference. Editors: Nicoletta Calzolari (Conference chair), Fred´ eric B ´ echet, Philippe Blache, Khalid Choukri, ´ Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hel´ ene Mazo, Jan Odijk, Stelios Piperidis
Gonzalez-Dios, Itziar and Altuna, Begoñ
Natural Language Processing and Language Technologies for the Basque Language (2022)
Gonzalez-Dios, Itziar and Altuna, Begoña (2022). Natural Language Processing and Language Technologies for the Basque Language. In Cuadernos Europeos de Deusto. NÚMERO ESPECIAL. Linguas minoritarias e futuro de Europa. Minority Languages and the Future of Europe 26, 203-230. https://doi.org/10.18543/ced.2477 https://ced.revistas.deusto.es/issue/view/285
María Jesús Aranzabe, Antton Gurrutxaga, Igone Zabala
Compilación del corpus académico de noveles en euskera HARTAeus y su explotación para el estudio de la fraseología académica (2022)
Procesamiento del Lenguaje Natural, Revista no 69, septiembre de 2022, pp. 95-103
MarÍa Jesús Aranzabe, Izaskun Aldezabal, Igone Zabala
Recursos y Herramientas de Lingüística de Corpus y PLN para la Monitorización e Investigación de los Usos Académicos del Euskera (2022)
III. workshop de INTELE (Infraestructura de Tecnologías del Lenguaje). Madrid, 13 y 14 de septiembre (Workshop horretan aurkeztutako posterra)
Bernardo Magnini, Begoña Altuna, Alberto Lavelli, Anne-Lyse Minard, Manuela Speranza, and Roberto Zanoli
European Clinical Case Corpus (2022)
Bernardo Magnini, Begoña Altuna, Alberto Lavelli, Anne-Lyse Minard, Manuela Speranza, and Roberto Zanoli (2022). European Clinical Case Corpus. Georg Rehm ed. European Language Grid, A Language Technology Platform for Multilingual Europe. Springer, Cham, Switzerland. https://doi.org/10.1007/978-3-031-17258-8
Petter Mæhlum, Andre Kåsen, Samia Touileb, and Jeremy Barnes.
Annotating Norwegian language varieties on Twitter for Part-of-speech. (2022)
Proceedings of the Ninth Workshop on NLP for Similar Languages, Varieties and Dialects
Itziar Glez Dios, Aitor Soroa, Hugo Laurençon, Lucile Saulnier, Thomas Wang, Christopher Akiki, Albert Villanova del Moral, Teven Le Scao, Leandro Von Werra, Chenghao Mou, Eduardo González Ponferrada, Huu Nguyen, Jörg Frohberg, Mario Šaško, Quentin Lhoest, Angelina McMillan-Major, Gérard Dupont, Stella Biderman, Anna Rogers, Loubna Ben Allal, Francesco de Toni, Giada Pistilli, Olivier Nguyen, Somaieh Nikpoor, Maraim Masoud, Pierre Colombo, Javier de la Rosa, Paulo Villegas, Tristan Thrush, etal.
The BigScience ROOTS Corpus: A 1.6 TB Composite Multilingual Dataset (2022)
2022. Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track
Scao, T. L., Fan, A., Akiki, C., Pavlick, E., Ilić, S., Hesslow, D., Soroa, A., Gonzalez-Dios, I,... & Manica, M.
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model (2022)
Scao, T. L., Fan, A., Akiki, C., Pavlick, E., Ilić, S., Hesslow, D., ... & Manica, M. (2022). BLOOM: A 176B-Parameter Open-Access Multilingual Language Model. arXiv preprint arXiv:2211.05100.
Nayla Escribano, Jon Ander González, Julen Orbegozo-Terradillos, Ainara Larrondo-Ureta, Simón Peña-Fernández, Olatz Perez-de-Viñaspre, Rodrigo Agerri
BasqueParl: A Bilingual Corpus of Basque Parliamentary Transcriptions (2022)
Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3382–3390, Marseille, France. European Language Resources Association.
Margarita Alonso Ramos, Igone Zabala
HARTAes-vas: Lexical combinations for an academic writing aid tool in Spanish and Basque (2022)
SEPLN-PD 2022. Annual Conference of the Spanish Association for Natural Language Processing 2022: Projects and Demonstrations, September 21-23, 2022, A Coruña, España.
Mikel Artetxe, Itziar Aldabe, Rodrigo Agerri, Olatz Perez-de-Viñaspre, Aitor Soroa
Does Corpus Quality Really Matter for Low-Resource Languages? (2022)
Proceedings of EMNLP 2022.
Elisa Sanchez-Bayona, Rodrigo Agerri
Leveraging a New Spanish Corpus for Multilingual and Crosslingual Metaphor Detection (2022)
Proceedings of the 26th Conference on Computational Natural Language Learning (CoNLL), pages 228--240, Abu Dhabi, United Arab Emirates, Association for Computational Linguistics.
Elisa Sanchez-Bayona, Rodrigo Agerri
From Automatic Metaphor Processing in Spanish to a Multilingual Perspective: Annotation, Systems, and Evaluation (2022)
Doctoral Symposium on Natural Language Processing from the PLN.net network 2022 (RED2018-102418-T), 21-23 September 2022, A Coruña, Spain.
Cecilia Domingo, Tatiana Gonzalez-Ferrero, Itziar Gonzalez-Dios
What is on Social Media that is not in WordNet? A Preliminary Analysis on the TwitterAAE Corpus (2021)
Domingo, C., Gonzalez-Ferrero, T., & Gonzalez-Dios, I. (2021, January). What is on Social Media that is not in WordNet? A Preliminary Analysis on the TwitterAAE Corpus. In Proceedings of the 11th Global Wordnet Conference (pp. 234-242).
Itziar Gonzalez-Dios, Uxoa Iñurrieta, Igone Zabala
General and Specialised Corpora to Raise Linguistic Awareness in a Language Undergoing the Normalisation Process: Academic Writing in Basque (2021)
Gonzalez Dios, I.; Iñurrieta, U.; Zabala, I. General and specialised corpora to raise linguistic awareness in a language undergoing the normalisation process: academic writing in Basque. A: AELFE-TAPP 2021 (19th AELFE Conference, 2nd TAPP Conference). "Multilingual academic and professional communication in a networked world. Proceedings of AELFE-TAPP 2021 (19th AELFE Conference, 2nd TAPP Conference). Vilanova i la Geltrú (Barcelona), 7-9 July 2021". Vilanova i la Geltrú: Universitat Politècnica de Catalunya, 2021, ISBN 978-84-9880-943-5.
Igone Zabala
Euskararen lantze funtzionala esparru akademiko eta profesionaletan (2021)
In Grenoble, Lenore / Lane, Pia / Røyneland, Unn (eds.) Ivan Igartua & Lourdes Oñederra (Basqeu eds.) Linguistic Minorities in Europe Online. A Born-Digital, Multimodal, Peer-Reviewed Online Reference Resource The Gruyter Mouton
Igone Zabala
Euskaltzaindiaren Hiztegiaren ekarpena lexiko espezializatuaren eta ez-espezializatuaren harmonizazioan (2021)
In Andres Urrutia (ed.) Arantzazutik mundu zabalera. Euskararen normatibizazioa: 1968-2018. IKER 40. Euskaltzaindia-Iberoamericana Vervuert: 285-299
Igone Zabala, Izaskun Aldezabal, Maria Jesus Aranzabe
Academic Research Works and Domain Dinamics: Resources and Tools for Basque Academic Writing (2021)
18th International Conference on Minority Languages (Bilbao, 2021/03/24-26)
Jon Alkorta
Hacia el análisis de sentimientos en euskera (2021)
J. Alkorta. (2021). Hacia el análisis de sentimientos en euskera. Procesamiento del Lenguaje Natural, 66, 201-204.
Jon Alkorta, Koldo Gojenola, Mikel Iruskieta
Ezeztapena identifikatzeko Murriztapen Gramatikako erregelak sentimenduen analisiaren testuinguruan (2021)
Alkorta, J., Gojenola, K. eta Iruskieta, M. 2021. Ezeztapena identifikatzeko Murriztapen Gramatikako erregelaksentimenduen analisiaren testuinguruan. IV. IKERGAZTE NAZIOARTEKO IKERKETA EUSKARAZ Kongresuko artikulu-bilduma, Editoreak: Olatz Arbelaitz, Ainhoa Latatu, Miren Josu Omaetxebarria, Blanca Urgell. Bilbo: UEU, 169-176 orr.
Prys Delyth, Sarasola Kepa, Alegria Iñaki, Perez-de-Viñaspre Olatz, Palmer Geraint, Corcoran Padraig, Arman Laura, Knight Dawn ,Spasic Irena, Bryn Jones Dewi, Cooper Sarah, Prys Myfyr, Muralidaran Vigneshwaran, O’Hare Keeziah, Prys Gruffudd, Watkins Gareth, Roberts Jonathan C, Butcher Peter W. S., Lew Robert, Rees Geraint, Sharma Nirwan, Frankenberg-Garcia Ana, Farhat Leena Sarah, Teahan William John.
Language and Technology in Wales: Volume I (2021)
Language and Technology in Wales: Volume I. University of Bangor. ISBN: 978-1-84220-189-3
Prys Delyth, Sarasola Kepa, Alegria Iñaki, Perez-de-Viñaspre Olatz, Palmer Geraint, Corcoran Padraig, Arman Laura, Knight Dawn ,Spasic Irena, Bryn Jones Dewi, Cooper Sarah, Prys Myfyr, Muralidaran Vigneshwaran, O’Hare Keeziah, Prys Gruffudd, Watkins Gareth, Roberts Jonathan C, Butcher Peter W. S., Lew Robert, Rees Geraint, Sharma Nirwan, Frankenberg-Garcia Ana, Farhat Leena Sarah, Teahan William John.
Iaith a Thechnoleg yng Nghymru: Cyfrol 1 (2021)
Iaith a Thechnoleg yng Nghymru: Cyfrol 1. University of Bangor. ISBN: 978-1-84220-189-6
Xavier Gómez Guinovart, Itziar Gonzalez-Dios, Antoni Oliver, German Rigau
Multilingual Central Repository: a Cross-lingual Framework for Developing Wordnets (2021)
Xavier Gómez Guinovart, Itziar Gonzalez-Dios, Antoni Oliver, German Rigau (2021) Multilingual Central Repository: a Cross-lingual Framework for Developing Wordnets. arXiv:2107.00333
Ainara Estarrona, Izaskun Etxeberria, Ricardo Etxepare, Manuel Padilla-Moyano, Ander Soraluze
The First Annotated Corpus of Historical Basque (2021)
Digital Scholarship in the Humanities, vol. 37(2), pp. 391-404
Igone Zabala, María Jesús Aranzabe, Izaskun Aldezabal
Retos actuales del desarrollo y aprendizaje de los registros académicos orales y escritos del euskera (2021)
Círculo de Lingüística Aplicada a la Comunicación 88, pp. 31-50
Bernardo Magnini, Begoña Altuna, Alberto Lavelli, Manuela Speranza, Roberto Zanoli
The E3C Project: European Clinical Case Corpus (2021)
Proceedings of the Annual Conference of the Spanish Association for Natural Language Processing: Projects and Demonstrations (SEPLN-PD 2021). Pages 17-20. ISSN: 1613-0073. URL: http://ceur-ws.org/Vol-2968/paper5.pdf
Ainara Estarrona, Izaskun Aldezabal, Arantza Díaz de Ilarraza
How the corpus-based Basque Verb Index lexicon was built (2020)
Language Resources and Evaluation. First Online 05 December 2018. DOI: https://doi.org/10.1007/s10579-018-9440-0. Springer Netherlands
Piroska Lendvai , Sándor Darányi, Christian Geng, Moniek Kuijpers, Oier Lopez de Lacalle , Jean-Christophe Mensonides, Simone Rebora and Uwe Reichel
Detection of Reading Absorption in User-Generated Book Reviews: Resources Creation and Evaluation (2020)
Proceeding of 12th Edition of its Language Resources and Evaluation Conference (LREC2020). Marseille, France
Arantxa Otegi, Aitor Agirre, Jon Ander Campos, Aitor Soroa, Eneko Agirre
Conversational Question Answering in Low Resource Scenarios: A Dataset and Case Study for Basque (2020)
Proceedings of The 12th Language Resources and Evaluation Conference, pp. 429–435. European Language Resources Association. ISBN: 979-10-95546-34-4
Javier Álvez, Itziar Gonzalez-Dios, German Rigau
Towards Word Sense Disambiguation by Reasoning (2020)
Vampire 2018 and Vampire 2019. The 5th and 6th Vampire Workshops. EPiC Series in Computing. Pages 19-29. ISSN: 2398-7340
Uxoa Iñurrieta
Identification and translation of verb+noun multiword expressions: a Spanish-Basque study (2020)
Procesamiento del Lenguaje Natural, 64, pp. 123-126.
Kepa Bengoetxea, Itziar Gonzalez-Dios, Amaia Aguirregoitia
AzterTest: Open source linguistic and stylistic analysis tool (2020)
Procesamiento del Lenguaje Natural, 64, 61-68. http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6196
Rodrigo Agerri, Iñaki San Vicente, Jon Ander Campos, Ander Barrena, Xabier Saralegi, Aitor Soroa, Eneko Agirre
Give your Text Representation Models some Love: the Case for Basque (2020)
Proceedings of LREC. Also available at arxiv https://arxiv.org/pdf/2004.00033.pdf
Itziar Gonzalez-Dios, Javier Álvez, German Rigau
Towards modeling SUMO attributes through WordNet adjectives: a Case Study on Qualities. (2020)
Proceedings of the Workshop on Multimodal Wordnets (MMWN-2020), pages 1–6. ISBN: 979-10-95546-41-2 https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/MMW2020book.pdf
Jon Alkorta, Itziar Gonzalez-Dios
Exploring the Enrichment of Basque WordNet with a Sentiment Lexicon (2020)
Proceedings of the Workshop on Multimodal Wordnets (MMWN-2020), pages 20–24. ISBN: 79-10-95546-41-2 https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/MMW2020book.pdf
Thierry Declerck, Itziar Gonzalez-Dios, German Rigau (editors)
Proceedings of the LREC 2020 Workshop on Multimodal Wordnets (MMWN-2020) (2020)
European Language Resources Association (ELRA), Paris. https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/MMW2020book.pdf ISBN: 979-10-95546-41-2 EAN: 9791095546412
Begoña Altuna, María Jesús Aranzabe, Arantza Díaz de Ilarraza
EusTimeML: A mark-up language for temporal information in Basque (2020)
Research in Corpus Linguistics 8: 86-104. ISSN 2243-4712. Asociación Española de Lingüística de Corpus (AELINCO) DOI 10.32714/ricl.08.01.06
Begoña Altuna
Análisis de estructuras temporales en euskera y creación de un corpus (2020)
Procesamiento del Lenguaje Natural, Revista no 64, marzo de 2020, pp. 131-134 URL: http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6206 ISSN: 1989-7553
Elena Zotova, Rodrigo Agerri, Manuel Nuñez and German Rigau
Multilingual Stance Detection in Tweets: The Catalonia Independence Corpus (2020)
Language Resources and Evaluation Conference (LREC 2020)
Uxoa Inurrieta, Itziar Aduriz, Arantza Díaz de Ilarraza, Gorka Labaka, Kepa Sarasola
Learning about phraseology from corpora: A linguistically motivated approach for Multiword Expression identification. (2020)
Inurrieta U, Aduriz I, Díaz de Ilarraza A, Labaka G, Sarasola K (2020) Learning about phraseology from corpora: A linguistically motivated approach for Multiword Expression identification. PLoS ONE 15(8): e0237767. https://doi.org/10.1371/journal.pone.0237767
Ainara Estarrona, Izaskun Etxeberria, Ricardo Etxepare, Manuel Padilla-Moyano, Ander Soraluze
Sintaktikoki etiketatutako euskarazko corpus historikoa eraikitzen (2020)
Fontes Linguae Vasconum 50 urte. Ekarpen berriak euskararen ikerketari. Nuevas aportaciones al estudio de la lengua vasca
Ainara Estarrona, Izaskun Etxeberria, Ricardo Etxepare, Manuel Padilla-Moyano, Ander Soraluze
Dealing with dialectal variation in the construction of the Basque historical corpus (2020)
Proceedings of the 7th Workshop on NLP for similar languages, varieties and dialects (VarDial2020 at COLING 2020).
Gorka Urbizu, Ander Soraluze, Olatz Arregi
Sequence to Sequence Coreference Resolution (2020)
Proceedings of the 3rd Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2020), pages 39–46,Barcelona, Spain (online), December 12, 2020.
Jon Ander Campos, Arantxa Otegi, Aitor Soroa, Jan Deriu, Mark Cieliebak, Eneko Agirre
DoQA - Accessing Domain-Specific FAQs via Conversational QA (2020)
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7302–7314
Itziar Gonzalez-Dios
Data statement of the Corpus of Basque Simplified Texts (2020)file2 (2020)
Data Statements workshop
Bernardo Magnini, Begoña Altuna, Alberto Lavelli, Manuela Speranza, Roberto Zanoli
The E3C Project:Collection and Annotation of a Multilingual Corpus of Clinical Cases (2020)
In Johanna Monti, Felice Dell'Orletta and Fabio Tamburini (eds.), Proceedings of the Seventh Italian Conference on Computational Linguistics. Associazione Italiana di Linguistica Computazionale. Bologna, Italy, 2020.
Itziar Aldabe, Josu Aztiria, Francho Beltrán, Myriam Bras, Klara Ceberio, Itziar Cor tes, Jean-Baptiste Coyos, Benaset Dazeas, Louise Esher, Gorka Labaka, Igor Leturia, Kepa Sarasola, Aure Séguier, Jean Sibille
LINGUATEC: Development of cross-border cooperation and knowledge transfer in language technologies (2020)
Workshop "INTELE : INfraestructura de TEcnologías del LEnguaje" CLARIN DARIAH-EU. http://ixa2.si.ehu.eus/intele/?q=node/71
Kepa Sarasola, Itziar Aldabe, Arantza Diaz de Ilarraza, Ainara Estarrona, Aritz Farwell, Inma Hernaez, Eva Navas; Reviewers: Annika Grützner-Zahn, Maria Giagkou; Editors: Maria Giagkou, Stelios Piperidis, Georg Rehm, Jane Dunne
Report on the Basque Language. European Language Equality (2020)
Deliverables of the Project ELE (European Language Equality). D1.4 Report on the Basque Language, https://european-language-equality.eu/deliverables/
Jon Alkorta, Koldo Gojenola, Mikel Iruskieta
SentiTegi: building a semantic oriented Basque lexicon (2019)
Computación y Sistemas, 22 (4)
Igone Zabala
The elaboration of Basque in academic and professional domains. (2019)
In Grenoble, Lenore; Lane, Pia & Røyneland, Unn Unn Røyneland (ed.) Linguistic Minorities in Europe Online. The Gruyter Mouton. ISSN 2510-5361
Aitziber Atutxa, Kepa Bengoetxea, Arantza Diaz de Ilarraza, Mikel Iruskieta
Towards a top-down approach for an automatic discourse analysis for Basque: Segmentation and Central Unit detection tool (2019)
PLoS ONE 14(9): e0221639
Ander Soraluze, Olatz Arregi, Xabier Arregi, Arantza Diaz de Ilarraza
EUSKOR: End-to-end coreference resolution system for Basque (2019)
PLoS ONE 14(9): e0221801. https://doi.org/10.1371/journal.pone.0221801
Ainara Estarrona, Izaskun Etxeberria, Ander Soraluze, Manuel Padilla-Moyano
Spelling Normalisation of Basque Historical Texts (2019)
Procesamiento del Lenguaje Natural, vol. 63, pp. 59-66
Javier Álvez, Itziar Gonzalez-Dios, German Rigau
Commonsense Reasoning Using WordNet and SUMO: a Detailed Analysis (2019)
Proceedings of the Tenth Global Wordnet Conference, pp 197--205. ISBN 978-83-7493-108-3
ItziarGonzalez-Dios, German Rigau
Textual genre based approach to use wordnets in language-for-specific-purpose classroom as dictionary (2019)
Proceedings of the Tenth Global Wordnet Conference, pp 222--227. ISBN 978-83-7493-108-3
Jon Ander Campos, Arantxa Otegi, Aitor Soroa, Jan Deriu, Mark Cieliebak, Eneko Agirre
Conversational QA for FAQs (2019)
NeurIPS 3rd Conversational AI Workshop: “Today's Practice and Tomorrow's Potential”
Meghan Dowling, Kepa Sarasola, Ana Zelaia, Aitzol Astigarraga
Looking for possible new articles. What Wikipedia pages are often consulted in English... but there are not defined in Gaelic? (2019)
Meghan Dowling, Kepa Sarasola, Ana Zelaia, Aitzol Astigarraga (2019) 'Looking for possible new articles. What Wikipedia pages are often consulted in English... but there are not defined in Gaelic?' Wikimedia+Education Conference, Donostia 2019
Begoña Altuna, Maria Jesús Aranzabe, Arantza Diaz de Ilarraza
Adapting TimeML to Basque: Event Annotation (2018)
In Gelbukh A. (eds.) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science (LNCS, vol 9624), 565-577. Springer, Cham. DOI https://doi.org/10.1007/978-3-319-75487-1_43 ; Print ISBN 978-3-319-75486-4; Online ISBN 978-3-319-75487-1
Uxoa Iñurrieta, Itziar Aduriz, Arantza Díaz de Ilarraza, Gorka Labaka, Kepa Sarasola
Konbitzul: an MWE-specific Database for Spanish-Basque (2018)
Proceedings of the 11th Language Resources and Evaluation Conference, Miyazaki, Japan. orrialdeak: pages 2500-2504.
Uxoa Iñurrieta, Itziar Aduriz, Ainara Estarrona, Itziar Gonzalez-Dios, Antton Gurrutxaga, Ruben Urizar, Iñaki Alegria
Verbal Multiword Expressions in Basque corpora (2018)
In the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (at COLING 2018)
Igone Zabala
Euskararen lantze funtzionala eta profesionalen komunikazio-gaitasunen garapena osasun-alorrean (2018)
BAT Soziolinguistika Aldizkaria 108, 2018 (3): 11-34
Igone Zabala
Euskararen terminologiaren garapena Terminologiaren Teoria Komunikatiboaren argitan (2018)
In Ruben Urizar eta Itizar Aduriz (ed.) Hizkuntzalari Euskaldunen III Topaketa. Zer berri?. 349-358.
Klara Ceberio, Itziar Aduriz, Arantza Díaz de Ilarraza and Ines Garzia-Azkoaga
Coreferential Relations in Basque: The Annotation Process (2018)
J Psycholinguist Res (2018) 47, Issue 2. Pages 325-342. https://doi.org/10.1007/s10936-018-9559-6. ISSN 0090-6905. Online ISSN 1573-6555.
Izaskun Aldezabal, Xabier Artola, Arantza Diaz De Ilarraza, Itziar Gonzalez-Dios, Gorka Labaka, German Rigau and Ruben Urizar
Basque e-lexicographic resources: linguistic basis, development, and future perspectives (2018)file2 (2018)
Workshop on eLexicography: Between Digital Humanities and Artificial Intelligence. https://lexdhai.insight-centre.org/Lex_DH__AI_2018_paper_5.pdf
Itziar Aduriz, María Jesús Aranzabe, José María Arriola, Arantza Díaz de Ilarraza, Itziar Gonzalez-Dios, Ruben Urizar
Building the Gold Standard for the Surface Syntax of Basque (2017)
Procesamiento del Lenguaje Natural, 58, 125-132. Consultado en http://ixa.si.ehu.es/sites/default/files/dokumentuak/8825/5421-4766-1-PB.pdf (ISSN edición impresa: 1135-5948) (ISSN edición electrónica: 1989-7553)
Itziar Aduriz, Iñaki Alegria, Olatz Arregi, Arantza Diaz de Ilarraza, Kepa Sarasola
Hizkuntza-teknologia “Datu Handien” garaian: programa bilatzaileak, itzultzaileak… (2017)
Senez, 48, pp. 191-200. ISSN: 1132-2152. 2017 https://eizie.eus/eu/argitalpenak/senez/20171102/aurkezpena/datuhandiak
Zabala I., San Martin I., Lersundi M.
Learning terminology in order to become an active agent in the development of Basque biomedical registers (2016)
Language Learning in Higher Education. Journal of CercleS (European Confederation of Language Centres in Higher Education). De Gruyter Mouton. Volume 6, Issue 1 (May 2016). Special issue: Teaching Medical Discourse in Higher Education. ISSN (Online) 2191-6128, ISSN (Print) 2191-611X, DOI: 10.1515/cercles-2016-0007 URL: http://www.degruyter.com/view/j/cercles.2016.6.issue-1/cercles-2016-0007/cercles-2016-0007.xml
Arantxa Otegi, Nora Aranberri, António Branco, Jan Hajic, Steven Neale, Petya Osenova, Rita Pereira, Martin Popel, Joao Silva, Kiril Simov, Eneko Agirre
QTLeap WSD/NED Corpora: Semantic Annotation of Parallel Corpora in Six Languages (2016)
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), European Language Resources Association (ELRA). ISBN 978-2-9517408-9-1
Estarrona A., Aldezabal I., Díaz de Ilarraza A. eta Aranzabe M.J.
A Methodology for the Semiautomatic Annotation of EPEC-RolSem, a Basque Corpus Labeled at Predicate Level following the PropBank/Verbnet Model (2016)
Edward Vanhoutte (ed.) Digital Scholarship in the Humanities (2016) 31 (3): 470-492. DOI: http://dx.doi.org/10.1093/llc/fqv001 First published online: 17 June 2015 (23 pages). Published by Oxford University Press on behalf of EADH: The European Association for Digital Humanities (Online ISSN 2055-768X - Print ISSN 2055-7671)
A. Minard, M. Speranza, R. Urizar, B. Altuna, M. van Erp, A. Schoen, and C. van Son
MEANTIME, the NewsReader Multilingual Event and Time Corpus (2016)
Proceedings of LREC 2016.Pages: 4417-4422. ISBN: 978-2-9517408-9-1
Zabala I., Lersundi M., Martínez M., Requero M.A., Omaetxebarria M.J.
Biokimikaren terminologiaren deskripzioa: erabilera errealetik hiztegietara. (2015)
Beatriz Fernandez eta Pello Salaburu (ed.) Ibon Sarasola, Gorazarre, Homenatge, Homenaje. UPV/EHUko Argitalpen Zerbitzua. Bilbo: 679-692
Maria Jesús Aranzabe, Aitziber Atutxa, Kepa Bengoetxea, Arantza Diaz de Ilarraza, Iakes Goenaga, Koldo Gojenola, Larraitz Uria
Automatic Conversion of the Basque Dependency Treebank to Universal Dependencies (2015)
Markus Dickinsons, Erhard Hinrichs, Agnieszka Patejuk, Adam Przepiórkowski (eds), Proceedings of the Fourteenth International Workshop on Treebanks an Linguistic Theories (TLT14), 233-241. Institute of Computer Science of the Polish Academy of Sciences, Warszawa, Poland. ISBN: 978-83-63159-18-4
Zabala I., San Martin I., Lersundi M.
Linguistic and sociolinguistic factors that influence the detection, implantation and circulation of natural terminology in academic uses of Basque (2014)
Pascaline Dury; José Carlos de Hoyos; Julie Makri-Morel; François Maniez; Vincent Renner; María Belén Villar Díaz (ed.) La néologie en langue de spécialité. Détection, implantation eta circulation des nouveaux termes. Travaux du CRTT (Centre de Recherche en Terminologie et Traduction. Université Lumière Lyon 2):141-164 ISBN: 978-2-9533061-0-1
Larraitz Uria, Montserrat Maritxalar, Igone Zabala
An Environment for Learner Corpus Research and Error Analysis: The Study of Determiner Errors in Basque (2014)
International Journal of Computer-Assisted Language Learning and Teaching (IJCALLT) , 4 (3), 34-51, July-September 2014, edited by Bin Zou (ISSN 2155-7098, eISSN 2155-7101).
Igor Leturia, Kepa Sarasola, Xabier Arregi, Arantza Diaz de Ilarraza, Eva Navas, Iñaki Sainz, Arantza del Pozo, David Baranda, Urtza Iturraspe
BerbaTek: euskararako hizkuntza teknologien garapena itzulpengintza, edukien kudeaketa eta irakaskuntza arloetan (2013)
Euskalingua aldizkari digitala, 23, 66-76. http://mendebalde.eus/euskalinguak/Euskalingua%2023/Berbatek:%20euskararako%20hizkuntza%20teknologien%20garapena%20itzulpengintza,%20edukien%20kudeaketa%20eta%20irakaskuntza%20arloetan.pdf
Iruskieta M., Aranzabe M., Diaz de Ilarraza A., Gonzalez I., Lersundi I., Lopez de Lacalle O.
The RST Basque TreeBank: an online search interface to check rhetorical relations (2013)
4th Workshop RST and Discourse Studies, 40-49, Sociedad Brasileira de Computacao, Fortaleza, CE, Brasil. October 20-24 (http://encontrorst2013.wix.com/encontro-rst-2013)
Zabala I., San Martin I., Lersundi M., Azkue J. J., Mendizabal J.L.
The Elaboration of Human Anatomy Terminology for the Basque Language: the Contribution of Translators, Linguists and Experts (2012)
Terminàlia Vol. 6: 15-25
Zabala I., San Martin I.
Basque and Romance Languages: Languages with Different Structures (2012)
Pello Salaburu & Xabier Alberdi (ed.) The Basque Country, a Bilingual Society. Center for Basque Studies University of Nevada: 51-72
Pociello E., Agirre E. and Aldezabal I.
Methodology and construction of the Basque WordNet (2011)
Language Resources and Evaluation. Springer. Volume 45, Issue 2, pp 121-142. ISSN 1574-020X. DOI 10.1007/s10579-010-9131-y. official
Iñaki Alegria, Maria Jesús Aranzabe, Xabier Arregi, Xabier Artola, Arantza Diaz de Ilarraza, Aingeru Mayor, Kepa Sarasola
Valuable Language Resources and Applications Supporting the Use of Basque (2011)
Z. Vetulani (Ed.): LTC 2009, Lecture Notes in Artifitial Intelligence LNAI 6562, pp. 327--338. Springer, Heidelberg. ISBN:978-3-642-20094-6, DOI: 10.1007/978-3-642-20095-3, https://link.springer.com/content/pdf/10.1007%2F978-3-642-20095-3_30.pdf
Zabala I., San Martin I., Lersundi M., Elordui A.
Graduate Teaching of Specialized Registers in a Language in the Normalization Process: Towards a Comprehensive and Interdisciplinary Treatement of Academic Basque (2011)
Sergio Maruenda-Bataller y Begoña Clavel-Arroitia (ed.) Multiple Voices in Academic and Professional Discourse: Current Issues in Specialised Language Research, Teaching and New Technologies. Cambridge Scholars Publishing: 208-229 ISBN (10) 1-4438-2971-4; ISBN (13): 978-1-4438-2971-7
Izaskun Aldezabal, Maria Jesús Aranzabe, Arantza Diaz de Ilarraza, Ainara Estarrona, Larraitz Uria
EusPropBank: Integrating Semantic Information in the Basque Dependency Treebank (2010)
Lecture Notes in Computer Science (LNCS) nº 6008, Alexander Gelbukh (Ed.), Computational Linguistics and Intelligent Text Processing. pp.60-73, Springer. ISSN: 0302-9743, ISBN-10: 3-642-12115-2 Springer Berlin Heidelberg New York, ISBN-13: 978-3-642-12115-9 Springer Berlin Heidelberg New York. 11th International Conference, CICLing 2010, Iasi, Romania, March 21-27, 2010
Izaskun Aldezabal, Maria Jesús Aranzabe, Arantza Diaz de Ilarraza, Ainara Estarrona
Building the Basque PropBank (2010)
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odjik, Stelios Piperidis, Mike Rosner and Daniel Tapias (eds.), Proceedings of the Seventh Conference on International Language Resources and Evaluation (LREC 2010), pp. 1414-1417, European Language Resources Association (ELRA), ISBN: 2-9517408-6-7. LREC 2010, Valletta (Malta), May 19-21, 2010
Uria L., Estarrona A., Aldezabal I., Aranzabe M., Díaz de Ilarraza A., Iruskieta M.
Evaluation of the Syntactic Annotation in EPEC, the Reference Corpus for the Processing of Basque (2009)
Lecture Notes in Computer Science (LNCS) nº 5449, Alexander Gelbukh (Ed.), Computational Linguistics and Intelligent Text Processing. pp 72-85. Springer. ISSN: 0302-9743, ISBN-10: 3-642-00381-8, ISBN-13: 978-3-642-00381-3. 10th International Conference, CICLing 2009, Mexico City, Mexico, March 1-7, 2009
Izaskun Aldezabal, Maria Jesús Aranzabe, Jose Maria Arriola, Arantza Diaz de Ilarraza
Syntactic annotation in the Reference Corpus for the Processing of Basque (EPEC): Theoretical and practical issues (2009)
Corpus Linguistics and Linguistic Theory 5-2 (2009), 241-269. Mouton de Gruyter. Berlin-New York. Print ISSN: 1613-7027 Online ISSN: 1613-7035
Elosegi A., Aldezabal I., Garcia J.
Euskarazko produktu linguistikoen azterketa (2009)
In I. Zabala (ed.) GARATERM: diskurtso akademiko-profesionalaren didaktika eta garapena uztartzeko tresna informatikoen diseinua eta integrazioa. EHUko Argitalpen Zerbitzuko argitalpen elektronikoa (argitarabidean). http://garaterm.ehu.es/garaterm_ataria/publications/
Elosegi A., Aldezabal I., Garcia J.
Euskararen garapen lexiko-diskurtsiboan eragina duten produktu linguistiko nagusiei buruzko informazioa biltzen duen datu-basea (2009)
GARATERM: diskurtso akademiko-profesionalaren didaktika eta garapena uztartzeko tresna informatikoen diseinua eta integrazioa. EHUko Argitalpen Zerbitzuko argitalpen elektronikoa. (argitarabidean). http://garaterm.ehu.es/garaterm_ataria/publications/
Elordui A., Zabala I.
Euskara Batuaren garapen lexiko-diskurtsiboa: batasunetik aniztasun funtzionalera (2009)
Ricardo Etxepare, Ricardo Gomez &Joseba Lakarra (ed.) A Festschrift for Bernard Oyharçabal. Bilbao: Université du Pays Basque. Supplements of the Internationa Journal of Basque Language and Linguistics: 231-246 ISSN: 05826152
Zabala I., Aierbe A., Aldezabal I., Aranzabe M., Arregi X., Arriola J.M., Elordui A., Elosegi A., Elosegi K., Ezeiza J., Garcia I., Garcia J., Lersundi M., San Martin I. eta Ugarteburu I.
GARATERM: Diskurtso akademiko-profesionalaren didaktika eta garapena uztartzeko tresna informatikoen diseinua eta integrazioa helburu duen proiektua (2008)
In Iñaki Ugarteburu eta Pello Salaburu (arg.), Espezialitate hizkerak eta terminologia III: espezialitate hizkeren didaktika eta komunikazioa, 211-219, UPV/EHUko argitalpen zerbitzua. Bilbo (Bizakia). ISBN: 978-84-691-6424-2
Izaskun Aldezabal, Klara Ceberio, Itsaso Esparza, Ainara Estarrona, Jone Etxeberria, Elixabete Izagirre, Mikel Iruskieta, Larraitz Uria
EPEC (Euskararen Prozesamendurako Erreferentzia Corpusa) segmentazio-mailan etiketatzeko eskuliburua (2007)
UPV/EHU / LSI / TR 11-2007
Itziar Aduriz, Maria Jesús Aranzabe, Jose Maria Arriola, Aitziber Atutxa, Arantza Diaz de Ilarraza, Nerea Ezeiza, Koldo Gojenola, Maite Oronoz, Aitor Soroa, Ruben Urizar
Methodology and steps towards the construction of EPEC, a corpus of written Basque tagged at morphological and syntactic levels for the automatic processing (2006)
Corpus Linguistics Around the World. Book series: Language and Computers. Vol 56 (pag 1- 15). ISBN 90-420-1836-4 Ed. Andrew Wilson, Paul Rayson, and Dawn Archer. Rodopi. Netherlands.
Eneko Agirre, Izaskun Aldezabal, Jone Etxeberria, Mikel Iruskieta, Elixabete Izagirre, Karmele Mendizabal, Eli Pociello
Improving the Basque WordNet by corpus annotation. (2006)
Proceedings of Third International WordNet Conference. pp. 287-290. ISBN 80-210-3915-9. Jeju Island (Korea).
Izaskun Aldezabal
Euskal Filologia Saila. Zientzia Fakultatea. Leioa. UPV/EHU. 2004ko apirila.
Aldezabal I., Goenaga P.
Analyzing Verbal Subcategorization Aimed at its Computational Application (2003)
Inquiries into the lexicon-syntax relations in Basque. Volumen especial del Anuario del Seminario de Filología Vasca Julio de Urquijo (ASJU), XLVI: 95 – 126. Bernard Oiharçabal (ed.), University of the Basque Country.