Information Extraction and Information Retrieval

The ever increasing availability of unstructured textual resources in the Web and their potential to be used in applications for the automatic acquisition of knowledge have caused a dramatic rise in research related to Information Extraction (IE) and Information Retrieval (IR). Traditionally, the required textual content was produced by means of manual annotations by human experts on the task at hand, which is too costly in terms of both economic and human resources. In the last decade, new t...Read More

see more

ie_ir_tabs

Demos

Demo of the NewsReader NLP pipeline

 

Just copy in any English text and see what entities and events and other annotations are added automatically. The result is represented in the NAF format.

Demo of the NewsReader NLP pipeline

 

Just copy in any Spanish text and see what entities and other annotations are added automatically. The result is represented in the NAF format

 

Eihera

Basque named entities recognizer/classifier

Eustagger

Basque lemmatizer and morphosyntactic analyzer

Contracts

Projects

Patents

EUSLEM

EUSLEM: lemmatizer for Basque

UKB

Word sense disambiguation and similarity.

KYBOT

Knowledge Yielding Robot

Resources

  • EIEC
    Basque Named Entity Recognition corpus.
  • EDIEC
    Basque corpus annotated for Named Entity Disambiguation.
  • MCR: Multilingual Central Repository
    Multilingual lexical database with wordnets for several European languages, including Basque.
  • EPEC-EuSemcor
    Corpus tagged with Basque WordNet senses.

Publications

Cristina Aceta, Johan Kildal, Izaskun Fernández, Aitor Soroa

Towards an optimal design of natural human interaction mechanisms for a service robot with ancillary way-finding capabilities in industrial environments (2021)

Production & Manufacturing Research, 9:1, 1-32

Ainhoa Serna, Aitor Soroa, Rodrigo Agerri

Applying Deep Learning Techniques for Sentiment Analysis to Assess Sustainable Transport (2021)

Sustainability 13, no. 4: 2397.

Aitzol Elu, Gorka Azkune, Oier Lopez de Lacalle, Ignacio Arganda-Carreras, Aitor Soroa, Eneko Agirre

Inferring spatial relations from textual descriptions of images (2021)

Pattern Recognition, Volume 113, 107847. Pre-print: https://arxiv.org/abs/2102.00997

Eneko Agirre, Marianna Apidianaki, Ivan Vulić (Editors)

Proceedings of Deep Learning Inside Out (DeeLIO): The 2nd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures (2021)

In conjunction with NAACL. Association for Computational Linguistics

Elena Zotova, Rodrigo Agerri, German Rigau

Semi-automatic generation of multilingual datasets for stance detection in Twitter (2021)

Expert Systems with Applications, 170 (2021).

Jon Alkorta

Hacia el análisis de sentimientos en euskera (2021)

J. Alkorta. (2021). Hacia el análisis de sentimientos en euskera. Procesamiento del Lenguaje Natural, 66, 201-204.

Joseba Fernandez de Landa, Iker García, Ander Salaberria, Jon Ander Campos

Twitterreko Euskal Komunitatearen Eduki Azterketa Pandemia Garaian (2021)

IV. Ikergazte. Nazioarteko ikerketa euskaraz. Kongresuko artikulu bilduma. Ingeniaritza eta Arkitektura

Ander Barrena, Aitor Soroa, Eneko Agirre

Towards Zero-Shot Cross-Lingual Named Entity Disambiguation (2021)

Expert Systems With Applications ESWA 2021

Eneko Agirre

Cross-Lingual Word Embeddings (Book Review) (2020)

Computational Linguistics 46 (1), 245-248. (https://doi.org/10.1162/COLI_r_00372)

Oier Lopez de Lacalle, Ander Salaberria, Aitor Soroa, Gorka Azkune and Eneko Agirre

Evaluating Multimodal Representations on Visual Semantic Textual Similarity (2020)

Proceedings of the Twenty-third European Conference on Artificial Intelligence, ECAI 2020, June 8-12, 2020, Santiago Compostela, Spain

Oscar Sainz, Oier Lopez de Lacalle, Itziar Aldabe, Montse Maritxalar

Domain Adapted Distant Supervision for Pedagogically Motivated Relation Extraction (2020)

Proceeding of 12th Edition of its Language Resources and Evaluation Conference (LREC2020). Marseille, France

Javier Álvez, Itziar Gonzalez-Dios, German Rigau

Towards Word Sense Disambiguation by Reasoning (2020)

Vampire 2018 and Vampire 2019. The 5th and 6th Vampire Workshops. EPiC Series in Computing. Pages 19-29. ISSN: 2398-7340

Rodrigo Agerri, Iñaki San Vicente, Jon Ander Campos, Ander Barrena, Xabier Saralegi, Aitor Soroa, Eneko Agirre

Give your Text Representation Models some Love: the Case for Basque (2020)

Proceedings of LREC. Also available at arxiv https://arxiv.org/pdf/2004.00033.pdf

Begoña Altuna, María Jesús Aranzabe, Arantza Díaz de Ilarraza

EusTimeML: A mark-up language for temporal information in Basque (2020)

Research in Corpus Linguistics 8: 86-104. ISSN 2243-4712. Asociación Española de Lingüística de Corpus (AELINCO) DOI 10.32714/ricl.08.01.06

Rodrigo Agerri, German Rigau

Language independent sequence labelling for Opinion Target Extraction (2020)

International Joint Conference on Artificial Intelligence (IJCAI 2020)

Elena Zotova, Rodrigo Agerri, Manuel Nuñez and German Rigau

Multilingual Stance Detection in Tweets: The Catalonia Independence Corpus (2020)

Language Resources and Evaluation Conference (LREC 2020)

Javier Álvez, Itziar Gonzalez-Dios, German Rigau

Applying the Closed World Assumption to SUMO-based FOL Ontologies for Effective Commonsense Reasoning (2020)file2 (2020)

Frontiers in Artificial Intelligence and Applications. Giuseppe De Giacomo, Alejandro Catala, Bistra Dilkina, Michela Milano, Senén Barro, Alberto Bugarín, Jérôme Lang (eds.). Volume 325: ECAI 2020. Pages 585 - 592. IOS Press Ebooks

Juan J. Lastra-Díaz, Josu Goikoetxea, Mohamed Ali Hadj Taieb, Ana Garcia-Serrano, Mohamed Ben Aouicha, Eneko Agirre, David Sánchez

A large reproducible benchmark of ontology-based methods and word embeddings for word similarity (2020)

Information Systems. Online first.

Iker de la Iglesia, Mikel Martinez-Puente, Alexander Platas, Iria San Miguel, Aitziber Atutxa, Koldo Gojenola

MEDIA team at the CLEF-2020 MultilingualInformation Extraction Task (2020)

Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum Thessaloniki, Greece, September 22-25, 2020.

Eneko Agirre, Marianna Apidianaki, Ivan Vulić (Editors)

Proceedings of Deep Learning Inside Out (DeeLIO): The First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures (2020)

In conjunction with EMNLP. Association for Computational Linguistics

Rodrigo Agerri, German Rigau

Projecting Heterogeneous Annotations for Named Entity Recognition (2020)

In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020). Winner of the CAPITEL@IberLEF task on Spanish NER.

María Espinosa, Rodrigo Agerri, Roberto Centeno, Alvaro Rodrigo

DeepReading@SardiStance:Combining Textual, Social and Emotional Features. (2020)

Proceedings of the Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA 2020). Winners of the SardiStance@Evalita 2020 shared task

Rodrigo Agerri, German Rigau

Language independent sequence labelling for Opinion Target Extraction (2019)

Artificial Intelligence, 268 (2019) 85-95

lñigo Lopez-Gazpio, Montse Maritxalar, Mirella Lapata, Eneko Agirre

Word n-gram attention models for sentence similarity and inference (2019)

Expert Systems with Applications. Volume 132, 15 October 2019, Pages 1-11. https://doi.org/10.1016/j.eswa.2019.04.054.

Aitor Ormazabal, Mikel Artetxe, Gorka Labaka, Aitor Soroa and Eneko Agirre

Analyzing the Limitations of Cross-lingual Word Embedding Mappings (2019)

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4990-4995.

Juan J. Lastra-Díaz, Josu Goikoetxea, Mohamed Ali Hadj Taieb, Ana García-Serrano, Mohamed Ben Aouicha, Eneko Agirre

Reproducibility dataset for a large experimental survey on word embeddings and ontology-based methods for word similarity (2019)

Data in Brief, Volume 26.

Juan J. Lastra-Díaz, Josu Goikoetxea, Mohamed Ali Hadj Taieb, Ana García-Serrano, Mohamed Ben Aouicha, Eneko Agirre

A reproducible survey on word embeddings and ontology-based methods for word similarity: linear combinations outperform the state of the art (2019)

Engineering Applications of Artificial Intelligence. Volume 85, October 2019, Pages 645-665.

Andrea Amelio Ravelli, Oier Lopez de Lacalle, Eneko Agirre

A comparison of representation models in a non-conventional semantic similarity scenario (2019)

Proceedings of the Sixth Italian Conference on Computational Linguistics, Bari, Italy.

Rodrigo Agerri

Doris Martin at SemEval-2019 Task 4: Hyperpartisan News Detection with Generic Semi-supervised Features (2019)

SemEval@NAACL-HLT 2019: 944-948 https://www.aclweb.org/anthology/S19-2161.pdf

Joseba Fernandez de Landa, Rodrigo Agerri, Iñaki Alegria

Euskaldun gazte eta helduen harremanak Twitterren (2019)

III. Ikergazte. Nazioarteko ikerketa euskaraz. Kongresuko artikulu bilduma. Gizarte Zientziak eta Zuzenbidea. 2, pp. 83 - 90

Itziar Gonzalez-Dios, Javier Alvez, and German Rigau

Exploiting Metonymy from Available Knowledge Resources. (2019)

20th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2019). (TO APPEAR in LNCS)

Mark Stevenson, Eneko Agirre

Word Sense Disambiguation (2018)

The Oxford Handbook of Computational Linguistics 2nd edition (2 ed.) Edited by Ruslan Mitkov. Oxford. ISBN: 9780199573691. DOI of the chapter: 10.1093/oxfordhb/9780199573691.013.28

Josu Goikoetxea, Aitor Soroa eta Eneko Agirre

Knowledge-Based Systems (KNOSYS). Volume 150, 15 June 2018, Pages 218-230. ISSN: 0950-7051. DOI https://doi.org/10.1016/j.knosys.2018.03.017 Preprint at https://arxiv.org/pdf/1804.08316.pdf

Rodrigo Agerri, Yiling Chung, Itziar Aldabe, Nora Aranberri, Gorka Labaka, German Rigau

Building Named Entity Recognition Taggers via Parallel Corpora (2018)

In Proceedings of the 11th Language Resources and Evaluation Conference (LREC 2018), 7-12 May, 2018, Miyazaki, Japan.

Ander Barrena, Aitor Soroa, Eneko Agirre

Learning text representations for 500K classification tasks on Named Entity Disambiguation (2018)

The SIGNLL Conference on Computational Natural Language Learning CONLL 2018

Rodrigo Agerri, German Rigau

Simple Language Independent Sequence Labelling for the Annotation of Disabilities in Medical Texts (2018)

Proceedings of the Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018), Diann Track, Sevilla, Spain.

Egoitz Laparra, Rodrigo Agerri, Itziar Aldabe, German Rigau

Multi-lingual and Cross-lingual timeline extraction (2017)

Knowledge-Based Systems, 133, 77-89

Itziar Aduriz, Iñaki Alegria, Olatz Arregi, Arantza Diaz de Ilarraza, Kepa Sarasola

Hizkuntza-teknologia “Datu Handien” garaian: programa bilatzaileak, itzultzaileak… (2017)

Senez, 48, pp. 191-200. ISSN: 1132-2152. 2017 https://eizie.eus/eu/argitalpenak/senez/20171102/aurkezpena/datuhandiak

Goikoetxea J., Agirre E., Soroa A.

Single or Multiple. Combining Word Representations Independently Learned from Text and WordNet (2016)

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. pp. 2608-26014. ISBN: 978-1-57735-760-5. Phoenix (USA).

Rodrigo Agerri, German Rigau

Robust Multilingual Named Entity Recognition with Shallow Semi-supervised Features (2016)

Artificial Intelligence, 238 (2016) pages 63-82. http://dx.doi.org/10.1016/j.artint.2016.05.003

Ander Intxaurrondo, Eneko Agirre, Oier Lopez de Lacalle, Mihai Surdeanu

Diamonds in the Rough: Event Extraction from Imperfect Microblog Data (2015)file2 (2015)file3 (2015)

Proceedings of the North American chapter of the Association for Computational Linguistics (NAACL HLT), pages: 641-650. ISBN: 978-1-941643-49-5.

Goikoetxea J., Agirre E., Soroa A.

Random Walks and Neural Network Language Models on Knowledge Bases (2015)

Proceedings of the Annual Meeting of the North American chapter of the Association of Computational Linguistics (NAACL HLT 2015), pages 1434-1439. ISBN: 978-1-937284-73-2. Denver (USA).

Igor Leturia, Kepa Sarasola, Xabier Arregi, Arantza Diaz de Ilarraza, Eva Navas, Iñaki Sainz, Arantza del Pozo, David Baranda, Urtza Iturraspe

BerbaTek: euskararako hizkuntza teknologien garapena itzulpengintza, edukien kudeaketa eta irakaskuntza arloetan (2013)

Euskalingua aldizkari digitala, 23, 66-76. http://mendebalde.eus/euskalinguak/Euskalingua%2023/Berbatek:%20euskararako%20hizkuntza%20teknologien%20garapena%20itzulpengintza,%20edukien%20kudeaketa%20eta%20irakaskuntza%20arloetan.pdf

Arantxa Otegi

Hedapena informazioaren berreskurapenean: hitzen adiera-desanbiguazioaren eta antzekotasun semantikoaren ekarpenak (2012)file2 (2012)

Lengoaia eta Sistema Informatikoak Saila, EHU/UPV. Informatika Fakultatea. 2012/03/16

Iñaki Alegria, Bertol Arrieta, Arantza Diaz de Ilarraza, Elixabete Izagirre, Montse Maritxalar

Using Machine Learning Techniques to Build a Comma Checker for Basque (2006)

Proceedings of Coling-ACL 2006. Sydney. Australia.ISBN: 1-932432-69-8 pp.1-8. https://aclanthology.org/P06-4000/

A. Casillas, V. Fresno, R. Martínez, S. Montalvo

Evaluación del clustering de páginas web mediante funciones de peso y combinación heurística de criterios (2005)

Revista Española para el Procesamiento del Lenguaje Natural, 35, 417-424 .https://1library.co/document/yn4mkjpz-evaluacion-clustering-paginas-mediante-funciones-combinacion-heuristica-criterios.html

All HiTZ publications

ie_ir_tabs_full

Demo of the NewsReader NLP pipeline

 

Just copy in any English text and see what entities and events and other annotations are added automatically. The result is represented in the NAF format.

Demo of the NewsReader NLP pipeline

 

Just copy in any Spanish text and see what entities and other annotations are added automatically. The result is represented in the NAF format

 

Eihera

Basque named entities recognizer/classifier

Eustagger

Basque lemmatizer and morphosyntactic analyzer

EUSLEM

EUSLEM: lemmatizer for Basque

UKB

Word sense disambiguation and similarity.

KYBOT

Knowledge Yielding Robot

  • EIEC
    Basque Named Entity Recognition corpus.
  • EDIEC
    Basque corpus annotated for Named Entity Disambiguation.
  • MCR: Multilingual Central Repository
    Multilingual lexical database with wordnets for several European languages, including Basque.
  • EPEC-EuSemcor
    Corpus tagged with Basque WordNet senses.

Cristina Aceta, Johan Kildal, Izaskun Fernández, Aitor Soroa

Towards an optimal design of natural human interaction mechanisms for a service robot with ancillary way-finding capabilities in industrial environments (2021)

Production & Manufacturing Research, 9:1, 1-32

Ainhoa Serna, Aitor Soroa, Rodrigo Agerri

Applying Deep Learning Techniques for Sentiment Analysis to Assess Sustainable Transport (2021)

Sustainability 13, no. 4: 2397.

Aitzol Elu, Gorka Azkune, Oier Lopez de Lacalle, Ignacio Arganda-Carreras, Aitor Soroa, Eneko Agirre

Inferring spatial relations from textual descriptions of images (2021)

Pattern Recognition, Volume 113, 107847. Pre-print: https://arxiv.org/abs/2102.00997

Eneko Agirre, Marianna Apidianaki, Ivan Vulić (Editors)

Proceedings of Deep Learning Inside Out (DeeLIO): The 2nd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures (2021)

In conjunction with NAACL. Association for Computational Linguistics

Elena Zotova, Rodrigo Agerri, German Rigau

Semi-automatic generation of multilingual datasets for stance detection in Twitter (2021)

Expert Systems with Applications, 170 (2021).

Jon Alkorta

Hacia el análisis de sentimientos en euskera (2021)

J. Alkorta. (2021). Hacia el análisis de sentimientos en euskera. Procesamiento del Lenguaje Natural, 66, 201-204.

Joseba Fernandez de Landa, Iker García, Ander Salaberria, Jon Ander Campos

Twitterreko Euskal Komunitatearen Eduki Azterketa Pandemia Garaian (2021)

IV. Ikergazte. Nazioarteko ikerketa euskaraz. Kongresuko artikulu bilduma. Ingeniaritza eta Arkitektura

Ander Barrena, Aitor Soroa, Eneko Agirre

Towards Zero-Shot Cross-Lingual Named Entity Disambiguation (2021)

Expert Systems With Applications ESWA 2021

Eneko Agirre

Cross-Lingual Word Embeddings (Book Review) (2020)

Computational Linguistics 46 (1), 245-248. (https://doi.org/10.1162/COLI_r_00372)

Oier Lopez de Lacalle, Ander Salaberria, Aitor Soroa, Gorka Azkune and Eneko Agirre

Evaluating Multimodal Representations on Visual Semantic Textual Similarity (2020)

Proceedings of the Twenty-third European Conference on Artificial Intelligence, ECAI 2020, June 8-12, 2020, Santiago Compostela, Spain

Oscar Sainz, Oier Lopez de Lacalle, Itziar Aldabe, Montse Maritxalar

Domain Adapted Distant Supervision for Pedagogically Motivated Relation Extraction (2020)

Proceeding of 12th Edition of its Language Resources and Evaluation Conference (LREC2020). Marseille, France

Javier Álvez, Itziar Gonzalez-Dios, German Rigau

Towards Word Sense Disambiguation by Reasoning (2020)

Vampire 2018 and Vampire 2019. The 5th and 6th Vampire Workshops. EPiC Series in Computing. Pages 19-29. ISSN: 2398-7340

Rodrigo Agerri, Iñaki San Vicente, Jon Ander Campos, Ander Barrena, Xabier Saralegi, Aitor Soroa, Eneko Agirre

Give your Text Representation Models some Love: the Case for Basque (2020)

Proceedings of LREC. Also available at arxiv https://arxiv.org/pdf/2004.00033.pdf

Begoña Altuna, María Jesús Aranzabe, Arantza Díaz de Ilarraza

EusTimeML: A mark-up language for temporal information in Basque (2020)

Research in Corpus Linguistics 8: 86-104. ISSN 2243-4712. Asociación Española de Lingüística de Corpus (AELINCO) DOI 10.32714/ricl.08.01.06

Rodrigo Agerri, German Rigau

Language independent sequence labelling for Opinion Target Extraction (2020)

International Joint Conference on Artificial Intelligence (IJCAI 2020)

Elena Zotova, Rodrigo Agerri, Manuel Nuñez and German Rigau

Multilingual Stance Detection in Tweets: The Catalonia Independence Corpus (2020)

Language Resources and Evaluation Conference (LREC 2020)

Javier Álvez, Itziar Gonzalez-Dios, German Rigau

Applying the Closed World Assumption to SUMO-based FOL Ontologies for Effective Commonsense Reasoning (2020)file2 (2020)

Frontiers in Artificial Intelligence and Applications. Giuseppe De Giacomo, Alejandro Catala, Bistra Dilkina, Michela Milano, Senén Barro, Alberto Bugarín, Jérôme Lang (eds.). Volume 325: ECAI 2020. Pages 585 - 592. IOS Press Ebooks

Juan J. Lastra-Díaz, Josu Goikoetxea, Mohamed Ali Hadj Taieb, Ana Garcia-Serrano, Mohamed Ben Aouicha, Eneko Agirre, David Sánchez

A large reproducible benchmark of ontology-based methods and word embeddings for word similarity (2020)

Information Systems. Online first.

Iker de la Iglesia, Mikel Martinez-Puente, Alexander Platas, Iria San Miguel, Aitziber Atutxa, Koldo Gojenola

MEDIA team at the CLEF-2020 MultilingualInformation Extraction Task (2020)

Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum Thessaloniki, Greece, September 22-25, 2020.

Eneko Agirre, Marianna Apidianaki, Ivan Vulić (Editors)

Proceedings of Deep Learning Inside Out (DeeLIO): The First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures (2020)

In conjunction with EMNLP. Association for Computational Linguistics

Rodrigo Agerri, German Rigau

Projecting Heterogeneous Annotations for Named Entity Recognition (2020)

In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020). Winner of the CAPITEL@IberLEF task on Spanish NER.

María Espinosa, Rodrigo Agerri, Roberto Centeno, Alvaro Rodrigo

DeepReading@SardiStance:Combining Textual, Social and Emotional Features. (2020)

Proceedings of the Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA 2020). Winners of the SardiStance@Evalita 2020 shared task

Rodrigo Agerri, German Rigau

Language independent sequence labelling for Opinion Target Extraction (2019)

Artificial Intelligence, 268 (2019) 85-95

lñigo Lopez-Gazpio, Montse Maritxalar, Mirella Lapata, Eneko Agirre

Word n-gram attention models for sentence similarity and inference (2019)

Expert Systems with Applications. Volume 132, 15 October 2019, Pages 1-11. https://doi.org/10.1016/j.eswa.2019.04.054.

Aitor Ormazabal, Mikel Artetxe, Gorka Labaka, Aitor Soroa and Eneko Agirre

Analyzing the Limitations of Cross-lingual Word Embedding Mappings (2019)

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4990-4995.

Juan J. Lastra-Díaz, Josu Goikoetxea, Mohamed Ali Hadj Taieb, Ana García-Serrano, Mohamed Ben Aouicha, Eneko Agirre

Reproducibility dataset for a large experimental survey on word embeddings and ontology-based methods for word similarity (2019)

Data in Brief, Volume 26.

Juan J. Lastra-Díaz, Josu Goikoetxea, Mohamed Ali Hadj Taieb, Ana García-Serrano, Mohamed Ben Aouicha, Eneko Agirre

A reproducible survey on word embeddings and ontology-based methods for word similarity: linear combinations outperform the state of the art (2019)

Engineering Applications of Artificial Intelligence. Volume 85, October 2019, Pages 645-665.

Andrea Amelio Ravelli, Oier Lopez de Lacalle, Eneko Agirre

A comparison of representation models in a non-conventional semantic similarity scenario (2019)

Proceedings of the Sixth Italian Conference on Computational Linguistics, Bari, Italy.

Rodrigo Agerri

Doris Martin at SemEval-2019 Task 4: Hyperpartisan News Detection with Generic Semi-supervised Features (2019)

SemEval@NAACL-HLT 2019: 944-948 https://www.aclweb.org/anthology/S19-2161.pdf

Joseba Fernandez de Landa, Rodrigo Agerri, Iñaki Alegria

Euskaldun gazte eta helduen harremanak Twitterren (2019)

III. Ikergazte. Nazioarteko ikerketa euskaraz. Kongresuko artikulu bilduma. Gizarte Zientziak eta Zuzenbidea. 2, pp. 83 - 90

Itziar Gonzalez-Dios, Javier Alvez, and German Rigau

Exploiting Metonymy from Available Knowledge Resources. (2019)

20th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2019). (TO APPEAR in LNCS)

Mark Stevenson, Eneko Agirre

Word Sense Disambiguation (2018)

The Oxford Handbook of Computational Linguistics 2nd edition (2 ed.) Edited by Ruslan Mitkov. Oxford. ISBN: 9780199573691. DOI of the chapter: 10.1093/oxfordhb/9780199573691.013.28

Josu Goikoetxea, Aitor Soroa eta Eneko Agirre

Knowledge-Based Systems (KNOSYS). Volume 150, 15 June 2018, Pages 218-230. ISSN: 0950-7051. DOI https://doi.org/10.1016/j.knosys.2018.03.017 Preprint at https://arxiv.org/pdf/1804.08316.pdf

Rodrigo Agerri, Yiling Chung, Itziar Aldabe, Nora Aranberri, Gorka Labaka, German Rigau

Building Named Entity Recognition Taggers via Parallel Corpora (2018)

In Proceedings of the 11th Language Resources and Evaluation Conference (LREC 2018), 7-12 May, 2018, Miyazaki, Japan.

Ander Barrena, Aitor Soroa, Eneko Agirre

Learning text representations for 500K classification tasks on Named Entity Disambiguation (2018)

The SIGNLL Conference on Computational Natural Language Learning CONLL 2018

Rodrigo Agerri, German Rigau

Simple Language Independent Sequence Labelling for the Annotation of Disabilities in Medical Texts (2018)

Proceedings of the Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018), Diann Track, Sevilla, Spain.

Egoitz Laparra, Rodrigo Agerri, Itziar Aldabe, German Rigau

Multi-lingual and Cross-lingual timeline extraction (2017)

Knowledge-Based Systems, 133, 77-89

Itziar Aduriz, Iñaki Alegria, Olatz Arregi, Arantza Diaz de Ilarraza, Kepa Sarasola

Hizkuntza-teknologia “Datu Handien” garaian: programa bilatzaileak, itzultzaileak… (2017)

Senez, 48, pp. 191-200. ISSN: 1132-2152. 2017 https://eizie.eus/eu/argitalpenak/senez/20171102/aurkezpena/datuhandiak

Goikoetxea J., Agirre E., Soroa A.

Single or Multiple. Combining Word Representations Independently Learned from Text and WordNet (2016)

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. pp. 2608-26014. ISBN: 978-1-57735-760-5. Phoenix (USA).

Rodrigo Agerri, German Rigau

Robust Multilingual Named Entity Recognition with Shallow Semi-supervised Features (2016)

Artificial Intelligence, 238 (2016) pages 63-82. http://dx.doi.org/10.1016/j.artint.2016.05.003

Ander Intxaurrondo, Eneko Agirre, Oier Lopez de Lacalle, Mihai Surdeanu

Diamonds in the Rough: Event Extraction from Imperfect Microblog Data (2015)file2 (2015)file3 (2015)

Proceedings of the North American chapter of the Association for Computational Linguistics (NAACL HLT), pages: 641-650. ISBN: 978-1-941643-49-5.

Goikoetxea J., Agirre E., Soroa A.

Random Walks and Neural Network Language Models on Knowledge Bases (2015)

Proceedings of the Annual Meeting of the North American chapter of the Association of Computational Linguistics (NAACL HLT 2015), pages 1434-1439. ISBN: 978-1-937284-73-2. Denver (USA).

Igor Leturia, Kepa Sarasola, Xabier Arregi, Arantza Diaz de Ilarraza, Eva Navas, Iñaki Sainz, Arantza del Pozo, David Baranda, Urtza Iturraspe

BerbaTek: euskararako hizkuntza teknologien garapena itzulpengintza, edukien kudeaketa eta irakaskuntza arloetan (2013)

Euskalingua aldizkari digitala, 23, 66-76. http://mendebalde.eus/euskalinguak/Euskalingua%2023/Berbatek:%20euskararako%20hizkuntza%20teknologien%20garapena%20itzulpengintza,%20edukien%20kudeaketa%20eta%20irakaskuntza%20arloetan.pdf

Arantxa Otegi

Hedapena informazioaren berreskurapenean: hitzen adiera-desanbiguazioaren eta antzekotasun semantikoaren ekarpenak (2012)file2 (2012)

Lengoaia eta Sistema Informatikoak Saila, EHU/UPV. Informatika Fakultatea. 2012/03/16

Iñaki Alegria, Bertol Arrieta, Arantza Diaz de Ilarraza, Elixabete Izagirre, Montse Maritxalar

Using Machine Learning Techniques to Build a Comma Checker for Basque (2006)

Proceedings of Coling-ACL 2006. Sydney. Australia.ISBN: 1-932432-69-8 pp.1-8. https://aclanthology.org/P06-4000/

A. Casillas, V. Fresno, R. Martínez, S. Montalvo

Evaluación del clustering de páginas web mediante funciones de peso y combinación heurística de criterios (2005)

Revista Española para el Procesamiento del Lenguaje Natural, 35, 417-424 .https://1library.co/document/yn4mkjpz-evaluacion-clustering-paginas-mediante-funciones-combinacion-heuristica-criterios.html

All HiTZ publications