research projects

Protágoras: Desarrollo de algoritmos de procesamiento de lenguaje natural para el desarrollo de un motor cognitivo.


Protagoras: Development of natural language processing algorithms for the development of a cognitive engine.

(2017 - 2018)

Indra-BPO gives service to other companies managing their documentary backoffice and it stores thousands of manually extracted documents. This information extraction process is very laborious and expensive. The main objective of Protágoras has been to partially automate this manual work.

The use-case has been the following: to extract the relevant information from notarial documents, from the certificates issued by the real estate registries related to mortgages, information related to buyers, sellers, properties, the notary and so on. Nowadays, Indra’s employees read the documents, identify the information by hand and introduce it in an application.

In our opinion the system developed by means of this contract can be used in real applications. A recall and precision of 80% was obtained for the critical fields with an f-score of 75% in the extraction. In the fields containing names and dates, the systems achieves a 90% of precision.
Webpage:
Organization:  Indra Sistemas S.A.
Main researcher: Maite Oronoz
Participants
Eneko Agirre, Arantza Díaz de Ilarraza, Maite Oronoz, German Rigau , Aitor Soroa


Back

HiTZ is made up of the following research groups: