Results

All the models, code, and data that we have developed for Basque at the HiTZ Center are publicly available. These include the best open chatbots, speech recognition models, and speech synthesis models for Basque that may be used for commercial purposes.

We have arranged them into three sections depending on user profile.

🏠 For use at home and work

Latxa

A general-purpose chatbot in Basque is available for testing at:

Speech recognition and synthesis

Demos of speech recognition and speech synthesis systems:


🏢 For building innovative products in industry and public administration

Speech-related demos and APIs

Product demos that combine our technologies

ILENIA demonstrators

ILENIA project demonstrators


Open public models:

Latxa family

Models of different sizes. 70B is the best; 8B is faster.

ASR - Speech recognition

The best open systems for automatic speech recognition

TTS - Speech synthesis

Speech synthesis system, the only open model available in Basque, with multiple voices

Interested in API access? Contact: transfer.hitz@ehu.eus


🔬 For research and advanced development

In addition to the above, all our specialized models, code, and data can be found at the following public repositories:

📦 GitHub


Significant open datasets:

Latxa training data

Data used to build Latxa

ASR data

Data for building speech recognition systems

TTS data

Data for building speech synthesis systems

Scientific publications: www.hitz.eus/publications