Here you will find a list of resources that I have created or to which I have made substantial contributions. I hope you find them useful for your work. If you use any of them, please cite them as indicated in the documentation and ensure compliance with their licenses.
📚 Data
Corpora
🔗 Project GitHub → CCNET-Galician
Evaluation and fine-tuning datasets
Machine translation goldstandards:
Galician-English (test suite, gold1, gold2);
Galician-Spanish (test-suite, gold1, gold2)
Fine-tuning suite containing reasoning, truthfulness, general knowledge, and mathematical tasks.
🕸️ Models
LLMs
Machine translation models
📲 Apps/demos