About Naki

List of research and engineering of NLP for American Native/Indigenous Languages.

About Naki


This page tries to assemble all the research on Natural Language Processing (NLP) for native and indigenous languages of the American continent. Our languages are in danger, especially if they don’t get involved in the new digital boom, that is introduced even into the most remote communities. Nevertheless, scientific and engineering work has been done in the field, much more work is necessary to archive usable tools that can compete with the products from the big companies (as Google Translate, Alexa, etc.). To push forward this effort, this work wants to generate an (as much as possible) complete list.

Our main aim is to encourage native speakers, researchers, and engineers to participate in this effort. Hopefully, we can do it with these survey.

If you want more information, please read our paper: “Challenges of language technologies for the indigenous languages of the Americas”. We also invite you to have a look at our presentation

Last Update: 22/November/2020

Table of Contents

  1. Machine Translation
  2. Automatic Lexical extraction
  3. Morphologcal analysis and segmentation
  4. Corpus and digital resources
  5. Speech Recognition
  6. POS Tagging
  7. Parsing
  8. OCR
  9. Spell checking
  10. WordNet
  11. Language ID
  12. Code-Switching and Multilingual NLP
  13. Tools, documentation and education
  14. Computational Linguistic Analyze and Surveys
  15. Contact

Corpus and digital resources

Online Corpus Resources

Scientific papers

Machine Translation

Online demos and software

Scientific papers

Automatic Lexical Extraction

Scientific papers and dictionaries

Morphologcal analysis and segmentation


Scientific Papers


POS Tagging



Spell checking


Language ID

Code-Switching and Multilingual NLP

Tools, documentation and education

Online available software


Computational Linguistic Analyze and Surveys


This effort can be completed only with the cooperation of all visitors. If you know about some work in this field, please let me know and push to this repositoy or send an email to mmager [at] turing.iimas.unam.mx or visit my personal web page.

How to cite

If you found this information usefull for your academic research please acknowledge its use with a citation:

Mager, M., Gutierrez. X., Sierra, G., and Meza, I. (2018, August). Challenges of language technologies for the Americas indigenous languages. In Proceedings of the 27th international conference on Computational linguistics. Association for Computational Linguistics.

  author = 	"Mager, Manuel
		and Gutierrez-Vasques, Ximena
		and Sierra, Gerardo
		and Meza-Ruiz, Ivan",
  title = 	"Challenges of language technologies for the indigenous languages of the Americas",
  booktitle = 	"Proceedings of the 27th International Conference on Computational Linguistics",
  year = 	"2018",
  publisher = 	"Association for Computational Linguistics",
  pages = 	"55--69",
  location = 	"Santa Fe, New Mexico, USA",
  url = 	"http://aclweb.org/anthology/C18-1006"