Ship-lemmatagger: Building an nlp toolkit for a peruvian native language

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Springer Verlag

Acceso al texto completo solo para la Comunidad PUCP

Abstract

Natural Language Processing deals with the understanding and generation of texts through computer programs. There are many different functionalities used in this area, but among them there are some functions that are the support of the remaining ones. These methods are related to the core processing of the morphology of the language (such as lemmatization) and automatic identification of the part-of-speech tag. Thereby, this paper describes the implementation of a basic NLP toolkit for a new language, focusing in the features mentioned before, and testing them in an own corpus built for the occasion. The obtained results exceeded the expected results and could be used for more complex tasks such as machine translation.

Description

Keywords

Lemmatisation, Computer science, Machine translation, Natural language processing, Artificial intelligence, Identification (biology)

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By