Representation of Yine (Arawak) Morphology by Finite State Transducer Formalism

dc.contributor.affiliationPontificia Universidad Católica del Perú
dc.contributor.authorIngunza, A.M.
dc.contributor.authorMiller, J.E.
dc.contributor.authorOncevay, A.
dc.contributor.authorZariquiey, R.
dc.date.accessioned2026-03-13T17:00:18Z
dc.date.issued2021
dc.description.abstractWe represent the complexity of Yine (Arawak) morphology with a finite state transducer (FST) based morphological analyzer. Yine is a low-resource indigenous polysynthetic Peruvian language spoken by approximately 3,000 people and is classified as 'definitely endangered' by UNESCO. We review Yine morphology focusing on morphophonology, possessive constructions and verbal predicates. Then we develop FSTs to model these components proposing techniques to solve challenging problems such as complex patterns of incorporating open and closed category arguments. This is a work in progress and we still have more to do in the development and verification of our analyzer. Our analyzer will serve both as a tool to better document the Yine language and as a component of natural language processing (NLP) applications such as spell checking and correction.
dc.description.sponsorshipFunding: of non-verbal predicate, nominalizer, and verbal-We are grateful to the bilingual teachers from izer functions to address non-verbal predicate and NOPOKI Nimia Acho and Remigio Zapata. Simi-change of category errors. larly, we acknowledge the research grant of the Cardenas and Zeman (2018) obtained 78.9% av-“Consejo Nacional de Ciencia, Tecnología e In-erage coverage over multiple domains on test data novación Tecnológica” (CONCYTEC, Peru) un-for a completed FST morphology of an Amazonian der the contract 183-2018-FONDECYT-BM-IADT-polysynthetic language. Our ≈15% coverage in MU.; Funding text 2: We are grateful to the bilingual teachers from NOPOKI Nimia Acho and Remigio Zapata. Similarly, we acknowledge the research grant of the ?Consejo Nacional de Ciencia, Tecnolog?a e In-novaci?n Tecnol?gica? (CONCYTEC, Peru) under the contract 183-2018-FONDECYT-BM-IADT-MU.
dc.identifier.doihttps://doi.org/10.18653/v1/2021.americasnlp-1.11
dc.identifier.urihttp://hdl.handle.net/20.500.14657/206586
dc.language.isoeng
dc.publisherAssociation for Computational Linguistics (ACL)
dc.relation.conferencenameProceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas (2021)
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.subjectComputer science
dc.subjectLinguistics
dc.subjectRepresentation (politics)
dc.subjectNatural language processing
dc.subjectAlgorithm
dc.subjectArtificial intelligence
dc.subject.ocdehttps://purl.org/pe-repo/ocde/ford#1.02.01
dc.titleRepresentation of Yine (Arawak) Morphology by Finite State Transducer Formalism
dc.typehttp://purl.org/coar/resource_type/c_5794
dc.type.otherComunicación de congreso
dc.type.versionhttps://vocabularies.coar-repositories.org/version_types/c_970fb48d4fbd8a85/

Files

Collections