Recent Submissions

  • Item type:Item,
    Revisiting Syllables in Language Modelling and their Application on Low-Resource Machine Translation
    (Association for Computational Linguistics (ACL), 2022) Oncevay, A.; Rojas, K.D.R.; Sanchez, L.K.C.; Zariquiey, R.; Pontificia Universidad Católica del Perú
    Language modelling and machine translation tasks mostly use subword or character inputs, but syllables are seldom used. Syllables provide shorter sequences than characters, require less-specialised extracting rules than morphemes, and their segmentation is not impacted by the corpus size. In this study, we first explore the potential of syllables for open-vocabulary language modelling in 21 languages. We use rule-based syllabification methods for six languages and address the rest with hyphenation, which works as a syllabification proxy. With a comparable perplexity, we show that syllables outperform characters and other subwords. Moreover, we study the importance of syllables on neural machine translation for a non-related and low-resource language-pair (Spanish–Shipibo-Konibo). In pairwise and multilingual systems, syllables outperform unsupervised subwords, and further morphological segmentation methods, when translating into a highly synthetic language with a transparent orthography (Shipibo-Konibo). Finally, we perform some human evaluation, and discuss limitations and opportunities.
  • Item type:Item,
    Structural damage detection on a full-scale masonry cross-vault subjected to quasi-static cyclic loading tests
    (International Operational Modal Analysis Conference (IOMAC), 2025) Bendezu, A.; Pellegrini, D.; Chácara, C.; Pontificia Universidad Católica del Perú. Departamento de Ingeniería
    This paper presents the results of an experimental campaign for structural damage detection on a full-scale alternative-masonry cross-vault subjected to quasi-static cyclic tests. The masonry constituent material was composed of stabilised compressed earth blocks and soil-cement mortar. The cross-vault specimen presented a square plan with a span of approximately 3.20 m. Its boundary conditions consisted of two fixed corners that restrained displacements and rotations in all directions, and two corners that were placed over four-wheeled steel masses that enabled horizontal displacements. The masonry cross-vault was subjected to incremental horizontal cyclic load following a displacement-controlled approach to capture its in-plane shear failure. A total of 14 cyclic load sequences were applied to the specimen, reaching a maximum displacement of about 40 mm. The damage detection was evaluated in terms of frequency decrease and modal shape variations, estimated by operational modal analyses (OMA) technique carried out every two cyclic loads sequences considering ambient vibrations as excitation source. The Enhanced Frequency Domain Decomposition (EFDD) method implemented in the ARTeMIS Modal software estimated the specimen's frequencies. The results show an important decay in natural frequencies and variations of the first three mode shapes, especially when the vault experienced severe damage.
  • Item type:Item,
    WordNet-SHP: Towards the building of a lexical database for a Peruvian minority language
    (European Language Resources Association (ELRA), 2018) Maguiño-Valencia, D.; Oncevay, A.; Sobrevilla Cabezudo, M.A.; Pontificia Universidad Católica del Perú. Departamento de Ingeniería
    WordNet-like resources are lexical databases with highly relevance information and data which could be exploited in more complex computational linguistics research and applications. The building process requires manual and automatic tasks, that could be more arduous if the language is a minority one with fewer digital resources. This study focuses in the construction of an initial WordNetdatabase for a low-resourced and indigenous language in Peru: Shipibo-Konibo (shp). First, the stages of development from a scarce scenario (a bilingual dictionary shp-es) are described. Then, it is proposed a synset alignment method by comparing the definition glosses in the dictionary (written in Spanish) with the content of a Spanish WordNet. In this sense, word2vec similarity was the chosen metric for the proximity measure. Finally, an evaluation process is performed for the synsets, using a manually annotated Gold Standard inShipibo-Konibo. The obtained results are promising, and this resource is expected to serve well in further applications, such as word sense disambiguation and even machine translation in the shp-es language pair.
  • Item type:Item,
    Building an Endangered Language Resource in the Classroom: Universal Dependencies for Kakataibo
    (European Language Resources Association (ELRA), 2022) Zariquiey, R.; Alvarado, C.; Echevarria, X.; Saltachin, L.; Gonzales, R.; Illescas, M.; Oporto, S.; Blum, F.; Oncevay, A.; Vera, J.; Pontificia Universidad Católica del Perú
    In this paper, we launch a new Universal Dependencies treebank for an endangered language from Amazonia: Kakataibo, a Panoan language spoken in Peru. We first discuss the collaborative methodology implemented, which proved effective to create a treebank in the context of a Computational Linguistic course for undergraduates. Then, we describe the general details of the treebank and the language-specific considerations implemented for the proposed annotation. We finally conduct some experiments on part-of-speech tagging and syntactic dependency parsing. We focus on monolingual and transfer learning settings, where we study the impact of a Shipibo-Konibo treebank, another Panoan language resource.
  • Item type:Item,
    Innovation and entrepreneurship: Successful experiences in Brazil and Peru
    (Proceedings of the European Conference on Innovation and Entrepreneurship, ECIE, 2018) Barcellos-Paula, Luciano; Alvares, Daniela Fantoni; De Castro Rezende, Aline; Pontificia Universidad Católica del Perú
    Entrepreneurship plays an important role in the development of economies. In Latin America, specifically, public and private incentives for innovation are fundamental to enhance competitiveness and to strengthen new businesses. In this context, the paper aims to show the scenario of innovation and entrepreneurship in Brazil and Peru from the perspective of development programs, as well as to highlight the policies aimed at improving the quality and impact of the enterprises. In this context, we analyse the government, private and non-profit sector initiatives that deal with programs, projects, and actions to stimulate entrepreneurship and innovation, along with public policies that have an impact on the creation of positive environments for business development, and innovation networks. The focus of the study will be on the policies that are developed throughout the national territory in the studied countries, especially the exchange of experiences in Latin America. Preliminary results point out the importance of public power in the process of encouraging and consolidating innovation and entrepreneurship in these territories. It should be noted that institutions with business activities are essential in the ecosystem of innovation and corporate structure, especially in the smaller ones. Therefore, research in this field is fundamental for guiding development policies and stimulating business competitiveness. In summary, this paper proposes a relevant research on innovation and entrepreneurship, which is very useful to the public and private sectors and to the academic field since it aims to encourage the development of more effective research on the subject.