• JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
 
  Bookmark and Share
 
 
Master's Dissertation
DOI
https://doi.org/10.11606/D.8.2020.tde-20022020-164808
Document
Author
Full name
Beatriz Albiero
E-mail
Institute/School/College
Knowledge Area
Date of Defense
Published
São Paulo, 2019
Supervisor
Committee
Ferreira, Marcelo Barra (President)
Coelho, Livy Maria Real
Faria, Pablo Picasso Feliciano de
Lopes, Marcos Fernando
Title in Portuguese
O modelo Encoder-Decoder aplicado em irregularidades verbais do Português Brasileiro
Keywords in Portuguese
Aprendizagem de máquina
Conexionismo
Morfologia verbal
Abstract in Portuguese
Inspirada na controversa questão da aquisição de verbos irregulares na língua inglesa (Chomsky, N. & Halle (1968/1991), Pinker & Prince (1988), Albright, A. & Hayes (2003), Kirov & Cotterell (2018)), esta pesquisa tem como objetivo estudar a questão da flexão de verbos irregulares do Português Brasileiro sob a ótica do modelo computacional Encoder- Decoder. Para tanto, a tarefa proposta ao modelo era a de predizer uma forma verbal flexionada dada uma forma primária (Radical + Vogal Temática). O escopo da pesquisa restringiu-se ao estudo do paradigma de 1a Pessoa do Singular no Modo Indicativo e Tempo Presente. O modelo utilizado, por sua vez, é um modelo de caráter associativo que pertence ao grupo dos modelos de Redes Neurais Artificiais. Também, fez-se necessária a construção de um corpus linguístico composto pelo paradigma selecionado e em seguida transcrito em notação fonética específica para viabilizar a utilização do modelo escolhido. O corpus produzido é composto por 423 verbos que foram marcados como pertencendo às famílias de verbos regulares (51%) ou irregulares (49%). Ainda, dentro do escopo da família de verbos irregulares, foi possível identificar 15 subgrupos conforme a identificação de diferentes padrões de flexão. A partir da notação fonética utilizada, os verbos puderam ser associados a novas representações que englobavam informações relativas aos traços fonéticos presentes. Assim, o modelo proposto tenta predizer as formas flexionadas a partir da identificação das relações fonéticas envolvidas durante o processo de flexão. O modelo apresentado foi submetido a múltiplos treinamentos e testes e apresentou uma acurácia média de 13.55%, mas chegou a acertar 17% em um dos experimentos. Considerando a segmentação entre verbos regulares e irregulares, o modelo performou melhor na classe dos regulares. Entretanto, considerandose todas as 16 classes individualmente (15 irregulares + 1 regular), pôde-se observar que as duas primeiras classes em que o modelo performou melhor eram classes irregulares, deixando a classe regular como a terceira com os melhores resultados.
Title in English
The Encoder-Decoder Model Applied to Brazilian-Portuguese Verbal Irregularities
Keywords in English
Connectionism
Machine learning
Verbal morphology
Abstract in English
Inspired by the controversial debate about the acquisition of irregular verbs in Englishlanguage (Chomsky, N. & Halle (1968/1991), Pinker & Prince (1988), Albright, A. & Hayes (2003), Kirov & Cotterell (2018)), this research aims to study the inflection process of irregular verbs in Portuguese through the perspective of the computational model Encoder- Decoder. To do this, we proposed the task of predicting an inflected verbal form given a primary form (Stem + Thematic Vowel). The scope of the research was restricted to the study of the singular first-person paradigm in the indicative mood and present tense. The model, in turn, is an associative model that belongs to the group of Artificial Neural Networks models. Also, it was necessary to construct a linguistic corpus composed by the chosen paradigm and then transcribe it into a specific phonetic notation to enable the usage of the chosen model. The resulting corpus consists of 423 verbs that were marked as belonging to either regular (51%) or irregular (49%) verb families. Moreover, within the scope of irregular verbs, it was possible to identify 15 subgroups through the identification of inflection patterns. Through the phonetic notation provided, verbs could be associated with new representations that included information related to the phonetic features. Thus, the proposed model attempts to predict inflected forms by identifying the involved phonetic relationships during the inflection process. The model was submitted to multiple trainings and tests and presented an average accuracy of 13.55%, but it got to 17% in one of the experiments. Considering the segmentation between regular and irregular verbs, the model performed better among the regular class. However, considering all 16 classes individually (15 irregular + 1 regular), it was observed that the first two classes in which the model performed best were irregular classes, leaving the regular class with the third place.
 
WARNING - Viewing this document is conditioned on your acceptance of the following terms of use:
This document is only for private use for research and teaching activities. Reproduction for commercial use is forbidden. This rights cover the whole data about this document as well as its contents. Any uses or copies of this document in whole or in part must include the author's name.
Publishing Date
2020-02-20
 
WARNING: Learn what derived works are clicking here.
All rights of the thesis/dissertation are from the authors
CeTI-SC/STI
Digital Library of Theses and Dissertations of USP. Copyright © 2001-2020. All rights reserved.