• JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
 
  Bookmark and Share
 
 
Master's Dissertation
DOI
https://doi.org/10.11606/D.45.2020.tde-12052020-005232
Document
Author
Full name
Guilherme Jun Yoshimura
E-mail
Institute/School/College
Knowledge Area
Date of Defense
Published
São Paulo, 2020
Supervisor
Committee
Queiroz, Marcelo Gomes de (President)
Tavares, Tiago Fernandes
Wertzner, Haydée Fiszbein
Title in Portuguese
Processamento de fala para triagem de distúrbios fonológicos
Keywords in Portuguese
Classificação
Coeficientes Mel-Cepstrais
Distúrbio do som da fala
Dynamic Time Warping
Processamento de fala
Abstract in Portuguese
Este trabalho apresenta dois classificadores originais para sinais de voz que objetivam auxiliar profissionais da fonoaudiologia no diagnóstico de pessoas com alterações de fala. Comparamos os classificadores propostos com três técnicas conhecidas: Modelos de Markov Escondidos (HMM), bag-of-words e classificador baseado em Earth Mover's Distance (EMD). Utilizamos três bases de dados, sendo duas disponibilizadas pelo Departamento de Fisioterapia, Fonoaudiologia e Terapia Ocupacional (FOFITO) da Faculdade de Medicina da Universidade de São Paulo (FMUSP) que contêm gravações de crianças que têm alterações de fala que ocorrem durante o desenvolvimento da fala, e a terceira é a base pública UA-Speech que contém gravações de indíviduos adultos com disartria. O intuito deste trabalho é criar classificadores de fala capazes de distinguir um áudio sem alteração de fala de um áudio com alteração de fala. Além de estudar as técnicas conhecidas citadas anteriormente, propusemos dois classificadores baseados em Coeficientes Mel-Cepstrais (MFCC). O primeiro utiliza uma reformulação da distância DTW entre registros de fala e conjuntos de gravações sem alteração de fala, enquanto o outro combina a informação de curvas de dissimilaridades construídas a partir da comparação do registro de fala a ser classificado com as gravações de referência (sem alterações de fala).
Title in English
Speech processing for screening off phonological disorders
Keywords in English
Classification
Dynamic Time Warping
Mel Frequency Cepstral Coefficients
Speech processing
Speech sound disorder
Abstract in English
This work presents two novel speech classifiers which aim to aid speech therapy professionals in the diagnosis of individuals with speech disorders. We compared the proposed classifiers with three well-known techniques: Hidden Markov Models (HMM), Bag-of-Words (BoW) and a classifier based on the Earth Mover's Distance. In this work we used three databases, two of which were provided by the School of Medicine at the University of São Paulo, and a third one which is a public database (UA-Speech) containing recordings of individuals with dysarthria. The goal of this project is to develop speech classifiers which are able to distinguish recordings from patients with and without speech disturbances. Besides studying the well-known techniques mentioned above, we proposed two techniques that are based on Mel Frequency Cepstral Coefficients (MFCC). The first one defines the classification problem over relative embeddings based on point-to-set distances, while the second one combines information from dissimilarity curves built from the comparison of the speech recording to be classified and the reference recordings (without speech disorders).
 
WARNING - Viewing this document is conditioned on your acceptance of the following terms of use:
This document is only for private use for research and teaching activities. Reproduction for commercial use is forbidden. This rights cover the whole data about this document as well as its contents. Any uses or copies of this document in whole or in part must include the author's name.
Publishing Date
2020-05-27
 
WARNING: Learn what derived works are clicking here.
All rights of the thesis/dissertation are from the authors
CeTI-SC/STI
Digital Library of Theses and Dissertations of USP. Copyright © 2001-2024. All rights reserved.