Aprendizagem de conceitos não-estacionários por meio de redes neurais artificiais

Oliveira Filho, Evaldo Araújo de

doi:10.11606/T.43.2005.tde-06032014-162559

Home

Facilities

Doctoral Thesis

DOI

https://doi.org/10.11606/T.43.2005.tde-06032014-162559

Document

Doctoral Thesis

Author

Oliveira Filho, Evaldo Araújo de (Catálogo USP)

Full name

Evaldo Araújo de Oliveira Filho

Institute/School/College

Instituto de Física

Knowledge Area

Physics

Date of Defense

2005-08-04

Published

São Paulo, 2005

Supervisor

Alfonso, Nestor Felipe Caticha (Catálogo USP)

Committee

Alfonso, Nestor Felipe Caticha (President)
Carneiro, Carlos Eugenio Imbassahy
Kinouchi Filho, Osame
Piqueira, Jose Roberto Castilho
Salinas, Silvio Roberto de Azevedo

Title in Portuguese

Aprendizagem de conceitos não-estacionários por meio de redes neurais artificiais

Keywords in Portuguese

Mecânica estatística
Redes neurais

Abstract in Portuguese

Num sentido geral, qualquer sistema (natural ou artificial) que incorpore informação contida numa amostragem de dados realiza aprendizagem. Dado um conjunto D de amostras que carrega informação sobre sua fonte geradora, existem diferentes medidas para quantificar a aprendizagem sobre ela e, portanto, uma boa representação de tal fonte. Contudo, não estamos interessados numa aprendizagem que apenas torne possível a reprodução de D por um sistema aprendiz, mas principalmente numa que torne possível a geração de novos dados condizentes com a fonte geradora. Portanto, uma vez fixado um sistema (máquina ou algoritmo), aprender significa encontrar um estado do sistema aprendiz que generalize a fonte geradora de D. Em Mecânica Estatística as informações relevantes sobre os estados de qualquer sistema estão contidas em sua função de partição Z. Logo, a inferência de qualquer variável ê obtida tratando-se Z, de forma que o seu conhecimento (cálculo) representa o conhecimento dos estados do sistema, ou seja, do próprio sistema. Num problema de aprendizagem bayesiana a função de partição é representada pela distribuição posterior a D (que já tenha incorporado as informações dos exemplos), P(|D), obtida por meio da regra de Bayes P(A, B) = P(A/B)P(B). Embora a abordagem bayesiana se enquadre originalmente em modelos da Mecânica Estatística em equilíbrio, sua utilização tem sido promissora também em cenários que podem ser interpretados como modelos de mecânica estatística fora do equilíbrio termodinâmico, sendo a aprendizagem de conceitos que mudam no decorrer do processo de aprendizagem um desses problemas que têm atraído bastante atenção. O principal objetivo desta tese foi o estudo da aprendizagem bayesiana quando além do acesso ao conjunto D temos também a informação de que a fonte geradora de D é não-estacionária, introduzindo assim tempo num problema que de outra forma seria classificado como em equilíbrio. Em particular, estudamos a aprendizagem de conceitos com várias formas de dependência temporal por redes neurais (mais especificadamente, perceptrons), para a qual não é necessário modificar a verossimilhança do modelo. Assim nos concentramos na modificação do conhecimento a priori de forma a refletir a possibilidade de envelhecimento dos dados, numa escala de tempo desconhecida. Ao introduzirmos uma distribuição de probabilidades priori para essa escala de tempo, nós encontramos uma distribuição posterior efetiva com uma cauda de decaimento algébrico que resultou num novo algoritmo com uma capacidade de adaptação satisfatória. Também aplicamos esse novo algoritmo na aprendizagem com ruído e discutimos algumas novas possibilidades sobre algoritmos para perceptrons.

Title in English

Nonstationary learning concepts using artificial neural networks

Keywords in English

Neural networks
Statistical mechanics

Abstract in English

In a general sense, any system which incorporates knowledge from sample data can be called a learning machine (natural or artificial). Given a set D of samples which carries information about a rule, there are different measures of how much a system has learnt about the rule and therefore comprises a good representation of its. We are not only interested in learning that can reproduce D, but also generate new consistent data. Therefore, once fixed a system (a machine or an algorithm), to learn means to find a state of the system that generalizes the source rule of D. We looked at Bayesian formulations of the learning problem, which is a formalism identical to Statistical Mechanics. Relevant knowledge about a given system is encoded in a partition function Z. Then, any inference can be made by treating Z, and if we know Z we know the system's properties. The function Z is the posterior distribution P(|D) in the Bayesian approach, calculated by the Bayes' rule P(A, B) = P(A/B)P(B). Although the Bayesian theory is naturally paralleled in equilibrium Statistical Mechanics, it holds the promise of leading to results in problems that can be classified as non-equilibrium. One of this problems that has been the subject of increasing attention is that of learning non-stationary concepts The aim of this thesis was to study Bayesian learning when in addition to the knowledge to the data set D we have the information that the rule which gave rise to the samples is non-stationary, thereby introducing time into what would otherwise, have been an equilibrium problem. ln particular we studied learning of several forms of time dependent concepts by neural networks (more specifically, perceptrons), for which there is no need to change the likelihood. We concentrated on changing the prior knowledge in a way that reflects the aging possibility of the data on an unknown time scale. By introducing a prior probability distribution for the time scale, we found a effective posterior distribution with an algebraic decaying tail, which resulted in a new algorithm that was able to adapt satisfactory. We also applied the new algorithm to the learning with noise data and discussed some new possibilities about algorithms for perceptrons.

WARNING - Viewing this document is conditioned on your acceptance of the following terms of use:
This document is only for private use for research and teaching activities. Reproduction for commercial use is forbidden. This rights cover the whole data about this document as well as its contents. Any uses or copies of this document in whole or in part must include the author's name.

47015AraujoEvaldo.pdf (5.04 Mbytes)

Publishing Date

2014-03-12

Derived works

WARNING: Learn what derived works are clicking here.