APRENDIZADO DE MÁQUINA POR EXEMPLOS USANDO ÁRVORES DE DECISÃO

Castiñeira, Maria Inés

doi:10.11606/D.55.2019.tde-21022019-100634

Home

Facilities

Master's Dissertation

DOI

https://doi.org/10.11606/D.55.2019.tde-21022019-100634

Document

Master's Dissertation

Author

Castiñeira, Maria Inés (Catálogo USP)

Full name

Maria Inés Castiñeira

Institute/School/College

Instituto de Ciências Matemáticas e de Computação

Knowledge Area

Computer Science and Computational Mathematics

Date of Defense

1990-11-09

Published

São Carlos, 1990

Supervisor

Monard, Maria Carolina (Catálogo USP)

Committee

Monard, Maria Carolina (President)
Carvalho, Ariadne Maria Brito Rizzoni
Eizirik, Leila Maria Rippol

Title in Portuguese

APRENDIZADO DE MÁQUINA POR EXEMPLOS USANDO ÁRVORES DE DECISÃO

Keywords in Portuguese

Não disponível

Abstract in Portuguese

O Aprendizado de Máquina é uma importante área de pesquisa em Inteligência Artificial pois a capacidade de aprender é essencial para um comportamento inteligente. Em particular, um dos objetivos da pesquisa em Aprendizado de Máquina é o de auxiliar o processo de aquisição de conhecimento facilitando a construção de Sistemas Baseados em Conhecimento. Uma das formas de aprendizagem é por generalizações, isto é, através de processos indutivos. São várias as estratégias desenvolvidas para Aprendizado de Máquina por Indução. Uma delas está baseada na construção de árvores de decisão. Esta estratégia abrange uma determinada família de sistemas de aprendizado por indução: a família TDIDT - Top Down Decision Trees. Neste trabalho são apresentadas algumas estratégias de Aprendizado de Máquina, dando ênfase aos sistemas da família TDIDT, bem como detalhes da implementação realizada. Mostra-se que é possível realizar uma implementação geral dos algoritmos desta família. Mostra-se também a importância dos diversos mecanismos de poda em árvores de decisão. Um método de poda específico é usado para podar árvores geradas em diversos domínios. Os resultados obtidos evidenciam que este método reduz a complexidade da árvore e produz ganhos significativos na classificação por ela realizada.

Title in English

Not available

Keywords in English

Not available

Abstract in English

Machine Learning is an important research area of Artificial Intelligence, since the ability to learn is central to intelligent bahavior. Making generalizations - induction - is the means by which humans learn most of their knowledge. In this work we describe several approaches to Machine Learning and concetrate our attention on a family of learning systems called TDIDT - Top Down Induction Decision Trees. The task of these systems in to induce general descriptions of concepts, from examples of this concepts, using decision trees as a knowledge formlism. Although decision trees are a simple formalismm the learning methodologies used by the TDIDT family are less complex than the mehodologies used by other systems that employ a more powerful language to express the results of the learning process. Nevertheless, decision trees are capable of capturing knowledge which is useful to solve difficult problems. In general, TDIDT family's algorithms develops a decision tree from a set of examples in three main stages: construction of the tree to classify the examples, pruning such a tree to give statistical reliability and processing of the pruned tree to improve understandability. In this work the first two stages are considered. Related to the first stage, we propose an efficient Prolog implementation for the construction of decision trees. The decision tree is grown by choosing, at each node, the attribute which divides "best" the set of examples considered. In this particular implementation the attribute is chosen by an entropy measure, although it is simple to redefine and implement in the system another kind of measure. Related to the second stage we propose a pruning method which estimates the classification errors in the nodes of the decision tree peviously constructed and then, considering this errors, decides whether to prune certain subtrees. This method was applied to several domaiins and sets of data to measure the size of the pruned tree and its accuracy. Results show that the complexity of the pruned decision tree decreases while its accuracy invreases; both measures are heavily dependent on the domain.

WARNING - Viewing this document is conditioned on your acceptance of the following terms of use:
This document is only for private use for research and teaching activities. Reproduction for commercial use is forbidden. This rights cover the whole data about this document as well as its contents. Any uses or copies of this document in whole or in part must include the author's name.

MariaInesCastineira.pdf (4.64 Mbytes)

Publishing Date

2019-02-21

Derived works

WARNING: Learn what derived works are clicking here.