• JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
  • JoomlaWorks Simple Image Rotator
 
  Bookmark and Share
 
 
Master's Dissertation
DOI
https://doi.org/10.11606/D.45.2021.tde-03022022-234955
Document
Author
Full name
Felipe Maia Polo
E-mail
Institute/School/College
Knowledge Area
Date of Defense
Published
São Paulo, 2021
Supervisor
Committee
Vicente, Renato (President)
Cozman, Fabio Gagliardi
Prates, Marcos Oliveira
Title in English
Covariate shift adaptation and dataset shift decomposition in machine learning
Keywords in English
Concept drift
Covariate shift
Dataset shift
Dataset shift decomposition
Dimensionality
Domain adaptation
Effective sample size
Machine searning
Statistics
Abstract in English
In supervised learning, we often have access to a limited sample, in size or quality (e.g., lack of labels), of the population/distribution of interest, for which we want to create predictive models. However, it is possible that we have less limited access to data sampled from another population, more or less similar to the one of interest. Training models using only data from the population of interest may be impossible or result in sub-optimal models, so it would be interesting to use data from the other population in order to get better results or make training possible. In these situations, as the distributions of interest and the one that we can sample with few restrictions are different, we say that there is dataset shift. In dataset shift situations, employing domain adaptation techniques when training supervised models is essential for theoretical guarantees of good results in the population of interest. The two kinds of dataset shift we will discuss about in this work are covariate shift and concept drift/shift. The main objectives of this work are: (i) to review the main concepts and methods related to covariate shift and covariate shift adaptation; (ii) propose contributions to the covariate shift adaptation literature, connecting concepts present in modern literature; (iii) propose the decomposition of the dataset shift into covariate shift and expected concept drift/shift as a new approach to better understand situations in which we deal with dataset shift.
Title in Portuguese
Adaptação para covariate shift e decomposição do dataset shift no aprendizado de máquina
Keywords in Portuguese
Adaptação de dominio
Concept drift
Covariate shift
Dataset shift
Decomposição do dataset shift
Dimensionalidade
Effective sample size
Estatistica
Machine learning
Abstract in Portuguese
No aprendizado supervisionado, muitas vezes temos acesso a uma amostra limitada, em tamanho ou qualidade (e.g., falta de rotulos), de dados da populacao/distribuicao de interesse, para a qual queremos criar modelos preditivos. No entanto, e possivel que tenhamos acesso pouco limitado a dados amostrados de outra populacao, mais ou menos parecida com a de interesse. Treinar modelos utilizando somente dados da populacao de interesse pode ser impossivel ou resultar em modelos sub-otimos, entao seria interessante utilizar os dados provenientes da outra populacao a fim de obter melhores resultados ou tornar o treinamento possivel. Nessas situacoes, como as distribuicoes de interesse e aquela que podemos amostrar com poucas restricoes sao diferentes, dizemos que ha dataset shift. Em situacoes de dataset shift, empregar tecnicas de adaptacao de dominio ao treinar modelos supervisionados e essencial para garantias teoricas de bons resultados na populacao de interesse. Os dois tipos de dataset shift que discutiremos neste trabalho sao covariate shift e concept drift/shift. Os objetivos principais deste trabalho sao: (i) revisar principais conceitos e metodos relacionados ao covariate shift e covariate shift adaptation; (ii) propor contribuicoes para a literatura de covariate shift adaptation, conectando conceitos presentes em discussoes atuais; (iii) propor a decomposicao do dataset Shift em covariate shift e concept drift/shift esperado como uma nova abordagem para melhor entendimento de situacoes em que lidamos com dataset shift.
 
WARNING - Viewing this document is conditioned on your acceptance of the following terms of use:
This document is only for private use for research and teaching activities. Reproduction for commercial use is forbidden. This rights cover the whole data about this document as well as its contents. Any uses or copies of this document in whole or in part must include the author's name.
MasterThesis_FMP.pdf (4.60 Mbytes)
Publishing Date
2022-02-04
 
WARNING: Learn what derived works are clicking here.
All rights of the thesis/dissertation are from the authors
CeTI-SC/STI
Digital Library of Theses and Dissertations of USP. Copyright © 2001-2024. All rights reserved.