👷 Readings in Digital Typography, Scientific Visualization, Information Retrieval and Machine Learning

[Michal Štefánik] Unsupervised Data Augmentation: Thinking Outside the Single-Objective Box 10. 12. 2020

Join us using Zoom, on December 10 at 10 AM (CET).

An eagerness for a huge amount of training data is one of the major issues of SOTA estimators, disabling them to reach any useful level of quality on many tasks, where the data is scarce, or too expensive to obtain.

Currently, this problem is addressed in supervised settings by data augmentation strategies or mainly by auto-regressive objectives in unsupervised settings. Conventional data augmentation strategies, however, can only bring a limited amount of noise to the original data to still keep the samples valid and hence can hardly substitute orders of magnitude of missing samples.

Unsupervised Data Augmentation (UDA) address this situation originally, by a funny, novel presumption:

Having in-domain samples with no labels, yet all are belonging to a certain category, we can augment those samples and expect the system to predict the same output of an augmented sample, as the original one.

In this session, we'll describe the maths behind UDA, analyze the conditions and circumstances, when this semi-supervised approach can be used. We'll give a thought of implications that this work brings to research in data-scarce fields, and also to the industry, where data acquisition is a bottleneck of countless applications.

Unsupervised Data Augmentation: Thinking Outside the Single-Objective Box

video recording of the 2020-12-10 presentation by Michal Štefánik

Readings

Unsupervised Data Augmentation for Consistency Training: https://arxiv.org/abs/1904.12848

Předchozí

Následující

👷 Readings in Digital Typography, Scientific Visualization, Information Retrieval and Machine Learning
- Nyní studovat
  
  [Michal Štefánik] Attention sparsification: Look into the future and the past (behind the context window) 8. 10. 2020
- Nyní studovat
  
  [Vítek Novotný] Word Embeddings: Towards Fast, Interpretable, and Accurate Information Retrieval Systems 15. 10. 2020
- Nyní studovat
  
  [Mikuláš Bankovič] Single Image Super Resolution: SRCNN and ESPCN 22. 10. 2020
- Nyní studovat
  
  [Michal Štefánik] Attention semantics: What attention heads actually know and why should we care 29. 10. 2020
- Nyní studovat
  
  [Vlastimil Martinek] Experiments with image augmentation for classification and segmentation 5. 11. 2020
- Nyní studovat
  
  [Vítek Novotný & Dominik Rehák] Five Years of Markdown in LaTeX: What, Why, How, and Whereto 12. 11. 2020
- Nyní studovat
  
  [Jakub Ryšavý] Feature Reduction: Selection or Extraction for Time Series (Financial) Data 19. 11. 2020
- Nyní studovat
  
  [Vlastimil Martinek] Experiments with Image Augmentation for Classification and Segmentation: Part 2 26. 11. 2020
- Nyní studovat
  
  [Eniafe Festus Ayetiran] Exploting Semantic Knowledge for Aspect Sentiment Classification: A Deep Learning Approach 3. 12. 2020
- Nyní studovat
  
  [Michal Štefánik] Unsupervised Data Augmentation: Thinking Outside the Single-Objective Box 10. 12. 2020
- Nyní studovat
  
  [All] Christmas party 17. 12.
- Nyní studovat
  
  [Petr Sojka et al.]: The Representations of Language which Allow Thinking, Fast and Slow 7. 1. 2021
- Nyní studovat
  
  [Michal Štefánik and Vítek Novotný]: Poster session for the ALPS NLP winter school 14. 1. 2021
- Nyní studovat
  
  [Filip Široký] Forecasting the Linac3 ion beam current challenge (a.k.a. co se dělá v CERN) 28. 1. 2021
- Nyní studovat
  
  [Vítek Novotný a Michal Štefánik] Advanced Language Processing Winter School 2021 11. 2. 2021
- Nyní studovat
  
  [Vítek Novotný] Math-Aware Search Engine in a Single Line of Code 18. 2. 2021

Operace

Prohlédnout vše

Interaktivní osnova

[Michal Štefánik] Unsupervised Data Augmentation: Thinking Outside the Single-Objective Box 10. 12. 2020

Readings

Operace