Masterarbeit MSTR-2022-60

Bibliograph.
Daten
Ott, Stefan: Long-tailed visual object detection using grouped staged training.
Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Masterarbeit Nr. 60 (2022).
103 Seiten, englisch.
Kurzfassung

State-of-the-art object detectors are dominated by data-driven methods, which rely on data quantity and quality for training. However, the usage of data collected from the real-world can exhibit a long-tail data distribution, where object detectors tend to be overconfident on head classes while generalizing poorly to tail classes. In this thesis, techniques from few-shot learning are transferred to long-tailed learning by applying and extending fine-tuning for object detection. We show the effectiveness of two-stage fine-tuning on a challenging long-tail large vocabulary dataset and extend the first stage of training by including tail class data. We propose grouped staged training as extension of the first representation learning stage to reduce the overall class imbalance and exploit similarities between classes. Our experiments show that grouped staged training can increase the overall performance by improving detection and classification of rare classes, while keeping the performance on frequent and common objects. Additionally, we apply self-supervised pre-training to long-tailed object detection and show the positive effect of its combination with two-staged training. Finally, we critically evaluate cross-category rankings and limitations of our methods regarding detection confidence scores, where we observe the performance improvements are redistributed towards tail classes with a decreasing number of predictions for head classes.

Abteilung(en)Universität Stuttgart, Institut für Maschinelle Sprachverarbeitung
BetreuerVu, Prof. Ngoc Thang; Schweitzer, Dr. Antje; Zhang; Dr. Dan; Friedrich, Dr. Annemarie
Eingabedatum29. November 2022
   Publ. Informatik