Bibliograph. Daten | Khosravifard, Mina: Novel Deep Learning Architecture for Semantic Segmentation of Road Scene from Fisheye Images. Universität Stuttgart, Fakultät Informatik, Elektrotechnik und Informationstechnik, Masterarbeit Nr. 118 (2019). 83 Seiten, englisch.
|
| Kurzfassung | An autonomous vehicle requires to have a perception modality, which is one of the challenging problems for the automotive industry. A functional perception system demands to have the 360-degree surroundings of the vehicle. Fisheye lenses offer a greater Field Of View (FOV) than conventional lenses, which make them an excellent choice for input camera sensors of autonomous vehicles. Fisheye lenses introduce a large amount of distortion; therefore, most networks which are designed for perspective images, can not handle this amount of deformation from fisheye images. The fisheye lenses have not been used for autonomous vehicles until recently, as a result, the available fisheye datasets are commonly small. Recent revolutionary results of convolutional neural networks open the doors for solving such a challenging computer vision problem such as road scene semantic segmentation. This thesis presents two deep learning architectures, EShareNet, and EPShareNet, which can efficiently handle a significant amount of distortion and lack of large fisheye dataset. For this purpose, (E/EP)ShareNets utilize weight sharing between encoder-decoder architectures based on ERFNet and ERFPSPNet for the fisheye dataset and transformed versions of non-fisheye datasets such as Cityscapes and SYNTHIA (Sequence /Random sets). The extended version of (E/EP)ShareNet as EShareFocusNet and EPShareFocusNet created for reducing the effect of dominant classes such as road, sky, and vegetation (uncritical classes) in the driving scenes. EShareFocusNet shows a better ability of modeling for some of the critical classes of driving scenes such as pedestrian and rider; however, the total testing performance decreased as network focus mostly on critical classes, and uncritical ones are strongly downsampled. Finally, EShareNet has the best testing performance overall of our models when it trained on the fisheye dataset and transformed Cityscapes and SYNTHIA-Rand set. Additionally it outperforms for recognizing Persons and Riders classes.
|
| Abteilung(en) | Universität Stuttgart, Institut für Visualisierung und Interaktive Systeme, Visualisierung und Interaktive Systeme
|
| Betreuer | Bruhn, Prof. Andrés; Coors, Benjamin; Paul, Dr. Condurache Alexandru |
| Eingabedatum | 13. Mai 2025 |
|---|