Bachelor Thesis BCLR-2021-37

BibliographyMarianov, Valentin: Graph generation with preserved properties.
University of Stuttgart, Faculty of Computer Science, Electrical Engineering, and Information Technology, Bachelor Thesis No. 37 (2021).
68 pages, english.
Abstract

The increasing size of real-world networks and the lack of publicly available datasets due to security and user privacy concerns has increased the demand for graph generators. Developing graph generators could facilitate researchers in analysing network's properties and underlying structure. More accurately, a request towards designing synthetic generator applicable to arbitrary graphs, which also scales up to massive datasets, is made. Such model could eventually be further extended to produce graphs of multiple size aiming to predict how those will conceivably evolve in the future. Having acquired synthetic networks, we could then measure multiple metrics in order to understand a graph's structure and properties. However, current graph generators are only capable of matching a fraction of real-world graph properties, are not suitable for different types of networks, do not scale up for large datasets or a mixture of the preceding. The main goal of this thesis is to take a deeper look in how well state-of-the-art graph generators preserve real-world graphs and their properties. Answers to the following questions are to be investigated: How well can a generation model capture a certain property? Is the model reliable in generating real-world networks from different domains? Can any similarities be found between the properties of synthetically generated and real-world graphs? Can a generator potentially produce graphs with adaptable properties, e.g. multiple times larger graph? Detailed analyses of existing graph generators and approach to try to combine or extend them to be able to generate graphs with similar to real-world networks properties are the head focus of this work. After investigating several generation models, we focused our attention on Darwini. Because to the best of our knowledge, it is only the second approach making an effort to concomitantly match the explicitly provided degree and clustering properties. In addition, no work aiming at questioning the model's ability has been found motivating us to thoroughly analyse the algorithm. Besides assessments we contributed to this topic by extending the algorithm with two methods aiming to predict how the structure of a certain network would look like in the future. Darwini along with our proposed extensions have been implemented in the GAME framework to analyse and compare the properties of arbitrary graphs. As we have shown in the results, Darwini lacks the ability of reproducing graphs with notable degree difference of neighbour elements in the distribution series and missing vital links connecting multiple components with one another. However, distributing vertices across several groups could address the issue and particularly improve the overall results.

Full text and
other links
Volltext
Department(s)University of Stuttgart, Institute of Parallel and Distributed Systems, Distributed Systems
Superviser(s)Rothermel, Prof. Kurt; Schramm, Michael
Entry dateAugust 16, 2021
New Report   New Article   New Monograph   Computer Science