Relaxing restrictive interdependence assumptions in networks

cyvy Research Project

In various fields of research, such as the social sciences, biology, and computer science, network models are often applied to help describe complex systems with many individual elements interacting. In recent years, these models have often been used to draw new conclusions from observed data. The availability of large amounts of data has promoted this development.

 

Generative models are a popular network model. Here, latent variables are introduced which integrate the scientific findings in this field of knowledge (the "domain knowledge") and capture complex interactions. However, interactions among individuals are usually so complex that they are often approximated as independent. Conditioning upon these variables, the network edges are assumed to be independent and the distribution of probabilities within the network can be simplified. The disadvantage of these models is that in some real-world scenarios, the interactions within the network are not well captured. This means that the model's mathematical description does not correspond well to what is observed in real data. The coupling between variables, which are  too limited, are the main problem here. In comparison, network ensemble models do not use such latent variables, but rather network-specific variables (e.g. degree of distribution or clustering coefficient). However, these models also suffer from various problems that limit their practical application.

 

This project will combine certain features of the generative model and the network ensemble model with methods from statistical physics. The aim is to develop better principle-based models. In addition, the project aims to ensure that these models can be efficiently applied to concrete problems (e.g. repeatability or the simultaneous occurrence of different forms of relationships between two nodes).