Algorithms, Dynamics,
and Information Flow
in Networks
DFG Research Unit
Speakers:
Multi-Agent Contracts (Duetting)
We study a natural combinatorial single-principal multi-agent contract design problem, in which a principal motivates a team of agents to exert effort toward a given task. At the heart of our model is a reward function, which maps the agent efforts to an expected reward of the principal. We seek to design computationally efficient algorithms for finding optimal (or near-optimal) linear contracts for reward functions that belong to the complement-free hierarchy.
Our first main result gives constant-factor approximation algorithms for submodular and XOS reward functions, with value and demand oracles, respectively. It relies on an unconventional use of ``prices'' and (approximate) demand queries for selecting the set of agents that the principal should contract with, and exploits a novel scaling property of XOS functions and their marginals, which may be of independent interest.
Our second main result is an $\Omega(\sqrt{n})$ impossibility for settings with n agents and subadditive reward functions, even with demand oracle access. A striking feature of this impossibility is that it applies to subadditive functions that are constant-factor close to submodular. This presents a surprising departure from previous literature, e.g., on combinatorial auctions.
Preprint: https://arxiv.org/abs/2211.05434 <https://arxiv.org/abs/2211.05434> (to appear in STOC 2023)
Information Design for Congestion Games with Unknown Demand (Koglin)
We consider non-atomic congestion games with affine costs and a single commodity whose demand is unknown to the players of the game, but can be observed by a principal. Prior to observing the demand, the principal commits to a public signaling scheme that governs to which extend (if at all) the information about the realized demand is communicated to the players. Upon receiving the signal, the players update their beliefs about the demand and respond by playing a Wardrop equilibrium for the resulting cost functions. We study the question of how to compute the signaling scheme that minimizes the total cost of the induced Wardrop equilibrium.
First, we devise a fully polynomial-time approximation scheme (FPTAS) for the case that the demand can only take two values. The FPTAS relies on several structural properties of the cost of the induced Wardrop equilibrium as a function of the updated belief of the distribution of demands. We show that this function is piecewise linear for any number of demands, and monotonic for two demands. Second, we give a complete characterization of the graph structures where it is optimal for the principal to fully reveal the information about the realized demand to the players. Specifically, we show that this signaling scheme is optimal for all cost functions and probability distributions over demands if and only if the graph is series-parallel. Third, we propose an algorithm that computes the optimal signaling scheme for any number of demands whose time complexity is polynomial in the number of supports that occur in a Wardrop equilibrium for some demand. Finally, we conduct a computational study that tests this algorithm on real-world instances.
Joint work with Svenja M. Griesbach, Martin Hoefer and Max Klimm.
Speakers:
Computational Study of Collective Behavior: The Case of Association Football (Soccer) (Brandes)
I will present team sports analytics as an opportunity to teach and develop computational methods for the study of collective behavior and applied data science more generally.
Football is the most popular sport in the world, in part because its fluidity and randomness allow for many alternative interpretations of the same events or, in other words, accommodate millions of pundits relying on experience and anecdotal evidence. The discussions around data-informed or even data-driven decision- making illustrate nicely the conflict between naive eagerness to apply, and ignorant dismissal of, soccer analytics.
Using examples from our ongoing research I will discuss opportunities and pitfalls in the (un)critical application of computational methods.
Given that so many seem to know so much about the subject, I expect a great deal of willingness to question any choice of representation or data-suggested conclusion. Why, then, should similar methods be more applicable in unregulated social environments?
Certifying Induced Subgraphs in Large Graphs (Tran)
We introduce I/O-efficient certifying algorithms for bipartite graphs, as well as for the classes of split, threshold, bipartite chain, and trivially perfect graphs. When the input graph is a member of the respective class, the certifying algorithm returns a certificate that characterizes this class. Otherwise, it returns a forbidden induced subgraph as a certificate for non-membership. On a graph with n vertices and m edges, our algorithms take O(Sort(n + m)) I/Os in the worst case for split, threshold and trivially perfect graphs. In the same complexity bipartite and bipartite chain graphs can be certified with high probability. We provide implementations for split and threshold graphs and provide a preliminary experimental evaluation.
Speakers:
Algorithmic Programmable Matter: From Local Markov Chains to “Dumb” Robots (Richa)
Many programmable matter systems have been developed, including modular and swarm robotics, synthetic biology, DNA tiling, and smart materials. We describe programmable matter as an abstract collection of simple computational elements (particles) with limited memory that each execute distributed, local algorithms to self-organize and solve system-wide problems, such as movement, reconfiguration, and coordination. Self-organizing particle systems (SOPS) have many interesting potential applications like coating objects for monitoring and repair purposes, and forming nano-scale devices for surgery and molecular-scale electronic structures. We describe some of our work on the algorithmic foundations of programmable matter, investigating how macro-scale system behaviors can naturally emerge from local micro-behaviors by individual particles: We utilize tools from statistical physics and Markov chain analysis to translate Markov chains defined at a system level into distributed, local algorithms for SOPS that drive the desired emergent collective behavior for the problems of compression, separation, and foraging, among others. We further establish the notion of algorithmic matter, where we leverage standard binary computation, as well as physical characteristics of the robots and interactions with the environment in order to implement our micro-level algorithms in actual testbeds composed of robots that are not capable of any standard computation. We conclude by addressing full concurrency and asynchrony in SOPS.
Dynamic Averaging Load Balancing on Arbitrary Graphs (Hintze)
Speakers:
Triangle Switches, Reconstruction and Mixing (Greenhill)
Speakers:
The Hitting Time of Clique Factors (Müller)
In 2022, Jeff Kahn gave the strongest possible, affirmative, answer to Shamir's problem, which had been open since the late 1970s: Let r be an integer larger than two and n divisible by r. Then, in the r-uniform random hypergraph process on n vertices, as soon as the last isolated vertex disappears, a perfect matching emerges. We transfer this hitting time result to the setting of clique factors in the random graph process: at the time that the last vertex joins a copy of the complete graph on r vertices K_r, the random graph process contains a K_r-factor. Our proof draws on a novel sequence of couplings, extending techniques of Riordan and Annika Heckel. Moreover, we prove an analogous result for clique factors in the s-uniform hypergraph process (where s is at least 3).
This talk is based on joint work with Annika Heckel, Marc Kaufmann and Matija Pasch.
The Cut Metric for Probability Distributions (Hahn-Klimroth)
Guided by the theory of graph limits, we investigate a variant of the cut metric for limit objects of sequences of discrete probability measures.
Speakers:
Hedonic Games with Fixed-Size Coalitions (Bilò)
In hedonic games, a set of n agents, having preferences over all possible coalition structures, needs to agree on a stable outcome. In this work, we initiate the study of hedonic games with fixed-size coalitions, where the set of possible coalition structures is restricted as follows: there are k coalitions, each coalition has a fixed size, and the sum of the sizes of all coalitions equals n. We focus on the basic model of additively separable hedonic games with symmetric valuations, where an agent's preference is captured by a utility function which sums up a contribution due to any other agent in the same coalition. In this setting, an outcome is stable if no pair of agents can exchange coalitions and improve their utilities. Conditioned on the definition of improvement, three stability notions arise: swap stability under transferable utilities, in which no swap can improve the sum of the utilities of both agents, strict swap stability, in which no swap can improve the utility of one agent without decreasing the utility of the other one, and swap stability, in which no swap can improve the utilities of both agents simultaneously. We analyse the fundamental questions of existence, complexity and efficiency of stable outcomes, and that of complexity of a social optimum.
Margin of Victory for Weighted Tournament Solutions (Döring)
Determining how close a winner of an election is to becoming a loser, or distinguishing between different possible winners of an election, are major problems in Computational Social Choice. We tackle these problems for so-called weighted tournaments by generalizing the notion of Margin of Victory (MoV) for unweighted tournament solutions by Brill et al. to weighted tournament solutions. For these, the MoV of a winner (resp. loser) is the total weight that needs to be changed in the tournament to make them a loser (resp. winner). We study three weighted tournament solutions: Borda's rule, the weighted Uncovered Set, and Split Cycle. For all three rules, we analyse the complexity of computing the MoV, and provide structural insight into the MoV by investigating three properties: monotonicity, consistency regarding the weighted covering relation, and consistency regarding the degree of alternatives. Lastly, we provide upper and lower bounds on the MoV and analyse the expressiveness of the MoV values for each of the three tournament solutions using experimental results on tournaments of varying sizes.
Speakers:
Online Stochastic Optimization (Gravin)
There is a growing number of interesting stochastic models complementing traditional for the area of online algorithms worst-case analysis. Two popular stochastic frameworks are the random arrival model and Bayesian optimization framework. I will introduce these two frameworks on the examples of Secretary problem and Prophet inequality and discuss a few of their combinatorial extensions. A special attention will be given to various online matching models, a central setting for online algorithms due to its many applications in online advertisement, ridesharing, and online marketplaces. I will give an overview of different models and open research directions. No background knowledge is required, although some familiarity with online algorithms and stochastic optimization might be useful.
Based on a series of papers with Tomer Ezra, Michal Feldman, Enze Sun, Zhihao Tang, Hongao Wang, and Kangning Wang
New Fairness Notions for Resource Allocation (Varricchio)
We consider the fundamental problem in resource allocation that entails fairly dividing a set of indivisible goods among agents. The notion of envy-freeness up to any good (EFX) is the most compelling notion of fairness in this line of work. We say an allocation is EFX if every agent (weakly) prefers her own bundle over any other agent bundle after removing her least preferred good from this latter. Despite significant efforts over the past few years, the existence of EFX allocations is not known, even for additive valuations. Therefore, there has been a lot of work focusing on relaxations and approximations of EFX.
In our work, we propose two natural fairness notions that are inspired by EFX. The first notion is the one of epistemic EFX (EEFX), which ensures every agent is EFX-satisfied upon a redistribution of the other agents' goods. The second is the minimum EFX value (MXS) which is a threshold-based criteria defined using EFX allocations. Interestingly, we show that both EEFX and MXS allocations are always guaranteed to exist, for additive valuations, and can be computed in polynomial time.
Joint work with Ioannis Caragiannis, Jugal Garg, Nidhi Rathi, and Eklavya Sharma
Speakers:
Towards Rapidly Transforming Programmable Matter (Scheideler)
The Amoebot model is a model for programmable matter that has found increased interest in recent years. However, many important issues are not captured in the basic form of this model, including rapid shape transformation. I propose a reconfigurable circuit extension that allows many central problems like leader election, compass alignment, and symmetry checks to be solved significantly faster than in the basic model. This provides the base for synchronization and energy distribution that can then be used for rapid shape transformation.
On the Hierarchy of Distributed Majority Protocols (Schallmoser)
We study the Consensus problem among n agents, defined as follows. Initially, each agent holds one of two possible opinions. The goal is to reach a consensus configuration in which every agent shares the same opinion. To this end, agents randomly sample other agents and update their opinion according to a simple update function depending on the sampled opinions. We consider two communication models: the gossip model and a variant of the population model. In the gossip model, agents are activated in parallel, synchronous rounds. In the population model, one agent is activated after the other in a sequence of discrete time steps. For both models we analyze the following natural family of majority processes called j-Majority: when activated, every agent samples j other agents uniformly at random (with replacement) and adopts the majority opinion among the sample (breaking ties uniformly at random). As our main result we show a hierarchy among majority protocols: (j + 1)-Majority (for j > 1) converges stochastically faster than j-Majority for any initial opinion configuration. In our analysis we use Strassen’s Theorem to prove the existence of a coupling. This gives an affirmative answer for the case of two opinions to an open question asked by Berenbrink et al. [2017]
Speakers:
Online Routing and Network Design with Predictions (Megow)
Online optimization refers to solving problems where an initially unknown input is revealed incrementally, and irrevocable decisions must be made not knowing future requests. The assumption of not having any prior knowledge about future requests seems overly pessimistic. Given the success of machine-learning methods and data-driven applications, one may expect to have access to predictions about future requests. However, simply trusting them might lead to very poor solutions as these predictions come with no quality guarantee. In this talk we present recent developments in the young line of research that integrates such error-prone predictions into algorithm design to break through worst case barriers. We discuss algorithmic challenges with a focus on online routing and network design and present algorithms with performance guarantees depending on a novel error metric.
Joint work with Giulia Bernardini, Alexander Lindermayr, Alberto Marchetti-Spaccamela, Leen Stougie, and Michelle Sweering.
Algorithms for Non-Linear Preferential Attachment (Allendorf)
Preferential attachment lies at the heart of many network models aiming to replicate features of real world networks. To simulate the attachment process, conduct statistical tests, or obtain input data for benchmarks, efficient algorithms are required that are capable of generating large graphs according to these models.
Existing graph generators are optimized for the most simple model, where new nodes that arrive in the network are connected to earlier nodes with a probability $P(h) \propto d$ that depends linearly on the degree $d$ of the earlier node $h$. Yet, some networks are better explained by a more general attachment probability $P(h) \propto f(d)$ for some function $f \colon \mathbb N~\to~\mathbb R$.
Here, the polynomial case $f(d) = d^\alpha$ where $\alpha \in \mathbb R_{>0}$ is of the greatest interest.
In this paper, we present efficient algorithms that generate graphs according to the more general models.
To this end, we design a new method for maintaining a discrete distribution which leads to a simple yet optimal algorithm for the polynomial model. We parallelize the algorithm by identifying batches of samples that can be drawn independently and obtain a near-optimal speedup when adding many nodes.
In addition, we present an I/O-efficient algorithm for use in external memory that can even be used for the fully general model.
To showcase the efficiency and scalability of our algorithms, we conduct an experimental study and compare their performance to existing solutions for the linear case.
Speakers:
Strong Approximations and Irrationality in Financial Networks with Derivatives (Ioannidis)
Financial networks model a set of financial institutions (firms) interconnected by obligations. Recent work has introduced to this model a class of obligations called credit default swaps, a certain kind of financial derivatives. The main computational challenge for such systems is known as the clearing problem, which is to determine which firms are in default and to compute their exposure to systemic risk, technically known as their recovery rates. It is known that the recovery rates form the set of fixed points of a simple function, and that these fixed points can be irrational. We further study the clearing problem from the point of view of irrationality and approximation strength. Firstly, we observe that weakly approximate solutions may contain misleading information regarding the actual financial state of an institution, thus making them hard to justify. On this basis, we study the complexity of finding a strongly (or "near") approximate solution, and show FIXP-completeness. We then study the structural properties required for irrationality, and we give necessary conditions for irrational solutions to emerge: The presence of certain types of cycles in a financial network forces the recovery rates to take the form of roots of non-linear second- or higher-degree polynomials. In the absence of a large subclass of such cycles, we study the complexity of finding an exact fixed point, which we show to be a problem close to, albeit outside of PPAD.
The Full Rank Condition for Sparse Random Matrices (Rolvien)
We derive a sufficient condition for a sparse random matrix with given numbers of non-zero entries in the
rows and columns having full row rank. The result covers both matrices over finite fields with independent non-zero entries and {0, 1}-matrices over the rationals. The sufficient condition is generally necessary as well.
Speakers:
Introducing JUNE - an open-source epidemiological simulation (Krauss)
We constructed a detailed digital twin of the UK population, with supreme social and geographical granularity, representing 55 million residents in England and Wales and tracing their daily movements and activities. The infection is propagated through the virtual population through simulated social contacts, and the progress of the disease in infected individuals and their trajectory through the healthcare system is then described, based on public health data. Various concurrent strains and pharmaceutical (vaccines) and non-pharmaceutial interventions (e.g. social distancing) are implemented. JUNE resolves the spatio-temporal development of the disease spread through the population in great detail and is able to find non-trivial correlations of infection and fatality rares with a variety of societal factors. Our model has been proven in the midst of the current crisis and is currently being used by NHS-England’s COVID-19 response team to inform their strategic and operational planning. As a side-product, we collaborate with WHO and UN Global Pulse and roll out the model also to one of the world's largest refugee settlements.
Initial alignment in neural networks and its necessity for learning by gradient descent (Hązła)
We study the question which boolean functions can be efficiently learned by neural networks using gradient descent (GD) algorithm. We introduce the notion of "initial alignment" (INAL) between a randomly initialized neural network and its target function. We then show that if the INAL value is not noticeably large, then a fully connected neural network with iid initialization will not learn the function with noisy GD in polynomial time. The result is based on analyzing the relation of INAL to the boolean Fourier expansion and the notion of cross-predictability of a class of functions.
Joint work with Emmanuel Abbe, Elisabetta Cornacchia and Christopher Marquis.
Speakers:
Single-source Shortest p-disjoint Paths: Fast Computation and Sparse Preservers (Bilò)
On the External Validity of Average-Case Analyses of Graph Algorithms (Fischbeck)
The number one criticism of average-case analysis is that we do not actually know the probability distribution of real-world inputs. Thus, analyzing an algorithm on some random model has no implications for practical performance. At its core, this criticism doubts the existence of external validity, i.e., it assumes that algorithmic behavior on the somewhat simple and clean models does not translate beyond the models to practical performance real-world input.
With this paper, we provide a first step towards studying the question of external validity systematically. To this end, we evaluate the performance of six graph algorithms on a collection of 2751 sparse real-world networks depending on two properties; the heterogeneity (variance in the degree distribution) and locality (tendency of edges to connect vertices that are already close). We compare this with the performance on generated networks with varying locality and heterogeneity. We find that the performance in the idealized setting of network models translates surprisingly well to real-world networks. Moreover, heterogeneity and locality appear to be the core properties impacting the performance of many graph algorithms.
Speakers:
Competitive Divison of Goods, Bads, and Mixed (Mehta)
Fair division is the problem of allocating a set of items among agents in a fair and efficient manner. This age-old problem, mentioned even in the Bible, arises naturally in a wide range of real-life settings, for example, school seat assignments, partnership dissolution, sharing of satellites, and dividing costs for climate resilience. Division based on competitive equilibrium (CE) has emerged as one of the best mechanisms for this problem. The existence and computability of CE have been extensively studied when all items are disposable goods, while the problem is less explored when some of them are non-disposable chores (bads). In this talk, I will discuss recent algorithmic advances on the computation of CE when each item may be a good, a chore, or both (mixed).
I will first consider the case of additive valuations, where when all items are goods, the CE set is well-known to be captured by convex programming formulations and thereby forms a convex set. In sharp contrast, with chores, the CE set may be nonconvex and disconnected. I will discuss how to handle this non-convexity through a new exterior-point method to find an approximate CE in polynomial time (FPTAS). This method seems general enough to work with any mathematical formulation that optimizes a coordinate-wise monotone function over linear constraints. Finally, I will discuss recent developments on the exchange setting (barter system).
Based on joint works with Shant Boodaghians, Bhaskar Ray Chaudhury, Jugal Garg, and Peter McGlaughlin.
Public Signals in Network Congestion Games (Koglin)
We consider a largely untapped potential for the improvement of traffic networks that is rooted in the inherent uncertainty of travel times. Travel times are subject to stochastic uncertainty resulting from various parameters such as weather condition, occurrences of road works, or traffic accidents. Large mobility services have an informational advantage over single network users as they are able to learn traffic conditions from data. A benevolent mobility service may use this informational advantage in order to steer the traffic equilibrium into a favorable direction. The resulting optimization problem is a task commonly referred to as signaling or Bayesian persuasion. Previous work has shown that the underlying signaling problem can be NP-hard to approximate within any non-trivial bounds, even for affine cost functions with stochastic offsets. In contrast, we show that in this case, the signaling problem is easy for many networks. We tightly characterize the class of single-commodity networks, in which full information revelation is always an optimal signaling strategy. Moreover, we construct a reduction from optimal signaling to computing an optimal collection of support vectors for the Wardrop equilibrium. For two states, this insight can be used to compute an optimal signaling scheme. The algorithm runs in polynomial time whenever the number of different supports resulting from any signal distribution is bounded to a polynomial in the input size. Using a cell decomposition technique, we extend the approach to a polynomial-time algorithm for multi-commodity parallel link networks with a constant number of commodities, even when we have a constant number of different states of nature.
This is joint work with Svenja Marie Griesbach, Martin Hoefer and Max Klimm.
Speakers:
Similarity-based hierarchical Clustering in Tree-Like Networks (Mnich)
Hierarchical clustering of networks is a fundamental task with important applications in phylogenetics, network security analysis, and software engineering, which groups together data points that are similar to each other, and separates data points which are different. Compared to flat clustering, which requires to guess or search the "optimal" number of groups or clusters in advance, hierarchical clustering does not have such a requirement. Moreover, hierarchical clustering can detect the hidden cluster structure at all levels of granularity simultaneously.
This concept of hierarchical clustering was formulized by Dasgupta (STOC 2016), who proposed algorithms based on sparsest cut computations to identify the similarity and dissimilarities in large networks. For tree-like networks of treewidth t and size n, Chlamtáč et al. (APPROX 2010) presented an algorithm that yields an exp(exp(t))-approximation to the optimal sparsest cut in in time exp(t) * poly(n). Gupta et al. (STOC 2013) showed how to obtain an O(1)-approximation with a blown-up run time of n^{O(k)}. An intriguing open question is whether one can simultaneously achieve the best out of the aforementioned results, that is, an O(1)-approximation in time exp(k) * poly(n). We achieve this goal, via the following results: (i) An O(k^2)-approximation that runs in time exp(k) * poly(n), simultaneously improving both the approximation factor and the run time of the result by Chlamtáč et al. (ii) For any \varepsilon>0, an O(1/\varepsilon^2)-approximation in time 2^{O(k^{1+\varepsilon}/\varepsilon)} * poly(n), implying an O(1)-approximation whose run time is almost exp(k) * poly(n) and a O(\log^2 k)-approximation in time k^{O(k)} * poly(n). Key to our results is a novel measure of "combinatorial diameter" for tree-like networks, that we introduce here and which may be of independent interest.
(Joint work with Parinya Chalermsook, Matthias Kaul, Joachim Spoerhase and Daniel Váz).
Engineering Uniform Sampling of Graphs with a Prescribed Power-law Degree-Sequence (Allendorf)
We consider the following common network analysis problem: given a degree sequence d = (d_1, ..., d_n) in N^n return a uniform sample from the ensemble of all simple graphs with matching degrees. In practice, the problem is typically solved using Markov Chain Monte Carlo approaches, such as Edge-Switching or Curveball, even if no practical useful rigorous bounds are known on their mixing times.In contrast, Arman et a. sketch Inc-Powerlaw, a novel and much more involved algorithm capable of generating graphs for power-law bounded degree sequences with gamma > 2.88 in expected linear time. For the first time, we give a complete description of the algorithm and add novel switchings. To the best of our knowledge, our open-source implementation of Inc-Powerlaw is the first practical generator with rigorous uniformity guarantees for the aforementioned degree sequences. In an empirical investigation, we find that for small average-degrees Inc-Powerlaw is very efficient and generates graphs with one million nodes in less than a second. For larger average-degrees, parallelism can partially mitigate the increased running-time.
Speakers:
Dynamics for Multi-Agent System Coordination in Noisy and Stochastic Environments (Natale & D'Amore)
In recent years, there has been a surge of interest in the study of abstract systems composed of simple interacting agents. In particular, some computer scientists have focused their attention on algorithms that attempt to capture aspects of biological systems, i.e., natural algorithms. Here, we consider two distributed computing tasks that affect many biological systems: the consensus problem and the ANTS problem. When trying to reach consensus, real entities may be affected by some form of communication noise rather than adversarial failures. In this framework, we study two well-known opinion dynamics (the Undecided-State dynamic and the 3-Majority dynamic), which are known to converge quickly to agreement in noise-free environments, to which we add a uniform communication noise feature: we show that both dynamics exhibit similar phase transitions. In the ANTS (Ants Nearby Treasure Search) problem, on the other hand, a system of independent agents in the two-dimensional lattice must find a given (unknown) target as fast as possible. We consider Lévy walks, special random walks that have been shown to model many animal movement patterns, and show that they can provide a near-optimal solution to the ANTS problem. These results are based on the works [d'Amore et al., SIROCCO 2020], [Clementi et al., PODC 2021], and [d'Amore et Ziccardi, SIROCCO 2022], wich are joint work with A. Clementi, G. Giakkoupis, E. Natale, and I. Ziccardi.
Population Protocols for Exact Plurality Consensus: How a small chance of failure helps to eliminate insignificant opinions
(Schallmoser)
Joint work with Petra Berenbrink, Felix Biermeier, Robert Elsässer, Hamed Hosseinpour, and Peter Kling
Speakers:
Danupon Nanongkai, University of Copenhagen:
New Perspectives on Classic Questions in the Theory of Graph Algorithms
New Perspectives on Classic Questions in the Theory of Graph Algorithms (Nanongkai)
The theory of graph algorithms is a research field that develops mathematical techniques with provable guarantees to eventually make computing devices process graph (network) data with fewer resources such as time, energy, and hardware. Despite its pivotal role in computer science over many decades, the theory of graph algorithms research still lacks techniques for designing efficient algorithms for some of the most basic tasks, and for exploiting technological advances from the last few decades: On the one hand, the ever-increasing data volumes call for nearly-linear time algorithms which are still elusive for many basic problems. On the other hand, to exploit advances in computer architecture and large-scale systems, we need algorithms that are efficient in models of computation beyond the classic sequential setting.
In this talk, I will discuss some recent progress towards efficient algorithms across many computational models such as dynamic graphs, distributed networks, and streaming algorithms. Basic graph problems that will be discussed include cut, matching, and shortest paths. No background is required, although familiarity with basic graph terminologies and algorithms will be useful.
Speakers:
Semi-Definite Programming meets Stance Classification - how to turn theory into good practice (Vilenchik)
Stance detection is an important task, supporting many downstream tasks such as discourse parsing and modeling the propagation of fake news, rumors, and science denial. In this talk we describe a novel framework for stance detection. Our framework is unsupervised and domain-independent. Given a claim and a multi-participant discussion - we construct the interaction network from which we derive topological embeddings for each speaker. These speaker embeddings enjoy the following property: speakers with the same stance tend to be represented by similar vectors, while antipodal vectors represent speakers with opposing stances. These embeddings are then used to divide the speakers into stance-partitions. Our embedding is derived from the Semi-Definite Programming (SDP) solution to the max-cut problem on the interaction network. In this talk we shall explain how the success of our method is rooted in theoretical results of SDP integrality for random k-colorable graphs. We evaluated our method on three different datasets from different platforms. Our method outperforms or is comparable with supervised models, including Neural Network based approaches. Joint work with: Ron Korenblum Pick, Vladyslav Kozhukhov, and Oren Tsur Paper was accepted to AAAI 2022: https://arxiv.org/abs/2112.00712 .
A sparse parity matrix (Lee)
We study a square matrix with every entry having a Bernoulli distribution p=d/n. We show that this matrix defies the usual zero-one law behavior when d>e such that the fraction of the variables that take the same value in the kernel of the matrix vacillates between two values with equal probability 1/2.
Speakers:
On The Unreasonable Effectiveness of SAT Solvers (Ganesh)
Over the last two decades, software engineering (broadly construed to include testing, analysis, synthesis, verification, and security) has witnessed a silent revolution in the form of Boolean SAT and SMT solvers. These solvers are now integral to many testing, analysis, synthesis, and verification approaches. This is largely due to a dramatic improvement in the scalability of these solvers vis-a-vis large real-world formulas. What is surprising is that the Boolean satisfiability problem is NP-complete, believed to be intractable, and yet these solvers easily solve industrial instances containing millions of variables and clauses in them. How can that be? In my talk, I will address this question of why SAT solvers are so efficient through the lens of machine learning (ML) as well as ideas from (parameterized) proof complexity. While the focus of my talk is almost entirely empirical, I will show how can we leverage theoretical ideas to not only deepen our understanding but also to build better SAT solvers. I will argue that SAT solvers are best viewed as proof systems, composed of two kinds of sub-routines, ones that implement proof rules and others that are prediction engines that optimize some metric correlated with solver running time. These prediction engines can be built using ML techniques, whose aim is to structure solver proofs in an optimal way. Thus, two major paradigms of AI, namely machine learning and logical deduction, are brought together in a principled way in order to design efficient SAT solvers. A result of my research is the MapleSAT solver, that has been the winner of several recent international SAT competitions and is widely used in industry and academia.
The Impact of Heterogeneity and Geometry on the Proof Complexity of Random Satisfiability (Rothenberger)
Satisfiability is considered the canonical NP-complete problem and is used as a starting point for hardness reductions in theory, while in practice heuristic SAT solving algorithms can solve large-scale industrial SAT instances very efficiently. This disparity between theory and practice is believed to be a result of inherent properties of industrial SAT instances that make them tractable. Two characteristic properties seem to be prevalent in the majority of real-world SAT instances, heterogeneous degree distribution and locality. To understand the impact of these two properties on SAT, we study the proof complexity of random k-SAT models that allow to control heterogeneity and locality. Our findings show that heterogeneity alone does not make SAT easy as heterogeneous random k-SAT instances have superpolynomial resolution size. This implies intractability of these instances for modern SAT-solvers. On the other hand, modeling locality with an underlying geometry leads to small unsatisfiable subformulas, which can be found within polynomial time. A key ingredient for the result on geometric random k-SAT can be found in the complexity of higher-order Voronoi diagrams. As an additional technical contribution, we show an upper bound on the number of non-empty Voronoi regions, that holds for points with random positions in a very general setting. In particular, it covers arbitrary p-norms, higher dimensions, and weights affecting the area of influence of each point multiplicatively. Our bound is linear in the total weight. This is in stark contrast to quadratic lower bounds for the worst case.
Speakers:
Incomplete Information VCG Contracts for Common Agency (Talgam-Cohen)
We study contract design for welfare maximization in the well-known “common agency” model of Bernheim and Whinston [1986]. This model combines the challenges of coordinating multiple principals with the fundamental challenge of contract design: that principals have incomplete information of the agent’s choice of action. Motivated by the significant social inefficiency of standard contracts for such settings (which we formally quantify using a price of anarchy/stability analysis), we investigate whether and how a recent toolbox developed for the first set of challenges under a complete-information assumption - VCG contracts [Lavi and Shamash, 2019] - can be extended to incomplete information. This is joint work with Tal Alon, Ron Lavi and Elisheva Shamash.
Delegated Online Search (Hahn)
In a delegation problem, a principal with commitment power tries to pick one out of n options. Each option is drawn independently from a known distribution. Instead of inspecting the options herself, the principal delegates the information acquisition to a rational and self-interested agent. After inspection, the agent proposes one of the options, and the principal can accept or reject.
We study a natural online variant of delegation, in which the agent searches through the options in an online fashion. For each option, he has to irrevocably decide if he wants to propose the current option or discard it, before seeing information on the next option(s). How can we design algorithms for the principal that approximate the utility of her best option in hindsight?
Speakers:
Temporal Graph Exploration (Erlebach)
A temporal graph is a sequence of static graphs that all have the same vertex set, but the edge set in each step can be different. An agent can in each step either stay at its current vertex or move over an incident edge that is present in that step. The goal is to let the agent visit all vertices of the temporal graph as quickly as possible. While it is easy to visit all vertices of a static graph with n vertices in O(n) steps (e.g., by using depth-first search), the exploration of temporal graphs may require a much larger number of steps: We show that even if the temporal graph is a connected graph in each step and the dynamics are fully known in advance, exploration may require a quadratic number of steps. We also present upper and lower bounds on the worst-case exploration time of temporal graphs for various restricted cases (e.g., restrictions on the underlying graph or on the maximum degree of the graph in each step), outlining the main techniques that have been used to obtain these bounds and highlighting the many cases with large gaps between the known upper and lower bounds. (The talk is based on joint work with Michael Hoffmann, Frank Kammer, Kelin Luo, Andrej Sajenko, and Jakob Spooner.)
A loosely self-stabilizing leaderless phase clock (Hahn)
We present a self-stabilizing phase clock for population protocols. In the population model we are given a system of n identical agents which interact in a sequence of randomly chosen pairs. Our phase clock is leaderless and it requires O(log n) states. It runs forever and is, at any point of time, in a synchronous state w.h.p. When started in an arbitrary configuration, it recovers rapidly and enters a synchronous configuration within O(log n) parallel time w.h.p. Once the clock is synchronized, it stays in a synchronous configuration for at least poly n parallel time w.h.p.
Speakers:
Learning to prune instances of combinatorial optimisation problems (Ajwani)
In a large number of industrial applications, combinatorial optimisation problems are repeatedly solved with datasets from similar distribution. In recent years, machine learning techniques have been shown to be quite effective in speeding up such computations. However, black-box end-to-end machine learning approaches suffer from poor interpretability and the requirement for a large amount of labelled data. In this talk, I will present a simple and highly effective machine learning framework that incorporates the insights from the algorithmic and optimization literature to speed-up the solutions of these problems. We look at the kind of quantities the approximation algorithms employ and use these to derive useful features for training a classifier that helps quickly reduce the problem size by identifying the difficult core of the problem and pruning the remainder. The difficult core is then solved using an ILP solver. A prime advantage of using such features is that we do not require much data to train the classifier.
I will present results on a range of combinatorial optimisation problems: maximum clique enumeration, k-median, facility location, set cover, maximum coverage, minimum steiner tree, travelling salesman problem, capacitated vehicle routing problem and nurse rostering problem. I will show that in all of these problems, the learning-to-prune framework is able to reduce the computation time drastically while still obtaining a close to optimal solution.
An Experimental Study of External Memory Algorithms for Connected Components (Tran)
We empirically investigate algorithms for solving Connected Components in the external memory model. In particular, we study whether the randomized O(Sort(E)) algorithm by Karger, Klein, and Tarjan can be implemented to compete with practically promising and simpler algorithms having only slightly worse theoretical cost, namely Borůvka’s algorithm and the algorithm by Sibeyn and collaborators. For all algorithms, we develop and test a number of tuning options. Our experiments are executed on a large set of different graph classes including random graphs, grids, geometric graphs, and hyperbolic graphs. Among our findings are: The Sibeyn algorithm is a very strong contender due to its simplicity and due to an added degree of freedom in its internal workings when used in the Connected Components setting. With the right tunings, the Karger-Klein-Tarjan algorithm can be implemented to be competitive in many cases. Higher graph density seems to benefit Karger-Klein-Tarjan relative to Sibeyn. Borůvka’s algorithm is not competitive with the two others.
Speakers:
Spreading Processes on Spatial Contact Networks (Lapinskas)
We study the spread of information (or an epidemic) through a random graph intended to model a real-world social network. A decade ago, it was a long-standing open problem to find any such random graph model which maintained enough independence between vertices and edges to be pleasant to work with. Now, a closely-related family of such models has been developed. We study two models in this family, geometric inhomogeneous random graphs (GIRGs) and scale-free percolation (SFP). GIRGs include as a sub-family arguably the most famous such model: hyperbolic random graphs. The talk will not assume any prior background in the area, and it will begin with an introduction to GIRGs and SFP and to the properties of social networks in general.
From prior work by Komjathy and Lodewijks, we know that if we consider a standard SI spreading model on any GIRG or SFP graph whose parameters give rise to a giant component, then for most reasonable choices of transmission parameters the process will “explode” with non-zero probability (covering a constant proportion of vertices in constant time). This is unrealistic in many situations, which suggests it may be the wrong model for those situations. We show that by adding a small transmission penalty to high-degree vertices, we can recover the expected behaviour of a phase transition to explosion, and we locate this phase transition. We will also discuss unpublished future work in which we determine many of the transitions between explosion, exponential spread, and even polynomial spread. The idea of polynomial spread is surprising, but true to life; some epidemics do spread at polynomial rates, such as ebola in west Africa, and understanding the cause of this phenomenon is an open problem.
Speakers:
On codes decoding error on the BEC and BSC (Hazla)
I will talk about how to use a recent inequality on Boolean functions by Samorodnitsky to obtain a result in coding theory. Namely, assume that a linear code successfully corrects errors on the binary erasure channel (BEC) under optimal (maximum a posteriori, or MAP) decoding. Then, this code is also successful on a range of other channels with somewhat smaller capacities, including binary symmetric channel (BSC). One previously unknown consequence is that constant-rate Reed-Muller codes correct errors on BSC(p) for some constant p > 0.
Warning Propagation on Random Graphs (Ravelomanana)
Warning Propagation is a combinatorial message passing algorithm that unifies and generalises a wide variety of recursive combinatorial procedures. Special cases include the Unit Clause Propagation and Pure Literal algorithms for satisfiability as well as the peeling process for identifying the k-core of a random graph. Here we analyse Warning Propagation in full generality on the binomial random graph. We prove that under a mild stability assumption Warning Propagation converges rapidly. In effect, the analysis of the fixed point of the message passing process on a random graph reduces to analysing the process on a Galton-Watson tree. This result corroborates and generalises a heuristic first put forward by Pittel, Spencer and Wormald in their seminal k-core paper (JCTB 1996).
Speakers:
Networked Contingent Convertible Obligations and Financial Stability (Feinstein)
In this talk we investigate whether a financial system can be made more stable if financial institutions share risk by exchanging contingent convertible (CoCo) debt obligations. The question is framed in a financial network with debt and equity interlinkages with the addition of CoCos that convert continuously when a bank's debt-equity ratio drops to a trigger level. We characterize the clearing problem for the interbank debt and equity at the maturity of the obligations. We then introduce stylized networks to study when introducing CoCo bonds improves financial stability, as well as specific networks for which CoCo bonds do not uniformly provide improved system performance. To return to the main question, we examine the EU financial network at the time of the 2011 EBA stress test to do comparative statics to study the implications of CoCo debt on financial stability. It is found that by replacing all unsecured interbank debt by standardized CoCo interbank debt securities, systemic risk in the EU will decrease and bank shareholder value will increase. This is joint work with T.R. Hurd (McMaster University).
Stochastical Models for Bipartite Random Graph Models (Gerstorfer)
We study the embedding of bipartite random graph models using Bayesian stochastical models. Given a list of bipartite edges, we use Variational Inference algorithms to estimate the optimal parameters for a generative network model, so that the input network will be a typical outcome of the generative model. In this talk, I will present the methods we use for our models and explain the main concepts of our approach.
Speakers:
Theoretical Algorithm Analysis meets Practical Data (Bläsius)
A theoretical running time analysis is a great tool for predicting the performance of algorithms, saving programmers from implementing and evaluating alternatives that are "obviously" worse. Though this works very well in many cases, there are situations where the predictions do not match practical observations. This often comes from the fact that practical problem instances have certain properties that make them well-behaved. In this talk, I formalize these properties using probabilistic models and discuss theoretical results based on these formalizations. In the course of this, I will discuss advantages and
limitations of this model-based approach.
Eventually Exponential Contagion via Local and Global Contacts (Fischbeck)
Bootstrap percolation is a classical model for the spread of information in a network. In the round-based version, nodes of an undirected graph become active once at least r neighbors were active in the previous round. We propose the perturbed percolation process, a superposition of two percolation processes on the same node set. One process acts on a base graph with activation threshold 1, while the other acts on a random graph with threshold r (representing local and global contacts, respectively). We consider a class of grid-like base graphs, in which the bootstrap percolation process only spreads polynomially fast. As random graph, we consider Erdős–Rényi graphs. For r=1, the perturbed percolation process percolates exponentially fast, as the diameter is known to be low. For r>=2 however, we prove that the process initially spreads polynomially fast, with no activations via the random graph, and eventually the activation rate becomes exponential, driven only by the random graph. We complement our theoretical results by empirical analyses of this process. In addition, we discuss future work on the inclusion of recovery into this process.
Speakers:
The Next Economic Crisis? It's the Network! (Wattenhofer)
The world's financial system is a network, where financial institutions such as banks are connected via various kinds of financial contracts. If some financial institutions go bankrupt, then others might suffer as well; the financial network might experience a ripple effect. In other words, the next big financial crisis should be understood with the network in mind. In this talk I will present some analytical results on financial networks. For instance, can we build a financial network that is more robust to failures? How can we clean up after a crisis? Who should be bailed out?
Fire Sale Games (Wilhelmi)
We study a financial network consisting of n players, each write liabilities and hold unmarketable assets as well as a set of marketable assets called securities. Every player is obligated to fulfill a given leverage constraint and is forced to sell a share of its securities for discount prices whenever it violates the constraint. We study fire sales from a game-theoretical view where players sell securities strategically to minimize losses due to decreasing prices. We consider different combinations of price functions and their impact on sales revenue. For a wide range of games, we show monotonic sales of players. Furthermore, we prove existence of equilibria and convergence to the best equilibrium of the best-response dynamic.
Speakers:
Self-Stabilizing Clock Synchronization with 1-Bit Messages (Giakkoupis)
I will talk about some recent results we showed for the problem of distributed clock synchronization. The model we consider is a synchronous fully-connected network of n agents, where each agent has a local clock, that is, a counter increasing by one modulo T in each round. The clocks have arbitrary values initially, and they must all indicate the same time eventually. We assume a gossip-based pull communication model, where in every round each agent receives an ℓ-bit message from a random agent. We devised several fast synchronization algorithms that use small messages and are self-stabilizing, that is, the complete initial state of each agent (not just its clock value) can be arbitrary. Our main contribution is an algorithm that synchronizes a general modulo T clock in polylogarithmic time (in n and T), using just 1-bit messages. This is a join work with Paul Bastide and Hayk Saribekyan.
Using an Oracle for Scheduling Unknown Independent Tasks (Rau)
We are studying a non-clairvoyant problem for scheduling independent tasks where the processing times of the tasks are unknown. However, the scheduler can use an external resource (oracle) that delivers an answer to a query on the processing time. This query for the processing time of a task takes some time and the schedule has to wait until it is answered. Aiming to minimize the total completion time of the given tasks, we study different strategies for which tasks the oracle should be questioned. If this strategy has to be fixed before the first task is queried or scheduled, we present the strategy that provably generates a schedule with the smallest expected total completion time. However, we prove that this strategy is not the best when aiming to minimize the worst case.
Speakers:
Approximation of the Diagonal of a Laplacian's Pseudoinverse for Complex Network Analysis (Meyerhenke)
The ubiquity of massive graph data sets in numerous applications requires fast algorithms for extracting knowledge from these data. We are motivated here by three electrical measures for the analysis of large small-world graphs G = (V, E) - i. e., graphs with diameter in O(log |V|), which are abundant in complex network analysis. From a computational point of view, the three measures have in common that their crucial component is the diagonal of the graph Laplacian’s pseudoinverse, L^+. Computing diag(L^+) exactly by pseudoinversion, however, is as expensive as dense matrix multiplication - and the standard tools in practice even require cubic time. Moreover, the pseudoinverse requires quadratic space - hardly feasible for large graphs. Resorting to approximation by, e. g., using the Johnson-Lindenstrauss transform, requires the solution of O(log |V| /ε²) Laplacian linear systems to guarantee a relative error, which is still very expensive for large inputs. In this paper, we present a novel approximation algorithm that requires the solution of only one Laplacian linear system. The remaining parts are purely combinatorial - mainly sampling uniform spanning trees, which we relate to diag(L^+) via effective resistances. For small-world networks, our algorithm obtains a ± ε-approximation with high probability, in a time that is nearly-linear in |E| and quadratic in 1 / ε. Another positive aspect of our algorithm is its parallel nature due to independent sampling. We thus provide two parallel implementations of our algorithm: one using OpenMP, one MPI + OpenMP. In our experiments against the state of the art, our algorithm (i) yields more accurate approximation results for diag(L^+), (ii) is much faster and more memory-efficient, and (iii) obtains good parallel speedups, in particular in the distributed setting. Finally, we sketch some extensions of our techniques to related problems, in particular to (group) forest closeness centrality.
Simulating Population Protocols in Sub-Constant Time per Interaction (Penschuck)
We consider the efficient simulation of population protocols. In the population model, we are given a system of n agents modeled as identical finite-state machines. In each step, two agents are selected uniformly at random to interact by updating their states according to a common transition function. We empirically and analytically analyze two classes of simulators for this model. First, we consider sequential simulators executing one interaction after the other. Key to the
performance of these simulators is the data structure storing the agents' states. For our analysis, we consider plain arrays, binary search trees, and a novel Dynamic Alias Table data structure. Secondly, we consider batch processing to efficiently update the states of multiple independent agents in one step. For many protocols considered in literature, our simulator requires amortized sub-constant time per interaction and is fast in practice: given a fixed time budget, the
implementation of our batched simulator is able to simulate population protocols several orders of magnitude larger compared to the sequential competitors, and can carry out 2^50 interactions among the same number of agents in less than 400s.
Speakers:
Beyond Sparsity: Compressive Sensing with (Deep) Generative Modeling Assumptions (Scarlett)
The problem of estimating an unknown vector from linear measurements has a long history in statistics, machine learning, and signal processing. Classical studies focus on the "n >> p" regime (#measurements >> #parameters), and more recent studies handle the "n << p" regime by exploiting low-dimensional structure such as sparsity or low-rankness. Such variants are commonly known as compressive sensing.
In this talk, I will overview recent methods that move beyond these simple notions of structure, and instead assume that the underlying vector is well-modeled by a generative model (e.g., produced by deep learning methods such as GANs). I will highlight algorithmic works that demonstrated up to 5-10x savings in the number of measurements over sparsity-based methods, and then move on to our theoretical work characterizing the order-optimal sample complexity in terms of quantities such as (i) the Lipschitz constant of the model, or (ii) the depth/width in a neural network model. I will also briefly overview some recent results on non-linear observation models.
Inference under restrictions - sparsity constrained group testing (Hahn-Klimroth)
While non-adaptive group testing itself was recently fully understood to be easy in the sense that there is a polynomial time algorithm achieving inference at a universal information-theoretic lower bound, it comes with the flaw that within optimal designs each individual gets tested log(n) times and each test contains polynomially many individuals. Such testing schemes may not be applicable due to restrictions in laboratories.
In this talk, we will tackle the problem by analysing the sparsity-constrained group testing problem. Building up on first results by Gandikota et al. (2016), we provide a full understanding of the group testing problem if the test-size needs to be constant and achieve almost optimal results if the tests-per-individual are restricted to o(log n).