A team of UC Santa Cruz researchers led by computer science Professor Seshadhri Comandur have completely characterized the kind of six-person plus complex patterns that can be efficiently counted on a network. This has important applications beyond the realm of mathematics, including epidemic spread and tracking the spread of disinformation on a social network.
“Near-Linear Time Homomorphism Counting in Bounded Degeneracy Graphs: The Barrier of Long Induced Cycles” will be published in the Symposium for Discrete Algorithms. In addition to Seshadhri, the authors are Suman K. Bera, a postdoctoral researcher at UC Santa Cruz and Noujan Pashanasangi, a Ph.D. student in Seshadhri’s lab.
“The most common pattern we search for on a network is a triangle,” Seshadhri said. “For example three people who all know each other. How many of these triangles exist, and how they’re distributed is useful for a lot of analysis.”
More complex patterns such as the relationship between four or more people are also useful, particularly for tracking relationships between people who are more loosely connected. However, as patterns become more complex, it becomes much more challenging to look for them.
“After years of work we were able to design algorithms for patterns up to size five,” Seshadhri said. “But we became stuck. There were always larger patterns that couldn’t be counted. We discovered a mathematical barrier between patterns of five and six, but we couldn’t completely characterize it until now.”
Specifically the investigators were looking for homomorphisms of a constant sized pattern H in an input graph G. They looked for when near-time algorithms were possible, focusing on the case when the input graph has bounded degeneracy. They knew that for certain classes of H, H-homomorphisms could be counted exactly in near-linear time in bounded degeneracy graphs and precisely characterized the patterns H for which near-linear time algorithms were possible.
They discovered a clean dichotomy using fine-grained complexity. Let m denote the number of edges in G. We prove the following: if the largest induced cycle in H has length at most 5, then there is an O(mlogm) algorithm for counting H-homomorphisms in bounded degeneracy graphs. If the largest induced cycle in H has length at least 6, then (assuming standard fine-grained complexity conjectures) there is a constant γ>0, such that there is no o(m1+γ) time algorithm for counting H-homomorphisms.
In other words, the team’s solution characterizes exactly the kinds of six-person patterns that can be found efficiently on a network and which cannot. The breakthrough has exciting applications.
The research was funded by the Army Research Office and the National Science Foundation.
“We are already surrounded by a deluge of data which will only increase in the future,” said Dr. Purush Iyer, program manager, Army Research Office, an element of U.S. Army Combat Capabilities Development Commands Army Research Laboratory. “Identifying patterns of suspicious behavior will be critical for the success of future military missions, making the UCSC research important. For example, this mathematical solution could be used to identify deceitful behavior in financial transactions, patterns of disinformation and disease spread in societies, suspicious activities in urban settings and patterns of communication among small adversarial groups.”
To read the paper, please visit: https://arxiv.org/abs/2010.08083
For more information about Professor Seshadhri and his work please visit: https://users.soe.ucsc.edu/~sesh/