KaHyPar is a fast, high-quality, and scalable algorithm for partitioning graphs and hypergraphs with billions of edges. It finds applications in minimizing communication costs for distributed (hyper)graph computations, quantum circuit simulations, storage sharding in databases, and VLSI design.
Hypergraphs are a generalization of graphs, where each (hyper)edge (also called net) can connect more than two nodes. The k-way hypergraph partitioning problem is the generalization of the well-known graph partitioning problem: partition the node set into k disjoint blocks of bounded size (at most 1 + ε times the average block size), while minimizing an objective function defined on the nets. Hypergraph partitioning finds application in minimizing communication costs and balancing the computational work in distributed scientific simulations (e.g., for parallel sparse matrix-vector multiplication), simulations of distributed quantum circuits, storage sharding in distributed databases, or as a placement strategy in VLSI design.
The Karlsruhe Hypergraph Partitioning Suite (KaHyPar) is a fast, high-quality, and scalable algorithm for partitioning graphs and hypergraphs. It consists of the sequential solver KaHyPar and the shared-memory partitioner Mt-KaHyPar. Mt-KaHyPar can be seen as a feature complete parallelization of KaHyPar. (Mt-)KaHyPar runs on Linux, Windows, and MacOS (only KaHyPar), and besides its standalone application, it provides a C, Python, Julia (only KaHyPar), and Java interface (only KaHyPar).
The highest-quality configuration of Mt-KaHyPar computes partitions that are on par with those produced by the best sequential partitioning algorithms, while being almost an order of magnitude faster with only ten threads (e.g., when compared to KaFFPa or KaHyPar). Besides our high quality configuration, we provide several other faster configurations that are already able to outperform most of the existing partitioning algorithms with regard to solution quality and running time.
The figure below summarizes the time-quality trade-off of different hypergraph (left, connectivity metric) and graph partitioning algorithms (right, cut-net metric). The plot is based on an experiment with over 800 graphs and hypergraphs and relates the average solution quality and running time of each algorithm to the best achievable results. Points on the lower-left are considered better. Partially transparent markers indicate solvers producing more than 15% infeasible partitions (either imbalanced or timeout). For more details, we refer the reader to our publications.
Besides its fast and high-quality partitioning algorithm, (Mt-)KaHyPar provides many other useful features:
Mt-KaHyPar can optimize several objective functions which we explain in the following in more detail. An objective function takes a partition Π = {V_1, ..., V_k} into k blocks of a weighted hypergraph H = (V,E,c,ω) with node weights c: V -> R and net weights ω: E -> R as input. The number of blocks spanned by a net e is denoted with λ(e) and the set of all cut nets with E_Cut(Π) (λ(e) > 1).
The cut-net metric is defined as total weight of all nets spanning more than one block of the partition Π (also called cut nets).
The connectivity metric additionally multiplies the weight of each cut net with the number of blocks λ(e) spanned by that net minus one. Thus, the connectivity metric tries to minimize the number of blocks connected by each net.
The sum-of-external-degree metric is similar to the connectivity metric, but does not subtract one from the number of blocks λ(e) spanned by a net. A peculiarity of this objective function is that removing a net from the cut reduces the metric by 2ω(e), while reducing the connectivity by one reduces the metric only by ω(e). Thus, the objective function prefers removing nets from the cut, while as secondary criteria it tries to reduce the connectivity of the nets.
The Steiner tree metric is the most versatile metric that we provide at the moment. A Steiner tree is a tree with minimal weight that connects a subset of the nodes on a graph (a more detailed definition can be found here). For a subset with exactly two nodes, finding a Steiner tree reverts to computing the shortest path between the two nodes. When optimizing the Steiner tree metric, we map the node set of a hypergraph H onto the nodes of a target graph G. The objective is to minimize the total weight of all Steiner trees induced by the nets of H on G.
For a net e, dist(Λ(e)) is the weight of the minimal Steiner tree connecting the blocks Λ(e) spanned by net e on G. The Steiner tree metric can be used to accurately model wire-lengths in VLSI design or communication costs in distributed systems when some processors do not communicate with each other directly or with different speeds.
We have implemented a common interface for all gain computation techniques that we use in our refinement algorithms. This enables us to extend Mt-KaHyPar with new objective functions without having to modify the internal implementation of the refinement algorithms. A step-by-step guide how you can implement your own objective function can be found here.
A Helmholtz Pilot Program