Explainable clustering provides human-understandable reasons for decisions in black-box learning models. In a previous work, a decision tree built on the set of dimensions was used to define ranges ...of values for k-means clusters. For explainable graph clustering, we use expander graphs instead of dense subgraphs since powering an expander graph is guaranteed to result in a clique after at most a logarithmic number of steps.
Consider a set of multi-dimensional points labeled with k labels. We introduce the heat map sorting problem as reordering the rows and columns of an input matrix (each point is a column and each row is a dimension) such that the labels of the entries of the matrix form connected components (clusters). A cluster is preserved if it remains connected, i.e., if it is not split into several clusters and no two clusters are merged. In the massively parallel computation model (MPC), each machine has a sublinear memory and the total memory of the machines is linear.
We prove the problem is NP-hard. We give a fixed-parameter algorithm in MPC and an approximation algorithm based on expander decomposition. We empirically compare our algorithm with explainable k-means on several graphs of email and computer networks.
•A general method for explainable clustering of high-dimensional data.•A fixed-parameter algorithms for explainable graph clustering.•A Massively Parallel Computation (MPC) algorithm for explainable clustering.•An approximation algorithm for graph clustering on expander graphs.
Gradient coding is a technique for straggler mitigation in distributed learning. In this paper we design novel gradient codes using tools from classical coding theory, namely, cyclic MDS codes, which ...compare favorably with existing solutions, both in the applicable range of parameters and in the complexity of the involved algorithms. Second, we introduce an approximate variant of the gradient coding problem, in which we settle for approximate gradient computation instead of the exact one. This approach enables graceful degradation, i.e., the <inline-formula> <tex-math notation="LaTeX">\ell _{2} </tex-math></inline-formula> error of the approximate gradient is a decreasing function of the number of stragglers. Our main result is that normalized adjacency matrices of expander graphs yield excellent approximate gradient codes, which enable significantly less computation compared to exact gradient coding, and guarantee faster convergence than trivial solutions under standard assumptions. We experimentally test our approach on Amazon EC2, and show that the generalization error of approximate gradient coding is very close to the full gradient while requiring significantly less computation from the workers.
Disjoint isomorphic balanced clique subdivisions Fernández, Irene Gil; Hyde, Joseph; Liu, Hong ...
Journal of combinatorial theory. Series B,
July 2023, 2023-07-00, Volume:
161
Journal Article
Peer reviewed
Open access
A classical result, due to Bollobás and Thomason, and independently Komlós and Szemerédi, states that there is a constant C such that every graph with average degree at least Ck2 has a subdivision of ...Kk, the complete graph on k vertices. We study two directions extending this result.•Verstraëte conjectured that a quadratic bound guarantees in fact two vertex-disjoint isomorphic copies of a Kk-subdivision.•Thomassen conjectured that for each k∈N there is some d=d(k) such that every graph with average degree at least d contains a balanced subdivision of Kk. Recently, Liu and Montgomery confirmed Thomassen's conjecture, but the optimal bound on d(k) remains open. In this paper, we show that a quadratic lower bound on average degree suffices to force a balanced Kk-subdivision. This gives the right order of magnitude of the optimal d(k) needed in Thomassen's conjecture. Since a balanced Kmk-subdivision trivially contains m vertex-disjoint isomorphic Kk-subdivisions, this also confirms Verstraëte's conjecture in a strong sense.
Carsten Thomassen in 1989 conjectured that if a graph has minimum degree much more than the number of atoms in the universe (δ(G)≥101010), then it contains a pillar, which is a graph that consists of ...two vertex-disjoint cycles of the same length, s say, along with s vertex-disjoint paths of the same length3 which connect matching vertices in order around the cycles. Despite the simplicity of the structure of pillars and various developments of powerful embedding methods for paths and cycles in the past three decades, this innocent looking conjecture has seen no progress to date. In this paper, we give a proof of this conjecture by building a pillar (algorithmically) in sublinear expanders.
In this paper, we present a new algorithm for compressive sensing that makes use of binary measurement matrices and achieves exact recovery of ultrasparse vectors in a single pass and without any ...iterations. Due to its noniterative nature, our algorithm is hundreds of times faster than ℓ 1 -norm minimization and methods based on expander graphs, both of which require multiple iterations. Our algorithm can accommodate nearly sparse vectors, in which case, it recovers an index set of the largest components, and can also accommodate burst noise measurements. Compared to compressive sensing methods that are guaranteed to achieve exact recovery of all sparse vectors, our method requires fewer measurements. However, methods that achieve statistical recovery, that is, recovery of almost all but not all sparse vectors, can require fewer measurements than our method.
Random walks in expander graphs and their various derandomizations ( e.g ., replacement=zig-zag product) are invaluable tools from pseudorandomness. Recently, Ta-Shma used s -wide replacement walks ...in his breakthrough construction of a binary linear code almost matching the Gilbert-Varshamov bound (STOC 2017). Ta-Shma's original analysis was entirely linear algebraic, and subsequent developments have inherited this viewpoint. In this work, we rederive Ta-Shma's analysis from a combinatorial point of view using repeated application of the expander mixing lemma . We hope that this alternate perspective will yield a better understanding of Ta-Shma's construction. As an additional application of our techniques, we give an alternate proof of the expander hitting set lemma .