We use a recently developed synchronous Spiking Neural Network (SNN) model to study the problem of learning hierarchically-structured concepts. We introduce an abstract data model that describes ...simple hierarchical concepts. We define a feed-forward layered SNN model, with learning modeled using Oja’s local learning rule, a well known biologically-plausible rule for adjusting synapse weights. We define what it means for such a network to recognize hierarchical concepts; our notion of recognition is robust, in that it tolerates a bounded amount of noise.
Then, we present a learning algorithm by which a layered network may learn to recognize hierarchical concepts according to our robust definition. We analyze correctness and performance rigorously; the amount of time required to learn each concept, after learning all of the sub-concepts, is approximately O1ηkℓmaxlog(k)+1ɛ+blog(k), where k is the number of sub-concepts per concept, ℓmax is the maximum hierarchical depth, η is the learning rate, ɛ describes the amount of uncertainty allowed in robust recognition, and b describes the amount of weight decrease for “irrelevant” edges. An interesting feature of this algorithm is that it allows the network to learn sub-concepts in a highly interleaved manner. This algorithm assumes that the concepts are presented in a noise-free way; we also extend these results to accommodate noise in the learning process. Finally, we give a simple lower bound saying that, in order to recognize concepts with hierarchical depth two with noise-tolerance, a neural network should have at least two layers.
The results in this paper represent first steps in the theoretical study of hierarchical concepts using SNNs. The cases studied here are basic, but they suggest many directions for extensions to more elaborate and realistic cases.
Hierarchical Clustering Cohen-addad, Vincent; Kanade, Varun; Mallmann-trenn, Frederik ...
Journal of the ACM,
08/2019, Letnik:
66, Številka:
4
Journal Article
Recenzirano
Odprti dostop
Hierarchical clustering is a recursive partitioning of a dataset into clusters at an increasingly finer granularity. Motivated by the fact that most work on hierarchical clustering was based on ...providing algorithms, rather than optimizing a specific objective, Dasgupta framed similarity-based hierarchical clustering as a combinatorial optimization problem, where a “good” hierarchical clustering is one that minimizes a particular cost function 23. He showed that this cost function has certain desirable properties: To achieve optimal cost, disconnected components (namely, dissimilar elements) must be separated at higher levels of the hierarchy, and when the similarity between data elements is identical, all clusterings achieve the same cost.
We take an axiomatic approach to defining “good” objective functions for both similarity- and dissimilarity-based hierarchical clustering. We characterize a set of
admissible
objective functions having the property that when the input admits a “natural” ground-truth hierarchical clustering, the ground-truth clustering has an optimal value. We show that this set includes the objective function introduced by Dasgupta.
Equipped with a suitable objective function, we analyze the performance of practical algorithms, as well as develop better and faster algorithms for hierarchical clustering. We also initiate a beyond worst-case analysis of the complexity of the problem and design algorithms for this scenario.
In this article, we characterize the advantage of using a robot's neighborhood to find and eliminate adversarial robots in the presence of a Sybil attack. We show that by leveraging the opinions of ...their neighbors on the trustworthiness of transmitted data, robots can detect adversaries with high probability. We characterize the number of communication rounds required to be a function of the communication quality and of the proportion of legitimate to malicious robots. This result enables increased resiliency of many multirobot algorithms. Because our results are finite time and not asymptotic, they are particularly well-suited for problems of a time critical nature. We develop two algorithms, FindSpoofedRobots that determines trusted neighbors with high probability, and FindResilientAdjacencyMatrix that enables distributed computation of graph properties in an adversarial setting. We apply our methods to a flocking problem where a team of robots must track a moving target in the presence of adversarial robots. We show that by using our algorithms, the team of robots are able to maintain tracking ability of the dynamic target.
Hierarchical Clustering Cohen-Addad, Vincent; Kanade, Varun; Mallmann-Trenn, Frederik ...
Journal of the ACM,
08/2019, Letnik:
66, Številka:
4
Journal Article
Recenzirano
Odprti dostop
Hierarchical clustering is a recursive partitioning of a dataset into clusters at an increasingly finer granularity. Motivated by the fact that most work on hierarchical clustering was based on ...providing algorithms, rather than optimizing a specific objective, Dasgupta framed similarity-based hierarchical clustering as a combinatorial optimization problem, where a “good” hierarchical clustering is one that minimizes a particular cost function 23. He showed that this cost function has certain desirable properties: To achieve optimal cost, disconnected components (namely, dissimilar elements) must be separated at higher levels of the hierarchy, and when the similarity between data elements is identical, all clusterings achieve the same cost.We take an axiomatic approach to defining “good” objective functions for both similarity- and dissimilarity-based hierarchical clustering. We characterize a set of admissible objective functions having the property that when the input admits a “natural” ground-truth hierarchical clustering, the ground-truth clustering has an optimal value. We show that this set includes the objective function introduced by Dasgupta.Equipped with a suitable objective function, we analyze the performance of practical algorithms, as well as develop better and faster algorithms for hierarchical clustering. We also initiate a beyond worst-case analysis of the complexity of the problem and design algorithms for this scenario.
A fundamental problem in distributed computing is the distribution of requests to a set of uniform servers without a centralized controller. Classically, such problems are modeled as static balls ...into bins processes, where
m
balls (tasks) are to be distributed among
n
bins (servers). In a seminal work, Azar et al. (SIAM J Comput 29(1):180–200,
1999
.
https://doi.org/10.1137/S0097539795288490
) proposed the sequential strategy
G
R
E
E
D
Y
d
for
n
=
m
. Each ball queries the load of
d
random bins and is allocated to a least loaded of them. Azar et al. (
1999
) showed that
d
=
2
yields an exponential improvement compared to
d
=
1
. Berenbrink et al. (SIAM J Comput 35(6):1350–1385,
2006
.
https://doi.org/10.1137/S009753970444435X
) extended this to
m
≫
n
, showing that for
d
=
2
the maximal load difference is independent of
m
(in contrast to the
d
=
1
case). We propose a new variant of an
infinite
balls-into-bins process. In each round an expected number of
λ
n
new balls arrive and are distributed (in parallel) to the bins. Subsequently, each non-empty bin deletes one of its balls. This setting models a set of servers processing incoming requests, where clients can query a server’s current load but receive no information about parallel requests. We study the
G
R
E
E
D
Y
d
distribution scheme in this setting and show a strong self-stabilizing property: for
any
arrival rate
λ
=
λ
(
n
)
<
1
, the system load is time-invariant. Moreover, for
any
(even super-exponential) round
t
, the maximum system load is (w.h.p.)
for
d
=
1
and
for
d
=
2
. In particular,
G
R
E
E
D
Y
2
has an exponentially smaller system load for high arrival rates.
We present an algorithm that estimates the number of connected components of a graph over n vertices within an additive error of εn in (sublinear) time O(1ε2log(1ε)). A connected component C is the ...maximal subset of nodes of an undirected graph G such that for any two nodes in C there is path connecting. We consider graphs without multiple edges. The Connected Component Problem is well-known and amongst the first topics taught in graph theory. The number of connected components can be used to calculate the weight of the minimum spanning tree 1. Moreover, the study of connected components finds its application in Connected-component labelling, which is used in computer vision. An algorithm runs in sublinear time if its running time is o(n) for an input of size n. So far, the best known algorithm was provided by 1. Their running time is O(dε−2log(dε)) where d is the average degree.
•We present an algorithm that estimates the number of connected components of a graph.•Our running time depends only on the additive approximation.•So far, the best known running time depends on the average degree.•The number of connected components can be used to calculate the weight of the MST.
Threshold load balancing with weighted tasks Berenbrink, Petra; Friedetzky, Tom; Mallmann-Trenn, Frederik ...
Journal of parallel and distributed computing,
March 2018, 2018-03-00, Letnik:
113
Journal Article
Recenzirano
Odprti dostop
We study threshold-based load balancing protocols for weighted tasks. We are given an arbitrary graph G with n nodes (resources, bins) and m≥n tasks (balls). Initially the tasks are distributed ...arbitrarily over the n nodes. The resources have a threshold and we are interested in the balancing time, i.e., the time it takes until the load of all resources is below or at the threshold. We distinguish between resource-based and user-based protocols. In the case of resource-based protocols, resources with a load larger than the threshold are allowed to send tasks to neighboring resources. In the case of user-based protocols, the tasks make the migration decisions and we restrict ourselves to the complete graph in the model. Any task allocated to a resource with a load above the threshold decides whether to migrate to a neighboring resource independently of the other tasks.
For resource-controlled protocols, we present results for arbitrary graphs. For the user-controlled migration, we consider complete graphs and derive bounds for both above-average and tight thresholds.
•We study threshold-based load balancing protocols for weighted tasks in graphs.•We consider two models: (i) The resource-based model where the nodes decide which tasks are sent to neighbors; (ii) The user-based model where the tasks decide whether and where to move.•We show that the required time to balance depends on the mixing time and hitting time.•Our bounds for the resourced-based case are tight and, surprisingly, they are independent of the weights of the tasks.