Indoor localization has recently witnessed an increase in interest, due to the potential wide range of services it can provide by leveraging Internet of Things (IoT), and ubiquitous connectivity. ...Different techniques, wireless technologies and mechanisms have been proposed in the literature to provide indoor localization services in order to improve the services provided to the users. However, there is a lack of an up-to-date survey paper that incorporates some of the recently proposed accurate and reliable localization systems. In this paper, we aim to provide a detailed survey of different indoor localization techniques, such as angle of arrival (AoA), time of flight (ToF), return time of flight (RTOF), and received signal strength (RSS); based on technologies, such as WiFi, radio frequency identification device (RFID), ultra wideband (UWB), Bluetooth, and systems that have been proposed in the literature. This paper primarily discusses localization and positioning of human users and their devices. We highlight the strengths of the existing systems proposed in the literature. In contrast with the existing surveys, we also evaluate different systems from the perspective of energy efficiency, availability, cost, reception range, latency, scalability, and tracking accuracy. Rather than comparing the technologies or techniques, we compare the localization systems and summarize their working principle. We also discuss remaining challenges to accurate indoor localization.
When training deep neural networks (DNNs), expensive floating point arithmetic units are used in GPUs or custom neural processing units (NPUs). To reduce the burden of floating point arithmetic, ...community has started exploring the use of more efficient data representations, e.g., block floating point (BFP). The BFP format allows a group of values to share an exponent, which effectively reduces the memory footprint and enables cheaper fixed point arithmetic for multiply-accumulate (MAC) operations. However, existing BFP-based DNN accelerators are targeted for a specific precision, making them less versatile. In this paper, we present FlexBlock, a DNN training accelerator with three BFP modes, possibly different among activation, weight, and gradient tensors. By configuring FlexBlock to a lower BFP precision, the number of MACs handled by the core increases by up to 4× in 8-bit mode or 16× in 4-bit mode compared to 16-bit mode. To reach this theoretical upper bound, FlexBlock maximizes the core utilization at various precision levels or layer types, and allows dynamic precision control to keep throughput at its peak without sacrificing training accuracy. We evaluate the effectiveness of FlexBlock using representative DNNs on CIFAR, ImageNet and WMT14 datasets. As a result, training in FlexBlock significantly improves training speed by 1.5 5.3× and energy efficiency by 2.4 7.0× compared to other training accelerators.
ASP: Learn a Universal Neural Solver Wang, Chenguang; Yu, Zhouliang; McAleer, Stephen ...
IEEE transactions on pattern analysis and machine intelligence,
06/2024, Letnik:
46, Številka:
6
Journal Article
Recenzirano
Odprti dostop
Applying machine learning to combinatorial optimization problems has the potential to improve both efficiency and accuracy. However, existing learning-based solvers often struggle with generalization ...when faced with changes in problem distributions and scales. In this paper, we propose a new approach called ASP: A daptive S taircase P olicy Space Response Oracle to address these generalization issues and learn a universal neural solver. ASP consists of two components: Distributional Exploration, which enhances the solver's ability to handle unknown distributions using Policy Space Response Oracles, and Persistent Scale Adaption, which improves scalability through curriculum learning. We have tested ASP on several challenging COPs, including the traveling salesman problem, the vehicle routing problem, and the prize collecting TSP, as well as the real-world instances from TSPLib and CVRPLib. Our results show that even with the same model size and weak training signal, ASP can help neural solvers explore and adapt to unseen distributions and varying scales, achieving superior performance. In particular, compared with the same neural solvers under a standard training pipeline, ASP produces a remarkable decrease in terms of the optimality gap with 90.9% and 47.43% on generated instances and real-world instances for TSP, and a decrease of 19% and 45.57% for CVRP.
Wireless Network Intelligence at the Edge Park, Jihong; Samarakoon, Sumudu; Bennis, Mehdi ...
Proceedings of the IEEE,
11/2019, Letnik:
107, Številka:
11
Journal Article
Recenzirano
Odprti dostop
Fueled by the availability of more data and computing power, recent breakthroughs in cloud-based machine learning (ML) have transformed every aspect of our lives from face recognition and medical ...diagnosis to natural language processing. However, classical ML exerts severe demands in terms of energy, memory, and computing resources, limiting their adoption for resource-constrained edge devices. The new breed of intelligent devices and high-stake applications (drones, augmented/virtual reality, autonomous systems, and so on) requires a novel paradigm change calling for distributed, low-latency and reliable ML at the wireless network edge (referred to as edge ML). In edge ML, training data are unevenly distributed over a large number of edge nodes, which have access to a tiny fraction of the data. Moreover, training and inference are carried out collectively over wireless links, where edge devices communicate and exchange their learned models (not their private data). In a first of its kind, this article explores the key building blocks of edge ML, different neural network architectural splits and their inherent tradeoffs, as well as theoretical and technical enablers stemming from a wide range of mathematical disciplines. Finally, several case studies pertaining to various high-stake applications are presented to demonstrate the effectiveness of edge ML in unlocking the full potential of 5G and beyond.
Born digitals Monaghan, Sinéad; Tippmann, Esther; Coviello, Nicole
Journal of international business studies,
02/2020, Letnik:
51, Številka:
1
Journal Article
Recenzirano
Johanson and Vahlne (J Int Bus Stud 40(9):1411–1431, 2009) articulate various theoretical mechanisms underpinning the internationalization process; mechanisms they suggest are pertinent across firm ...type. Their argument builds on their earlier publications and, in this spirit, we consider Johanson and Vahlne (2009) in the contemporary context of digital firms. In particular, we revisit their theorizing as it relates to firms that had only begun to emerge when Johanson and Vahlne published their award-winning paper: born digitals. We address how technological affordances, especially direct engagement with stakeholders, automation, network effects, flexibility and scalability, affect the internationalization of born digitals. We also develop a future agenda for international business research on born digital firms.
Recent advances in cryptocurrencies have sparked significant interest in blockchain technology. However, scalability issues remain a major challenge for wide adoption of blockchains. Sharding is a ...promising approach to scale blockchains, but existing sharding-based blockchains fail to achieve expected performance gains due to limitations in cross-shard transaction processing. In this paper, we propose X-shard, a blockchain system that optimizes cross-shard transaction processing, achieving high effective throughput and low processing latency. First, we allocate transactions to shards based on historical transaction patterns to minimize cross-shard transactions. Second, we take an optimistic strategy to process cross-shard transactions in parallel as sub-transactions within input shards, thereby accelerating transaction processing. Finally, we employ a cross-shard commit protocol with threshold signatures to reduce communication overhead. We implement and deploy X-shard on Amazon EC2 clusters. Experimental results validate our theoretical analysis and show that as the number of shards increases, X-shard achieves nearly linear scaling in effective throughput and decreases in transaction processing latency.
The current literature of evolutionary many-objective optimization is merely focused on the scalability to the number of objectives, while little work has considered the scalability to the number of ...decision variables. Nevertheless, many real-world problems can involve both many objectives and large-scale decision variables. To tackle such large-scale many-objective optimization problems (MaOPs), this paper proposes a specially tailored evolutionary algorithm based on a decision variable clustering method. To begin with, the decision variable clustering method divides the decision variables into two types: 1) convergence-related variables and 2) diversity-related variables. Afterward, to optimize the two types of decision variables, a convergence optimization strategy and a diversity optimization strategy are adopted. In addition, a fast nondominated sorting approach is developed to further improve the computational efficiency of the proposed algorithm. To assess the performance of the proposed algorithm, empirical experiments have been conducted on a variety of large-scale MaOPs with up to ten objectives and 5000 decision variables. Our experimental results demonstrate that the proposed algorithm has significant advantages over several state-of-the-art evolutionary algorithms in terms of the scalability to decision variables on MaOPs.
Sharding has shown great potential to scale out blockchains. It divides nodes into smaller groups which allow for partial transaction processing, relaying and storage. Hence, instead of running one ...blockchain, we will run multiple blockchains in parallel, and call each one a shard. Sharding can be applied to address shortcomings due to compulsory duplication of three resources in blockchains, i.e., computation, communication and storage. The most pressing issue in blockchains today is throughput. In this paper, we propose new queueing-theoretic models to derive the maximum throughput of sharded blockchains. We consider two cases, a fully sharded blockchain and a computation sharding. We model each with a queueing network that exploits signals to account for block production as well as multi-destination cross-shard transactions. We make sure quasi-reversibility for every queue in our models is satisfied so that they fall into the category of product-form queueing networks. We then obtain a closed-form solution for the maximum stable throughput of these systems with respect to block size, block rate, number of destinations in transactions and the number of shards. Comparing the results obtained from the two introduced sharding systems, we conclude that the extent of sharding in different domains plays a significant role in scalability.
Secure state estimation is the problem of estimating the state of a dynamical system from a set of noisy and adversarially corrupted measurements. Intrinsically a combinatorial problem, secure state ...estimation has been traditionally addressed either by brute force search, suffering from scalability issues, or via convex relaxations, using algorithms that can terminate in polynomial time but are not necessarily sound. In this paper, we present a novel algorithm that uses a satisfiability modulo theory approach to harness the complexity of secure state estimation. We leverage results from formal methods over real numbers to provide guarantees on the soundness and completeness of our algorithm. Moreover, we discuss its scalability properties, by providing upper bounds on the runtime performance. Numerical simulations support our arguments by showing an order of magnitude decrease in execution time with respect to alternative techniques. Finally, the effectiveness of the proposed algorithm is demonstrated by applying it to the problem of controlling an unmanned ground vehicle.
The class of random features is one of the most popular techniques to speed up kernel methods in large-scale problems. Related works have been recognized by the NeurIPS Test-of-Time award in 2017 and ...the ICML Best Paper Finalist in 2019. The body of work on random features has grown rapidly, and hence it is desirable to have a comprehensive overview on this topic explaining the connections among various algorithms and theoretical results. In this survey, we systematically review the work on random features from the past ten years. First, the motivations, characteristics and contributions of representative random features based algorithms are summarized according to their sampling schemes, learning procedures, variance reduction properties and how they exploit training data. Second, we review theoretical results that center around the following key question: how many random features are needed to ensure a high approximation quality or no loss in the empirical/expected risks of the learned estimator. Third, we provide a comprehensive evaluation of popular random features based algorithms on several large-scale benchmark datasets and discuss their approximation quality and prediction performance for classification. Last, we discuss the relationship between random features and modern over-parameterized deep neural networks (DNNs), including the use of high dimensional random features in the analysis of DNNs as well as the gaps between current theoretical and empirical results. This survey may serve as a gentle introduction to this topic, and as a users' guide for practitioners interested in applying the representative algorithms and understanding theoretical results under various technical assumptions. We hope that this survey will facilitate discussion on the open problems in this topic, and more importantly, shed light on future research directions. Due to the page limit, we suggest the readers refer to the full version of this survey https://arxiv.org/abs/2004.11154 .