The vast quantity of information brought by big data as well as the evolving computer hardware encourages success stories in the machine learning community. In the meanwhile, it poses challenges for ...the Gaussian process regression (GPR), a well-known nonparametric, and interpretable Bayesian model, which suffers from cubic complexity to data size. To improve the scalability while retaining desirable prediction quality, a variety of scalable GPs have been presented. However, they have not yet been comprehensively reviewed and analyzed to be well understood by both academia and industry. The review of scalable GPs in the GP community is timely and important due to the explosion of data size. To this end, this article is devoted to reviewing state-of-the-art scalable GPs involving two main categories: global approximations that distillate the entire data and local approximations that divide the data for subspace learning. Particularly, for global approximations, we mainly focus on sparse approximations comprising prior approximations that modify the prior but perform exact inference, posterior approximations that retain exact prior but perform approximate inference, and structured sparse approximations that exploit specific structures in kernel matrix; for local approximations, we highlight the mixture/product of experts that conducts model averaging from multiple local experts to boost predictions. To present a complete review, recent advances for improving the scalability and capability of scalable GPs are reviewed. Finally, the extensions and open issues of scalable GPs in various scenarios are reviewed and discussed to inspire novel ideas for future research avenues.
The next generation (NG) optical technologies will unveil certain unique features, namely ultra-high data rate, broadband multiple services, scalable bandwidth, and flexible communications for ...manifold end-users. Among the optical technologies, free space optical (FSO) technology is a key element to achieve free space data transmission according to the requirements of the future technologies, which is due to its cost effective, easy deployment, high bandwidth enabler, and high secured. In this article, we give the overview of the recent progress on FSO technology and the factors that will lead the technology towards ubiquitous application. As part of the review, we provided fundamental concepts across all types of FSO system, including system architecture comprising of single beam and multiple beams. The review is further expanded into the investigation of rain and haze effects toward FSO signal propagation. The final objective that we cover is the scalability of an FSO network via the implementations of hybrid multi-beam FSO system with wavelength division multiplexing (WDM) technology.
•A novel Heat Supply System for a Small District Heating Reactor is proposed.•Reactor modules are scalable between 2–120 MWs.•All reactor modules are transportable on standard roads, without special ...permits.•Reactor modules up to 36 MWth fit inside a standard sea container.•Concept offers an efficient and dispatchable means of decarbonizing the district heating sector.
Small Modular Reactors (SMRs) are currently considered to be a potential solution for decarbonizing the district heating sector. The LUT Heat Experimental Reactor (LUTHER) is a concept for a small modular nuclear heating plant that is being designed to meet the demands of Nordic district heating networks while also incorporating high safety standards. This paper presents an extension of the work pursued by LUT University by proposing a reactor module that allows for easy scaling of unit sizes ranging from 2 MW to 120 MW. The pressure tube assembly geometry, which has been developed specifically for the LUTHER reactor module, was analyzed by modeling two significantly different-sized variants that utilize this unique structure. The modular design of LUTHER enables complete factory-assembly and the use of standard road transport for unit sizes up to 120 MW. This design prioritizes high inherent safety, targeting for siting near population centers. The proposed heating reactor concept offers a viable means of decarbonizing the district heating sector by replacing existing combustion-based production with emissions-free nuclear heat.
Indoor localization has recently witnessed an increase in interest, due to the potential wide range of services it can provide by leveraging Internet of Things (IoT), and ubiquitous connectivity. ...Different techniques, wireless technologies and mechanisms have been proposed in the literature to provide indoor localization services in order to improve the services provided to the users. However, there is a lack of an up-to-date survey paper that incorporates some of the recently proposed accurate and reliable localization systems. In this paper, we aim to provide a detailed survey of different indoor localization techniques, such as angle of arrival (AoA), time of flight (ToF), return time of flight (RTOF), and received signal strength (RSS); based on technologies, such as WiFi, radio frequency identification device (RFID), ultra wideband (UWB), Bluetooth, and systems that have been proposed in the literature. This paper primarily discusses localization and positioning of human users and their devices. We highlight the strengths of the existing systems proposed in the literature. In contrast with the existing surveys, we also evaluate different systems from the perspective of energy efficiency, availability, cost, reception range, latency, scalability, and tracking accuracy. Rather than comparing the technologies or techniques, we compare the localization systems and summarize their working principle. We also discuss remaining challenges to accurate indoor localization.
ASP: Learn a Universal Neural Solver Wang, Chenguang; Yu, Zhouliang; McAleer, Stephen ...
IEEE transactions on pattern analysis and machine intelligence,
06/2024, Letnik:
46, Številka:
6
Journal Article
Recenzirano
Odprti dostop
Applying machine learning to combinatorial optimization problems has the potential to improve both efficiency and accuracy. However, existing learning-based solvers often struggle with generalization ...when faced with changes in problem distributions and scales. In this paper, we propose a new approach called ASP: A daptive S taircase P olicy Space Response Oracle to address these generalization issues and learn a universal neural solver. ASP consists of two components: Distributional Exploration, which enhances the solver's ability to handle unknown distributions using Policy Space Response Oracles, and Persistent Scale Adaption, which improves scalability through curriculum learning. We have tested ASP on several challenging COPs, including the traveling salesman problem, the vehicle routing problem, and the prize collecting TSP, as well as the real-world instances from TSPLib and CVRPLib. Our results show that even with the same model size and weak training signal, ASP can help neural solvers explore and adapt to unseen distributions and varying scales, achieving superior performance. In particular, compared with the same neural solvers under a standard training pipeline, ASP produces a remarkable decrease in terms of the optimality gap with 90.9% and 47.43% on generated instances and real-world instances for TSP, and a decrease of 19% and 45.57% for CVRP.
To more efficiently address image compressed sensing (CS) problems, we present a novel content-aware scalable network dubbed CASNet which collectively achieves adaptive sampling rate allocation, fine ...granular scalability and high-quality reconstruction. We first adopt a data-driven saliency detector to evaluate the importance of different image regions and propose a saliency-based block ratio aggregation (BRA) strategy for sampling rate allocation. A unified learnable generating matrix is then developed to produce sampling matrix of any CS ratio with an ordered structure. Being equipped with the optimization-inspired recovery subnet guided by saliency information and a multi-block training scheme preventing blocking artifacts, CASNet jointly reconstructs the image blocks sampled at various sampling rates with one single model. To accelerate training convergence and improve network robustness, we propose an SVD-based initialization scheme and a random transformation enhancement (RTE) strategy, which are extensible without introducing extra parameters. All the CASNet components can be combined and learned end-to-end. We further provide a four-stage implementation for evaluation and practical deployments. Experiments demonstrate that CASNet outperforms other CS networks by a large margin, validating the collaboration and mutual supports among its components and strategies. Codes are available at https://github.com/Guaishou74851/CASNet .
Wireless Network Intelligence at the Edge Park, Jihong; Samarakoon, Sumudu; Bennis, Mehdi ...
Proceedings of the IEEE,
11/2019, Letnik:
107, Številka:
11
Journal Article
Recenzirano
Odprti dostop
Fueled by the availability of more data and computing power, recent breakthroughs in cloud-based machine learning (ML) have transformed every aspect of our lives from face recognition and medical ...diagnosis to natural language processing. However, classical ML exerts severe demands in terms of energy, memory, and computing resources, limiting their adoption for resource-constrained edge devices. The new breed of intelligent devices and high-stake applications (drones, augmented/virtual reality, autonomous systems, and so on) requires a novel paradigm change calling for distributed, low-latency and reliable ML at the wireless network edge (referred to as edge ML). In edge ML, training data are unevenly distributed over a large number of edge nodes, which have access to a tiny fraction of the data. Moreover, training and inference are carried out collectively over wireless links, where edge devices communicate and exchange their learned models (not their private data). In a first of its kind, this article explores the key building blocks of edge ML, different neural network architectural splits and their inherent tradeoffs, as well as theoretical and technical enablers stemming from a wide range of mathematical disciplines. Finally, several case studies pertaining to various high-stake applications are presented to demonstrate the effectiveness of edge ML in unlocking the full potential of 5G and beyond.
When training deep neural networks (DNNs), expensive floating point arithmetic units are used in GPUs or custom neural processing units (NPUs). To reduce the burden of floating point arithmetic, ...community has started exploring the use of more efficient data representations, e.g., block floating point (BFP). The BFP format allows a group of values to share an exponent, which effectively reduces the memory footprint and enables cheaper fixed point arithmetic for multiply-accumulate (MAC) operations. However, existing BFP-based DNN accelerators are targeted for a specific precision, making them less versatile. In this paper, we present FlexBlock, a DNN training accelerator with three BFP modes, possibly different among activation, weight, and gradient tensors. By configuring FlexBlock to a lower BFP precision, the number of MACs handled by the core increases by up to 4× in 8-bit mode or 16× in 4-bit mode compared to 16-bit mode. To reach this theoretical upper bound, FlexBlock maximizes the core utilization at various precision levels or layer types, and allows dynamic precision control to keep throughput at its peak without sacrificing training accuracy. We evaluate the effectiveness of FlexBlock using representative DNNs on CIFAR, ImageNet and WMT14 datasets. As a result, training in FlexBlock significantly improves training speed by 1.5 5.3× and energy efficiency by 2.4 7.0× compared to other training accelerators.
Sharding has shown great potential to scale out blockchains. It divides nodes into smaller groups which allow for partial transaction processing, relaying and storage. Hence, instead of running one ...blockchain, we will run multiple blockchains in parallel, and call each one a shard. Sharding can be applied to address shortcomings due to compulsory duplication of three resources in blockchains, i.e., computation, communication and storage. The most pressing issue in blockchains today is throughput. In this paper, we propose new queueing-theoretic models to derive the maximum throughput of sharded blockchains. We consider two cases, a fully sharded blockchain and a computation sharding. We model each with a queueing network that exploits signals to account for block production as well as multi-destination cross-shard transactions. We make sure quasi-reversibility for every queue in our models is satisfied so that they fall into the category of product-form queueing networks. We then obtain a closed-form solution for the maximum stable throughput of these systems with respect to block size, block rate, number of destinations in transactions and the number of shards. Comparing the results obtained from the two introduced sharding systems, we conclude that the extent of sharding in different domains plays a significant role in scalability.