This paper investigates the problem of distributed storage of electronic documents (both metadata and files) in decentralized blockchain-based b2b systems (DApps). The need to reduce the cost of ...implementing such systems and the insufficient elaboration of the issue of storing big data in DLT are considered. An approach for building such systems is proposed, which allows optimizing the size of the required storage (by using Erasure coding) and simultaneously providing secure data storage in geographically distributed systems of a company, or within a consortium of companies. The novelty of this solution is that we are the first who combine enterprise DLT with distributed file storage, in which the availability of files is controlled. The results of our experiment demonstrate that the speed of the described DApp is comparable to known b2c torrent projects, and subsequently justify the choice of Hyperledger Fabric and Ethereum Enterprise for its use. Obtained test results show that public blockchain networks are not suitable for creating such a b2b system. The proposed system solves the main challenges of distributed data storage by grouping data into clusters and managing them with a load balancer, while preventing data tempering using a blockchain network. The considered DApps storage methodology easily scales horizontally in terms of distributed file storage and can be deployed on cloud computing technologies, while minimizing the required storage space. We compare this approach with known methods of file storage in distributed systems, including central storage, torrents, IPFS, and Storj. The reliability of this approach is calculated and the result is compared to traditional solutions based on full backup.
We present in this paper the design of a large distributed file storage system called Julunga for cloud computing environments. Julunga is designed for federated data center environments where ...multiple numbers of data centers across the globe interconnect to each other and present a coherent single system view to end-users. In Julunga, the metadata, the namespace and the data blocks of the files are completely distributed with no hard limits on the number of files that can be accommodated in a single directory. Also, there is no physical limit on the size of a file. Julunga supports file of exabytes size with ease along with the number of concurrent users updating, reading and writing to the same file or a directory. The location of the data blocks of a file is determined by using functions, thus expunging the need for file allocation tables. Many new data structures and algorithms were designed earlier where locality and preferences of users are considered to provide optimal storage locations for files and metadata. We elaborate, in this paper, the design of fundamental building blocks of the distributed storage system and compare it with the designs of the earlier file storage systems.
We explore the feasibility of implementing a reliable, high performance, distributed storage system on a commodity computing cluster. Files are distributed across storage nodes using erasure coding ...with small low-density parity-check (LDPC) codes, which provide high-reliability with small storage and performance overhead. We present performance measurements done on a prototype system comprising 50 nodes, which are self organised using a peer-to-peer overlay.
In this paper, the issue of financial credential asset tokens is studied in the context of blockchain technology. In order to solve the problems of large file storage and poor algorithm performance ...in current research on financial credential asset token, based on the blockchain technology, the data of the business system can be protected. Then, by packaging the contents of financial credentials into blocks, the malicious tampering of financial credentials by operators is effectively prevented. There are two innovations. First, by using electronic storage technology of financial credentials, which combines distributed file system and blockchain, the problem that each transaction message on the current blockchain is not suitable for carrying large file attachments can be solved. Second, a new algorithm CT-DPCFT is innovatively proposed to improve the performance of consensus algorithm. Thus, the financial credentials can be packaged on the chain in a lightweight manner, and consensus discrimination can be carried out efficiently at the business side. The experiment results show that CT-DPCFT can solve the problems of difficult storage of large files and poor algorithm performance of current financial credential asset tokens to some extent.
Cloud collaboration is a billion-dollar industry, for sharing, storing, and co-authoring files. In the current age of information technology, cloud collaboration expects to see a significant amount ...of growth, as more organizations look to leverage the benefits of the industry specifically in the areas of flexibility, cost-efficiency, and security1. However, existing systems basically operates in a centralized cluster to achieve high performance, though they have a demand solving indisputable benefits, there are several inherent weaknesses such as high server costs for service providers, illegal data mining in trust-based architecture, security loopholes, and unethical government surveillance. Therefore, a large-scale resource sharing decentralized system can mitigate these traditional server expenses, data failure, and outage, as well as the enhanced security, and privacy of data. This dissertation presents a background to the problem, its impact on adaption, existing research background, and proposing design for storing, sharing, and coauthor files. The Design presents a decentralized resource (storage and network) sharing system, with real-time collaborative editing, peer (node) management, and redundancy schemes to manage fault tolerance of the distributed storage.
Ceph is a distributed file system that provides high performance, reliability, and scalability. Ceph maximizes the separation between data and metadata management by replacing allocation tables with ...a pseudo-random data distribution function (CRUSH) designed for heterogeneous and dynamic clusters of unreliable object storage devices (OSDs). In this paper, we investigate the performance of Ceph on an Open Stack cloud using well-known benchmarks. Our results show its good performance and scalability.
The volume of data being processed today in various applications is phenomenally enormous. While the computational speed of the processors to process these data has increased many folds since the ...last few decades, the storage systems designs to handle the processing of these data do not have adequate features. A crucial entity to support processing of such enormous volume of data is a file system. Until, a few years back, most of the file system designs relied on a single metadata and namespace server and the emergence of distributed metadata and namespace designs is seen only recently. However, most of these distributed namespace designs are based on cluster-based systems where the underlying environment is predictive and can be controlled. In this paper, we present Bristrita, a distributed file system metadata and namespace design layered on top of a peer-to-peer network. Bristrita allows varieties of file system operations, supports files with unlimited size with no limit on the depth of the namespace tree, the number of files in each directory and is resilient to failures. The paper includes the design decisions, the algorithms, and their analysis and shows that it can be used to support users across multiple different geographical regions encompassing a wide area network of computing nodes, storage clusters with support for ad-hoc nodes joining and leaving the network.