The emerging technologies of persistent memory, such as PCM, MRAM, provide opportunities for preserving files in memory. Traditional file system structures may need to be re-studied. Even though ...there are several file systems proposed for memory, most of them have limited performance without fully utilizing the hardware at the processor side. This paper presents a framework based on a new concept, "File Virtual Address Space". A file system, Sustainable In-Memory File System (SIMFS), is designed and implemented, which fully utilizes the memory mapping hardware at the file access path. First, SIMFS embeds the address space of an open file into the process' address space. Then, file accesses are handled by the memory mapping hardware. Several optimization approaches are also presented for the proposed SIMFS. Extensive experiments are conducted. The experimental results show that the throughput of SIMFS achieves significant performance improvement over the state-of-the-art in-memory file systems.
We present H2DP , a holistic heterogeneity-aware data placement scheme for hybrid parallel I/O systems, which consist of HDD servers and SSD servers. Most of the existing approaches focus on server ...performance or application I/O pattern heterogeneity in data placement. H2DP considers three axes of heterogeneity: server performance, server space, and application I/O pattern. More specifically, H2DP determines the optimized stripe sizes on servers based on server performance, keeps only critical data on all hybrid servers and the rest data on HDD servers, and dynamically migrates data among different types of servers at run-time. This holistic heterogeneity-awareness enables H2DP to achieve high performance by alleviating server load imbalance, efficiently utilizing SSD space, and accommodating application pattern variation. We have implemented a prototype of H2DP under MPICH2 atop OrangeFS. Extensive experimental results demonstrate that H2DP significantly improve I/O system performance compared to existing data placement schemes.
DMA-assisted I/O for Persistent Memory Li, Dingding; Zhang, Weijie; Dong, Mianxiong ...
IEEE transactions on parallel and distributed systems,
05/2024, Letnik:
35, Številka:
5
Journal Article
Recenzirano
Modern local persistent memory (PM) file systems often rely on CPU-based memory copying for data transfer between DRAM and PM, resulting in significant CPU resource consumption. While some nascent ...systems explore DMA (direct memory access) as an alternative for improved efficiency, the intricacies and trade-offs remain obscure. This paper investigates the feasibility of DMA for PM I/O and argues that it is not a straightforward replacement for CPU-based methods. Two key limitations hinder the direct adoption: poor performance for small data and limited bandwidth. To relieve these issues, we propose PM-DMA, a novel I/O mechanism that leverages the strengths of both CPU and DMA. It incorporates three key components: (1) L-Switch, seamlessly switches between CPU and DMA modes based on workload characteristics, maximizing performance; (2) D-Pool, reduces DMA setup overhead, improving responsiveness; (3) P-Mode, allows servicing requests through multiple channels, even hybrid CPU-DMA ones, for enhanced throughput. We implemented PM-DMA on two well-known PM file systems, NOVA and WineFS, utilizing Intel I/OAT technology. Our experimental results demonstrate substantial CPU consumption reductions across diverse workloads. Notably, under heavy load, PM-DMA delivers up to a <inline-formula><tex-math notation="LaTeX">10.4\times</tex-math></inline-formula> performance improvement.
Parallel file systems (PFS) are used to distribute data processing and establish shared access to large-scale data. Despite being able to provide high I/O bandwidth on each node, PFS has difficulty ...utilizing the I/O bandwidth due to a single connection between the client and server nodes. To mitigate the performance bottleneck, users increase the number of connections between the nodes by modifying PFS or applications. However, it is difficult to modify PFS itself due to its complicated internal structure. Thus, PFS users manually increase the number of connections between the nodes by employing several methods. In this paper, we propose a user-transparent I/O subsystem, MulConn, to make users exploit high I/O bandwidth between nodes. To avoid the modifications of PFS and user applications, we have developed a horizontal mount procedure and two I/O scheduling policies, TtoS and TtoM, in the virtual file system (VFS) layer. We expose a single mount point that has multiple connections by modifying the mount path of VFS from vertical hierarchy to horizontal hierarchy. We also introduce two I/O scheduling policies to distribute I/O requests evenly to multiple connections. The experimental results show that MulConn improves write and read performance by up to 2.6x and 2.8x, respectively, compared with those of PFS using the existing kernel. In addition, we provide the best I/O performance that PFS can provide in the given experimental environments.
Linux is a widely used multi-user operating system with applications ranging from personal desktop to commercial heavy duty web servers. It has built-in security features based on discretionary ...access control enforced in the form of access control lists, which can be enhanced using the Linux Security Module (LSM) Framework. LSM allows inserting security verification hooks for supporting custom security policies. However, there is no support yet for Attribute-Based Access Control (ABAC) - an access control model gaining popularity due to its dynamic nature and flexibility. In ABAC, access is granted or denied based on attributes of the subject, object and environment. In this work, we propose a method for enhancing Linux's security features by integrating ABAC for file system objects using the LSM framework. We look at various kernel and user space components and how they can be made to work together to enforce ABAC policies. Different algorithms and data structures for efficient access request resolution are also investigated. Finally, we carry out extensive performance evaluation of the ABAC-enabled Linux system and discuss its results.
An open-source software framework called the storage benchmark kit (SBK) is used to store the system benchmarking performance framework. The SBK is designed to perform any storage client or device ...using any data type as a payload. SBK simultaneously helps number of readers as well as writes to the storage system of large amounts of data as well as allows end-to-end latency benchmarking for multiple writers and readers. The SBK uses standardized performance measures for comparing and evaluating various storage systems and their combinations. Distributed file systems, distributed database systems, single or local node databases, systems of object storage, platforms of distributed streaming and messaging, and systems of key-value storage are the storage solutions supported by SBK. The SBK supports various storage systems like XFS, Kafka streaming storage systems, and Hadoop distributed file system (HDFS) performance benchmarking. The experimental results show that a proposed method achieves execution time of 65.530 s, 40.826 s and 30.351 s for the 100k, 500k and 1000k files respectively which ensures better improvement than the existing methods such as simple data interface and distributed data protection system.
MapReduce is a popular computing model for parallel data processing on large-scale datasets, which can vary from gigabytes to terabytes and petabytes. Though Hadoop MapReduce normally uses Hadoop ...Distributed File System (HDFS) local file system, it can be configured to use a remote file system. Then, an interesting question is raised: for a given application, which is the best running platform among the different combinations of scale-up and scale-out Hadoop with remote and local file systems. However, there has been no previous research on how different types of applications (e.g., CPU-intensive, data-intensive) with different characteristics (e.g., input data size) can benefit from the different platforms. Thus, in this paper, we conduct a comprehensive performance measurement of different applications on scale-up and scale-out clusters configured with HDFS and a remote file system (i.e., OFS), respectively. We identify and study how different job characteristics (e.g., input data size, the number of file reads/writes, and the amount of computations) affect the performance of different applications on the different platforms. Based on the measurement results, we also propose a performance prediction model to help users select the best platforms that lead to the minimum latency. Our evaluation using a Facebook workload trace demonstrates the effectiveness of our prediction model. This study is expected to provide a guidance for users to choose the best platform to run different applications with different characteristics in the environment that provides both remote and local storage, such as HPC cluster and cloud environment.
Context: The purpose of this study was to evaluate and compare the centering ability and canal transportation of TruNatomy, OneCurve, and Jizai file systems to assess their performance in oval-shaped ...canals using cone-beam computed tomography imaging.
Materials and Methods: Forty-two fully formed single-rooted mandibular premolars were selected with a buccolingual canal size 2-2.5 times the mesiodistal size at 5 mm from the apex, with 0°-10° canal curvature with a 5-6 mm radius, at 5 mm from the apex. The teeth were divided into three groups (n = 14) and prepared with TruNatomy, OneCurve, and Jizai files based on the manufacturer's instructions. Cone-beam computed tomographic images were taken before and after instrumentation. The canal transportation and centering ability was calculated at 3, 6, and 9 mm from the apex in both mesiodistal and buccolingual directions.
Statistical Analysis: Intergroup comparison was done using Kolmogorov-Smirnov test. Intragroup comparison was done using Freidman test. A comparison of categorical variables was done using the Chi-square test.
Results: The results obtained did not present any statistically significant difference between the three groups, with TruNatomy and OneCurve showing relatively lesser canal transportation and better centering ratio when compared to the Jizai file system.
Conclusions: It can, therefore, be concluded that all three systems used in the study are capable of safely preparing root canals with minimal errors.
Natural disasters could be defined as a blend of natural risks and vulnerabilities. Each year, natural as well as human-instigated disasters, bring about infrastructural damages, distresses, revenue ...losses, injuries in addition to huge death roll. Researchers around the globe are trying to find a unique solution to gather, store and analyse Big Data (BD) in order to predict results related to flood based prediction system. This paper has proposed the ideas and methods for the detection of flood disaster based on IoT, BD, and convolutional deep neural network (CDNN) to overcome such difficulties. First, the input data is taken from the flood BD. Next, the repeated data are reduced by using HDFS map-reduce (). After removal of repeated data, the data are pre-processed using missing value imputation and normalization function. Then, centred on the pre-processed data, the rule is generated by using a combination of attributes method. At the last stage, the generated rules are provided as the input to the CDNN classifier which classifies them as a) chances for the occurrence of flood and b) no chances for the occurrence of a flood. The outcomes obtained from the proposed CDNN method is compared parameters like Sensitivity, Specificity, Accuracy, Precision, Recall and F-score. Moreover, when the outcomes is compared other existing algorithms like Artificial Neural Network (ANN) & Deep Learning Neural Network (DNN), the proposed system gives is very accurate result than other methods.
I/O-Aware Flushing for HPC Caching Filesystem Tatebe, Osamu; Hiraga, Kohei; Ohtsuji, Hiroki
2023 IEEE International Conference on Cluster Computing Workshops (CLUSTER Workshops),
2023-Oct.-31
Conference Proceeding
The increasing difference in performance between computing and storage has caused significant problems for HPC systems. To reduce the difference in performance, intermediate storage layers have been ...introduced; however, the flushing strategy is a significant issue in these layers. Dirty data must be flushed to efficiently utilize the intermediate storage layers although this may degrade I/O performance and cause instability because data access and flushing interfere with each other. In this study, an I/O-aware mechanism that does not interfere with I/O activity is proposed for data flushing, using HPC-specific workloads. Evaluations based on HPC application benchmarks in MPI-IO and NetCDF demonstrated the performance advantages of the proposed I/O-aware flushing, with the I/O performance of the intermediate storage layers displaying minimum degradation and remaining stable during flushing.