Coded caching is a technique that generalizes conventional caching and promises significant reductions in traffic over caching networks. However, the basic coded caching scheme requires that each ...file hosted in the server be partitioned into a large number (i.e., the subpacketization level) of non-overlapping subfiles. From a practical perspective, this is problematic as it means that prior schemes are only applicable when the size of the files is extremely large. In this paper, we propose coded caching schemes based on combinatorial structures called resolvable designs. These structures can be obtained in a natural manner from linear block codes whose generator matrices possess certain rank properties. We obtain several schemes with subpacketization levels substantially lower than the basic scheme at the cost of an increased rate. Depending on the system parameters, our approach allows us to operate at various points on the subpacketization level vs. rate tradeoff.
Improved Lower Bounds for Coded Caching Ghasemi, Hooshang; Ramamoorthy, Aditya
IEEE transactions on information theory,
07/2017, Volume:
63, Issue:
7
Journal Article
Peer reviewed
Open access
Caching is often used in content delivery networks as a mechanism for reducing network traffic. Recently, the technique of coded caching was introduced whereby coding in the caches and coded ...transmission signals from the central server were considered. Prior results in this area demonstrate that carefully designing the placement of content in the caches and designing appropriate coded delivery signals from the server allow for a system where the delivery rates can be significantly smaller than conventional schemes. However, matching upper and lower bounds on the transmission rate have not yet been obtained. In this paper, we derive tighter lower bounds on the coded caching rate than were known previously. We demonstrate that this problem can equivalently be posed as a combinatorial problem of optimally labeling the leaves of a directed tree. Our proposed labeling algorithm allows for significantly improved lower bounds on the coded caching rate. Furthermore, we study certain structural properties of our algorithm that allow us to analytically quantify improvements on the rate lower bound for general values of the problem parameters. This allows us to obtain a multiplicative gap of at most four between the achievable rate and our lower bound.
Polynomial based methods have recently been used in several works for mitigating the effect of stragglers (slow or failed nodes) in distributed matrix computations. For a system with <inline-formula> ...<tex-math notation="LaTeX">n </tex-math></inline-formula> worker nodes where <inline-formula> <tex-math notation="LaTeX">s </tex-math></inline-formula> can be stragglers, these approaches allow for an optimal recovery threshold, whereby the intended result can be decoded as long as any <inline-formula> <tex-math notation="LaTeX">(n-s) </tex-math></inline-formula> worker nodes complete their tasks. However, they suffer from serious numerical issues owing to the condition number of the corresponding real Vandermonde-structured recovery matrices; this condition number grows exponentially in <inline-formula> <tex-math notation="LaTeX">n </tex-math></inline-formula>. We present a novel approach that leverages the properties of circulant permutation matrices and rotation matrices for coded matrix computation. In addition to having an optimal recovery threshold, we demonstrate an upper bound on the worst-case condition number of our recovery matrices which grows as <inline-formula> <tex-math notation="LaTeX">\approx O(n^{s+5.5}) </tex-math></inline-formula>; in the practical scenario where <inline-formula> <tex-math notation="LaTeX">s </tex-math></inline-formula> is a constant, this grows polynomially in <inline-formula> <tex-math notation="LaTeX">n </tex-math></inline-formula>. Our schemes leverage the well-behaved conditioning of complex Vandermonde matrices with parameters on the complex unit circle, while still working with computation over the reals. Exhaustive experimental results demonstrate that our proposed method has condition numbers that are orders of magnitude lower than prior work.
Distributed computing frameworks such as MapReduce are often used to process large computational jobs. They operate by partitioning each job into smaller tasks executed on different servers. The ...servers also need to exchange intermediate values to complete the computation. Experimental evidence suggests that this so-called Shuffle phase can be a significant part of the overall execution time for several classes of jobs. Prior work has demonstrated a natural tradeoff between computation and communication whereby running redundant copies of jobs can reduce the Shuffle traffic load, thereby leading to reduced overall execution times. For a single job, the main drawback of this approach is that it requires the original job to be split into a number of files that grows exponentially in the system parameters. When extended to multiple jobs (with specific function types), these techniques suffer from a limitation of a similar flavor, i.e., they require an exponentially large number of jobs to be executed. In practical scenarios, these requirements can significantly reduce the promised gains of the method. In this work, we show that a class of combinatorial structures called resolvable designs can be used to develop efficient coded distributed computing schemes for both the single and multiple job scenarios considered in prior work. We present both theoretical analysis and exhaustive experimental results (on Amazon EC2 clusters) that demonstrate the performance advantages of our method. For the single and multiple job cases, we obtain speed-ups of 4.69x (and 2.6x over prior work) and 4.31x over the baseline approach, respectively.
This paper introduces a network coding-based protection scheme against single- and multiple-link failures. The proposed strategy ensures that in a connection, each node receives two copies of the ...same data unit: one copy on the working circuit and a second copy that can be extracted from linear combinations of data units transmitted on a shared protection path. This guarantees instantaneous recovery of data units upon the failure of a working circuit. The strategy can be implemented at an overlay layer, which makes its deployment simple and scalable. While the proposed strategy is similar in spirit to the work of Kamal in 2007 2010, there are significant differences. In particular, it provides protection against multiple-link failures. The new scheme is simpler, less expensive, and does not require the synchronization required by the original scheme. The sharing of the protection circuit by a number of connections is the key to the reduction of the cost of protection. This paper also conducts a comparison of the cost of the proposed scheme to the 1+1 and shared backup path protection (SBPP) strategies and establishes the benefits of our strategy.
Distributed matrix computations over large clusters can suffer from the problem of slow or failed worker nodes (called stragglers) which can dominate the overall job execution time. Coded computation ...utilizes concepts from erasure coding to mitigate the effect of stragglers by running "coded" copies of tasks comprising a job; stragglers are typically treated as erasures. While this is useful, there are issues with applying, e.g., MDS codes in a straightforward manner. Several practical matrix computation scenarios involve sparse matrices. MDS codes typically require dense linear combinations of submatrices of the original matrices which destroy their inherent sparsity. This is problematic as it results in significantly higher worker computation times. Moreover, treating slow nodes as erasures ignores the potentially useful partial computations performed by them. Furthermore, some MDS techniques also suffer from significant numerical stability issues. In this work we present schemes that allow us to leverage partial computation by stragglers while imposing constraints on the level of coding that is required in generating the encoded submatrices. This significantly reduces the worker computation time as compared to previous approaches and results in improved numerical stability in the decoding process. Exhaustive numerical experiments on Amazon Web Services (AWS) clusters support our findings.
We present a new class of irregular low-density parity-check (LDPC) codes for moderate block lengths (up to a few thousand bits) that are well-suited for rate-compatible puncturing. The proposed ...codes show good performance under puncturing over a wide range of rates and are suitable for usage in incremental redundancy hybrid-automatic repeat request (ARQ) systems. In addition, these codes are linear-time encodable with simple shift-register circuits. For a block length of 1200 bits the codes outperform optimized irregular LDPC codes and extended irregular repeat-accumulate (eIRA) codes for all puncturing rates 0.6~0.9 (base code performance is almost the same) and are particularly good at high puncturing rates where good puncturing performance has been previously difficult to achieve.
Coded caching is a technique that promises huge reductions in network traffic in content-delivery networks. However, the original formulation and several subsequent contributions in the area, assume ...that the file requests from the users are synchronized, i.e., they arrive at the server at the same time. In this work, we formulate and study the coded caching problem when the file requests from the users arrive at different times. We assume that each user also has a prescribed deadline by which they want their request to be completed. In the offline case, we assume that the server knows the arrival times before starting transmission and in the online case, the user requests are revealed to the server over time. We present a linear programming formulation for the offline case that minimizes the overall transmission rate from the server subject to the constraint that each user meets his/her deadline. While the online case is much harder, we introduce a novel heuristic for it and show that under certain conditions, with high probability the request of each user can be satisfied with her/his deadline. Our simulation results indicate that in the presence of mild asynchronism, much of the benefit of coded caching can still be leveraged.
Distributed matrix computations - matrix-matrix or matrix-vector multiplications - are well-recognized to suffer from the problem of stragglers (slow or failed worker nodes). Much of prior work in ...this area is (i) either sub-optimal in terms of its straggler resilience, or (ii) suffers from numerical problems, i.e., there is a blow-up of round-off errors in the decoded result owing to the high condition numbers of the corresponding decoding matrices. Our work presents a convolutional coding approach to this problem that removes these limitations. It is optimal in terms of its straggler resilience, and has excellent numerical robustness as long as the workers' storage capacity is slightly higher than the fundamental lower bound. Moreover, it can be decoded using a fast peeling decoder that only involves add/subtract operations. Our second approach has marginally higher decoding complexity than the first one, but allows us to operate arbitrarily close to the storage capacity lower bound. Its numerical robustness can be theoretically quantified by deriving a computable upper bound on the worst case condition number over all possible decoding matrices by drawing connections with the properties of large block Toeplitz matrices. All above claims are backed up by extensive experiments done on the AWS cloud platform.
Practical random network coding based schemes for multicast include a header in each packet that records the transformation between the sources and the terminal. The header introduces an overhead ...that can be significant in certain scenarios. In previous work, parity check matrices of error control codes along with error decoding were used to reduce this overhead. In this work we propose novel packet formats that allow us to use erasure decoding and list decoding. Both schemes have a smaller overhead compared to the error decoding based scheme, when the number of sources combined in a packet is not too small.