[month] [year]

IEEE Information Theory Workshop 2020

Prof. Prasad Krishnan and his students presented the following papers at IEEE Information Theory Workshop 2020, Riva del Garda, Italy held virtually from 11 – 15 April:

Coded Data Rebalancing for Decentralized Distributed Databases –  Sushena Sree K V, Prasad Krishnan.

Research work as explained by the authors: The performance of replication-based distributed databases is affected due to non-uniform storage across storage nodes (also called \textit{data skew}) and reduction in the replication factor during operation, particularly due to node additions or removals. Data rebalancing refers to the communication involved between the nodes in correcting this data skew, while maintaining the replication factor. For carefully designed distributed databases, transmitting coded symbols during the rebalancing phase has been recently shown to reduce the communication load of rebalancing. In this work, we look at balanced distributed databases with \textit{random placement}, in which each data segment is stored in a random subset of r nodes in the system, where r refers to the replication factor of the distributed database. We call these as decentralized databases. For a natural class of such decentralized databases, we propose rebalancing schemes for correcting data skew and the reduction in the replication factor arising due to a single node addition or removal. We give converse arguments which show that our proposed rebalancing schemes are optimal asymptotically in the size of the file.

 

An Umbrella Converse for Data Exchange: Applied to Caching, Computing, Shuffling & Rebalancing – Prasad Krishnan, Lakshmi Natarajan, V Lalitha

Research work as explained by the authors: The problem of data exchange between multiple nodes with (not necessarily uniform) storage and communication capabilities models several current multi-user communication problems like Coded Caching, Data shuffling, Coded Computing, etc. The goal in such problems is to design communication schemes which accomplish the desired data exchange between the nodes with the optimal (minimum) amount of communication load. In this work, we present a converse to such a general data exchange problem between multiple nodes. The expression of the converse depends only on the number of bits to be moved between different subsets of nodes, and does not assume anything further specific about the parameters in the problem. Specific problem formulations, such as those in Coded Caching, Coded Data Shuffling, Coded Distributed Computing, and some of their variants, naturally can be seen as instances of this generic data exchange problem. Applying our generic converse to such problems, we recover known important converses for these settings and some of their variants in a simpler way. Further, for a generic coded caching problem with multiple transmitters, receivers and cache sizes, we show a new general converse which subsumes many existing results. We also employ our bound to obtain a new tight converse bound for the multi-node removal case in the Coded Data Rebalancing problem, in which nodes must exchange information to `rebalance’ a storage cluster after some node failures occur. Finally we relate a `centralized’ version of our bound to the known generalized independence number bound in index coding, and discuss our bound’s tightness in this context.

 

Blind Updates in Coded Caching – Suman Ghosh (IIT Hyderabad), Lakshmi Natarajan, Prasad Krishnan

Research work as explained by the authors: We consider the centralized coded caching system where a library of files is available at the server and their subfiles are cached at the clients as prescribed by a placement delivery array (PDA). We are interested in the problem where a specific file in the library is replaced with a new file at the server, the contents of which are correlated with the file being replaced, and this replacement needs to be communicated to the caches. The server loses the original file when the replacement is done and is unaware of the differences between the two files, whereas each cache has access to specific subfiles of the original file as dictated by the PDA. We model the correlation between the two files by assuming that they differ in at the most ϵ subfiles, and aim to reduce the number of bits broadcast by the server to update the caches. We design a new elegant coded transmission strategy for the server to update the caches blindly, and also identify another simple scheme that is based on MDS codes. We then derive converse bounds on the minimum cost ℓ∗ among all linear strategies. Equipped with these results, we identify ℓ∗ when the caching system uses a PDA of Yan et al. For two other families of PDAs — the Maddah-Ali-Niesen scheme and a scheme by Tang \& Ramamoorthy and Yan et al. — we show that our new scheme has cost ℓ∗(1+o(1)) when the updates are sufficiently sparse, while the scheme using MDS codes has order-optimal cost when the updates are dense.

IEEE Information Theory Workshop, a flagship conference of the IEEE Information Theory Society, is a forum for technical exchange among scientists and engineers working on the several aspects of Information Theory. Every edition emphasizes a number of related topics that Information Theory impacts. 

  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •