Information Theory

in Singapore (ITIS)

Is it ISIT? It is ITIS.

A Workshop Series on Coding and Information Theory

Information Theory in Singapore (ITIS) began in 2020 as an online seminar series jointly organized by Nanyang Technological University, National University of Singapore and Singapore University of Technology and Design. Through this series, we aim to promote, advocate, and spread the joy of coding theory and information theory within Singapore and throughout the world.

In 2023, the workshop is in-person!!

12 Oct 2023 (Thu), 8:50am to 5:30pm

Nanyang Technological University, Singapore, School of Physical and Mathematical Sciences

Executive Classroom 2 (SPMS-MAS-03-07) (detailed directions)

The workshop is open to all. Registration is free. You can register HERE.

Speakers and Tentative Schedule

8:50 - 9:00 - Opening Remarks

9:00 - 09:25 - Manabu Hagiwara, Chiba University

Title: Quantum Deletion Codes v.s. Quantum Insertion Codes
Abstract: In classical coding theory, a code is capable of correcting t-deletion error if and only if the code is capable of correcting t-insertion error. In quantum coding theory, this equivalence is an open problem. This presentation provides some difficulties with this problem.

9:25 - 10:05 - Keynote Speaker: Zohar Yakhini, Reichman University School of Computer Science

Title: Information aspects of synthetic biology, from cancer to data storage in DNA
Abstract: DNA is the basic molecule of all life big and small. This year we are marking 70 years to the discovery, in 1953, of the double helix structure of the DNA. This discovery marked a milestone in the history of science and gave rise to modern molecular biology. In recent decades technology has been developed that enables the manufacturing of synthetic DNA and the applications thereof. Furthermore - molecular biology techniques have been developed that allow for manipulating DNA and using it to perform experiments and to control biological processes.

High throughput applications of modern synthetic DNA reagents require design optimization and complex data analysis.

In this talk I will describe work that addresses such information aspects of studies and applications that use synthetic DNA.

I will start with data analysis in cancer research, move to describing CRISPR off target activity and how it is measured and conclude by discussing DNA as a potential data storage medium.

No prior biology knowledge is assumed

10:05 - 10:30 - Coffee break

10:30 - 10:55 - Cai Kui, Singapore University of Technology and Design

Coding and Detection for Emerging Data Storage Channels: Deletion Error Correction and Sneak Path Interference Detection
Abstract: In recent years, DNA-based data storage technology, which stores digital data using synthetic or live DNA, has emerged as a very promising candidate for long-term storage of Big Data. This is attributed to its extremely high data storage density, long-lasting stability of hundreds to thousands of years, and ultra-low power consumption for operation and maintenance. Meanwhile, the solid-state non-volatile memory (NVM) technologies have been developed rapidly over the past decade. Among various emerging NVMs, the resistive random access memory (ReRAM) is widely considered to have high potential for future data storage and computing. Each of these two technologies has its unique reliability challenges due to their fundamentally different data storage mechanisms. In this talk, we will introduce the specific problems of these two emerging data storage channels: the insertion / deletion / substitution errors in DNA data storage systems and the sneak path interference in ReRAM. We will provide a review of the major literature works on addressing these challenges. We will then present in detail our recent works for DNA data storage and ReRAM, respectively. Lastly, we will highlight some of the future research topics.

10:55 - 11:20 - Martianus Frederic Ezerman, Nanyang Technological University, Singapore

Title: Quantum Coding with Entanglement: Some Recent Results
Abstract: In the setting of entanglement-assisted quantum error-correcting codes (EAQECCs), the sender and the receiver have access to noiseless pre-shared entanglement. Such codes promise better information rates or improved error handling properties. Entanglement incurs costs and must be judiciously calibrated in designing quantum codes, relative to their deployment parameters. Revisiting known constructions, we devise tools from classical coding theory to better understand how the amount of entanglement can be varied. We present three new propagation rules for the parameters of these codes. With insights into what classical ingredients are required, we construct codes that lead to qubit and qutrit EAQECCs with better parameters than before. Online tables and some parameters not yet included there will be mentioned for future reference and comparison.

11:20 - 11:45 - Tuvi Etzion, Technion

On Pseudo-Random Arrays and Codes
Abstract: Perfect maps and pseudo-random arrays are the two-dimensional analogs of de Bruijn sequences and M-sequences. We modify the definitions to be applied to codes which are the analogs of factors in the de Bruijn graph. In particular, these factors are perfect factors and state diagrams obtained from irreducible polynomials. We review the main constructions for perfect maps and pseudo-random arrays and define their analogs for codes. We examine the minimum distance of the constructed codes.

11:45 - 12:05 - Van Khu Vu, National University of Singapore

Title: Deletion-Correcting-Codes: Something old, something new, and something followed
Abstract: In this talk, we will introduce various families of codes correcting deletion errors, including (non)binary codes, permutation codes, high dimensions codes, and symbol-pair codes, with their applications in data storage systems. In particular, we present new results on permutation codes correcting t deletions with 3t log n bits of redundancies. We note that these codes are closely related to permutation codes in Ulam metric and in generalized Kendall-tau metric proposed by Chee and Vu. Finally, we discuss several open questions for future work.

12:05 - 13:30 - Lunch break

13:30 - 13:55- Vincent Y. F. Tan, National University of Singapore

On Non-Interactive Simulation of Binary Random Variables
Abstract: We leverage proof techniques from discrete Fourier analysis and an existing result in coding theory to derive new bounds for the problem of non-interactive simulation or non-interactive correlation distillation (NICD) of binary random variables. Previous bounds in the literature were derived by applying data processing inequalities concerning maximal correlation or hypercontractivity. We show that our bounds are sharp in some regimes. Indeed, for a specific instance of the problem parameters, our main result resolves an open problem posed by E. Mossel in 2017.

13:55 - 14:20 - Mehul Motani, National University of Singapore

Title: Sliced Mutual Information Measures for Studying Generalization and Explainability in Deep Neural Networks
Abstract: Generalization is the ability of a learning algorithm to perform well on unseen data. Explainability deals with providing clear and understandable reasons for the predictions of a learning algorithm. In this talk, we explore generalization and explainability from an information theoretic perspective. We use a derivative of mutual information called sliced mutual information (SMI), which is scalable and efficient to compute. We show that SMI tracks generalization and memorization, and that an instance based version of SMI allows for explainability.

14:20 - 14:45 - Jonathan Scarlett, National University of Singapore

Many-Hop 1-bit Information Propagation
Abstract: In this talk, I will present various existing and new findings on a many-hop relaying problem, also known as the information velocity problem: An encoder seeks to send a single bit over a long chain of nodes, each connected by an independent binary symmetric channel. It has long been known that this can be done reliably over m hops with total transmission time n=O(m) (Rajagopalan and Schulman, 1994), but the existing solution is based on a result for much more general graphs that uses fairly complex tree codes and has exponential-time decoding. We give a very simple and efficient solution to attaining n=O(m) transmission time (i.e., a positive limit of m/n, which is the information velocity). In addition, we provide a converse bound based on the strong data processing inequality, establish the low-noise and high-noise asymptotics, and generalise our result to the k-bit setting with O(k+m) transmission time.

14:45 - 15:10 - Marco Tomamichel, National University of Singapore

Title: Finite Blocklength Channel Simulation
Abstract: We study channel simulation under common randomness-assistance in the finite-blocklength regime and identify the smooth channel max-information as a linear program one-shot converse on the minimal simulation cost for fixed error tolerance. We show that this one-shot converse can be achieved exactly using no-signalling assisted codes, and approximately achieved using common randomness-assisted codes. Our one-shot converse thus takes on an analogous role to the celebrated meta-converse in the complementary problem of channel coding, and find tight relations between these two bounds.

15:10 - 15:40 - Coffee break

15:40 - 16:00 - Nguyen Tuan Thanh, Singapore University of Technology and Design

Title: Every Bit Counts: A New Version of q-ary Varshamov-Tenengolts Codes with More Efficient Encoders
Abstract: The problem of correcting deletions and insertions has recently received significantly increased attention due to the DNA-based data storage technology, which suffers from deletions and insertions with extremely high probability. Non-binary codes correcting a single deletion or insertion were introduced by Tenengolts [1984], and recently, Wang et al. [2021] presented constructions of codes of length n, correcting a burst of length at most t (t>=2) over the q-ary alphabets with redundancy log n+O(log q log log n) bits, for arbitrary even q. The common idea in those constructions is to convert non-binary sequences into binary sequences, and the error decoding algorithms for the q-ary sequences are mainly based on the success of recovering the corresponding binary sequences, respectively.

In this work, we look at a natural solution that the error detection and correction algorithms are performed directly over q-ary sequences, and for certain cases, our codes provide a more efficient encoder with lower redundancy than the best-known encoder in the literature. Particularly, for the quaternary alphabet, our designed codes are suited for correcting deletions/insertions in DNA storage. For single error case, we present a linear-time algorithm that encodes user messages into these codes of length n with at most \cei{log_q n} + 1 redundant symbols, while the optimal redundancy required is at least \log_q n+\log_q (q-1) symbols. Our designed encoder reduces the redundancy of the best-known encoder of Tenengolts [1984] by at least 2 redundant symbols or equivalently 2 \log_2 q bits.

16:00 - 16:20 - Johan Chrisnata, Nanyang Technological University, Singapore

Title: Coding for DNA Synthesis
Abstract: The synthesis of DNA strands remains the most costly part of the DNA storage system. Thus, to make DNA storage system more practical, the time and materials used in the synthesis process have to be optimized. We consider the most common type of synthesis process where multiple DNA strands are synthesized in parallel from a common alternating supersequence, one nucleotide at a time. The synthesis time or the number of synthesis cycles is then determined by the length of this common supersequence. In this model, we design quaternary codes that minimizes synthesis time that can correct deletions or insertions, which are the most prevalent types of error in array-based synthesis. We also propose polynomial-time algorithms that encode binary strings into these codes and show that the rate is close to capacity.

16:20 - 16:40 - Stanislav Kruglik, Nanyang Technological University, Singapore

Title: Polynomial PIR and How to Easily Make Almost All PIR Protocols Information-Theoretically Verifiable
Abstract: Private Information Retrieval (PIR) protocols allow a client to retrieve any file of interest while keeping the files' identities hidden from the database servers. While most existing PIR protocols assume servers to be honest but curious, we investigate the scenario of dishonest servers that provide incorrect answers to mislead clients into obtaining wrong results. We propose a unified framework for polynomial PIR protocols encompassing various existing protocols that optimize the download rate or total communication cost. We introduce a way to transform a polynomial PIR to information-theoretically verifiable verifiable one without increasing the number of involved servers by doubling the queries.

17:00 - 17:10 - Closing Remarks

Page updated

Google Sites

Report abuse