The Mystery Channel

EE376A (Winter 2019)

Imagine you are given a communication channel C, which takes in a messages x = x_1, \ldots, x_n of length n, with each x_i taking values in an alphabet of size k. C is noiseless, except that it has a malfunction, which is that it is deleting a set of indices D \subset \{1, \ldots, n\} of size d, so the output is C(x) = (x_{I_1}, \ldots, x_{i_{n-d}}), where I = \{1, \ldots, n\}\setminus D are the indices which are not deleted by C. If you can just figure out what the set D is, then you can again use C as a noiseless channel by sending the actual message only in the indices I and putting arbitrary symbols in D. The situation is diagrammed below in Figure 1 with n = 8, k = 2, and D = \{4, 7\}.

Figure 1: An example of two messages sent through the same channel. The message length is n = 8, and the alphabet size is 2. The deletion indices are D = \{4, 7\}. Note that the channel is deleting the same indices on both inputs.

This is essentially the question which we posed to Nixon Elementary School students on Information Theory Night. The implementation had a few kid-friendly changes. First, the messages, instead of symbols, were sequences of colored cards, which the kids got to choose from. The channel itself was a cardboard box with holes on both sides for the messages to go through. To increase the mysteriousness, we draped a big black blanket over the apparatus, so that the whole thing had the appearance of a Houdini-esque magic trick.

A picture of us with the setup, complete with candy and an eye-catching poster.

To send messages, children would insert their flashcards one by one into the box, and one of us would sit behind it and choose which cards would get through to the other side. Another one of us would then lay out the cards that did get through cleanly on the other side, so that the kids could compare it to what they put in.

A glimpse behind the scenes.

Our project had a few key strengths. The first was that it was interactive. We wanted the kids to really get a sense for the problem by playing with it with their bare hands (literally). Instead of lecturing them, we tried to give them a chance to learn what worked and what didn’t through trial and error. This not only kept them engaged, but also gave them a rare opportunity to engage with a problem that was completely new, and which they didn’t know had a ‘right’ or ‘wrong’ answer. Instead of being told what to do, they got to experiment, which is always the first step when approaching any problem.

The next strength was that we could scale the difficult of the problem, either by making them use fewer colors (a smaller alphabet), or by increasing the number of deletions. After solving the one deletion case, many of them found that two deletions was quite a bit more difficult. This introduced them to the idea of generalizability, and showed how a problem which is trivial in one parameter regime could be much more nuanced in another.

Overall, the outreach project was a great experience. It was exciting to see children interact with a problem we had been working on. I also feel that the challenge of communicating the problem to them also gave us a better understanding of what the important aspect of the problem really were.

Moving on from the outreach project, we will now formalize the problem at hand. We will mainly focusing on the binary alphabet case, k=2. Our first observation is that that given a set of codewords to send through the channel, we can model the problem of identifying the deletion indices as solving linear system.

Let x_1, \ldots, x_m \in \mathbb{F}_2^n be the m codewords we will send through the channel. We then let X be the the matrix whose columns are these codewords, that is,

We then formulate the “deletion matrix” P, which, when multiplied by a codeword x\in\mathbb{F}_2^n, yields the result of passing x through the channel, y\in\mathbb{F}_2^{n-d}. We do this by letting $P$ be a submatrix of the n\times n identity matrix. In particular, if D is the set of indices being deleted and we let e_i be the one-hot vector in the i-th coordinate, let P be the matrix of columns \{e_i\}_{i\not\in D} in ascending order. For example, if n=4 and D=\{3\}, then

Note that knowing P tells us exactly which indices are being deleted.

We now observe now that if we write PX=Y, with our newly defined matrices P and X, then Y is the matrix whose columns are the results of passing the columns of X, x_1, \ldots, x_m \in \mathbb{F}_2^n, through the channel. Thus, we now see that given a set of codewords, i.e. given X, identifying the deletion indices is equivalent to solving the linear system PX=Y for P.

The question is then of designing the matrix X such that PX=Y is solvable for any P, while minimizing the number m of columns of X. In particular, we are interested in the minimal m, and the corresponding X.

We now note an easy upper bound on the minimal such m, \log_2(n). In particular, we let X be the n\times \log_2(n) matrix whose rows are all the binary of length \log_2(n). Upon observing the matrix Y, it suffices to check which of the length \log_2(n) binary strings are missing from the rows of Y.

Again, this is easiest to understand by way of example. Suppose we have a channel that deletes the indices D=\{2, 4, 5\} from binary strings of length eight. Then, the matrix P is given by

We then let X be the matrix

We then have that Y=PX is

We can see that Y is missing the rows (0~0~1),(0~1~1),(1~0~0), which tells us precisely that D=\{2, 4, 5\}

Though this scheme works well when thinking of d growing as a fraction of n, it is overkill when d is a constant. For those who wish to think about this problem further, it might be interesting to think about this parameter regime, and also think about how the algorithm can better take advantage of the fact that you can observe the output from the previous messages before choosing the next one to put through. Our solution right now ignores this fact, choosing all the messages at once, beforehand (the distinction is referred to as “adaptive” vs. “non-adaptive” algorithms).

The are interesting generalizations of deletion identification, to which this particular problem could be reduced, such as recovering arbitrary projection matrices (over finite fields). We are interested in an closed form for how many columns are needed. By reduction arguments, the case of deletions may give interesting bounds on the above generalizations.

Finally, we would like to thank Prof. Weissman and the teaching staff of EE376A for their advice on this project as well as their hard work in organizing the outreach event.

Leave a Reply