# Compression of Quantum Information

Information Theory (Winter 2020)

Quantum computers take advantage of the quantum phenomena that does not exist in classical computers to perform speed-up in some specific tasks. These phenomena include superposition, entanglement and interference. The most common architecture for quantum computers is based on two-level systems. The information unit is called qubit and its two levels are usually denoted as quantum states $\left|0\right>$ and $\left|1\right>$. With $N$ qubits, there are $2^N$ such quantum states and each of them has $2^N$ entries, e.g. $\left|00101011\right>$ for $N=3$. Quantum gates in a quantum computer transform and interfere these states in a way that is hard to be simulated by a classical computer.

When presented in an isolated environment, a quantum state stays coherent and we called the state a pure state. Once the quantum state interacts with its environment, decoherence causes the state to decay to other states with uncertainty. In this case, the quantum state becomes a mixed state. A mixed quantum state cannot be described by a pure state. Instead, it needs to be described by an ensemble of pure states $\{\left|\psi_i\right>\}$ with a probability distribution $\{p_i\}$. This ensemble is usually represented as a density matrix $\rho=\sum_i p_i\left|\psi_i\right>\left<\psi_i\right|$ with $2^N\times2^N$ entries in it.

In the aspect of error correction and quantum communication, the process of the quantum state interacting with environment and becoming a mixed state is considered as a noisy channel. Studies have been focused on finding the minimum output von Neumann entropy of the noisy channels because the quantity is related to the maximum mutual information between the input and output of the channels. Once the maximum mutual information of a noisy channel is computed, we know how much error correction is needed to send a piece of quantum information without loss.

Nevertheless, a mixed state is not necessary a pure state with noise. For example, a thermalized quantum state needs to be described as a mixed state. We want to track all components of the density matrix instead of correcting it. Besides, in the Noisy Intermediate Scale Quantum (NISQ) era, quantum computers are used without error correction. To understand how noise affects quantum computers, it is important to track how a density matrix evolve through noisy channels. These lead to the motivation of compressing a mixed state for more efficient characterization.

In the aspect of compression, one should focus on the maximum output von Neumann entropy of a channel because it is related to the minimum mutual information which defines the optimal compression rate. In the following, we give two examples of noisy channels and study the growth of maximum output entropy as a function of number of qubit. Several assumptions are made. First, it is assumed that the noise level is small ( $\simeq0.1\%$) in the noisy channels. Second, $n$-qubit noisy channels are composition of single-qubit noisy channels. In other word, it is assumed that noise in a qubit is independent to other qubits.

In the bit-flip channel defined by $\rho\rightarrow(1-p)\rho+p\sigma_x\rho\sigma_x$, where p is the noise level, $\sigma_x$ is the Pauli X matrix, it is found that the maximum output entropy growth approximately logarithmically. It means that, while it takes $\mathcal{O}(2^N\times2^N)$ complex numbers to track a full density matrix, it is possible to tack a compressed representation with only $\mathcal{O}(N\times2^N)$ numbers. The compression rate is approximately $\mathcal{O}(\frac{N}{2^N})$. In the depolarizing channel defined by $\rho\rightarrow(1-p)\rho+\frac p3\sigma_x\rho\sigma_x+\frac p3\sigma_y\rho\sigma_y+\frac p3\sigma_z\rho\sigma_z$, where $\sigma_y$ is the Pauli Y matrix and $\sigma_z$ is the Pauli Z matrix, it is harder to find the exact maximum output entropy. But it is found that the entropy is upper bounded by a quantity that grows approximately logarithmically. Thus, the compression rate is also approximately $\mathcal{O}(\frac{N}{2^N})$.

In summary, compared to the computational resource used in simulations of a full density matrix, it is possible to use only a small fraction of that when the noise is small. This is because the entropy grows much slower than the Hilbert space; while the dimensionality of Hilbert space grows like $2^N$, the effective dimensionality of the density matrix (or the effective rank) grows approxmiately polynomially. This gives the possibility of compression and efficient simulation of a density matrix with small noise.