Elizabeth Chen, Jingxiao Liu, Kailas Vodrahalli
Introduction
Let’s talk about pictograms: what are they and how well do they convey information?
Pictograms are symbols that transmit information through visual forms [1]. You might have seen, on the containers of certain chemicals, an icon of a flame. The icon is used to indicate that the chemical is flammable, and is an example of a pictogram. Several writing systems, such as the Egyptian hieroglyphs and Chinese, use pictograms as characters to deliver ideas. In this study, we aim to probe how well pictograms transmit information. More specifically, we will be looking at pictograms from oracle bone scripts, which are ancient Chinese writings inscribed on animal shells and bones [2], as well as pictograms from modern Chinese characters, and comparing the two on their ability to convey meaning.

Consider the oracle bone script character for “mountain” shown in Figure 1, and then the modern Chinese character for “mountain” in Figure 2. How well can an individual with no prior knowledge of Chinese associate these characters to actual mountains? Which represents a mountain better: the oracle bone script or modern character?

Our team tried to measure the “information distance” between the pictographic characters and the photographs of the objects they represent. We wanted to know which form of writing, the oracle bone scripts or modern Chinese, was the more effective medium for transmitting visual information.
Methods
We measured information distance using two methods:
Method 1: We asked people to try to match either oracle bone scripts or modern Chinese characters to photographs of the corresponding objects they represent. We hope studying how well people correctly matched characters to images will provide us with insight on whether oracle bone scripts or modern Chinese characters more effectively represents visual forms.
Method 2: We trained a neural network to classify photographs of objects. In other words, upon training, the neural network will be able to take in, for instance, a photograph of a dog, and output “dog”. We then give the neural network oracle bone script characters and modern Chinese characters and study what the neural network outputs. In this case, the performance of the neural network on the pictographic characters can serve as an indicator on how well each type of pictographic character transmits the information found in visual presentation of the object it tries to represent.
Results
Method 1 Results
We used Amazon MTurk to obtain input from several hundred people. Each person was shown (1) a single grayscale image of 1 of 10 objects (bird, field, fish, moon, mountain, sheep, sun, tree, turtle, or water) along with (2) ten images of the ancient or modern Chinese character corresponding to these 10 objects. The MTurk worker was asked to “select the character image that most closely resembles [the natural image].” We additionally asked a survey question to determine whether the worker understood Chinese and filtered out these responses during analysis.
You can access the matching survey at the following links:
- Image-to-Ancient Character Survey: bit.ly/info-theory-pictogram1
- Image-to-Modern Character Survey: bit.ly/info-theory-pictogram2
- Image-to-Ancient Character Solution: bit.ly/info-theory-pictogram-answers1
- Image-to-Modern Character Solution: bit.ly/info-theory-pictogram-answers2
Here, we are essentially asking the workers to select the “best encoding” for a given natural image. We could have done the reverse and show them a character and have them select the “best” natural image match, with one image sampled randomly from each class (this would be something like decoding). We chose our method so that we could get responses per natural image (in the “decoding” method, we would get results relative to 10-tuples of natural images; note the character images are fixed, so this is not an issue in our “encoding” method).
In total, we obtained 2,000 responses (2 character sets x 10 classes x 10 images per class x 100 responses per image/character set pair). After filtering out responses from workers who understood written Chinese, we had 1,816 responses remaining. All further analysis is done after this filtering step.
Our analysis is shown below. In Figure 3, we plot the observed distribution of selected symbols (solid blue) vs the true distribution (dotted black line) over both ancient and modern character sets. We note that some characters are not selected often (field) while others compensate for this (fish).

In Figure 4, we plot the marginal distribution of selected symbols over the ancient character set and modern character set. As one might expect, different characters are weighted differently between modern and ancient character sets. For example, we turn to fish in the modern set and water in the ancient. In both of these cases, the observed frequency is higher than the true frequency of the class, suggesting that fish (modern) and water (ancient) are characters “close” to (or, in other words, often mistaken as) other classes of natural images.

We now calculate the by-class accuracy of both character sets and plot the results in Figure 5. We first compute the maximum accuracy possible given the distributions from Figure 4 (i.e. 1-\Sigma_{c\in\text{classes}}|\text{observed}_{c}-\text{expected}_{c}|), and obtain 81.6% for the ancient set and 82.5% for the modern one.

The true accuracy is 65.7% for the ancient characters and 18.5% for the modern ones. For every class, the ancient character has higher accuracy than the modern character. For classes such as “bird”, the accuracy difference between the 2 character sets is large, suggesting that the ancient bird character conveys more information in its shape than the modern version does. In contrast, the “field” class has similar accuracy between modern and ancient, suggesting a similar amount of information is conveyed in both the ancient and modern characters. Both of these states can be confirmed visually in Table 1, where the ancient character for “bird” looks like an actual bird and the modern and ancient characters for “field” are rather similar in appearance.

Method 2 Results
We trained a convolutional neural network (CNN) to learn to classify images of mountains, moons, trees, fields, and suns. By the end of the training, the neural network is able to take in a photo similar to the ones shown in Figure 6 and correctly identify it as a “mountain”, “moon”, “tree”, “field”, or “sun” based on the contents of the input image with more than 90 percent accuracy.

We treated the trained CNN as a proxy for an individual that knows what the 5 classes (mountain, moon, tree, field, and sun) look like in real life, but has no knowledge of how they are written as Chinese characters. We then showed both the oracle bone script representations and the modern Chinese characters of the 5 classes to the trained neural network. The performance of the CNN is summarized in Tables 2 and 3.


We see that the CNN guessed 2 out of 5 correctly when shown the oracle bone scripts, and also achieved 2 out of 5 correct when shown the modern Chinese characters, which seems to suggest that, when measured using the classification accuracy of a trained CNN, the information distance between oracle bone scripts and actual images, and that between modern Chinese characters and images are roughly similar.
Conclusion
In this project, we tried to measure how well Chinese characters from two different time periods conveyed meaning by putting out surveys asking people to match natural images to their corresponding character representations and by studying whether a trained image-classifying CNN is able to correctly identify these characters.
From the survey results, we saw that people had an easier time associating natural images to ancient characters than they did associating natural images to modern characters. The trained image-classifying CNN performed similarly for both the oracle bone scripts and modern Chinese characters.
What are your thoughts? Which one spoke to you more: the oracle bone scripts or the modern Chinese characters?