By Yue Li and Jihyeon Lee
Technology has enabled the rapid production and dissemination of both information and misinformation. With the means to sway people’s opinions about social, economic, and political issues, sources such as news publishers, public figures on Twitter, and social media platforms are all part of an interconnected graph through which we receive and share information. In this “post-truth” era, we wanted to understand and model how rumors spread through this network using information theory, inspired by Wang et al.’s “A rumor spreading model based on information entropy.”
The Model and A Pretty Curve
First, let’s break down the different components in Wang et al.’s work.
We model individuals as nodes in a network and represent information as a binary string, e.g. `11011`. Given that the information is binary, there are 2^s possible different types of information. For this project, we will use a binary string with the length 5 (2^5 different types of information). You can consider each information type to be an article, opinion, or fact with varying levels of distortion.
Each individual has a memory, and it keeps track of the frequency of occurrence of each type of information. If the frequency of a certain information type is high, that means the individual has seen it many times, making it salient. And individuals will repeat the information that is most salient in their memory.
The measure we’re most interested in by far is information entropy. As the distribution of different types of information becomes more uniform, entropy increases. As one type of information dominates, entropy decreases. The more entropy, the more chaos and noise an individual encounters while remembering information.
Individuals also spread information, but there’s a chance that he/she/they distort that information, which is the probability of distortion. One assumption of the model is that the greater the entropy, the more uncertain a person’s memory is, and the more likely they will make errors when recalling and reproducing information.
There is also a probability of acceptance. Let’s say Yue receives information from Jihyeon. But Yue doesn’t always believe information from Jihyeon, so we model that level of (un)trustworthiness based on how many connections Jihyeon has among Yue’s neighbors. The more trustworthy Jihyeon is, the higher the probability that Yue accepts the information.
Now for how the information actually gets spread. It happens in three phases:
- Spreading. Individuals start to disseminate by remembering the most salient information type in their memory. Depending on the probability of distortion, a random bit may or may not be flipped, and this potentially distorted piece of information is then spread to all of n’s neighbors.
- Acceptance. When each neighbor receives the information, it may accept or reject it, given the probability of acceptance. If accepted, the information is added to that person’s memory bank.
- Updating. Since each person has finite memory, newly added information kicks out the oldest pieces. The update happens at the same time for everyone, where they decide the most salient information to pass together and then accept/reject together.
As information is passed through and all around the network, we plot information entropy. Given enough time passes, we notice a beautiful thing–a phase shift. A trend found many disciplines, even biology or chemistry, the phase shift represents that there was an explosion in entropy at a certain time step. The entropy persisted and stayed evenly after that burst.
We recreated this graph, as shown here:
Where the colored lines represent the different possibilities of distortion.
Experiments: Bad Actors and Good Ones
We tried two different experiments–the injection of malicious nodes and the addition of reliable nodes.
Experiment 1. Malicious nodes
Malicious nodes represent bad actors in a network, who always spread the bit string “11111” no matter what, but with the same probability of distorting as other nodes. We found that information entropy does not climb as high as in the vanilla experiment, but it declines over time, more quickly as the proportion of bad nodes increases (see plots below).
We believe this simulation is a fitting analogy for patterns in the political news sphere. When a piece of news is worthy of a sensationalist headline (e.g. a climate skeptic being placed on a White House panel evaluating whether or not climate change is a national security concern), many different publishers will all begin reporting that one story because it garners so much attention. In doing so, average information entropy across all the publishers decreases while the headline is relevant because many of them are focused on that one story.
Experiment 2. Reliable nodes
Definition 1. Reliable nodes are sources that spread the information they are given without distortion (i.e. probability of distortion is 0). Their real-world equivalent would be social media platforms like Facebook that do not produce their own content but create a space to share and view articles from publishers, which may or may not be reliable. We found that even with a high proportion of reliable nodes, explosion in information entropy still occurs (Figure 2), and it is only when there 99.7% of the nodes are reliable that information entropy remains at 0 (Figure 3).
Figure 3. Information entropy over time with 99.5% and 99.7% of all nodes being reliable.
Definition 2. Reliable nodes are sources that spread the truth (i.e. the original, undistorted bit string “00000”) no matter what. We would hope that this would be an equivalent of an objective, third-party arbiter of truth (e.g. fact-checked sources).
This weekend, we presented rumor theory and entropy at the Lucille M. Nixon Elementary School. For the project, we wanted to convey three key points to these kids:
- People with the most connections will influence the most people.
- People remember what they hear the most, even if there are contradictory statements.
- It is easy for the truth to get distorted, especially the further away you get from the truth.
We presented this in two ways. The first was a powerpoint slide to show how a single rumor (“Jihyeon didn’t brush her hair”) can turn into something completely different because a person might not remember something correctly, or because the room was too noisy to correctly spread the gossip. As a result, a single mistake in a rumor can ripple in a lot of different ways.
Once the students understood how rumors spread through these nodes, we wanted to portray what that looked like in real life. We came up with a game called the Metamorphic Drawing, in which a person is asked to copy the previous image that came before them exactly as they see it without changing anything. They cannot see any other drawings besides the one that came right before them. It’s similar to the classic game of telephone, but through images. Of course, because we interpret images differently and people have varying degrees of drawing ability, the images don’t quite look exactly like the previous image. Once a person has finished drawing, we would show them all the images that came before theirs.
This game was exciting because it was easy to see how information was lost over time and corrupted. The further away from the original image we got, the less and less it resembled the original drawing, similar to our takeaway point 3, that the further we get away from the source, the less the information resembles the actual truth. It was also striking how quickly detailed information, such as the patterns on the wings and flowers disappeared. Similar to our findings from the original paper, we see a huge jump in the amount of information changed and lost. This peters over time because the image becomes so simple that only small details are changed. It also speaks volumes about what information we choose to spread as humans. This game was a great way to convey how information (like rumors) get misrepresented over time!