Biofeedback in Performance

Journal for High Schoolers, Journal for High Schoolers 2022, Uncategorized

Jai Bhatia, Yui Hasegawa, Gabriele Muratori, Stasia Vaituulala, Farangiz Akhadova, Nikko Boling

Abstract

The purpose of our research is to investigate whether there are physical, quantifiable differences between an actor’s portrayal of emotions and the real-life sensation of those emotions. There is a lack of research surrounding the physiological changes that actors undergo as a result of their performances. To tackle this problem, we ran preliminary trials collecting Electrocardiography (ECG) signals, heart rate signals as an indication of their physical state while experiencing emotions, Galvanic Skin Response (GSR), and Electromyography (EMG). This paper serves as a starting point to integrate real-time biometric data into a theatrical performance and explores the potential of providing biofeedback for actors.

Background

There exist some universal standards of expression in theatrical performance. One of the most notable examples is the Delsarte System of Bodily Expression [1], which serves as a dictionary for outward expressions (such as gestures, facial cues, and movements) to express inner emotions. However, this model only centers around the external, visible outlook of actors and neglects the internal changes they experience while performing. One of the biggest challenges actors face is portraying the emotions of their characters in a “genuine” manner to create a realistic performance. And yet, without a way to quantify the internal changes of an actor, it remains extremely difficult to define what constitutes a “genuine” performance and how actors can better achieve it.

We wanted to test the hypothesis that our emotions are correlated with bodily changes, also known as the physiological theory of emotion [2]. In order to categorize the emotions based on physiological data, we used the circumplex model of affect – a two-dimensional framework of emotion – which plots arousal, the intensity of emotion, and valence, the extent of which an emotion is positive or negative. Prior research indicates that changes in the visceral motor system of the body are the most notable signs of emotional arousal. Hence, the actor’s heart rate and sweat gland activity are important signals to measure for our research. Though we conducted our experiment on ourselves, we outlined directions for the application of this research in theatrical performance.

While biofeedback has been explored in the performing arts [3] [4], integrating real-time metabolic data into a theatrical performance and making it visible to the audience is a new approach to performance making in theater.

Materials

Hardware

We utilized SparkFun’s RP2040 mikroBUS Development Board [5]. The Mikrobus Shuttle [6] and Shuttle Click [7] enabled us to hook up multiple sensors at a time to the Development Board. We used four Mikroe Click Boards™ [8] discussed in 1.3 to collect biometrics.

Software

We began experimenting with Micropython and Circuit Python through the Mu Editor, the Thonny IDE, VSCode, and the macOS terminal. We used C++ and the Arduino IDE, extracting data from the Serial Monitor and Serial Plotter. We utilized python for data visualization and analysis.

We utilized the Arduino Mbed OS RP2040 Board library [9] and the EmotiBit MAX30101 [10] library for the Heart Rate Click Board.

Methods

Data Collection

In order to mirror the emotional changes of an actor, we measured physiological data on student test subjects while they simulated different emotions.

ECG Data

The ECG (Electrocardiography) Click measures heart rate variability by picking up on the heart’s rhythm and electrical activity. We created an experiment to find changes in ECG signals as we experience emotions. First, we measured the signals for a duration of ~4 minutes, which served as the control. Then, a series of short clips were played for a test subject and they were asked to identify how they felt while watching each video. Simultaneously, we measured the ECG signals of the test subject. We then analyzed the data to guide research to find a correlation between the ECG signals and the self-reported emotions of the test subjects. The data reported compares the “scary” and control videos. The first electrode was placed under the subject’s ribcage, below their heart. The second and third ones were placed near their upper shoulder and calibrated until a QRS complex was represented in the output graph.

GSR Data

The GSR (Galvanic Skin Response) Click measures the electrodermal activity in the body, or changes in sweat gland activity. We conducted the same experiment as 1.3.1 obtaining a control measurement with the GSR click before taking data over a course of videos. The electrodes were fastened to the subject’s finger with velcro.

EMG Data

The EMG (Electromyography) Click measures the electrical activity of muscles. Same methodology as 1.3.1. We also reported a control and “distress” graph, comparing a time when a subject passionately expressed distress. The electrodes were placed on the subject’s eyebrows and cheek, with the DRL electrode on one wrist.

Heart Rate Data

The heart rate sensor measures the test subject’s heartbeats per minute. The experiment consisted of a user watching a selected horror scene from three different movies as they placed their finger on the Heart Rate Click.

Results

ECG Data

Normally the heart beats in a regular, rhythmic fashion producing a P wave, QRS complex, and T wave. The QRS complex represents three waves representing ventricular depolarization [12].

“The R wave reflects depolarization of the main mass of the ventricles—hence it is the largest wave” [11]. However, “exercise-induced left ventricular hypertrophy is considered a normal physiologic adaptation to the particularly rigorous training of athletes” [12]. We addressed this confounding variable by sitting still while recording data. Note that the r amplitudes of the “scary’ data are higher in relation to the Q wave than the control data.

After collecting ECG data, we confirmed the QRS complex represented in our data by zooming into certain parts of the graph (Figure 1). We then plotted data outlined in 1.3.1 (Figure 2), and its extracted r-wave amplitude (Figure 3).

We illustrate an example of the data analysis below. During one of the video clips the test subject reported feeling “scared” and “fearful” for the entire duration of the video, as well as “shocked” at 3 specific points due to jump scares in the video. We then looked to the subject’s physiological data (Figure 3) to identify a correlation. In contrast to the control data, where the R wave amplitudes remained relatively similar, there were three distinct peaks in the data collected while the test subject watched the video. Those three peaks occurred concurrently with the self-reported “shock” of the test subject.

For future analysis, we collected data on the R-R intervals or the distances between the

R-waves. This helps us plot the heart rate variability (HRV). Note that heart rate is the average number of heartbeats in a time interval while HRV is the difference in time between each heartbeat. “Research and theory support the utility of HRV as a noninvasive, objective index of the brain’s ability to organize regulated emotional responses” [13] which is why “the current neurobiological evidence suggests that HRV is impacted by stress and supports its use for the objective assessment of psychological health and stress” [14].

Figure 1: Short interval of ECG data depicting the QRS complex
Figure 2: ECG data snippet of control and scary data
Figure 3: R waves extrapolated from respective Figure 2 data

GSR Data

When plotted, GSR data consists of two major components: the tonic component, often measured from skin conductance level (SCL), and rapid, phasic changes, measured from event-related (ER-SCR) and non-event-related (NS-SCR) stimuli [15]. Higher frequencies of both ER-SCRs and NS-SCRs are correlated with higher emotional arousal.

We sampled the data from the same video clip mentioned in 2.1, during which the test subject reported feeling “scared” and “fearful” throughout.

There was no substantial difference between the GSR data recorded in Figure 5 and 6 which was from two separate video clips. The data from Figure 5 was taken when the test subject reported feeling “scared” and “fearful” while watching the video; the data in Figure 6 was taken while playing a separate video clip that evoked “euphoria” and “excitement” in the test subject. The two-clips both resulted in much higher frequencies in GSR activity than the control data in Figure 4.

This suggests that GSR activity indicated emotional arousal rather than emotional valence.

Figure 4: GSR Control
Figure 5: GSR while watching video (scary)
Figure 6: GSR while watching happiness inducing video

Heart Rate Data

In the heart rate data, we saw a direct correlation between the suspense and feelings of anxiety reported in the test subject and the heart rate frequency.

In Figure 7, the test subject reported the feeling of shock due to a loud sound effect that corresponded with the jumpscare. The user’s heart rate spiked correspondingly, with values reaching a max of 85 beats per minute at the peak of the jumpscare.

In Figure 8, the test subject reported feeling continuously apprehensive and on the edge of their seat. Instead of a singular tall peak, the data illustrate more frequent but shorter peaks that correlate with the self-reported anxiety of the test subject. The first-reported “jumpscare” corresponded with a peak of higher values, up to a max of 106 beats per minute, however, after the first scare, the values never reached as high.

In Figure 9, the test subject reported feeling peaceful and not being too caught off guard by the jump scares, thus the low average bpm.

This suggests that the levels of anxiety and uncertainty the subject encountered throughout the experiment are demonstrated in the heart rate signals.

Figure 7: Heart Rate visual and sound scare but with not much build up
Figure 8: Heart Rate visual and sound scare with build up
Figure 9: Heart Rate Sound scare with buildup

EMG Data

Figure 10 shows a control and “distress” graph, comparing a time when a subject passionately expressed distress verbally to when they sat still. The control data had consistent y values between 241 and 383, as well as consistent frequency. The distress data fluctuated much more, corresponding to times when the subject was raising their eyebrows. The max y value for the distress graph is 557.

Another experiment shown in Figure 11 compares the control data of the subject reporting to be peaceful with the subject being happy and smiling. Figure 12 reported more variation and higher values in the data, likely from their cheek muscles.

In Figure 10 and Figure 11 the subject noted that the low peaks on the control data were from when they blinked.

Figure 9: Heart Rate Sound scare with buildup
Figure 11: EMG Peace (control)
Figure 12: EMG Happy

Experimental Errors

It is important to note that data interpretation in the context of emotional arousal is not yet standardized in all aspects, so while statistical functions (e.g. standard deviation, calculated kurtosis of skin conductance, local maxima peak) can be used to determine arousal [16], they depend on goals of a project and will vary accordingly. Based on our limited range of datasets, the aforementioned measures serve as a starting point for further data collection and analysis.

User-related data inaccuracy could be due to different electrocardiographic artifacts along with user health conditions. Other factors that can affect data between different individuals are obesity, pregnancy, location of heart within chest, exercise habits etc., [17] [18].

Non-user-related factors that could have affected the collection of ECG and GSR data include high-frequency noises, high humidity, extreme temperature variations, and the vicinity of other machines.

Future Directions

Theatrical Implementation

We suggest the same experiment be conducted with professional actors without video stimuli, and instead taking measurements while they perform various emotions of their characters to increase our dataset and suggest a correlation using methods that are rooted in theater rather than one that attempts to mirror it.

We plan on implementing our devices in theater at the University of Brasilia. Implementation could include: the overlaying of actors’ or the audience’s heart rate to create a soundtrack, actors’ GSR data being used to stimulate lighting color and intensity, and “limited heart rate” performances where actors have a certain number of heartbeats before their microphone is cut off and they have to speak louder to be heard which symbolizes the aging process.

Additionally, the visualization of the data will be formatted to be captivating for audience members rather than those with technical backgrounds, enhancing the artistic aspect of this

performance. We plan on using configurations of geometric shapes and colors to represent each actor’s data.

This look into the biometrics of an actor creates an original type of performance: one in which the fourth wall is broken and the data collection process likely reflexively juxtaposes the subjective nature of theater, pushing the audience to reconsider their notions of emotional and empirical truth.

Outside of the performance itself, we can analyze the change in a performer’s technique and its correlation to the biometric data as an objective metric of feedback for actors.

Sensor Hookup to Actor

Utilizing the Mikroe phone jack ECG Cable [19] and adhesive electrode Sensors [20], we are able to record data from all four click boards at once (Figure 13). We are currently working on getting the BLE Tiny Click to [21] collect this data wirelessly. The device is to be powered using the Mikroe 3.7V secondary batteries [22] (Figure 14), based on efficiency in criteria of weight, size, and capacity.

We propose keeping the sensors in the pocket of an actor with their clothes covering up the EMG and ECG electrodes. The GSR electrode will be wrapped around the actor’s fingers and succeed with velcro. The device would need to be placed on the arm to also take measurements from the Heart Rate Click. If this is not possible, we can use the QRS complex produced by the ECG to represent a heartbeat.

Figure 13: Sensor setup
Figure 14: Battery Calculations

Conclusion

Through our research, we observed some correlations between biometrics such as heart rate, GSR, ECG, and EMG signals that can be furthered to prove statistical significance between our control and test data in theater. This data will provide ways for the audience to receive sensory information from an actor’s state, opening up many possibilities for theatrical implementation.

Acknowledgments

We would like to thank Professor Michael Rau, Sreela Kodali, Deniz Yagmur Urey, Ashley Jun, and Rinni Bhansali for their technical guidance and support this summer. We would also like to express our appreciation for Professor Tsachy Weissman, Cindy Nguyen, Sylvia Chin, and the other mentors at the Stanford Compression Forum who made this opportunity possible.

References

  1. Kirby, E. T. “The Delsarte Method: 3 Frontiers of Actor Training.” The Drama Review: TDR, vol. 16, no. 1, 1972, pp. 55–69. JSTOR, https://doi.org/10.2307/1144731. Accessed 5 Aug. 2022.
  2. Cornelius, Randolph R. “Department of Computer Science, Columbia University.” THEORETICAL APPROACHES TO EMOTION, ISCA Archive, 5 Sept. 2000, http://www1.cs.columbia.edu/~julia/papers/cornelius00.pdf.
  3. Gruzelier, John. Enhancing Creativity with Neurofeedback in the Performing Arts: Actors, Musicians, Dancers: Theory and Action in Theatre/Drama Education. Sept. 2018, https://www.researchgate.net/publication/327649460_Enhancing_Creativity_with_Neurofeedbac k_in_the_Performing_Arts_Actors_Musicians_Dancers_Theory_and_Action_in_TheatreDrama_ Education.
  4. Gruzelier J, Inoue A, Smart R, Steed A, Steffert T. Acting performance and flow state enhanced with sensory-motor rhythm neurofeedback comparing ecologically valid immersive VR and training screen scenarios. Neurosci Lett. 2010 Aug 16;480(2):112-6. doi: 10.1016/j.neulet.2010.06.019. Epub 2010 Jun 11. PMID: 20542087.
  5. “Sparkfun RP2040 MikroBUS Development Board.” DEV-18721 – SparkFun Electronics, https://www.sparkfun.com/products/18721?gclid=Cj0KCQjwuaiXBhCCARIsAKZLt3nAO36fYivR VifORTdKXTpFs_poh_bXbNy_sjJ9OCNbvPIMFawIo38aAiDBEALw_wcB.
  6. “Mikrobus Shuttle: Mikroelektronika.” MIKROE, https://www.mikroe.com/mikrobus-shuttle.
  7. “Shuttle Click: Mikroelektronika.” MIKROE, https://www.mikroe.com/shuttle-click.
  8. “Click Boards.” MIKROE, https://www.mikroe.com/click-boards.
  9. “Arduino/ArduinoCore-Mbed.” ArduinoCore-Mbed, https://github.com/arduino/ArduinoCore-mbed.
  10. “EmotiBit_MAX30101.” GitHub, https://github.com/EmotiBit/EmotiBit_MAX30101.
  11. Ashley, Euan A, and Josef Niebauer. “Conquering the ECG.” National Center for Biotechnology Information, U.S. National Library of Medicine, 2004, https://www.ncbi.nlm.nih.gov/books/NBK2214/.
  12. Is Hypertrophy a Short Term Effect of Exercise?, 11 Oct. 2020, https://stwnews.org/is-hypertrophy-a-short-term-effect-of-exercise/.
  13. Wei, Chuguang, et al. “Affective Emotion Increases Heart Rate Variability and Activates Left Dorsolateral Prefrontal Cortex in Post-Traumatic Growth.” Nature News, Nature Publishing Group, 30 Nov. 2017, https://www.nature.com/articles/s41598-017-16890-5.
  14. Kim, Hye-Geum, et al. “Stress and Heart Rate Variability: A Meta-Analysis and Review of the Literature.” Psychiatry Investigation, Korean Neuropsychiatric Association, Mar. 2018, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5900369/.
  15. Braithwaite, Jason J, et al. “A Guide for Analysing Electrodermal Activity (EDA) & Skin Conductance …” A Guide for Analysing Electrodermal Activity (EDA) & Skin Conductance Responses (SCRs) for Psychological Experiments, 2015, https://www.birmingham.ac.uk/Documents/college-les/psych/saal/guide-electrodermal-activity.pd f.
  16. Kolodziej, M., et al. “Electrodermal activity measurements for detection of emotional arousal.” Warsaw University of Technology, Institute of Theory of Electrical Engineering, Measurement and Information Systems, https://yadda.icm.edu.pl/baztech/element/bwmeta1.element.baztech-e856c13f-1e07-47d1-8d49

-9d474b611b24/c/Bulletin_2019_4_Electrodermal.pdf

  1. García-Niebla, Javier et al. “Technical mistakes during the acquisition of the electrocardiogram.” Annals of noninvasive electrocardiology : the official journal of the International Society for Holter and Noninvasive Electrocardiology, Inc vol. 14,4 (2009): 389-403. doi:10.1111/j.1542-474X.2009.00328.x
  2. Rashid, Muhammad Shihab, et al. “Emotion Recognition with Forearm-Based Electromyography.” ArXiv.org, 13 Nov. 2019, https://doi.org/10.48550/arXiv.1911.05305
  3. https://www.mikroe.com/ecg-cable
  4. https://www.mikroe.com/ecg-30pcs
  5. https://www.mikroe.com/ble-tiny-click
  6. https://www.mikroe.com/accessories/batteries

Designing for Authenticity

Journal for High Schoolers, Journal for High Schoolers 2022, Uncategorized

Adanna Taylor, Chloe Zhu, Jared Rosales, Lilah Durney, Ryan Brunswick, Samip Phuyal, Selina Song

Abstract

We have entered an age of information disorder; with the current design of the internet, it has become increasingly difficult for users to access, identify, and trust authentic information. Editing tools have made the alteration or fabrication of image and video content dangerously easy, leading to the vast amount of misleading and false information available online. Misinformation online has jeopardized the public’s trust in news media and the free press. Furthermore, disinformation is being used now more than ever for information warfare, which has had measurable effects on the political, social and economical climate of nations worldwide.

The imminent transition to Web 3.0 opens up an opportunity to redesign the internet using new technologies. Looking at the Starling Lab’s work on image verification, for instance, we see cryptography’s ability to secure and verify the history of a piece of visual content as a powerful tool to ensure authenticity. In this paper, we seek to explore and demonstrate how Web 3.0’s technology can be applied to solve the information disorder; we ask the question “How might we design for authenticity? And how might we visualize that design?”

All of our results can be found in this folder.

Background

To understand Starling Lab’s work, it’s important to consider how we arrived at this time full of so much information disorder. Since its invention in 1989, the internet has gone through three distinct phases: Web 1, Web 2, and Web 3 [1]. Web 1 gained popularity in 1991, and its structure was decentralized, with no single authority or group of authorities dictating the content that could be published. Next, around the year 2004, Web 2 emerged with the arrival of the social-mobile web and companies like Facebook. Based on user interaction and the spread of information via posting, commenting and messaging, this phase of the web saw a centralization of power in the hands of a few large companies, which resulted in many companies selling user’s data and information [2]. These characteristics of Web 2 are what has led to the spread of misinformation and disinformation, two terms that we will explore more later in this paper. We are now on the cusp of Web 3, a new phase of the internet that strives to fix the unintended consequences of Web 2, like centralization and monopoly power. The goal of Web 3 is to create a system that removes power from bad actors, decentralizes control, and creates transparency [3]. Beyond Web 3 technologies that have gained popularity in the past few years, like cryptocurrencies, we believe that Web 3 technologies can reduce “information uncertainty” and help bolster trust in digital content.

Blockchain & Tech

We can specifically use cryptographic technologies to help us verify and authenticate information. The roadmap of verification starts when the creator, or Person A, takes a photo and saves it as a data file. They upload the file through a cryptographic hashing program, software that returns a hash. Next, they sign the hash with their private key, verifying their identity and ownership [4]. Next, Person A registers their signature onto a distributed blockchain ledger, a decentralized database that can be seen across different sites and is not controlled by a single, centralized company like Google or Facebook [4]. Person A then stores it onto a distributed storage network, which splits the data across multiple storage places, or servers. If one server is hacked, the others will not be compromised, a unique security benefit to decentralized technology. Person B will verify the signature with a public key and get the hash code [5]. Person B will compare the hash from the digital signature with the hash from the original data file [5]. If the hashes match, then the data is verified and authentic. If the hashes are different, someone has changed the data file and it is not authentic.

Figure 1: Roadmap of the Cryptographic Process [6]

The history of changes made to a data file can be found on the ledger and is stored as metadata. Information on the ledger is immutable, so it cannot be changed or removed once saved [7]. Data explaining what changes were made, at what time, and by whom can all be accessed.

Figure 2: Adobe CAI tool showing metadata and edit history of an image [8]

Misinformation & Disinformation

Misinformation has been one of the main issues that has come along with the evolution of the web. Misinformation is false information that is spread or used by someone because of the unawareness of where it came from, but it may not necessarily be used for intentional harm. There are 7 categorized concepts of misinformation that range from satire, which is low manipulation, to fabricated content, to manipulated, false, misleading or imposter content [9]. For example, Some websites will post false content on websites with headings or logos similar to those of credible and trustworthy news organizations.

In addition to these concepts there are more that come in between that only serve to confuse the public. Disinformation is false information that is continuously used to purposefully cause harm. Some politicians, for example, throw out and overuse the term “fake news,” undercutting all news including reliable journalism and information.

How might we better verify digital content and make it clear to the public that what they read and see has been authenticated thereby bolstering trust in journalism and the media? That’s a question on the minds of many journalists and photographers who worry how and where readers and viewers get their information and whether they can trust it. These days, some say, we’re all suffering from “information disorder.”

Players & Analysis of Problem

Many developers have been working together to re-architect the web by looking for upstream solutions using provenance and cryptographic tools. There is a world of open source players working in this ecosystem, which allows for various groups to collaborate using publicly available tools to combat misinformation and disinformation. We have spoken to representatives of many of the organizations spearheading this movement, including the Starling Lab for Data Integrity led by Stanford and University of Southern California, The Content Authenticity Initiative, or the CAI, spearheaded by Adobe, as well as the News Provenance Project, an experiment led by the New York Times Research and Development team.

The CAI collaborates with groups across several industries to fight misinformation by using tech and tools that bolster digital content provenance and puts data on a decentralized, unalterable, and transparent distributed network [8].

Similarly, the Starling Lab looks to establish trust in records of digital media by using provenance, various cryptographic methods and widely accessible collaborative tools. The Lab follows a three-step framework to capture, store and verify digital content that makes sure the information of the content’s origins are authentic and viewable [10].

The New York Times Research and Development team has also experimented with provenance working with IBM on a “News Provenance Project” experiment with technical solutions that combat the spread of misinformation by allowing readers to verify the validity of news online [11].

However, there is still a severe lack of awareness about the host of issues that misinformation and disinformation have brought about, as well as the steps that are being taken to address these problems. The work our team has done sheds light on the consequences of misinformation and disinformation, as well as solutions for them.

Methods and Materials

When discussing the most effective way to inform others about the utility and journey of Web 3.0, we advocated for the use of visual aids. Our primary criteria revolved around factors such as: comprehension, precision, and engagement. The process of creating these visuals involved the use of various editing softwares, incorporating informative text with striking images. Photoshop and Canva were primary players, allowing creative flexibility in the overlap of text and image cycling. Furthermore, iMovie played a significant role in expanding our reach to the media centric audience through video.

Results

Throughout the summer, we have focused on multimedia forms of creation. We created visuals and infographics to explain the various concepts of Web3. These concepts span from the history of the Internet to the technologies behind data authentication, such as hashing, signatures, and public/private keys. In addition, with the supervision of media professional Aaron Huey, we have created a video to present our methods and findings throughout the program.

Our video is viewable through this link: https://www.canva.com/design/DAFH2usxdPs/5C41l5REW2ct0jKNHPC24Q/watch?utm_content=DAFH2usxdPs&utm_campaign=designshare&utm_medium=link2&utm_source=sharebutton

Some examples of our work:

Figure 3: Visual explaining the concept of fabricated content. As one of the many ways in which information can be manipulated online, fabricated content is distinguishable through its malignant use of distinguishable logos to communicate incorrect information.
Figure 4: Infographic depicting the steps involved in securing data online with the concept of provenance, beginning with the creation of the data and ending with its secured but public viewing.
Figure 5: Infographic explaining the concept of metadata. This includes 3 steps, ranging from the collection of metadata, its analysis, finally to the securing of the metadata.
Figure 6: A visualization of the web’s history depicting how it evolved and adapted over time.

The following is a compilation of all visuals and infographics we created on the various topics, technical and conceptual, of Web3:

Future Directions

We can further increase the scope and quality of our work to make knowledge about cryptography’s value in content verification more accessible. To measure the effectiveness of our work, we can get feedback through surveys from our local communities. After we present the infographics to them, we can track how much information was retained through a follow up quiz. In addition, to create a more professional video with greater human interaction, we can create a video interview series to gauge the starting familiarity levels of strangers with Web3 and how it changes after a brief discussion and video playback. From a more technical aspect, we can learn more about the product development process for Web3 applications and potentially create our own dApp, or decentralized app. Overall, there is an abundance of opportunity to explore the variety of uses of Web3, utilize it to create change, and continue to design for authenticity.

References

[1] Wikipedia contributors. (2022, July 21). World Wide Web. Wikipedia. https://en.wikipedia.org/wiki/World_Wide_Web

[2] Wikipedia contributors. (2022a, July 3). Web 2.0. Wikipedia. https://en.wikipedia.org/wiki/Web_2.0

[3] Web3: in a nutshell. (2021, September 9). Mirror. https://eshita.mirror.xyz/H5bNIXATsWUv_QbbEz6lckYcgAa2rhXEPDRkecOlCOI

[4] Johnson, S. (2021, September 3). Beyond the Bitcoin Bubble. The New York Times. https://www.nytimes.com/2018/01/16/magazine/beyond-the-bitcoin-bubble.html

[5] IBM. (2021, March 5). Digital signatures. https://www.ibm.com/docs/en/ztpf/1.1.0.14?topic=concepts-digital-signatures

[6] Blockgeeks. (2019, November 8). BLOCKCHAIN INFOGRAPHICS: The Most Comprehensive Collection. https://blockgeeks.com/blockchain-infographics/

[7] Koptyra, K., & Ogiela, M. R. (2020). Imagechain-Application of Blockchain Technology for Images. Sensors (Basel, Switzerland), 21(1), 82. https://doi.org/10.3390/s21010082

[8] CAI. (2022). Secure Mode Enabled. Content Authenticity Initiative. https://contentauthenticity.org/case-study

[9] Wardle, C. (2021, August 3). Understanding Information disorder. First Draft. https://firstdraftnews.org/long-form-article/understanding-information-disorder/

[10] Starling Lab. (2022). Starling Lab. https://www.starlinglab.org/

[11] NYT R&D. (2022). The New York Times R&D. https://rd.nytimes.com/

How Do Cell Phones Work?

Blog, Uncategorized

Introduction

Cell phones are all around us. We use them every day to communicate nearly instantly with friends, family, and random internet strangers across the globe. The fact that we are able to send messages so quickly is truly a technological marvel and brings together innovations in physics, electrical engineering, information theory, computer science, and more. This blog post hopes to illuminate how these magical devices work.

There is no way that this post will be able to cover every aspect of wireless communications (as you can devote your entire life to studying the subject), but I hope to distill it into manageable components that allow for further study if desired.

Each section will be broken up into two subsections: one accessible to those with a non-technical background and one for those with more physics and math knowledge. The more general sections will be denoted by a 😎, while the more advanced sections will be denoted by a 🤓. They are as follows:

  1. A Brief History of Communication
  2. The Communication Problem
  3. The Layered Model of Communication
  4. A Primer On Electromagnetic Waves (Radio Waves & Light)
  5. How Information is Encoded in Light (Modulation)
  6. Antennas
  7. The Network of Cell Towers
  8. Putting It All Together

Hopefully, this post will give you a taste of how the theory of information works closely with other fields of science and engineering to produce massively effective and efficient systems.

A Brief History of Communication

Just to give a sense of where we are and where we have come from, here is a by-no-means-complete timeline of the history of communication:

  • ~1.75 Million Years Ago: First humans use spoken language to yell at kids for playing with rocks.
  • ~3000 BCE: Ancient Mesopotamians develop Sumerian, the first written language, to write down their laws, stories, and more.
  • ~540 BCE: Cyrus of Persia establishes one of the first postal services, where messages traveled along a network of roads.
  • ~1200 CE: People begin to use carrier pigeons to transport messages in a one-way fashion.
  • ~1450 CE: Johannes Gutenberg invents the printing press, allowing for mass production of books and other writing.
  • ~1650 CE: British navy pioneers using flags as signals for different messages at sea.
  • May 24, 1844: Samuel Morse sends first telegraph message: “What hath God wrought?”
  • March 10, 1876: Alexander Graham Bell makes first telephone call.
  • 1885: Heinrich Hertz conducts experiments sending and receiving radio waves.
  • Early 1920s: Bell Labs tests car-based telephone systems.
  • 1973: Motorola’s Martin Cooper invents the first handheld cellular phone.
  • 2007: First iPhone
  • 2019: 5G cell phone technology

The Communication Problem

The basic idea of communication is to move information from point A to point B as accurately and efficiently as possible. Let’s take a closer look at what this actually means.

 😎

According to Merriam Webster, information is defined as “knowledge obtained from investigation, study, or instruction.” This is a nice colloquial definition, but not exactly what we are going for in an information theoretic sense.

To get a better sense of what information really is, consider the following situation. Imagine that you live in a town called Bagelville without any cell phones and you and a friend are trying to meet up to grab some of your town’s famous bagels. Assume that Bagelville only has four bagel shops, numbered 1, 2, 3, and 4, respectively. You know that he is waiting for you at one of the four, but you don’t know which one. In your mind, he has an equal chance of being at any of the four. At this point, you have minimal information about your friend. As you walk out the front door, you notice that there is a note with the number 2 written on it signed by your friend. Now, you have reduced your uncertainty about where he is. In some sense, you have gained information about your friend. In addition, your friend has encoded the information about which shop he is in by writing it down for you.

Another way to look at it is that information is anything that can be described. An encoding is, therefore, the description that you choose that uniquely determines the what thing you have in mind. It’s important to note that not all encodings are created equally. Instead of writing the number “2”, your friend could have also written: “go to the bagel shop next to the hardware store on the corner of Main Street.” These two descriptions may be about the same thing and after reading them you may have the same place in mind, but one of them is more concise than the other.

Looking at information in a general sense, information theorists try to find ways to efficiently encode information (usually in 0’s and 1’s), as it is easier to send and receive smaller descriptions than bigger ones. There’s even a couple of cool mathematical theories that state that the most efficient way to encode information is with bits!

The communication problem can be summarized by the following diagram:

Screen Shot 2019-03-16 at 3.34.07 PM

You have some message (in this case, “hey, want to hang out?”) that you want to send to your friend. You want him to be able to reconstruct the information in your message, so you send it across a channel that has noise. In other words, when you send your message, there is a chance that there is some distortion, e.g. that your message becomes “hey, want to work out?”.

Another way to think of noise is to think of you and your friend in the bagel shop in Bagelville. You’ve been dying to tell her about your newfound obsession with cat memes, so you try to communicate using your words. Since you chose the most popular shop in town, there are tons of people also talking and music playing in the background that makes it harder for your friend to understand you. This process is also an example of noise. You can think of noise as anything that makes it harder to discern the actual information of a message. Information theorists call the actual content of the message the signal, and one of the goals of engineering communication systems is to maximize the ratio between the signal and the noise, called the signal to noise ratio or SNR for short.

🤓

Information turns out to be something that you can quantitatively measure. Given the link between information and uncertainty, we can define a quantity that describes how much uncertainty we have about a given source of information called entropy. To make things more concrete, we need to utilize the mechanics of probability theory.

Consider a discrete random variable U that can take values from an alphabet \mathcal{U} = \{ u_1, u_2, \dots, u_M\}. For each value u_i, there is a probability p(U=u_i) that denotes how likely U is to take on the value u_i. We can now define the “surprise” function: S(u) = \log \frac{1}{p(u)} (here \log = \log_2). To gain some intuition about this object, consider the extremes of the input. If p(u) = 1, then the surprise function will evaluate to zero, and if p(u) << 1, then the surprise function will be very large. We are now ready to define the entropy of the random variable U, H(U).

H(U) = \mathbb{E}[S(U)] = -\sum_u p(u) \log p(u)

You can think of the entropy as the expected surprise of the random variable. In other words, on average, how surprised are you when you see a manifestation of the random variable?

Entropy is linked to information in the following sense. The more entropy a random variable has, the more information is required to describe it. Think about the extreme situations. If a random variable is essentially deterministic (say always takes the value “1”), then you can just convey that it always takes on “1”. But if a random variable is uniformly distributed, you need to describe that it is sometimes “1”, sometimes “2”, sometimes “3”, etc. This definition of entropy can be expanded to more complex distributions, e.g. joint and conditional ones, by replacing the probability in the log with the desired distributions.

Now that we have entropy, we can also define a measure of information called mutual information. It essentially describes how easy it is to reconstruct one random variable from another. It is defined as:

I(X;Y) = H(X) - H(X|Y)

Note that if X and Y are independent, then the mutual information will be 0.

But how does this all relate to communication? Consider the diagram below…

Screen Shot 2019-03-16 at 3.46.47 PM

Taken from EE376A Lecture 7 Notes 2018

Essentially, we are trying to convey a message X^n to our friend through a noisy channel which produces a (potentially) altered message Y^n based on the statistics of the channel P_{Y^n|X^n}. Note that given this probability distribution for the channel (and a little more information about X), we can calculate the mutual information between X and Y. The higher the mutual information, the better the channel and the easier it is to reconstruct the original message.

The Layered Model of Communication

Telecommunications is a complex problem to tackle. To make it manageable, people much smarter than me have developed the Open Systems Interconnection (OSI) Model that has seven layers of communication that incorporate both software and hardware components. The layers are shown here:

The general idea is that when a user interfaces with layer 7 (an application, e.g. email) and hits “send”, the information is converted into simpler and simpler forms as it moves down the layers and then gets sent over the physical link in its simplest form. The intended recipient of the message then receives the physical signal and reconstructs it into something legible to a human. Each layer has protocols (read: languages and customs) that everyone agrees on so that data processing can happen seamlessly.

Since in this post I will mainly focus on layer 1, I will give a quick general overview of what the different layers do and examples of their manifestations.

  1. The Physical Layer: This layer is responsible for the barebones transport of bits (0s and 1s) from point A to point B. This can be done through voltages, light, or other media. The protocols specify all the basics about the communication: the rate of data transfer, what a 0 looks like, what a 1 looks like, etc. Examples of the physical layer include Bluetooth, USB, and Ethernet.
  2. The Data Link Layer: The data link layer connects two directly connected devices (e.g. a computer and a router on the same wifi) in a network and helps establish when they can talk to each other. This layer also performs some rudimentary error correction on the physical layer when it messes up. The most well known of these protocols is the Media Access Control (MAC) which gives permission to devices on a network to talk to each other.
  3. The Network Layer: This layer connects different local networks together. You may have your wifi router in your home in California, but you want to send an email to someone in Florida. The Internet Protocol (IP) provides a way to find efficient routes from one node in a network to another.
  4. The Transport Layer: This layer ensures that all of the data that you are trying to send accurately makes it to the intended recipient. An example of this is the  Transmission Control Protocol (TCP).
  5. The Session Layer: When two devices need to talk to each other, a session is created that stays open as long as the devices are communicating. This layer handles the mechanics of setting up, coordinating, and terminating a session.
  6. The Presentation Layer: When a raw stream of bits is received, it is not very useful unless you know what they are for. If you took the bits for a picture and put them in a text editor, you would get something really weird, but when you open them with picture viewing software, e.g. Preview, you can clearly see what the image is. The presentation layer can be thought of like the different file types that people use. Examples include JPEG, GIF, and more.
  7. The Application Layer: This is where you, the user, come in. The application is the thing that presents the data to the end user, whether it be a web browser or email.

Don’t worry if you didn’t understand all of that. For the rest of the post, we will focus mainly on the physical link.

A Primer on Electromagnetic Waves

Electromagnetic (EM) waves are essential to nearly everything we do. They often act as a carrier of information that travels INCREDIBLY fast. Take light as an example. Light is essential for our visual faculties, and photons (little particles of light) that bounce off of the objects around us reach our eyes almost instantly, giving us a real-time view of what is happening. Because of this, it is often said that “light is the carrier of information.” This section aims to give you a basic background on what EM waves are and why they are important to wireless communications.

 😎

You may be familiar with electricity and magnetism, but what you may not know is that they are two sides of the same coin. When the strength of something electric changes, magnetism pops up. When the strength of something magnetic changes, electricity pops up. EM waves are bundles of electricity and magnetism that change in such a way that creates a self-reinforcing cycle.

Image result for electromagnetic wave

A Diagram of an Electromagnetic Wave (Source)

As you can see in the diagram above, the strength of the electricity and magnetism “waves” from high to low, which thus creates a “waving” pattern in the other force. James Maxwell Clerk, known as the father of Electricity and Magnetism, discovered that these waves can propagate in free space at the speed of light (c = 186,000 mi/s). To put that speed into perspective, if you were traveling at the speed of light, you could travel from New York to London in about 18 milliseconds.

How quickly the strength of the electricity and magnetism changes is called the frequency of the wave. Frequency is measured in a unit called Hertz, which is basically the number of peaks that you see in a second. The more peaks per second, i.e. the higher the frequency, the more energy that wave has. Frequency is closely related to the wavelength of the wave, which is the spatial distance between consecutive peaks. The two are related by the equation v = \lambda f, where \lambda is the wavelength (measured in meters), f is the frequency (measured in Hertz = 1/s), and v is the speed of the wave (measured in meters per second). The larger the wavelength at a given speed, the smaller the frequency. Electromagnetic waves span a wide range of frequencies, and the different “bands” are used for different purposes. The spectrum is shown below.

Image result for em spectrum

The Electromagnetic Spectrum (Source)

You may be familiar with X-rays and microwaves from your everyday experience in the doctor’s office or the kitchen, but you might not know that visible light, i.e. the light that you see with your eyes, is made of the same “stuff”! For the purpose of cell phones, we are going to focus on the far right of the spectrum: radio waves.

Radio waves are great for telecommunications because they travel at the speed of light and are good at traveling long distances without getting disturbed. We will get into how cell phones turn your text messages into radio waves later.

🤓

(This section will be a little shorter as I will assume more knowledge of EM waves). Electromagnetic waves are a consequence of Maxwell’s Equations, which are a set of four differential equations discovered in part by James Clerk Maxwell in the 1860s.

maxwell's equations

Maxwell’s Equations (Source)

Here, \mathbf{E} is the electric field, \mathbf{H} is the magnetic field, and \mathbf{J} is the current density, which isn’t really important for waves traveling in free space. If you solve the last two equations for \mathbf{E} and \mathbf{H}, you will find a form of the wave equation:

The Wave Equation for the Electric Field (Source)

This gives rise to sinusoidal solutions that look like the ones in the pictures above. In free space, the waves are allowed to have any frequency and propagate at the speed of light c.

Encoding Information Into Light

The process of encoding information into a wave is called modulation. It’s used all the time in radio communications and in cellular communications. In this section, I’ll try and give you a sense of how it works.

 😎

Let’s consider what happens when you type a text message into your phone. First, the letters that you type are turned into a series of bits (0s and 1s). There is a standard for doing this with individual letters called the ASCII standard. Each letter is assigned a number, and that number is converted into a series of bits. You can think of the bits as answering a yes or no question about the letter. The first bit could answer the question: “Is the letter in the first half of the alphabet?” If it is “1”, then we know it is in the first half and we can discard the other letters as a possibility. The second bit can then represent us asking, “Is the letter in the first half of the first half of the alphabet?” (or alternatively, “Is the letter in the first half of the second half of the alphabet?” if the first bit is a “0”). We continue this process until we know precisely what letter we have.

Once we have the series of bits that we are going to send, we can turn the bits into a wave. There are several ways to do this, so I’ll explain the simplest one. The signal “waves” for a little bit to represent a “1” and doesn’t “wave” for a little bit to represent a “0”. The following diagram should make this a little more clear for the sequence 1011:

Screen Shot 2019-03-17 at 9.12.08 PM

Courtesy of Prof. Andrea Goldsmith

As you can see from the picture, once we receive the signal we can simply measure the amplitude of the wave for a given period to determine whether the current bit is a zero or a one. It’s important to note that the sender and the receiver must agree on how long these 0’s and 1’s last and at what frequency they are being sent, or else the message will not be able to be recovered. We will explain how these pulses are physically sent out in the next section.

🤓

The goal of modulation is to take a signal x(t) (analog or digital) and turn it into an analog signal y(t) that has some carrier frequency \omega_c, which can then be transmitted as an electromagnetic wave. There are two stages: modulation and demodulation.

  1. Modulation: We take the signal x(t) and multiply it by the carrier c(t) = \cos (\omega_c t) to get y(t). We can then use an antenna (explained in next section) to send this signal out.

Screen Shot 2019-03-17 at 9.21.40 PM

The Process of Modulation (Courtesy of Prof. Joseph Kahn)

2. Demodulation: Once we receive the signal y(t), we multiply it by the carrier again to get

v(t) = c(t) y(t)= x(t) \cos^2(\omega_c t) = \frac{1}{2}x(t) + \frac{1}{2} x(t) \cos(\omega_c t)

If we then apply a filter to the signal that gets rid of the high frequencies, we are left with the half the original signal. A schematic is shown below.

Screen Shot 2019-03-17 at 9.26.34 PM.png

Demodulation (Courtesy of Prof. Joseph Kahn)

There are a litany of extra steps that need to be taken (compression and error correction) that need to happen if you are transmitting digital signals, but they are outside the scope of this post. Here are some resources on compression and error correction.

Antennas

So now that we can turn our information into something that can be sent as an electromagnetic wave, how do cell phones actually send signals out that get to other phones? If this description leaves you unsatisfied, check out this website for a comprehensive guide to how antennas work.

 😎

An antenna is basically a piece of metal connected to a special kind of battery that is able to send and receive electromagnetic waves. It does so by applying electricity in such a way that creates the particular wave (as described in the EM primer section). The size of the antenna matters, as it needs to be at least as big as half of the wavelength scale that you are trying to send.

Each antenna also has a specific directivity, which a measure of how concentrated in space the radiation is. Directivity is closely related to antenna size, and the larger the antenna, the more focused you can make your beam in one direction. Since your phone has a relatively small antenna, it generally radiates waves out isotropically, i.e. in all directions. Cell phone towers are basically huge antennas, and they are able to specifically beam radiation in the general direction of your phone rather than everywhere. This allows them to not waste power sending signals in every direction.

One important concept for antennas is called the bandwidth. Since a single antenna can radiate multiple frequencies of radiation, we can define bandwidth as the difference between the highest and lowest frequencies of radiation. This concept will become important when we discuss the cellular grid. Most cell phone systems operate in the 3-30 GHz (billion Hertz) range.

🤓

To make an antenna, we need a piece of wire hooked up to an alternating current source that can accelerate the electrons in the wire at the frequency we want to radiate. When the electrons move, they cause a change in the electric field that consequently causes a shift in the magnetic field around. This process continues and the result is electromagnetic waves. There needs to be impedance matching with the incoming transmission line, however, as otherwise, the propagating signals will become out of phase with each other and power will be lost. This mostly matters at higher frequencies, which is important for cellular communications.

In addition to affecting directivity (defined above), the size of an antenna dictates how many different frequencies it can radiate. A general rule of thumb is that an antenna of size l can radiate wavelengths of length \lambda = 2l.

One cool thing you can do with antennas is putting many of them together in an array. This allows you to interfere the waves with each other in such a way that increases your directivity. They are also generally more efficient as they can radiate more power and increase the signal-to-noise ratio of the incoming messages.

If you haven’t, make sure you read the definition of bandwidth above, as it will be important later.

Image result for cell phone tower

An Example of an Array of Cellular Antennae on a Cell Phone Tower (Source)

The Network Of Cell Towers

There are nearly 100,000 cell phone towers in the United States alone. In this section, I’ll try to explain how your phone talks to the tower and then how your message gets to its intended recipient. The general setup is depicted below:

Screen Shot 2019-03-17 at 10.21.20 PM

A Schematic of Cellular Communication

 😎

Cell phones are called “cell” phones because of the cellular nature of the arrangement of cell towers. Each one is given a region that it governs, and when your phone has service, it means that it is in communication with the nearest tower. A toy example of the grid is shown below:

Screen Shot 2019-03-17 at 10.21.03 PM

The towers are arranged in a hexagonal lattice because it provides maximal coverage for the fewest number of towers. If they were arranged according to circles, there would be blackout spots that would not have any coverage, and if they were arranged in squares, then there would be a higher variability of signal strength in the cell.

Individual towers are connected by a series of fiber optic cables that use light (yes, visible light) to transmit information from tower to tower. The exact nature of how a message gets from point A to point B is outside the scope of this post, but if you are interested, you can read more on the Internet Protocol. For messages to go overseas, e.g. to Europe or Asia, the messages travel through cables that have been laid down under the sea.

Image result for undersea cable map

A Map of the Submarine Cables in the World (Source)

🤓

This grid of cellular towers would not work if all the towers were on the same frequency range. You can think of a frequency “band” as a channel over which communication can occur. If you and I try to use the same channel at the same time, our messages could interfere with each other and disrupt service. Towers that are next to each other therefore cannot be on the same frequency band. The hexagonal organization of the towers also allows for the same frequency to be reused with some spatial separation.

For each cell tower, there are hundreds or even thousands of phones trying to use cellular services at the same time. So how do cellular communication systems solve this problem? The answer lies in a technique called multiplexing. The basic idea of multiplexing is dividing the channel into different buckets by time or frequency and putting different messages in different buckets. Below is a depiction of what time-domain multiplexing looks like (where the different colors represent different users of the channel). Since cell phones operate at frequencies in the gigahertz range, they are able to fit in many time “buckets” per unit time.

Screen Shot 2019-03-17 at 10.36.01 PM.png

Time Domain Multiplexing (Courtesy of Prof. Joseph Kahn)

Similarly, you can do the same thing in the frequency domain. Below is what frequency-domain multiplexing looks like (where again the different colors represent different users of the channel):

Screen Shot 2019-03-17 at 10.38.44 PM.png

Frequency-Domain Multiplexing (Courtesy of Prof. Joseph Kahn)

You can combine the two schemes to maximize the amount of information per unit time. This is where the concept of bandwidth comes into play. If we have high bandwidth, we can fit many more “buckets” in the channel and therefore transmit information at a higher rate.

Putting it all together

In summary, this is what happens when you press send on your message.

  1. Your message gets encoded into a series of bits that represent the information you are trying to convey in a concise and efficient manner.
  2. Your phone’s internal computer figures out how to modulate the signal so that it can be sent out as electromagnetic radiation.
  3. The cell phone’s antenna radiates the message with some meta-information (e.g. who the recipient is, what kind of data it is) to the nearest cell tower.
  4. The cell tower receives the message and decides which node in the network is the best to send the message to.
  5. Step 4 repeats as the message arrives at the cell tower closest to your friend.
  6. The final cell tower radiates the same signal to your friend’s phone.
  7. Your friend’s phone demodulates, decrypts, and decompresses the signal and displays it on the screen.

If you have any questions or want more resources, feel free to email me at yous.hindy@gmail.com, and I’d be happy to send you resources.

Acknowledgments

I’d like to thank Prof. Tsachy Weissman for advising me on this project and providing me with guidance and enthusiasm at every step of the way. I’d also like to thank Professors Jon Fan, Andrea Goldsmith, and Joseph Kahn for taking the time to meet with me and sharing the resources that made this post possible.

Outreach Event @ Nixon Elementary

On March 17, 2019, as part of the EE376A course, I presented my work to a group of students from Nixon Elementary School in Stanford, CA. They ranged from K-5th grade and found the topic pretty fascinating. I didn’t realize that most of them didn’t own cell phones, but they were all familiar with the ones that their parents use. It certainly was difficult explaining these topics at a 1st-grade level, but it made writing this post a lot easier, as I had to really think deeply about these topics and how they could be simplified. Below is a picture of the poster that was presented. I also had a deconstructed cell phone and was able to show them the various components on the phone’s board like the various antennas, microphones, speakers, etc.

EE376A Project.jpg