By: Alejandro Macias, Krish Dev, Nhi Huynh, Corina Chen, Michael Sun
Disinformation and misinformation on climate change have been present early on as the discovery of human cast-offs led to the rise of carbon dioxide emissions and thus global warming. As early as 1980, oil industry companies, such as the American Petroleum Institute, had propagated and spread disinformation to the general public about climate change in order to lean the general public’s attention towards the fossil fuel industry.
Adding to present research, our work seeks to analyze the connection between misinformation and disinformation about climate change and the influence that has spread it and/or started it.
Over a seven-year period from 2013 to 2020, Twitter was recorded to have received an increase of an average of 50% in the use of the hashtag #climate change. According to Twitter, the discourse around climate change has risen significantly throughout the years. Twitter debates about the climate problem are increasingly being influenced by artificial bots. Automated bots are responsible for a quarter of all claims related to the climate crisis. Some topics have a higher percentage, like 38% of tweets about “fake science” and 28% of all tweets about ExxonMobil, which is an oil company with a history of climate denial. In an effort to dismantle the incorrect information about climate change, Twitter has implemented a new prohibition on advertisements that are misleading and contrary to the scientific consensus on climate change. Twitter’s announcement is also part of a broader social media story aimed at stopping misinformation about climate change.
In term of the consequences that misinformation about climate change has, inaccurate information regarding climate change has misled the public, prevented politicians from taking action, and prevented or slowed support for mitigation programs. It already has been a big contributor to slow or no progress in fighting climate change. First, a persistent, decades-long campaign of disinformation is a significant factor in the public’s false conceptions about climate change. It reduces acceptance of climate change and trust in people’s knowledge. Secondly, conservatives are also disproportionately affected by climate disinformation, which has exacerbated divisiveness in recent years.
The aim of this research is to discuss online misinformation, and how it pertains to climate change, and address the following key questions (1) How has social media, specifically Twitter, impacted climate change, changing the narrative of climate activism and spreading the correct information on the topic? (2) Does Twitter significantly play a role in improving climate literacy, especially in this modern and technological world? (3) How does the media underreporting of climate change affect climate action?
We first created Twitter API developer accounts, which give us access to data, actions, and other activities that a user can do on Twitter. In order to open a Twitter API developer account, it acquires an API key, an API token, and bearer token using tokens. With a Twitter API developer account, we can download tweets onto our computer for analysis. After that, we started to look into tweets, public figures, and climate change activists who actively engage in the climate change and climate awareness conversations on Twitter and store them in Docs. We then use Tweetdeck to look for tweets and hashtags that are frequently repeated or generated by automated bots. Like Twitter, Tweetdeck has a search function with various columns, so many searches could be conducted at the same time. After that, we use Tweepy in Python to collect tweets related to climate change and analyze how many times different terms were repeated in 100 tweets related to a specific hashtag. Tweepy is a Python library which helps us access the Twitter API, which in turn helps us retrieve and engage with the data within the Twitter space.
With the use of the library, we used functions accordingly to scrape the data from the Twitter API. As the final step, we use python to create different visualization graphs to illustrate the data that we collected.
With the first image, we were able to filter through millions of tweets and collect 100 that use the keyword, “#climate”. We then scraped the tweets into the terminal allowing us to more efficiently use them, as seen in the second image
We are then allowed to analyze the tweets with the key words, “no climate change,” and “lie” to better narrow down our findings.
This paper has presented the way social medias impact the way we think about the information on climate change, specifically on twitter. Through the means of the Twitter API and Tweepy, we were able to find the effects of climate activism and how ‘bots’ interact with them. When it comes to the effects of spreading false information on climate change, it has misled the general public, stopped governments from acting, and slowed down or stopped funding for mitigation efforts. In addition to this it decreases belief in people’s knowledge and tolerance of climate change. We eventually concluded that social medias such as twitter play a vital role in climate literacy in the technological world we are in and its underreporting is something everyone should take a look at.
As a team we experimented with new resources such as tweepy, pytorch, and much more to scrape tweets into our data, create visualization graphs, and analyze our tweets to narrow down our findings. With certain hashtags such as ‘#climate_emergency’ and ‘#climate_crisis’ we were able to create conclusions on the misinformations and where it all stems from. Looking at the tweets that we pulled, the comments had terms in denial including “joke,” “scam,” and etc. With all of this being said, we are able to create a more informed judgement on this issue and advise others as well to do the same.
Current Results (Krish)
Given by the results, we can infer that this specific method of sentimental analysis can help us achieve similar results for other applications in other industries, such as politics (measuring how many times a candidate mentions a specific term) and advertisement (calculating the effectiveness of the propaganda analytics.) This can provide us with better insights to formulate better strategies to improve the world in various ways by optimizing specific operations.
With that being said, in order to achieve better results that are more accurate without error and noise, we have to improve our calculation and retrieval method on a more rigorous manner. This may include, but not limited to, collecting more datapoints, increasing the filters during our word collection process to uplift the bar of our criteria for key words that can possibly eliminate further noise within the data, and incorporate the other relevant information that can provide more context to the certain words that we may not have included in the original data retrieval.
The new improvements can also hope to bring new visualizations to our data that may be more helpful to provide more insights. By creating new models, such as linear regression or data classification techniques, we may be able to discover a new relationship with the help of the visuals that was not as evident theoretically.
Arjun Barrett, Arz Bshara, Laura Gomezjurado González, Shuvam Mukherjee
In the wake of the SARS-COV-2 pandemic, video conferencing has become a critical part of daily life around the world and now represents a major proportion of all internet traffic. Popular commercial video conferencing services like Zoom, Google Meet, and Microsoft Teams enable virtual company meetings, interactive online education, and a plethora of other applications. However, these existing platforms typically require consistent internet connections with several megabits per second of bandwidth per user, despite the use of state-of-the-art video compression techniques like H.264 and VP9. These requirements pose a major challenge to global multimedia access, particularly in underserved regions with limited, inconsistent internet connectivity. We implement an optimized version of the Txt2Vid video compression pipeline, which synthesizes video by lip-syncing existing footage to text-to-speech results from deep-fake voice clones. Our implementation is available as an accessible web application for omnidirectional, ultra-low bandwidth (~100 bits per second) video conferencing. We employ a novel architecture utilizing ONNX Runtime Web, an efficient neural network inference engine, along with WebGL-based GPU acceleration and fine-tuned preprocessing logic to enable real-time video decoding on typical consumer devices without specialized CUDA acceleration hardware. We evaluate the perceived quality-of-experience (QoE) of the platform versus traditional video compression techniques and alternative video conferencing programs via a subjective study: involving real-world contexts with a wide range of subject demographics including participants from potential markets such as Colombia. The promising QoE results and low hardware requirements of our platform indicate its real-world applicability as a means of bringing high-quality video conferencing to developing regions with poor internet connectivity.
In the last decade, the growing success of the internet and the mobile electronics industry have revolutionized global communications. Alternatives to the telephone are skyrocketing in popularity, and brand new solutions are introducing video as means of improving communication (Fernández et al, 2014). In fact, mobile video traffic now accounts for more than half of all mobile data traffic, and it was predicted that nearly four-fifths of the world’s mobile data traffic would be video by 2022 (Cisco, 2017).
Indeed, video conferencing solutions have opened the door to new applications within a wide variety of fields: from remote expertise applications such as in medicine or law, to corporate, work and home environments (Biello, 2009). With the advent of global lockdowns caused by the COVID-19 pandemic, video conferencing traffic increased approximately by 2-3 times (Silva et al, 2021). The daily use of traditional video conferencing services and applications such as Zoom, Microsoft Teams and Google Meet experienced dramatic growth as shown in Figure 1.
In the midst of the lockdown, the implementation of video conferencing tools became more critical than ever by directly influencing basic human needs such as access to health, work and education. For millions of people around the world, video conferencing tools became the only way to attend classes.
While a switch to online life has worked well for developed nations, it has not been so successful in the developing nations of the world. Two thirds of the world’s school-age children experience poor connectivity and lack of internet access, which has impeded their education especially during the pandemic (UNICEF, 2020). Even before the COVID-19 lockdown, the lack of infrastructure and transportation in low-income regions made it difficult for millions of people to access health and education services in person, as doing so would require them to walk many miles and cross dangerous geographical barriers (Portafolio, 2022). Even beyond the pandemic, accessibility to video conferencing solutions would make it possible to bring remote education, health and work opportunities to historically disconnected areas around the world.
However, traditional video conferencing services such as Zoom and Google Meet generally require stable internet connections, often with a consistent bandwidth of several hundred kilobits per second, per user (100000 bps) despite the use of state-of-the-art video compression techniques such as H.264, VP9, and AV1. The lack of global access to high-quality internet connectivity is exacerbating pre-existing social and economic inequalities worldwide: without a low bandwidth alternative, video conferencing would remain inaccessible to many underserved regions.
These factors indicate that the implementation and real-world evaluation of an ultra-low bandwidth video conferencing platform would represent a new world of possibilities for video conferencing communications and could help reduce inequity in health, education, and work opportunities worldwide.
Background and Literature Review
Video compression is a core technology that enables delivery and consumption of video. The term “bandwidth” characterizes the amount of data that a network can transfer per unit of time; in other words, the maximum amount of data an internet connection can handle (R. Prasad et al, 2003). Typically, the maximum data rate transmitted is measured in bits per second (bps). The amount of bandwidth required for raw uncompressed video is so high that without compression, delivery over even a high end network connection and consumption would not be possible. For example, an uncompressed full HD video stream at 30 frames per second, 8 bits per pixel, with 4:2:0 encoding, needs a whopping 746 Mbps uncompressed which is impossible to deliver even over the best of networks. Most broadband connections to homes in the United States and other developed regions have lower bandwidth than that, and the number is substantially lower for developing and under-developed regions. Video streamed today is often compressed at ratios of at least 100:1 using sophisticated compression technologies.
Codecs are compression technologies with two primary components: an encoder that compresses a file or a video stream, and a decoder that decompresses them. Codecs are extensively used in the available commercial video conferencing platforms and therefore the main technology behind video conferencing communications. There exist a variety of state-of-the-art video codecs including VP8, VP9, and AV1: they are used when streaming, sending, or uploading videos. The purpose of these codecs is to compress videos to reduce the bitrate required while still trying to keep the quality high (THEOplayer). VP8 is an open-source royalty-free video codec that was released in 2010 as part of the WebM project. VP9 was the next generation codec from the WebM project, released in 2013, achieving about 40% to 50% higher compression ratio over VP8 at the same bitrate. Today, the vast majority of YouTube videos are compressed with VP9. Further, for video conferencing platforms such as Google Meet, both VP8 and VP9 are used extensively. Built on the success of VP8 and VP9, an industry consortium known as the Alliance for Open Media was established in 2015 to further advance royalty-free video codec development and deployment. The first codec developed by this consortium in 2018 was AV1 (Y. Chen et al, 2018). AV1 achieves about 30% to 35% more compression than VP9 at similar quality, and today many videos on popular streaming platforms are likely to be using AV1. However, MPEG released its latest codec Versatile Video Coding (VVC) (J. Han et al, 2021) in 2020 which surpassed AV1 by about 15% in compactness according to most estimates. While VVC is extremely sophisticated, the higher encoding complexity makes their usage in videoconferencing applications somewhat limited.
Even with the latest advances in video codec technology over the last decade, video and audio streamed for videoconferencing applications often require hundreds of kilobits per second (kbps) of bandwidth for acceptable quality of experience. This is of course much higher than what can be supported over the internet infrastructure in developing and under-developed regions in the world. Therefore, we created a platform that would not even need to send a live video feed while still providing a high quality of experience.
Recent advances in artificial intelligence (AI) and video compression have opened the door to new video conferencing techniques such as Txt2Vid, which utilizes a custom compression pipeline that decreases the conventionally required bandwidth for video conferencing.
Txt2Vid, originally developed by Tandon et al. (2021), is a novel video compression pipeline that dramatically reduces data transmission rates by compressing webcam videos to a text transcript. Essentially, Txt2Vid synthesizes video by lip-syncing existing footage to text-to-speech results from deep-fake voice clones, as shown in Figure 2.
The pipeline extracts a driving video and audio from the encoder, assigning it to a user identifier (ID) and transforming the audio into text to be sent to the decoder. The decoder uses a voice cloning model to convert text to speech (a “TTS” engine)1, and applies a lip-syncing model called Wav2Lip (Junho Park. et al, 2008) to the synthesized speech and the driving video to generate the reconstructed video, more clearly seen in Figure 3. Effectively, the text is transmitted and decoded into a realistic recreation of the original video using a variety of deep learning models. While conventional approaches generally require over 100 kilobits per second (kbps) of bandwidth, Txt2Vid requires only 100 bits per second (bps) beyond the initial driving video for similar audiovisual quality. Therefore, Txt2Vid achieves 1000 times better compression than state-of-the-art audio-video codecs.
Although this technique has proven to be highly effective for ultra-low bandwidth contexts, it required a software development toolkit to run, and was not integrated into an easy-to-use interface. Moreover, it also required an expensive NVIDIA graphics processing unit (GPU) with the CUDA toolkit. There was a need for implementing a more accessible solution with lower computational complexity to bring the final product to the consumer market.
We therefore investigated a means of integrating the Txt2Vid pipeline into an optimized user-facing application.
Our research focuses on both the implementation and evaluation of Txt2Vid into a program with an accessible user interface. For our implementation, we engineered a custom WebRTC-based web application. Our evaluation consisted of a subjective study that measures the real-world applicability of our web application by qualitatively comparing our implementation to a traditional video conferencing platform based on the AV1 video codec.
Although traditional video codecs such as VP8, VP9, and AV1 offer high quality video with a high-bandwidth connection, they have proven to perform poorly when bandwidth is limited. In order to enable anyone to take advantage of the quality and bandwidth improvements that the Txt2Vid methodology offers, we implemented Txt2Vid into a web application and utilized a variety of technologies to make it accessible even in low-income regions.
First, we utilize WebRTC to establish our peer-to-peer data channels and share driving videos. WebRTC enables us to use a high-quality compressed video stream that dynamically responds to changes in network conditions when sufficient bandwidth is available. Alternatively, if the website is opened on a device with a poor network connection, our application detects that audiovisual quality will suffer and seamlessly switch from WebRTC’s RTCP-based video exchange to a Txt2Vid scheme backed via efficient WebRTC data channels. The new Txt2Vid session will use a driving video recorded from the last 5 seconds of the RTCP session, practically eliminating the initial driving video overhead. With a WebRTC connection established, the application can share text transcripts of speech instead of the full video to all the other peers in a call, reducing required bandwidth from over 100 kbps to 100 bps.
Although WebRTC helped create peer-to-peer connections, we implemented the full Txt2Vid scheme from scratch on the next layer of the application stack. After receiving a text transcript from a peer,
1 Resemble, RESEMBLE.AI: Create AI Voices that sound real., accessed 2021. [Online]. Available: https://www.resemble.ai
our application uses the Resemble.ai speech synthesis engine to create a text-to-speech result mimicking the phonetic qualities of the original speaker. It then lip-syncs the driving video to the text-to-speech generation via Wav2Lip. Since the lip-syncing model typically requires a high-performance computer to run in realtime, we enabled GPU acceleration via WebGL, a graphics pipeline for web applications based on OpenGL. We created a novel WebGL shader for ConvTranspose, a component of the Wav2Lip neural network that had not previously been implemented in GLSL, and contributed it back to the ONNX Runtime Web neural network inference engine as open-source software. By utilizing OpenGL instead of CUDA (a GPU compute framework exclusive to high-end NVIDIA graphics cards), we dramatically improve Wav2Lip’s performance with GPU acceleration on low-end devices and thereby reduce the hardware requirements to use the application. We also created a face tracking algorithm based on a performance-optimized version of the Pico algorithm (A. Koskela. et al, 2021) to pass high quality face crops to the Wav2Lip model and optimize the resulting lip-sync quality while minimizing CPU load during inference.
Resemble.ai is intended to be replaced by a free, open-source voice cloning service in the future; however, we have nonetheless integrated support for custom Resemble voices securely and efficiently to maximize usability. Resemble is integrated via a suite of REST API wrappers in both the backend and frontend code, resulting in efficient network bandwidth usage and easy future replacement with an alternative service or platform. Since each member in a video call may create a custom voice on a different Resemble.ai plan or account, we securely exchange credentials to enable realistic speech synthesis for each peer in the call. We protect user security by asymmetrically encrypting the Resemble.ai API key on the frontend via an RSA-4096 public key (with the corresponding private key stored securely on our custom-built backend). To generate speech, the application uses the peer’s encrypted API key to make a call on our backend, which decrypts the credentials and forwards the request to Resemble.ai. As only the encrypted API key is saved in users’ browsers and sent to untrusted peers, potential attackers are never able to hack into the users’ Resemble.ai accounts.
Finally, our app employs Progressive Web App technologies to minimize bandwidth concerns in developing regions. After the website code and pre-trained Wav2Lip model are downloaded, the site uses a custom script called a service worker to automatically save the files to the user’s computer such that opening the site in the future will not require re-downloading any data unless the site is updated.
Quality-of-Experience (QoE) is the overall subjective acceptability perceived by the end-user of a service or application (Kuipers et al, 2010), so evaluating QoE requires metrics that can objectively assess user satisfaction. There are multiple QoE evaluation techniques available to test the applicability and acceptability of our web platform implementation. Nonetheless, QoE evaluation by definition involves subjective complexity, given the presence of factors not necessarily related to the service’s performance, such as the user’s mood (Serral-Gracià et al, 2010). The available literature can be largely categorized into two main techniques to evaluate QoE: the measurement of numerical metrics that tend to be associated with human visual and auditory perception (Kuipers et al, 2010), or the deployment of traditional subjective studies involving voluntary human subjects and statistical analysis of their responses. For the purpose of our research, both techniques were explored and it was decided that a subjective study better suits the QoE evaluation of our web platform implementation.
The video conferencing experience involves multiple aspects, and therefore QoE also evaluates diverse qualities within this service. These aspects include video quality, audio and speech quality as well as audio-video synchronization. Given the background technology of our video conferencing web platform implementation, the available objective resources to conduct QoE would involve the quality of the video artificially generated from the driving video, and the latency determined by the Wav2Lip model with the requirement for audio-video synchronization.
Out of the existing objective techniques, Video Multi-method Assessment Fusion (VMAF) is the most promising full-reference objective video quality assessment model. VMAF was developed by Netflix and Professor C.C. Jay Kuo from the University of Southern California. The VMAF method seemed to be especially convenient since it uses already existing image quality metrics (e.g. visual information fidelity or detail loss metric) in order to predict video quality, emphasizing metrics that are attuned to human visual preferences (García et al, 2019). VMAF uses a supervised learning regression model that provides a single VMAF score per video frame.
An example of a VMAF metric is Peak Signal-to-Noise Ratio (PSNR), which expresses the ratio between the maximum possible power value of a signal and the power of distorting noise that affects the quality of its representation (NI. 2020). Similarly, PSNR-Human Vision System modified (PSNR-HVS) is another PSNR metric that additionally considers contrast sensitivity (NI. 2020). Structural Similarity Index Measure (SSIM) predicts the perceived quality of digital television and cinematic pictures by measuring the similarity between two images using a reference image as indicator (Math Works, 2021). Multi-scale SSIM (MS-SSIM) and color image quality assessment CIEDE2000 are the other common metrics used for VMAF.
Although VMAF was explored and tested as a means of evaluation from the publicly available software (GitHub – Netflix/vmaf, 2018) the metrics that VMAF uses are focused primarily on video quality for streaming services, rather than quality-of-experience for end-users in a video call. The PSNR metric, for example, reflects mostly characteristics like clarity and sharpness that are relevant for the user’s experience in television and streaming settings, but are inaccurate to measure user satisfaction in video conferencing. Even traditional high-bandwidth platforms experience low clarity and momentaneous image distortions since image acceptability has a lower threshold when other factors like communication play a larger role (Serral-Gracià et al, 2010).
Another potential point of objective evaluation was audio-video synchronization, which refers to the relative timing of sound and image portions of a television program or movie. In Txt2Vid, it would refer to the synchronization between the artificially generated voice and image from the user in the video call. Audio-video synchronization can be measured using artificially generated video test samples by separating the video component from the audio component and assigning them certain markers that allow the numerical detection and analysis of any desynchronization (Serral-Gracià et al, 2010). Standard recommendations state that the viewer detection thresholds of audio/video lag are about +45 ms to −125 ms, and that acceptance thresholds are about +90 ms to −185 ms for video broadcasting (Lu, Y. et al, 2010). However, like with the VMAF metrics, desynchronization plays a larger role for streaming services than video conferences, where tolerances are significantly more flexible. Therefore, desynchronization measurements are inaccurate to evaluate real QoE for our implementation.
Based on this analysis, it was concluded that video-conferencing is a complete experience that involves dynamic interactions beyond the scope of any convenient audiovisual metrics or numeric parameters. Therefore, objective evaluation techniques cannot be used at all to conduct a real-world QoE study.
Once we established a usable web video conferencing platform, we were able to use a demo version on the browser. Rather than evaluating with an objective study that focuses on metrics that humans do not normally perceive in the video conferencing context, we opted to undergo a subjective study to prioritize user experience and obtain more accurate results.
In general terms, prior literature presents multiple techniques to deploy subjective evaluations. In the field of QoE, the final goal is to analyze the acceptability of a service based on a person’s real preferences, so it requires a sample with human subjects from the target demographic. Surveys are the best means to collect data that is later analyzed statistically to conclude the level of acceptability of the service being tested. In order to keep the study valid and generalizable from sample to population, the survey must be designed so that it primarily addresses general human perceptions without any prompting. While it is true that subjective evaluations can be influenced by variables unrelated to the quality per se (e.g personal preferences, external lighting, use of headphones, mood, etc.), we concluded that those are still factors that play a role on the experience of the user when using similar applications. Hence, they still add value to our research purpose: implementing our web solution in real-world contexts.
Accordingly, we conducted a subjective study with the objective of comparing our web implementation with traditional codecs, particularly AV1 (which is the most common codec used by commercial video calling platforms).
As shown in Figures 4 and Figure 5, the demo browser version of our implementation was designed such that we could choose to use Txt2Vid or instead disable it and use AV1 codec. We also had a slider to control the precise bandwidth to run the codec.
Since our goal was to evaluate if our implementation offers better QoE than traditional platforms in low-bandwidth contexts, we used our demo version to simulate academic and educational lectures. Those simulations within our platform were recorded, yielding a total of 6 videos grouped into 3 pairs. Each pair consists of two videos with the same content. However, one video contains a lecture given using AV1 with the minimum possible bandwidth ~10 kbps stream (~10000 bps), and the other one contains the same lecture given using Txt2Vid with only ~100 bps. Therefore, within our same platform we could simulate low-bandwidth conditions.
As a result, each pair of videos consisted of Video 1, which was the video recorded without using Txt2Vid, and Video 2, which was recorded using Txt2Vid. The 3 pairs were integrated inside an online survey that was designed2 to determine the subject’s preferences between both videos. Questions were made to be general, and provide a space for each subject to explain their reason to prefer one video over the other, as well as to rate each video independently. We verified manually all responses to guarantee quality standards finally removing ~31% of the total number of submissions due to incomplete answers or surveys fill in less than 5 minutes, failing our quality thresholds. It is worth noting that we addressed a remarkable background and demographics of subjects: with ages ranging from 13 to 60+, and both from the United States and from Colombia –two of our potential target markets.
With health and education being the principal fields we considered as potential use cases for our implementation, a significant number of our results were obtained by partnering with one of the largest Colombian diagnostic institutions, Medical Diagnostic Institute (IDIME). IDIME deployed our survey within their organization and guaranteed reliable and trustworthy results. Another important proportion of our subjects were high school students coming from a Colombian official educational institution. The rest of the results were obtained from crowdsourcing and public contributions, yielding a total of 188 submissions.
In total, 125 complete survey responses were considered. Respondents were asked to compare recordings from two video calls with the same content: one using Txt2Vid (~100 bps stream) and one using AV1 (~10 kbps stream). They were then asked to rank each individually with a score of 0 to 5.
As demonstrated in Figure 6 and Figure 7, in all three pairs, over two thirds of respondents for all videos preferred the Txt2Vid generation to the AV1 compressed video call, despite the use of over 100x lower bandwidth in the Txt2Vid implementation. It is worth mentioning that each pair was recorded in different lightings, background noises and by different people, in order to guarantee a real-world context. Therefore, the more pronounced preference for Txt2Vid in the third video was likely due to the use of a higher-quality driving video as compared to the more static driving videos used in videos 1 and 2.
On the other hand, respondents further rated each video in every video pair being 0 terrible and 5 excellent. The arithmetic means and standard deviation of their responses are shown in Table 1.
Video Pair 1
Video Pair 2
Video Pair 3
Rating Arithmetic Mean
Table 1.Arithmetic Means of Respondents’ Rating per Video
In general terms, ratings for our Txt2Vid implementation are visibly higher in the three Video Pairs, being the Video Pair 3 the one with a higher rating mean of 3,83.
It is visible that respondents showed a significant preference for videos where Txt2Vid was used versus those with AV1. This trend was expected since conventional codecs usually require high data transmission rates to achieve good video quality. However, AV1 was tested under poor simulated bandwidth conditions given that is when Txt2Vid would be most useful. However, as seen in Figure 6, there is a significant difference between video pairs 1, 2, and 3. While the proportion of respondents that prefered AV1 in the first pair is ~33,3%, it is ~22,8% for the second pair, and only ~6,5% in the third. Although participants prefered Txt2Vid in general, Txt2Vid was more favorable relative to AV1 in the third video pair than in the first. It is worth mentioning that each pair was recorded in different lightings, background noises and by different people, in order to guarantee a real-world context. Therefore, the more pronounced preference for Txt2Vid in the third video was likely due to the fact that in the first pair, the driving video appeared more static and emotionless than the one from the third pair, making the Txt2Vid quality worse overall. This result indicates the importance of a good driving video to improve QoE in our implementation.
Additionally, QoE rating for Txt2Vid is concentrated mainly between 3 and 4 (on a scale from 0 to 5), and has less scores of 0 and 1 compared to AV1. In fact, as shown in Table 1. Txt2Vid’s mean rating ranges from 3,44 to 3,83 while AV1’s mean rating ranges from 2,5 to 2,7. In addition, the highest Txt2Vid’s mean rating was given in video pair 3 as well, where the driving video was considerably better than in the other pairs.
The standard deviation of the Txt2Vid video from the third pair is substantially lower compared to the standard deviation of all the other videos. Therefore, respondents’ ratings were more homogeneous, and closer to the mean (3,83), which is also the higher ranking mean from all videos; in other words, not only more people preferred Txt2Vid in the third pair, but also the majority of them rated it favorably. This convenient rating is a positive indicator of our implementation’s applicability.
On the other hand, respondents were asked to explain their preference. Respondents choosing AV1 over Tx2Vid in the first pair mainly attributed it to the video quality. It is also remarkable that open responses coincided that AV1 seemed more natural and realistic than Txt2Vid, but only in the first pair. Once again, likely due to the Txt2Vid driving video recorded in the first pair. On the contrary, the overwhelming majority of the respondents that chose Txt2Vid over AV1 from the third pair justified it due to the audio quality.
These insights allowed us to identify not only that Txt2Vid is a significantly prefered option when it comes to low-bandwidth conditions, compared to the existing commercial codecs, but also the weaknesses and strengths of our platform, as well as key factors to consider in order to guarantee a higher QoE.
One of the main strengths Txt2Vid presented over AV1, is the speech generation of our implementation. Not only the Rsemble.ai voice cloning proved to be realistic enough for the users, but the respondents also showed a preference for the audio quality in Txt2Vid. Since Txt2Vid generatesthe speech in the decoder, it avoids background noise or distortions: a clear advantage over commercial codecs where speech experience quality loss. For the opposite, although Txt2Vid video quality was still highly preferades by respondents, those who prefered AV1 attributed it to the unnaturalness that can occur when the driving video is not ideal. Meaning that improving video realism and guaranteeing good driving videos are key factors to improve Txt2Vid’s QoE.
Overall, the subjective study was favorable for our Txt2Vid web implementation and its applicability.
As has been demonstrated in this paper, it was possible to use WebGL to enable GPU acceleration, and create a novel WebGL shader not previously used, which allowed us to reduce the usually required high-performance computer to run Txt2Vid lip-syncing model. That way, we could implement an in-browser Txt2Vid-based platform that shows promise for use as an ultra-low bandwidth alternative to traditional video conferencing platforms. The videos from our Txt2Vid platform received higher QoE scores than state-of-the-art video codecs while utilizing 100 times lower bandwidth. The implementation of several performances and bandwidth optimizations within the web application we developed meant that standard consumer devices could run the lip-syncing inference nearly in real-time on a poor internet connection, making the platform suitable for older devices in low-income regions.
The present paper focuses mainly on the applicability of our platform as a tool to bring connectivity and multimedia access to underserved regions generally with a poor internet connection. Even so, the favorable results from the subjective study and the promising advances in web acceleration of our implementation, open the possibilities for Txt2Vid to be implemented as an AI-based solution that revolutionizes the manner video-conferencing takes place. The same technology as applied in accessible in-browser alternatives like our platform breaks what used to be the greatest limitation for video conferencing: the internet. Therefore, the applicability of this research goes beyond the original scope and can be further considered as a tool for other low-bandwidth communication contexts. As examples, it could exponentiate development in other areas such as in marine or space exploration, where poor-internet connectivity hinders scientific progress, or even for commercial purposes as add-ons alternatives for existing high-bandwidth platforms.
The primary point of potential future research is an alternative realistic speech synthesis solution to Resemble.ai. The use of Resemble requires all users of the platform to have previously created an account and trained a custom voice on Resemble’s limited free plan. It also dramatically increases the real-world bandwidth requirements from the theoretical 100 bps (though even counting Resemble.ai, data usage is still substantially lower than a traditional video codec for similar quality). Moreover, Resemble adds several seconds of latency to the video call as the browser session waits for the API call to resolve, while the platform would otherwise have had comparable latency to standard video codecs.
An open source voice cloning tool that can operate on the few seconds of audio recorded during the initial driver video transfer over WebRTC would eliminate both the bandwidth and latency overheads caused by Resemble.ai.
Another area to continue research is computational complexity. Although we were able to approach this point by utilizing OpenGL instead of CUDA to improve Wav2Lip’s performance with GPU acceleration, these deep-learning models still constitute the majority of the execution time for our application. Although we consider other web acceleration techniques, some can be further researched. Particularly, MIL WebDNN, which is an open-source software framework for fast execution of pre-trained deep neural network (DNN) models in the web browser (MIL, 2022) that present novel approaches. It could also be investigated to optimize our models to make them more lightweight. As
AI and related technologies continue to be developed over time, we expect performance to improve overall.
In terms of QoE, it would be possible to expand the subjective study in order to identify with more precision the usability of our platform among a greater population, compared to more codecs, under different bandwidth conditions and even allowing the respondents to experience the whole video-conferencing process instead of showing them pre-recorded calls. One of the most promising resources to expand the study is Amazon Mechanical Turk: a crowdsourcing marketplace that outsourced processes to a distributed workforce who can perform these tasks virtually, widely used for survey participation (AMTurk, 2022). These resources could bring new insights from global workforces, augment data collection and analysis, and accelerate machine learning development (AMTurk, 2022). In that way, identify with a higher accuracy how to improve our web implementation to bring it to the real market.
We thank our mentor, Sahasrajit Sarmasarkar, for his continued guidance throughout the project and for his help in testing and planning the design for our web application and QoE evaluation. We would also like to thank Pulkit Tandon and the other authors of the original Txt2Vid paper for their groundbreaking research in the field of low-bandwidth video conferencing, which served as the basis for our project.
Fernández, C., Saldana, J., Fernández-Navajas, J., Sequeira, L., & Casadesus, L. (2014). Video Conferences through the Internet: How to Survive in a Hostile Environment. The Scientific World Journal, 2014, 1–13. https://doi.org/10.1155/2014/860170
Gladović, P., Deretić, N., & Drašković, D. (2020). Video Conferencing and its Application in Education. JTTTP – JOURNAL OF TRAFFIC AND TRANSPORT THEORY AND PRACTICE, 5(1). https://doi.org/10.7251/jtttp2001045g
Serral-Gracià, R., Cerqueira, E., Curado, M., Yannuzzi, M., Monteiro, E., & Masip-Bruin, X. (2010). An Overview of Quality of Experience Measurement Challenges for Video Applications in IP Networks. Lecture Notes in Computer Science, 252–263. https://doi.org/10.1007/978-3-642-13315-2_21
García, B., López-Fernández, L., Gortázar, F., & Gallego, M. (2019). Practical Evaluation of VMAF Perceptual Video Quality for WebRTC Applications. Electronics, 8(8), 854. https://doi.org/10.3390/electronics8080854
Junho Park, & Hanseok Ko. (2008). Real-Time Continuous Phoneme Recognition System Using Class-Dependent Tied-Mixture HMM With HBT Structure for Speech-Driven Lip-Sync. IEEE Transactions on Multimedia, 10(7), 1299–1306. https://doi.org/10.1109/tmm.2008.2004908
Resemble, RESEMBLE.AI: Create AI Voices that sound real., accessed 2021. [Online]. Available: https://www.resemble.ai
R. Prasad, C. Dovrolis, M. Murray and K. Claffy, “Bandwidth estimation: metrics, measurement techniques, and tools,” in IEEE Network, vol. 17, no. 6, pp. 27-35, Nov.-Dec. 2003, doi: 10.1109/MNET.2003.1248658.
Chen, Y., Mukherjee, D., Han, J., Grange, A., Xu, Y., Parker, S., . . . Liu, Z. (2020). An Overview of Coding Tools in AV1: The First Video Codec from the Alliance for Open Media. APSIPA Transactions on Signal and Information Processing, 9, E6. doi:10.1017/ATSIP.2020.2
Y. Chen et al., “An Overview of Core Coding Tools in the AV1 Video Codec,” 2018 Picture Coding Symposium (PCS), 2018, pp. 41-45, doi: 10.1109/PCS.2018.8456249.
J. Han et al., “A Technical Overview of AV1,” in Proceedings of the IEEE, vol. 109, no. 9, pp. 1435-1462, Sept. 2021, doi: 10.1109/JPROC.2021.3058584.
Despite decades of effort to improve the gender disparity in engineering, women represent only 13% of engineers today, as stereotypes, unsupportive environments, and neurological differences turn them away from the progressing field. By interweaving and building upon existing projects that have worked to close the gender gap in engineering, we plan to create a curriculum-based website to educate women and heighten their interest in engineering. Through games, career quizzes, research, communication forums, and professional accounts in this field, our project seeks to expand the number of female engineers in the workforce.
Living in the 21st century, the adoption of new technologies is common and the progression of technology is rapid. Amid the swift changes, engineering proves to run first in line. According to data collected by the Bureau of Labor Statistics in 2019, employment in engineering occupations is expected to grow 6% from 2020 to 2030. Currently, there are approximately 2 million engineers from data collected by the US workforce in 2018. However, out of the 2 million, only 13% are women. Despite the rapid technological innovations, years of combating gender inequality in engineering did not rest. While women are paid a median annual salary of $75,000 as engineers, their male counterparts are making $89,000. This large gap deters young females from considering becoming an engineer.
The decline of young women pursuing engineering can be clearly seen in recent statistics collected from high school environments. For example, statistics collected by the CollegeBoard show that only 19% of AP Computer Science test takers were women. The small percentage vividly paints an image that resources to explore engineering are less accessible for young women. Corroborated with experiences from first-hand interviews, an informal survey was conducted in San Francisco amongst rising high school seniors. A high schooler from San Francisco stated that in her AP Computer Science A class, there are only five women out of 35 total students. The small number puts the issue into context. It can be accurately inferred that if young women take little interest in engineering in high school, they are less likely to pursue it as a career in college.
According to the study conducted in 2009 on Ambient Belonging: How Stereotypical Cues Impact Gender Participation in Computer Science , results indicate that stereotypical environments can negatively impact women’s interest in computer science from 0.5 to -0.5 (-1 being no interest and 1 being complete interest) compared to non-stereotypical environments. This furthers the idea that in order to build a sense of belonging, a positive environment must be fostered for women in order to build their confidence and encourage them toward an engineering career. When female software engineers, mechanical engineers, and electrical engineers only make up a small part of the field (18%, 14%, 10% respectively), it discourages young women from taking a stance and interest in engineering. A sense of a supportive community and belonging is important to provide young women with a successful path to pursuing an engineering career.
There have been countless efforts sought to close the gender gap by providing free coding boot camps, but many lack in the way information is presented For example, the youth-led organization GurlsWired leads a free virtual coding boot camp in the summer. While this is a great resource for young women to start exploring computer science, it lacks a formidable presentation. The flyer does not specify which age group or experience the camp is best fit for and lacks credible information such as who will be leading the camps and what coding experience they have. This can potentially lead to a smaller number of attendees and shave away any potentially interested audiences. To better improve this resource, information can be specified. With our herplusengineer.org, we seek to serve as a bridge for young women by offering accessible engineering resources that are tailored for beginners exploring the field.
After considering many forms of outreach, we decided on creating a website called herplusengineer.org to share engineering resources catered toward young women. As digital media has become more accessible and understandable thanks to the ubiquity of consumer electronics, an eye-catching online platform would be the most effective tool to spark a passion for engineering in as many girls as possible.
The website begins by grabbing the attention of visitors through a picture-based career quiz. The results of the quiz will forward the visitor to the matching career page that teaches everything about what that job does, what skills and education were needed, and examples of popular companies that hire those kinds of engineers. This method personalizes the learning experience for the user, keeping them constantly engaged and motivated to learn about what kind of engineering is best suited for them.
After learning a bit about what different engineering fields are like, the user can move on to listening to the personal experiences of female role models in engineering. We believe this way of connecting with someone who has a similar background as a woman will further motivate girls into following in their footsteps and becoming engineers themselves. The women we chose to interview were not particularly famous, but that is because we decided that having these stories come from people who are the most relatable to the average visitor was most important.
Finally, the resources pages offer plentiful videos and articles relevant to the engineering field of interest. Vlogs with an insider view of what it is like to be an engineer for a day, introductory lessons on critical technical skills, fascinating novel inventions, and more. These videos and articles are all media that somebody in the field would find interesting, so we hope they can induce the same kind of curiosity in a girl who sees it with fresh eyes.
Ultimately, each section of the website has a unique way of reaching out to the visitor about the opportunities available as a woman in the STEM field. Whether with a personalized quiz, a woman’s true story, or a plethora of eye-catching content, our website will surely assist in the movement for closing the gender gap in engineering.
The Career Quiz
The career quiz is a multi-step quiz that women can use to try and find a specific field in engineering that might work well with their skills and interests.
The quiz first asks the user to choose what object appeals to them to build from ten different options.
After the user choose an option, the quiz asks them a specific question about what part of the object they would be most interested in working on.
From their answers, the user is recommended a career that best suits them along with links to other parts of our website and external resources that introduce that career.
The Career Resources Pages
The career resources pages are a collection of resources for the fields of Computer Science, Mechanical Engineering, and Electrical Engineering.
First, each page features numerous day-in-the-life videos of career professionals working in that field and students in university majoring in engineering. The goal of these videos is to show aspiring female engineers what life would be like if they continued to pursue a career in engineering and motivate them.
In addition, we provide several introductory videos for the engineering fields and tutorials on basic skills and topics. These videos provide women with a more comprehensive understanding of what these fields entail.
Furthermore, these pages also feature articles that cover the real-world applications of their respective engineering fields. For example, the articles for the computer science resource page include projects that identify mutations in DNA that cause cancer and a robotic dog that uses machine learning.
The Meet the Professionals Page
The meet the professionals pages are biographies of women who work in engineering.
Our hope is that these biographies will help relay advice from women working in engineering fields that may help motivate and encourage students.
The biographies detail their educational pathway, gender-fueled barriers they have faced and how they have overcome them, and advice they have for aspiring female engineers. These individuals will serve as role models for young women who are curious and/or anxious about their engineering pursuits.
The Careers Pages
Last but not least is the careers page. Users of the website can access information on ten different careers in engineering. They are given general information about each career including a general description, education requirements, necessary skills, and companies that hire these kinds of engineers.
The purpose of this page is to help inform users of the website of the different careers in engineering. This may help them narrow down a specific career path they want to follow.
Take a look at the “Careers resources page” above for an example. This photo shows our description of biomedical engineering.
Above are the skills, education requirements, and specialties for biomedical engineers.
Lastly we have examples of major employers of biomedical engineers.
After extensive research on the gender gap in engineering, our group was able to create a website that provides resources, support, encouragement, and information to young women
interested in pursuing a career in engineering. Some of our research, conversations with women in the field, displayed how one necessity of upholding interest in engineering is having other female role models in the field as well as information that allows these young girls to picture themselves in an engineering career. This research greatly contributed to our website. Because of these findings, we created the Meet the Professionals and Careers pages. Many more pages were added to further address these necessities.
We worked to provide a variety of resources including words from women across multiple fields of engineering, a long list of career examples and information, articles, vlogs, introductory videos, and research on the gender gap in engineering. We were able to create a quiz so that women may narrow their interests in the field and pursue something they find fascinating or familiar.
We believe that the large variety of resources available through our website will resonate with young girls interested in engineering. By fostering this interest, they will be encouraged to pursue engineering. Each time a person is able to further explore her interest on our website and choose a career path related to this interest, we will be reducing the gender disparity in the engineering field.
Our website has the potential to reach many young women, and gradually close the gender gap in engineering.
Looking forward, we hope to improve the website by refining its current features, developing new features, and increasing its reach, in order to encourage interest in engineering in young women even more effectively.
One feature that we hope to expand is our resource pages. We currently have resource pages for mechanical engineering, electrical engineering computer science that includes vlogs of individuals working in the industry, introductory/educational videos, and articles that are relevant to each engineering career.
In the future, we hope to expand on this feature and provide resources for a wider range of engineering fields, including aerospace, environmental, chemical, biomedical, civil, materials, and industrial engineering. Of course, there is a possibility for more expansion by providing resources for more obscure engineering careers, but we believe that these more mainstream
careers will be a great start to allowing young women to discover the numerous engineering disciplines that they may choose to pursue.
One feature that our group has yet to develop is a communication forum on our website. Throughout our interviews with professionals, a common piece of advice that they shared was the necessity of a support system involving other women and role models. It would be amazing to provide a platform for women to establish connections with those who share similar interests and gain advice from others further along in their careers. This forum would be open to young girls interested in engineering, as well as professionals, college students, and other women in engineering who would like to share their own experiences and knowledge with others. This feature would be monitored so that women may pursue their interests freely and safely. We hope that a chat forum will allow women to connect with each other, share achievements/innovations, and ask questions that will allow them to further explore this field.
In addition to improving features, efforts such as community engagement and partnerships would be beneficial in reaching aspiring female engineers. One feasible option would be to form partnerships with middle and high schools. Many of our studies have shown that the gender gap in engineering can be traced back as early as high school, therefore sharing our website with students would be beneficial and can encourage access by sharing our platform with school districts and partnering with them to provide our website as a resource for those exploring passions in engineering. With this partnership, teachers will be able to encourage students to explore the website and further their interests. Because the resource will have come from a teacher or faculty member that hopes to nurture the student’s interest in engineering, they will already be exposed to the support our group hopes to reflect on this website.
Outside of educational expansion, there are many organizations that hope to raise the number of females in STEM fields. Some of these organizations include The Association for Women in Science, The Global Alliance for Diversifying the Science and Engineering Workforce, The Society of Women Engineers, and Women in Engineering Program & Advocates Network (WEPAN). These organizations all hope to make an impact similar to the vision we had for our website. They provide extensive amounts of resources, information, and support for women in engineering fields. Because of our shared visions, these organizations may also hold opportunities for our website to reach a larger audience. Reaching out to these organizations and working with them will allow us to reach a larger audience.
There are endless future directions for our website, and we hope that our continued efforts will inspire and foster countless women to become accomplished engineers.
: “Electrical Engineer Demographics and Statistics : Number of Electrical Engineers in the US.” Electrical Engineer Demographics and Statistics : Number Of Electrical Engineers In The US, 18 Apr. 2022,
Social and emotional recognition are fundamental aspects of children’s development, namely their ability to regulate their own emotions and properly understand those of others. However, while children’s literature can aid in developing their emotional competence, many children struggle with emotional expression through literacy; unlike in verbal communication where emotions are articulated through tone of voice, facial expressions, and physical gestures, children often find it difficult to comprehend intended emotions as they read. We aim to improve their literary emotion detection using Natural Language Processing(NLP). Our NLP-based application works by taking text as input and utilizing the EmoRoBERTa natural language processing tool to output the text’s main conveyed emotion.
Social and emotional learning (SEL) has been defined in numerous ways. Broadly, SEL consists of a set of social, emotional, behavioral, and character skills necessary in effectively navigating everyday tasks. The Collaborative for Academic Social Emotional Learning (CASEL), a leading research center and influential advocate for SEL inclusion within schools, recognizes SEL as a set of five competencies: self-awareness, social awareness, relationship skills, self-management, and responsible decision making . More recently, the Wallace Foundation model identifies the three domains of SEL as cognitive regulation (attention control, inhibitory control, cognitive flexibility), emotional processes (emotion knowledge, expression, and regulation, and empathy or perspective taking), and social/interpersonal skills (social cues, conflict resolution, etc.) .
Together, these skills allow people to develop healthy identities, manage emotions, and understand the perspectives of others. This plays a vital role in establishing and maintaining positive, supportive relationships, and guiding responsible shortand long-term decision-making skills. In short, SEL develops the necessary interpersonal, intrapersonal, and cognitive skills to succeed in school, the workplace, and relationships.
While SEL in schools may seemingly detract from time spent on academics, considerable research suggests that SEL is indeed a necessary foundation for both academic and career success, and can even facilitate learning. Children who are able to effectively manage their thinking, attention, and behavior are also more likely to have better grades and higher test scores– studies have shown as much as an 11-percentilepoint improvement on standardized test scores  . In addition, children with higher teacherrated social and emotional competence in early childhood are more likely to graduate, attend college, and have a job 20 years later. They are also less likely to face mental health challenges, have criminal justice involvement, or receive public assistance in young adulthood . Thus, SEL is crucially beneficial to children’s development.
TEACHING SEL THROUGH LITERATURE
An individual’s ability to recognize emotions is part of their emotional competence. Emotional competence is comprised of three key components: (1) the recognition of emotions, (2) the expression of emotions, and (3) the experience of emotions. . The recognition of emotions involves one’s ability to perceive the emotional state(s) of themself and others. This also entails identifying differences between inner emotional states and outer expression, and at more mature levels, understanding that emotional-expressive behaviors can greatly affect others . Emotional expression encompasses one’s ability to evaluate, regulate, adapt, and respond to these emotions, as well as the ability to use normatively accepted vocabulary to express emotions. The appropriate experience of emotions involves one’s ability to recognize and regulate emotions of varying intensity. It also includes the capacity for emotional self-efficacy where one can confidently accept and embrace their emotional experience .
The development of these skills begins in early childhood and is largely shaped by one’s social context . Typically, this means one’s family experience and relationships with parents, teachers, and peers. However, it also includes other forms of media that instruct children’s recognition, expression, and experience of emotions. Fairy tales, for example, introduce concepts of morality, imagination, danger, decision-making, proper etiquette, and social norms . Fairy tales’ didactic nature will often teach children the interrelations between social and emotional behaviors. The story’s ultimate outcome raises the awareness that ”the structure or nature of relationships is in large part defined by how emotions are communicated within the relationship, such as by the degree of emotional immediacy or genuineness of expressive display and by the degree of emotional reciprocity or symmetry within the relationship” (Saarni 5) . Hence, the stories of the characters will teach children the impacts of their own emotional experiences.
In addition, as children are provided the perspectives of multiple characters, they are guided into recognizing emotions within others and forming judgements about the characters and situations themselves. They are furthermore led to develop more advanced competency skills, as they familiarize themselves with conventional emotional vocabulary and recognize differences between characters’ internal thoughts and external emotional representation . At an early age, children typically find it easier to express their emotions through physical gestures (a smile, laugh, etc.) rather than verbalizing. Effectively, teaching appropriate oral language is crucial for children to learn to cope with, express, and regulate their emotions . Children’s literature undoubtedly supports social and emotional maturation.
NLP EMOTION DETECTION
Natural language processing (NLP) has over a 50 year history as a scientific discipline, with applications to education appearing as early as the 1960s. Initial NLP work dates back to the late 1940s, with focus on text-to-text translation across multiple languages. Systems used dictionary-lookup of appropriate words and manual word reordering after translation . Chomsky later introduced the idea of generative grammar, revolutionizing syntactically accurate translation . In recent years, efforts have shifted to improving human-computer conversation with speech recognition, statistical analysis and prediction, conversational agents/chatbots, facial recognition, auto-complete recommendation analysis, emotion detection, and natural language generation.
Typically, emotion detection works by training a model with a data set, then utilizing code to enable the model to analyze data with the opinion mining technique. Opinion mining, also known as sentiment analysis, is crucial for a computer to understand emotions and feelings in text. It is a technique that utilizes NLP and computational linguistics to identify and decode a sentiment behind a text. 
This paper applies NLP emotion detection to research on SEL. While children’s literature plays an instrumental role in developing emotional competence, many children struggle with reading comprehension and identifying the intended emotion(s) within literary texts. As manual annotation of texts is time intensive and costly, we use literary emotion detection to support children’s educational and emotional needs, creating an app for children to run texts through. We later conduct sentiment analysis, analyzing emotion-related patterns within children’s books and compare their analysis to other literary texts.
MATERIALS AND METHODS
The application created utilizes the EmoRoBERTa NLP model, which . is derived from BERT (Bidirectional Encoder Representations from Transformers), a deep learning model published by Devlin and his colleagues at Google in 2018. The model is based on the Transformers NLP, a technique which applies a bidirectional way of learning the context of words, and was trained using Wikipedia (2,500M words) and BooksCorpus (800M words).
Our application uses EmoRoBERTa to classify sequences of text as one of 28 emotions . The data set included 60+ Grimms’ Fairy Tales , 50 standard Children’s Books , 50 TIME-
forKids Articles , 5 TIMEs articles , 4 novels (Metamorphosis, Frankenstein, Strange Case of Dr. Jekyll and Mr. Hyde, and Moby Dick) , and 3 Shakespeare plays (Hamlet, The Tempest, and Twelfth Night) . Many of these were downloaded from the Project Gutebnerg website, an open library offering over 60,000 free eBooks.
As noted earlier, children’s stories, especially fairy tales, help develop children’s emotional lives and introduce them to many moral concepts that remain throughout their lives. The Brother Grimm’s Fairy Tales is a selection of some of the most popular fairy tales read–including Little Red Riding Hood, Rumpelstiltskin, Hansel and Gretel, and the Golden Goose. Additionally, due to the free open access, they were chosen to analyze for the project. Other texts were selected as points of comparison. For example, Hamlet (a tragedy), Twelfth Night (comedy), and The Tempest (contains elements of both a tragedy and comedy) were selected to compare emotion frequencies across different genres.
Data pre-processing consisted of data cleaning and text normalization. This involved removing stop words using the Natural Lanugage Toolkit (NTLK) library with Python to process bigrams. Regular Expressions was also used to normalize irregular spacing and remove punctuation in the top bigrams. NLTK was also used to tokenize the data. Tokenization was performed to split text into sentences. These sentences were then classified using the EmoRoBERTa model, which labeled their main emotion and provided a score that demonstrated the model’s confidence in its evaluation of that emotion. The score ranged from 0 to 100, with higher numbers representing more confidence.
Overall, four main variables were analyzed from and across the texts: (1) emotion frequency, (2) emotion distribution, (3) sentence word count, and (4) word complexity.
Emotion frequency was calculated by counting the amount of sentences with a specific emotion as assigned by EmoRoBERTa.
The ’emotion distribution’ was approximated to view the ”timeline” of emotions; while the emotion frequency displays the total count of emotions, it does not provide any insight into the concentrations of these emotions throughout the text. Emotion density, on the other hand, illustrates the varying amounts of each emotion at different parts of the story. Emotion distribution was calculated by categorizing each of the 28 emotions as positive, negative, or neutral, and counting the amount of positive, negative, and neutral emotions in a given segment. The 28 emotions were classified as follows:
The text was then segmented into 5-20 pieces (depending on the length of text) and the sum was taken of the positive, negative, and neutral emotions in each segment.
Sentence word count was found by counting the number of the tokens for a given sentence.
’Word complexity’ was determined using the word-frequency tool . Wordfrequency estimates the vocabulary difficulty of words from their commonality/usage frequency. Based on their frequency, words are assigned Zipf values to determine their vocabulary level and classified as follows:
Zipf 5 − 8: Beginner
Zipf 3 − 5: Intermediate
Zipf 0 − 3: Advanced
The average of each word’s value was taken to calculate the vocabulary level of each sentence.
EMOTION LABEL COUNTS
Figure 3 displays the percentages of each of the 27 non-neutral emotions in various texts.
The most common emotions in the Grimms’ Fairy Tales were admiration, fear, joy, and sadness, with admiration being the highest, while the least common emotions were embarrassment and pride.
Figure 1 and Figure 2 depict the relative amounts of each emotion label across the different texts. Each chart is color coordinated to display the amount of positive (pink/yellow), negative (blue/purple), and neutral (gray) emotion types.
As can be seen, the fairy tales contained the greatest amount of the ’neutral’ emotion while the novels contained the least. As will later be seen in ’Emotions at a Lexical Level’, most text within the fairy tales is simple dialogue or basic character and setting description. The fairy tales consist almost entirely of character dialogue or omniscient narration, often forcing a lack of emotions that would normally be expressed by a character experiencing the actual events. Hence, many of these sentences read neutral as they are not containing a specific emotion, but merely describing events.
Conversely, the novels contained the least amount of the ’neutral’ category of emotions. The longer length and deeper complexity of the novels allowed for more character and plot development, and thus emotional depth. The progression of these novels are often driven by emotional development. The success of these novels can also be accredited to their ability to engage the audience by using strong and descriptive words.
In terms of non-neutral emotions, children’s books contained the least amount of negative emotions, with fairy tales and novels having roughly the same. However, fairy tales showed a much greater amount of positive emotions, while novels had near equal amounts of positive and negative. Fairy tales often instill a sense of fear or danger the character must overcome to teach some sort of moral lesson, while children’s stories remain fairly upbeat from beginning to end, with no sense of moral struggle. These fairy tales typically end on a note of happiness, teaching children that if they abide with certain social codes they will prevail.
Figures 4 and 5 display the emotion counts in TIME and TIME for Kids articles. The TIME articles are mostly neutral, with most positive emotions stemming from ’approval’, while the TIME for Kids articles are most positive, with predominant emotions being ’admiration’ and ’joy.’
ORDERING OF EMOTIONS
Figure 7 represents the ”emotion timelines,” or distributions of positive (shown in blue), negative (green), and neutral (orange) emotions across the text. In each of the timelines, the ’neutral’ emotion was omitted due to governing frequency.
As seen, the fairy tales (shown across the top row) typically begin with a neutral emotion– often a simple line describing the setting–and end on notes of happiness (”happily ever after”).
The distribution and varying levels of positive, negative, and neutral emotions stays fairly consistent with genre and plot development.
The last row displays three Shakespearean plays, one tragedy, one comedy, and one that contains elements of both tragedies and comedies. As Hamlet is a tragedy, it starts with somewhat even levels of positive and negative emotions, though negative is slightly more prevalent. The text briefly raises in positive emotion (likely during the play-within-a-play scene), and from about one-fourth in progressively declines in positive emotion while increasing in negative. In the comedy Twelfth Night, the positive emotions remain consistently more prevalent than negative and neutral ones. The Tempest’s chart does not exhibit the clear distinction between positive and negative emotions as in Twelfth Night, though it also does not show the steady increase in negative emotion as in Hamlet, even ending with less negative than it began with; it’s a mix of both.
EMOTIONS AT A LEXICAL LEVEL
The following table shows the leading bigrams for the selection of Grimm’s fairy tales analyzed. Many of these bigrams, such as ”(’said’, ’I’)” and ”(’I’, ’shall’)” demonstrate dialogue. This can be further affirmed because Grimm’s stories are typically written in third person, so the use of personal pronouns such as ”I” are evidence of dialogue or thoughts. The bigrams ”(’old’, ’woman’) and ”(’long’, ’time’)” suggest the more simple descriptions used in Grimm’s fairy tales, which makes sense as they are typically seen as friendly for children to read.
Emotions were then analyzed with respect to sentence structure. Relationships between sentence word count and sentence level complexity were analyzed using simple linear regression analysis and product correlation coefficient tests for the neutral, positive, and negative emotions. Figures 8, 9, 10 are scatter grams of the relationship being studied. Note that the lower sentence complexity zipf values indicate a more advanced vocabulary level.
For the ’neutral’ emotion, shown in 8, the line of best fit was calculated as y = 0.023x + 3.33. The correlation coefficient, .183, is greater than its respective critical value at both a 95% and 99% significance level, meaning we can reject the null
that there is no correlation between word count and world complexity. Figures 9 and 10 yielded similar results, with r values of .185 and .176, respectively. Figure 11 shows scatter plots of the neutral, positive, and negative emotions in the other four text types.
Our findings suggest a relationship between word count and word complexity. However, while the results aim to shed light on the emotions within the texts, it can also be influenced by the NLP model and how it detects specific emotions within texts. For example, the model often marks sentences containing symbols as exceptionally complex. In a TIME news article discussing COVID, it identified a relatively simple sentence (”so far, APDC members have contributed to identifying three major SARSCoV-2 variants”) as ’advanced,’ likely due to the sentence’s inclusion of ”SARS-CoV-2 variants” . The model is also not entirely reliable at determining if a particular part of a sentence is expressing a certain emotion. The sentence ”then everybody laughed and jeered at her; and she was so abashed, that she wished herself a thousand feet deep in the earth” from the Grimms’ text was marked as amusement; however, while the first half of this sentence expresses amusement, the second conveys feelings of embarrassment. However, as a whole, the model seems generally reliable for larger pieces of texts. The emotion timelines for the Shakespeare plays, for example, seem to follow what would be expected of their respective genres.
CONCLUSION AND FUTURE DIRECTIONS
We hope our results and app will help refine natural language processing, provide more insight into literary emotion analysis, and progress SELbased development.
For future plans, we hope to utilize our application to help children improve their social and emotional literacy. To do this, we aim to create a fully functioning app with an easy and accessible UI. The app will introduce a robot called EmoBOT. The children will then have an opportunity to engage with EmoBot by inputting text for the bot to analyze the emotion. Goals for this app includes designing an icon for the robot, a summarizing tool, a synonym and word recommendation tool, and a feature that allows children to guess the main emotion displayed.
We would like to express our immense appreciation for our mentors, Raymond Zhang and Sukhamrit Singh, for introducing us to EmoRoBERTa, helping us develop our research, and giving us the opportunity to participate the program. We also want to thank the Stanford Compression Forum for giving us this opportunity to develop our research skills.
Social Collaborative for Academic and Emotional Learning (CASEL). What is the casel framework?, Oct 2021.
Wallace foundation. Wallace Foundation, 2016.
Roger P Weissberg, Joseph A Durlak, Celene E Domitrovich, and Thomas P Gullotta. Social and emotional learning: Past, present, and future. 2015.
Joseph A Durlak, Roger P Weissberg, Allison B Dymnicki, Rebecca D Taylor, and Kriston B Schellinger. The impact of enhancing students’ social and emotional learning: A meta-analysis of school-based universal interventions. Child development, 82(1):405–432, 2011.
Damon E Jones, Mark Greenberg, and Max Crowley. Early social-emotional functioning and public health: The relationship between kindergarten social competence and future wellness. American journal of public health, 105(11):2283–2290, 2015.
Carolyn Saarni. The development of emotional competence. Guilford press, 1999.
Leilani VisikoKnox-Johnson. The positive impacts of fairy tales for children. University of Hawaii at Hilo Hohonu, 14:77–81, 2016.
Elefteria Beazidou, Kafenia Botsoglou, and Maria Vlachou. Promoting emotional knowledge: strategies that greek preschool teachers employ during book reading. Early Child Development and Care, 183(5):613– 626, 2013.
Marion Dowling. Young Children s Personal, Social and Emotional Development. Sage, 2014.
Antoine Louis. A brief history of natural language processing — part 1, Jul 2020.
What is opinion mining why is it essential?, 2020.
Rohan Kamath, Arpan Ghoshal, Sivaraman Eswaran, and Prasad B Honnavalli. Emoroberta: An enhanced emotion detection model using roberta. In IEEE International Conference on Electronics, Computing and Communication Technologies, 2022.
Project gutenberg. Project Gutenberg, 2021.
Time for kids, 2022.
Robyn Speer, Joshua Chin, Andrew Lin, Sara Jewett, and Lance Nathan. Luminosoinsight/wordfreq: v2.2, October 2018.
Maya Moseley, Alexander Herman, Casey Chang, Afra Ashraf, Laia Balasubramanian, Carrie Lei
Science, Technology, Engineering, and Math make up a well-known acronym known as STEM. However, the acronym STEM has room to expand. SHTEAM better encompasses different disciplines that can come together for a common goal. The addition of the “H” and the “A” stand for humanities and art, respectively. By incorporating humanities and art, we look at STEM fields with a more open mind, allowing for more exploration and understanding of often challenging topics.
The humanities and artistic elements allow for the communication of STEM research and the way data is analyzed to develop and advance as we look at information differently. Our group focused on the intersection between STEM topics, specifically astronomy, the arts, and the humanities. We explored the extensive impact of marginalized groups on the advancement of space exploration. Additionally, we looked at a variety of representations of outer space. Through the use of creative drawings, such as our Apollo 11 Code Art we looked at things that are irrelevant to the average person, but by creating something visual out of code, we invite conversation and public interest into the field of space exploration. We employed the use of found text, a subjective process where we look at a research paper full of jargon and pick out the words and phrases that mean the most to us to better understand and form a personal relationship with the topic being researched.
Our report will focus on the intersection between the fields of astronomy, art, and humanities and their complementary nature. This paper is a culmination of six weeks of experiencing and studying the impact art and humanities have on STEM fields.
When referencing ontological concepts, it should be acknowledged that there is so much that we don’t know. In order to effectively explore the unknown, we need to productively break with conventional ways of thinking and creating. So, how do innovation and discovery happen? Well, usually through abstract thoughts and nontraditional processes. If people just decided to accept how things were and not be open to new creations and ideas over the vicissitudes of centuries, life as we know it and many of the luxuries provided to us would not exist. What our group has been creating and researching throughout our time together could also be considered unconventional. To synthesize what we have been doing, one can think of it as “breaking the mold” of traditional thought and creative processes.
The found text process is “breaking a mold” of how we read and interpret something.
Utilizing found text methodologies allows readers to absorb information from a paper that, at first glance, might seem esoteric. There is not only one way to research things effectively. The art pieces we created, our “celestial representations,” broke the mold of how one would traditionally represent ideas. We looked at how trailblazers in astronomy broke the mold of what people thought they could do. They used aspects of their being that some would use to marginalize them and persevered over obstacles that were made to limit them. Some would think that being blind, like Wanda Diaz-Merced, or being a woman in a male-dominated field like astronomy, would make the ability to significantly impact the field of astronomy impossible. But, Merced was able to sonify data, and many female leaders in astronomy were able to discover things that their male counterparts could not. Even astronomy is like breaking a mold because it helps people break the normative way of thinking in which they believe Earth is the only and most important place in universal existence when in reality, we exist in a whole universe. To clarify, although many of the activities we did were unconventional, we still referenced and built upon traditional research methods.
What are the motivations for looking into outer space and the media frenzy around it when it is so expensive? In all honestly, the grounds may change over time. It may be to gain public attention. It typically takes a government-funded telescope to look into the sky at a magnification where one can discover beyond what they could in their backyard. Those telescopes are costly, so obviously, public interest would be critical so that these and other initiatives taken by NASA could be better funded. Sherelyn Alejandro spoke to us at the start of our time together and explained how in the 60s, there was a worldwide space race, especially to see who could land on the moon first. Nowadays, a space race has reemerged among billionaires as the privatization of space grows astronomically.
In addition to Wanda Diaz Merced’s sonification of outer space and our use of the found text process, there are other distinct research methods that we have utilized that were more methodical. We coded a graph using sound bites we created. We looked at variation in flux across nebulae, using galactic latitude and light wavelengths between h-alpha (𝚨) and h-beta (𝛃), to look at patterns like in our interstellar reddening practice. Pattern recognition is vital for efficiency, simplification, and making new ideas easier to understand. We utilized peer-to-peer learning, as
well, because some people may be more receptive to learning from someone like them. Many of our assignments were kept open to interpretation as well, which helped facilitate the formation of new ideas and encouraged learning for both mentors and interns.
The Found Text Process
*In our studies, we use the term “found text” as a verb meaning that this is something that is done (an action).*
The found text process allows a reader to think about and interpret information differently than they typically would. To connect to our earlier projects, found text was quite a new concept to us. Below we will explore some articles on found text as well as explain the “what”, “why”, and “how” of the found text process.
Found Text Example A.
Found Text Example B.
What is Found Text?
Found text is a processing method where we make a bridge between being a reader and being an artist. This is achieved by dissecting or examining a piece of text. Found text is a way to better understand a text’s main focus by narrowing down what the reader finds to be the most important information. While creating a found text piece, the artist keeps the integrity of the text intact while inferring their own interpretation. Found text is a unique way to get people to interact with dense or scientific text in an engaging way.
Found text is a creative technique in which a reader dissects or examines a text. We concluded that there are four key components to found text, which are:
Scanning for words/phrases important to you (the reader)
Streamlining your focus and blocking out information you are not specifically looking for
Looking for definitions and explanations
Highlighting things that you do not understand
So, what is so crucial about found text? Well, it is a very effective way to better understand a text’s main focus by narrowing down the most important information. Creating a found text (example a. + b.) allows a reader to narrow down broad topics in order to focus on key information that relates, specifically to something someone is researching. Additionally, it emphasizes reader engagement via the use of creative elements in the process. There are many different processes one can use in creating a found text, but one way of breaking down the general elements of found text is as follows:
Highlight information you find to be especially relevant to the main idea you are trying to grasp from your readings or research
Black out information you don’t find especially relevant to your personal curiosities. What you choose to black out or highlight is completely up to you.
Add a decorative element (example a+b)
Create a decorative element, specifically images that connect to the text. Doing so will help a reader remember their readings with an interesting, engaging visual representation.
Creating a found text allows one to break away from traditional (think scientific method) experimental design and scientific research processes. Creative techniques and strategies are not typically utilized in aiding our understanding of scientific exchange and processing. Doing so, though, allows for information to be more uniquely interpreted than other widely used methods. Western science is crucial to modern-day society, but it does not necessarily need to be the end all be all as a way of thinking. Allowing for new ideas and methods of research may allow for more effective ways to find and interpret information. A found text is a very effective way for a general audience to process research and scholarly texts. Found text can be used to focus research by emphasizing components of a paper you are reviewing that are specifically applicable to your
research. Typically, scholarly papers are written with a field-specific vernacular that may be difficult for an uninformed audience to understand. Trying to read such a paper in the typical
head-to-toe style can be burdensome and overwhelming. The found text process allows a reader to dissect a paper one page at a time and better understand and connect to what they are reading.
Rather than passively reading something, the found text process allows one to actively engage and therefore grasp more information from it. A traditional way of reviewing scientific articles is by searching for an article online, reading it as an introduction, body, and conclusion, and maybe highlighting components that seem interesting to a reader. The found text process may include those elements but also removes unnecessary information and adds design elements to the annotations that can keep the reader engaged.
Why Create a Found Text?
Found text is a great way to analyze, investigate, explore, and introduce more challenging topics or works of scholarly or investigative text. When presented with a large article or research paper, the reader is often overwhelmed with the density of the information presented. By identifying and selecting ideas and topics that might be confusing or challenging to the reader, they are able to compartmentalize what they are reading. This will help with initially understanding smaller segments of information, allowing for more absorption and retention rather than blankly reading a challenging text. By breaking a long piece of text into smaller sections, focusing on what is most important to that reader, the reader gains a much better grasp and understanding of the main ideas and is more engaged with the text. Additionally, making a found text is fun and visually appealing; it creates a work of art in the process. This changes the reader’s mindset to make the process of reading a long text a more engaging and motivating process.
Why do we use found text in an education/research context (what’s the utility)?
Peer-reviewed articles can be quite dense and hard to interpret by the general population
Simplification of difficult topics
It takes less time because less time is spent trying to understand something that is not crucial to your research
Shows how you interpreted the text and highlighted the major/minor themes.
How to Create a Found Text?
Within our research, we have used Zoom annotations as a prototype for collaboratively creating found texts, but our mentors are designing a more user-friendly way to create found text. Doing found text activities as both a solo and group project, we have discovered many techniques that may work for other artists, students, and researchers. Our process is as follows. We begin with reading the page for about 3-5 minutes. This is where we would make our initial scan for important words and phrases. Then there is a combination of figuring out what you do not understand and what is the relevant material for the piece of the text. Once the group understood the page we were found texting, we would transition into making our initial markings on the piece. Markings might
include circling and highlighting any parts that stand out to us. This is equivalent to an artist making their first sketch on their canvas. After this step is complete, we examine the parts that are not highlighted and continue by blocking them out. This does not have to be in any particular color and it is completely up to the artist(s) on how they would like to redact the excess information.
Synopsis of “Boosting the Public Engagement With Astronomy Through Arts”
This paper titled “Boosting the public engagement with astronomy through arts” focuses on immersing the general public in astronomy. It aims to introduce topics that may be confusing or challenging to first-time learners and present those topics in a novel and enticing way.
As stated in the paper, art appeals to human emotion, which is advantageous, especially when used in an educational setting, because “the emotions are a key to long-term memory”. Three books are introduced aimed at assisting educators in their introduction and approach to astronomy for first-time students. The first novel, The Photon Starship, explores the life of Aster, a kid born on a spaceship who has never seen Earth. While describing his life, it simultaneously gives a great amount of background detail on astronomical concepts. The book can be used as a way to start a discussion “on exoplanetary systems’ architecture.” Another novel described in the paper is Infra Draconis, which follows a space flight to an object called infra. Infra is later found to be an object that is too small to sustain fusion in its core. This infra is equivalent to what are known as Brown Dwarfs today. As a result, this novel is a good introduction to Brown Dwarfs and gives some additional information about what they are comprised of. The last book that is described is Stars and Waves.
In contrast to the previous two novels, this book is a work of a professional astronomer who knows and understands the astronomical environment. This book details the various facets of exoplanetary research and portrays a realistic view of life in the astronomical community. It dives into astronomy’s history while providing information about relevant present-day technologies. The main idea behind the novel is finding life in the Universe other than on Earth. The novel stimulates discussion and engages interest in the search for life elsewhere, including how to work at observatories is organized and used to further this goal.
These three literary works can provide a plethora of information for astronomical education. They are an engaging way to introduce topics such as binary stars, planet formation, brown dwarfs, as well as the search for life in the Universe. They can be used to introduce astronomical topics, as a theme for discussion, or as a way for students to troubleshoot some of the originally false claims that were presented in the works. Talk about this information.
Our Creations and Projects
The following projects are just a few of the meaningful projects we did during our time together. Additionally, these were all relatively new concepts to us. Below we have described these projects and activities and described our personal interpretations.
Demonstration of Astronomical Concepts Interpreted Through Graphs and Charts
So, what is Dark Adaptation? Dark Adaptation is a process in which a part of the eye, the retina, changes from a photopic to a scotopic state. In other words, from a light-adapted state, where visual acuity is greatest, to a dark-adapted state, where light sensitivity is maximal (after 30 or more minutes) (Ofri, 2008). Increased light sensitivity of the retina in dark environments is caused by “dilatation of the pupil, synaptic adaptation of retinal neurons, and increase in the concentration of rhodopsin available in the outer segments” (Ofri, 2008, Ch.15). Cones in the eye are used for color vision. Rods, which are photoreceptive cells in the retina, are used for black and white vision in low light levels. Additionally, when there is a minimal level of light sensed by one’s eyes, after about 20 to 40 minutes in the condition, they start to produce a key chemical called rhodopsin. This is also known as “ visual purple,” which allows night vision to start setting in. (O’Connor, 2021).
How does this relate to astronomy? Astronomers realized that “Deep red lights do not trigger the neutralization of the rhodopsin”(O’Connor, 2021). Because of this, astronomers and safety officials use red lights for lighting at night to allow for a continuation of the night vision, which helps aid our ability to perceive outer space.
Sonification is a way to use audio to observe data. We observed its uses in an article in Nature called How One Astronomer Hears the Universe by Elizabeth Gibney. We learned about a computer scientist and astronomer named Wanda Diaz Merced. Merced is a blind scientist bringing innovations to the field of astronomy with respect to how we ‘see’ celestial phenomena. She designed a code that takes photos of space and creates sound from it — as if the night sky is a symphonic orchestra with all the stars and other data points creating a sound. Her code has allowed scientists to visualize data in a completely different manner. Her sonification algorithm makes beautiful music from the data, while also shedding light on things we cannot see with our eyes.
Our project for the theme of sonification was to create graphs of sound bites we made through coding in python. In addition, we took sounds from the text we were reading and any other clips we chose to practice how sonification graphing worked.
To be able to understand what a brown dwarf is, it is important to know the difference between a star and a planet. When looking through a telescope, a star shines light by producing light, while a planet only reflects light. A star forms from contracting hydrogen, where the temperature of the hydrogen reaches such high temperatures that it begins to fuse into helium. This process releases a huge amount of energy, causing the star to start shining. A planet on the other hand never reaches the size or heat necessary to produce its own light.
A brown dwarf is a mass that is between the size of a giant planet such as Jupiter, and a small star. Due to the smaller size, the mass would not be able to sustain the fusion of ordinary hydrogen, like a regular star can. As a result, many scientists call brown dwarfs “failed stars”.
One of our mentors, Afra, wrote a research paper based on the signatures of a specific phase of a dwarf’s life. Reading through the paper, there is so much information, as was expected; however, there is a huge distinction between reading and understanding. Our goal was to understand what was written – or at least part of it. We decided to create a found text on a page of our choosing. (See found text below) Breaking the text into many smaller bits and pieces makes it easier to read, as it isolates the important parts from the information that a reader finds to be less pertinent. Additionally, it creates a visually appealing picture that initiates a point for constructive conversation.
A nebula is a giant cloud of dust and gas in space. Some nebulae are created from explosions of dying stars. These stars are called supernovas. Nebulae are essential because they can act as nurseries for forming stars. With this scientific knowledge, we were able to research the cultural and historical significance of nebulae. Within the Book of the Images of the Fixed Stars, we viewed forty-eight constellations. These depictions were taken by Greek astronomer Ptolemy. In addition, we read through the Extra-Galactic Nebulae paper by Edwin Hubble. Hubble is famously known for his name on the Hubble Space Telescope and his role as a leading astronomer in discovering galaxies beyond the Milky Way. Additionally, we read Observing by Hand by Omar
W. Nasim. This brought to light the way that astronomy was studied in the past compared to how it is studied now. It made us easily connect the other two pieces together. This strong intersection is because all three pieces come at different times, but essentially aim to do similar things which is to explain astronomy.
Interstellar reddening is a phenomenon that occurs when micron-sized dust particles dim shorter wavelength light (such as blue light) more than longer wavelength light. The micron-sized dust particles are made of various elements, such as carbon, oxygen, iron, and other atoms that are essential for star formation. These dust particles are just the right size to interfere with the short wavelength of blue light (about 450nm)because the wavelength of blue light is about the same size as the particles. Red light, however, has a longer wavelength (about 700nm) that is larger than the size of the particles. This allows longer wavelengths of light such as redder light to pass through. Because of the dust in the Universe, the light that we see coming from distant objects is blocked or dimmed in blue wavelengths. The resulting effect is that everything looks redder than it actually is. The same phenomenon is responsible for red sunsets.
The Balmer series concerns Hydrogen electron transitions from energy levels n > 2 to n = 2. Energy levels in an atom (also known as shells) are the different “levels” or regions electrons can orbit an atom. Each electron level corresponds to a specific amount of energy, meaning that each electron in a particular shell has the same amount of energy as those around it. When Hydrogen electrons move from a higher energy level to a lower one, they emit photons (also known as light). Different intensities of Balmer lines occur where Hα is the strongest line, and Hβ is weaker. The ratio between the two is known as the Balmer decrement. Under typical conditions in planetary nebulae, the ratio between Hα and Hβ is 2.86. A planetary nebula lying behind a cloud of interstellar dust will be observed as having an intensity ratio of Hα to Hβ higher than 2.86. Galactic latitude is measured in degrees north or south of the Galaxy’s fundamental plane of symmetry. This plane is defined by the galactic equator, the great circle in the sky best fitting the plane of the Milky Way, as determined by a combination of optical and radio measurements.
The higher the galactic latitude, the lower the ratio between Hα and Hβ. When the ratio between Hα and Hβ is higher than 2.86, then a planetary nebula lying behind a cloud of interstellar dust is observed. Astronomers can infer the amount of interstellar reddening (and dust) between us and a planetary nebula by looking at the discrepancy between the observed and the theoretical Balmer decrements.
Our group looked at eight different nebulae and their corresponding flux (brightness levels) at two different wavelengths. Hα was located at 6563 Angstroms, while Hβ was located at 4861 Angstroms. We gathered the data for the maximum and the continuum (base) of each wavelength and calculated their net heights as well. We then looked at the ratio that was created between Hα and Hβ and listed the nebulae from highest to lowest ratio. By finding each ratio between wavelengths, and with the galactic latitudes provided, we can see that there is a correlation between the two, in that the higher the ratio, the lower the galactic latitude. As discussed earlier a ratio that is far off from the accepted value of 2.86, means that there is a large amount of dust (and therefore reddening) that can be seen between the observer, and the nebula itself.
Creating Something Sublime from Apollo Code
The Apollo 11 Mission was the first manned mission to the moon. During the cold war, one of the goals of the global powers was to have control of space. The main goal of the space race was to prove the intellectual and technological prowess of the United States to other countries (specifically Soviet Russia). Many scientists came together to make the Apollo 11 Mission successful. Large computers the size of whole basements were used to code the software necessary for the first manned mission to the moon. Today, however, many fail to realize that some of the most important people to the mission are now the most overlooked.
The build-up to the Apollo 11 Mission required a lot of precise calculations. One person who worked on projects leading up to and including Apollo 11 was an African American woman by the name of Katherine Johnson. She was responsible for many of the complex calculations that the computer would end up performing, but she was so talented that many called her a “human computer”. Another trailblazing woman named Margaret Hamilton was vital as the lead software engineer for the Apollo mission. programming that took place for the mission software.
Our goal was to look at the Apollo 11 code that was used to program the whole mission and turn it into something sublime. By taking bits and pieces of the code, we aimed to present it in a way that appeals to a general audience in an effort to start conversations about space exploration.
Additionally, we are able to portray different people in our artwork to bring to light those that may have been left in the dark.
The image shown is a reference to the American Flag planted on the moon by the Astronauts of the Apollo 11 Mission. The significance of being the first to the moon is a testament to American strength, ingenuity, and greatness that comes forward when people from all backgrounds come together for a common goal. While simple, my drawing embodies the power and impact that the Apollo 11 Mission had on global politics as well as space exploration today. By looking at the Apollo 11 code in this visual way, many people are much more willing to ask questions about the mission to the moon because this is a format that is easy to talk about with others. With public interest and opinion vital to our continuation of space exploration, I find it important to create conversation pieces to further our research.
The image I created shows the Apollo code being layered with a photo of an African-American female trailblazer in astronomy, American mathematician Katherine Johnson. Johnson worked on the Apollo code, among many crucial NASA projects. The Apollo code can be confusing and mindboggling, which can deter readers from exploring it. A representation like this one I created may help viewers look more into the importance of the Apollo code.
I was inspired by the historical impact of the Apollo code and was interested in exploring the societal issues that influenced the mission. Women such as Margaret Hamilton, Frances “Poppy” Northcutt, and Katherine Johnson, made significant contributions to the Apollo code, yet were not acknowledged for their impact. So within my piece, I used photographs to acknowledge the women who worked on the Apollo 11 mission.
Conclusion and Future Directions
Throughout the past six weeks, our group was centered around being able to make traditionally challenging disciplines (science, technology, engineering, and math) more enticing to the general public. While those disciplines can be challenging to understand, there are ways to introduce the concepts that they encompass to get more people involved. We have incorporated Arts and Humanities into STEM, changing the well-known acronym into SHTEAM. While this acronym might seem uncanny, it is important to note that our group has focused on changing the norms and “breaking the mold” of traditional learning and teaching of STEM disciplines.
Found text was an area of great focus as we employed its use from the first week until the last week, looking at different parts of investigative and research papers. It is important to emphasize that this is a subjective process meaning that what is important in the text is personal to each reader. We instituted found text both individually and as a group. The two ways offer a different experience. When completing a found text individually, the reader might pick out individual words that impact what the reader is reading the most. In a group, this might include words that an individual might not choose, but another does, leading to a broader perspective and understanding of what is being said in the paper. Additionally, the visual appeal and enjoyable factor add to the engagement and therefore, understanding of the text.
We explored how we see and interpret light, depending on how much light is present. Our eyes adjusted from the brightness of the sun to the absence of light in a dark room. Different wavelengths of light impact human vision in different ways and allow us to see and interpret visual information differently. As a result, when astronomers look at the night sky, they use red light to illuminate their work area as red light does not affect our ability to see space as much as other wavelengths of light.
Graphs are a great visual resource! We employed the use of graphs by looking at light emitted from different celestial bodies, which we graphed and made sound from. By using the same technique as blind scientist Wanda Diaz Merced, we were able to code a symphony of the stars. Resultantly, we also created a visually appealing work of art that brings awareness to the light that is present in our universe which we do not see on a daily basis.
Brown dwarfs and nebulae were another source of interest during our exploration between astronomy and art. Nebulae can contain brown dwarfs (also known as failed stars). We performed a found text on a brown dwarf paper that our mentor Afra wrote, and used it to better understand the context of that paper. Our studies of nebulae continued with the interstellar reddening activity we later completed. By looking at the ratio of two different wavelengths of light we were able to determine which nebulae have the most space dust between us, the observer, and the nebula itself.
We studied different pioneers in the Apollo 11 mission, and how their roles contributed to the mission’s success. While there are people who do not believe in the Apollo 11 mission’s occurrence, we looked at the code for the software that was used to land on the moon.
Additionally, we created a piece of art using that code.
The culmination of all of the facets of astronomy that we incorporated in our studies are strung together by a single thread. That thread is art and humanity. While in today’s world, that thread is virtually nonexistent, we are effectively threading that artistic and humanities thread through oftentimes challenging concepts in an attempt to reach more people. Our goal has been to portray traditional STEM concepts through an artistic lens in order to reach a wider audience and engage the general public in astronomy. After all, the basis of all space research is based on public interest which in turn, generates funds for research.
Incorporating non-traditional research methods in traditional disciplines, we break the mold of conventional learning, choosing to introduce creativity and imagination in an attempt to expand knowledge to more people.
Throughout our research, we have implemented different techniques to incorporate more artistic values in astronomy. We believe that we can use our studies and applications of art in our future education in a variety of ways. Firstly, we can apply found text to different subjects. The found text process could be used to interpret many different things from Classical Literature to Psychology, to physics concepts. We could advocate for found text use in classrooms. Not every student learns best in the traditional way of reading and memorization. The found text process offers an effective, interactive, creative, and engaging alternative. It allows us to portray concepts in a visual way to entice more people in different disciplines. Lastly, we believe it is vital that we acknowledge and explore contributors who have been left in the dark, in an effort to inspire future generations to persevere despite obstacles that may slow them down.
Recommendations to Educators
Along with our future directions, in order to create change, we know it is vital that professors and educators implement some of our processes. We believe it is vital that educators try to include some of our exercises in their teaching. Teaching the process of found text can be done through group learning and group projects in any class subject. Here are some things to keep in mind about the found text process:
If something is repeated in a text keep the most pertinent description or information and get rid of the one that the reader finds to be less relevant.
Look for things that stand out and are unique to the reader (does the text mention or reference something else?)
Along with other “nontraditional” research and note-taking styles, like Cornell notes, found text could be added to secondary level English and Science curricula to give another effective option for students to learn how to research.
Our goal is to get people excited about topics that they may feel they have no place in when quite the opposite is true. In every part of science, math, technology, and engineering, the portrayal of topics is vital, which is often done in a visual (artistic) form. Consequently, nowhere else are art and STEM more important than in the engagement of individuals to come together to create and inform.