Ritali Jain, Alexander Nguyen, Houda Miftah, Julian Reed, Kepler Boyce, Nina Franz, Cecilia Colberg, Audrey Edwards, Ayushman Chakraborty
Abstract
Algorithms generated by artificial intelligence on online platforms have dominated our lives and our time, affecting our behavior and opinions without user awareness. The more these intelligent algorithms are able to paint a clearer picture of you, the more they can manipulate you to act and think in the ways that are profitable to technology monopolies that exist in our current world. To best emphasize the massive scale of data collected by these companies, our team designed an immersive shopping website, Sahara Prime: The Simulated Experience, to demonstrate how much information can be collected on them in just twenty to thirty minutes. Our website can be accessed through the following link: shtem.herokuapp.com. This project and paper were crafted to explain and raise awareness to the infringement of users’ privacy that is occurring through these highly advanced algorithms. It is our belief that this platform will lead others to do their own research on how they can protect their privacy and resist surveillance capitalism to ensure each individual has more control over personal information. With this, we can break the veil of ignorance that has allowed us to be manipulated by our screens for so long.
Background
The role of algorithms in our daily lives has been expanding with the rise of digital platforms, including social media, e-commerce, and online services. The Covid-19 pandemic has contributed to the digital surge, with Internet service usage rising from 40% in pre-lockdown levels to 100%. Digital platforms offer large technology companies, often monopolies like Amazon, Facebook, and Apple, more mechanisms to collect data about our tendencies and demographics through our digital trace. Our online presence generates a massive amount of data, including search history, previous purchases, watch history, what posts are liked on social media, and more. This raw data is fed into advanced machine learning algorithms that use analytics to determine our behavior, interests, and persona. These insights undergo pattern discrimination, the addition of identifiers on input data to filter information, in order to divide users into segments for purposes such as advertising or political campaigning.
Algorithms
We begin by defining key words in relation to the concept of algorithmic targeting: privacy, surveillance capitalism, persuasive technologies, and the attention economy. Privacy is the right to keep personal information to yourself and/or trusted individuals rather than accessible in the public domain. Through terms of service, cookies, and other mediums, users often exchange their privacy in return for online services. Companies benefit off of access to this data due to surveillance capitalism, which refers to the economic system that aims to collect as much of users’ personal data as possible for financial gain, since data insights are sold to businesses and advertisers. In order to increase the amount of data gathered, technology companies aim to maximize the amount of time that people spend on their platforms. This is the concept of attention economy, where Internet users pay for these free services available to them, which tend to be persuasive technologies designed to change human’s thought patterns, with their attention. Essentially, when users are not paying for a product, they tend to ‘be’ the product, since tech monopolies make a solid profit from selling data to third parties. With this data they can also modify their products to be more enticing, such as using algorithms to pinpoint your interests and thereby determine which product recommendations suit you most.
Surveillance capitalism has become innate to our economy, and as a result, consumers do not realize the adverse effects of algorithms and attention economy on different facets of society, such as screen time. Delving into this extensive foundation on algorithms, we aimed to produce a captivating storyline that raises awareness on these concerning realities.
Data Collection
It is impossible to discuss surveillance capitalism without elaborating on data collection. The amount and nature of data collected tells a story: the story of you. Depending on the nature of the technology offered, companies may be collecting any or even all of the following protected private information: first and last name, gender, date of birth, age, location, IP address, mailing address, email address, phone number, payment details, and credit card information.
Companies also accumulate data surplus, or data that is not necessary to the functionality of the service or experience. Generally, a company may collect data such as social media accounts (if you log in using them), clicks, time spent looking on a website, reviews, browsing data, language, employment history, education history, weight, height, and body measurements (eg. estimated stride and shoe/foot size). Google Analytics is a tool that acquires data from each website visitor by inserting JavaScript page tags into the code of each webpage so that data from each visitor’s browser can be sent to Google’s servers.
Companies also rely on cookies for data collection, as they comprise a unique user ID and a site name. Cookies enable websites to retrieve this information when users revisit them, so that they can remember users and their preferences and tailor page content accordingly. When users accept the use of cookies, they make the collection of their personal information easier. Often, major corporations require that users accept cookies so that they can improve their services and products, improve customer experience, and curate their products. However, data collection benefits the corporation more than the user, as personal data is sold to third parties in order to send targeted ads to users.

Why Social Media is Free
Despite platforms like Youtube offering users access to an engaging and personalized experience, it is key to recognize that social media corporations benefit from the attention economy and surveillance capitalism. Dissecting their business model is fundamental to understanding why social media is free. Although their revenue is mostly based on personalized ads from firms, social media companies go beyond just allowing the promotion of products on their platforms. They serve as a cage for the guinea pig: the user. By analyzing user data from different inputs, they can sell inferences to corporations who are looking for an easy way to target their ideal customers.
To drive attention economy, social media platforms use features that research has shown are addictive, such as push notifications, typing awareness indicators, and banner ads. Platforms also regurgitate media that is predicted to imitate your tastes and preferences, which can have devastating impacts. For instance, Instagram and Faceobook have contributed to the rise of political polarization.
Algorithms identify user’s political beliefs and update their feed to match the user’s views in order to tempt them to click and engage with the media. This results in a rabbit hole where the algorithm suggests articles with increasingly radical ideas and these become the primary source of information that the user consumes. Another example of creating unhealthy user feeds is when an algorithm learns that a user is depressed and updates the feed with ads about vape pens and other dangerous influences.

Insight from Workshops
Throughout our journey, we participated in a multitude of workshops to not only gain a deeper understanding of the dangers of surveillance capitalism, but also to learn about the different ways we could create an interactive piece of theater that conveys this information to our audience. Theater is not just limited to a production on stage— it can also encompass an interactive phone call or a science- fiction role playing game. We learned that having the audience become a large part of the performance rather than merely watching it further immerses them into the story, as they essentially became the story. This allows the audience to further resonate with the message of the story being presented, which we wanted to take advantage of to convey our important message in the most impactful way. These workshops helped us unlock the full potential of our creativity as well as expand our understanding of theater, inspiring the experience we created.
The Social Dilemma
During our preliminary research, we watched two documentaries. The Social Dilemma introduces the ethics of algorithms and the tactics used to not only collect our data but to keep us engaged so that they can expose us to as many ads as possible and increase their revenue. The documentary portrays algorithms as three men to provide a metaphor. These men are not seen as caring for the user, but rather as manipulating and taking advantage of the user for profit. Another key detail is that as the user spends more time with their phone, they obtain a better idea of who the user is, which is symbolized as a holographic image of the user becomes more detailed over time.
We also watched a documentary about surveillance capitalism, featuring Harvard Professor Shoshana Zuboff. The documentary introduced us to multiple case studies where data was used and sold to companies that provided extremely effective ads to their targeted audience. One case study that stood out was the case of a pregnant woman. Using the data collected from the woman, the algorithms of Target were able to figure out that she was pregnant before the woman’s family found out. Such case studies showed us the true power that data collection had and inspired multiple aspects of the project.
Objectives & Rationale
Following our literature review on algorithms, surveillance capitalism, attention economy, and data collection, we found it pertinent to convey our research to a general audience. In particular, our goal was to illustrate the inherent monopolies that companies utilizing data and algorithms have established in our society, not just controlling economic markets, but dominating our daily lives. Algorithms dictate consumer spending, marketing, politics, and screen time among other facets, and our constant provision of data online no matter how innocuous fuels these algorithms and improves their ability to predict future behavior.
Our primary objective in this project was to illustrate the unwavering influence of surveillance capitalism via a story that is executed by harnessing online platforms. We chose to convey our research through an immersive experience in order to enable users to self-realize the nature of data collected from them within a short span of time as they walk through the story alone.’
We also wanted to show how platforms that appear to be free to use, like social media applications, rely on a much more ephemeral currency – our time and our attention. Using a shopping website as our driving mechanism served this need and also fulfilled the setting where much of surveillance capitalism occurs in reality. Through this vehicle, we were able to collect personally identifiable information (PII) that is necessary for the functionality of a shopping website, but more importantly micro-data that we were able to piece together to make inferences about our user. The purpose was to show how predictive algorithms can use micro-data to make inferences and suggestions for you, oftentimes determining aspects of a person unknown to that person’s conscious mind, and extending the time spent on those platforms.
Methods
To accomplish our aims, we created Sahara Prime: The Simulated Experience, an immersive shopping narrative. Our goal was to incorporate several platforms throughout the immersive experience in order to guide the user through each layer of our story. We wanted to start with a website that would mimic a high-end fashion store and replicate some of the features that users would be accustomed to. This initial layer of our story aimed to not only collect and compile a user profile, but to also display how valuable data collection is. The next layer of our experience leads the user to a phone call in which an automated voice directs them to a data leak, in which the user is able to view all the data that Sahara Prime has collected on them. Phase alpha is also initiated, a process aimed at detaching the body from the algorithm. The user is then led through the research we collected to reveal the main purpose of Sahara Prime: to inform users on the importance of resistance in the age of attention economy. The last platform we were able to incorporate was the use of email, which was used to wrap up the experience and send more information about resistance. Through this multi-layered experience, we were able to effectively construct an immersive shopping narrative that is both entertaining and informative.
Tools
We used many tools and frameworks to create our website; as such, we need to define several terms. Our website was built upon the Node.js runtime and package ecosystem, meaning it was written in the Javascript programming language. In our case, however, we used Typescript, which is a strongly-typed version of Javascript. Put simply, this ensures our code is robust by catching many errors before compile time.
Since our website requires dynamic functionalities, we used Next.js, a full-stack application framework that leverages React for the frontend development. React is a Javascript library that allows one to write reusable “components’ ‘ in Javascript Extended (JSX) or Typescript Extended (TSX) which render particular HTML content. This removes the need for independent HTML files and script files, allowing for a much cleaner codebase—all of the content that is rendered on a page and the logic for that page’s functionality are self-contained in a single JSX or TSX file. In addition, content that needs to be reused in many places on the site can be stored in a React component, following the “DRY” principle of programming: Don’t Repeat Yourself. With traditional HTML and Javascript, our website source code would have been much larger and more difficult to manage. React is currently the most popular frontend framework and is used in many frontend web development job positions, which demonstrates how powerful it is for shortening development time and improving code maintainability.
To further streamline development, we used a tool called Tailwind CSS, which almost entirely eliminates the need for CSS by linking premade stylesheets in all JSX/TSX files. Rather than creating verbose CSS files, one can instead do a vast majority of their styling by applying Tailwind’s classnames to their HTML elements. For example, styles such as
div#example{
display: flex;
flex-direction: column;
gap: 0.5rem;
justify-content: flex-end;
}
that would appear in an external CSS file become simplified to inline class names:
<div className=”flex flex-col gap-2 justify-end” />
Further benefits of the Next.js framework include its server-side rendering and built-in API routes. Our website required an email server for a few elements, and because email servers can only run on the backend, we needed some form of backend. Next.js does not offer a complete replacement for a traditional backend, but it allows one to create web API endpoints very easily, which was sufficient for our needs. These API endpoints receive HTTP requests from the frontend containing information such as the email address and contents of the email, and the email server sends an email in accordance with the provided information.
Due to the fact that browsers can only run Javascript (with the exception of WebAssembly, though this still requires Javascript to access and edit the DOM), the React frontend source gets compiled to optimized HTML and Javascript for the final production build.
Currency
Instead of using the traditional coins or tokens system, we established heartbeats as our form of currency for Sahara Prime. The purpose for choosing this was to make users grasp just how much time they allow screens and websites to take. By making the focus on heartbeats, the user can realize how many precious seconds are wasted for the gain of the attention economy. We believe this will have a lasting effect on our audience and will make them ponder on the quality of life we are all missing out on because of our inability to turn off our electronic devices.
Collecting Personal Data
On our account setup page, we collected PII such as email address, full name, phone number, and birthdate, as well as whether the user has preference for visual guides to audio components of the website. Our website also includes a preliminary survey to gather user’s interests and demographics, such as preferred brands, ethnicity, gender, and level of introversion vs extroversion. Throughout the website, we collect data on the amount of time spent and the number of clicks on different pages and products to infer what components pique the user’s interests.
For the user to earn enough currency to buy their desired product, we offered several microtasks that they had to complete, including surveys, captchas, video ads, image ads, and signing up for our newsletter. For the surveys and captchas, we embedded questions that seemed playful but would bring algorithms a lot of important predictive information. One such disguised question was ‘Which president would be the worst at video games?’ with options of ‘Abraham Lincoln’ , ‘Millard Fillmore’ , ‘Barack Obama’ , ‘Donald Trump’, and ‘Ulysses S Grant’. The question could be used to reveal any potential political biases of the user.
Design Choices
On top of applying commonly used data collection tactics in our very own site, we also focused on incorporating satire in our website. Satire allowed us to exaggerate certain components of our website that users often take for granted. We applied satire in the terms and conditions page, where we made the user scroll for at least ten seconds in order to reach the bottom of the page. Our terms and conditions also evidently included nonsensical clauses. By doing so, we hoped that the user would become conscious about the terms and conditions methods that other companies use to get the user’s permission to collect data without them even knowing it. Furthermore, in our captcha task, we included some questions that were impossible to answer correctly. Again, this use of satire illustrated how human verification captchas are persistent forms of data collection on websites.
At the end of checkout, our website is ‘hacked’ by a fictional youth resistance group, The New Generation (the TNGers). The rest of the experience is predominantly outsourced to a phone call, with the exception of showing how much data was collected about the user and what inferences were made. The phone call discusses methods of resistance by walking the user through a mediation-like experience and illustrating the effects devices have on our lives. We chose this multi-platform approach to increase the engagement and awareness that the experience provides by helping the user gradually transition from our site to joining the resistance against surveillance capitalism.
Limitations
Given the nature of our project, there are certain limitations that arise. Firstly, the inferences we made on our user were created using conditionals rather than an actual trained algorithm. This is because in order to develop a machine learning model that could make inferences based on the particular data we were collecting, we would need a very large set of training data with our custom features, which we did not have the time and resources to generate. Although the user could view their data on Sahara Prime, we used localStorage, which stores all information only on the user’s browser, instead of a database to build the site. We opted to not collect personally identifiable information from our users in order to stay true to our principle. Furthermore, we are currently testing our experience with a sample audience, and the feedback was not returned at the writing of the paper. We hope to publish our results and make it available upon request once we obtain a large, diverse pool of feedback from our audience.
Conclusion
The significance in our piece lies in the context behind it. Through our use of the heartbeats currency and microtasks, we illustrated the famous saying: if you are not paying for the product, then you are the product. Through our data, we are being sold to companies who want to know more about us so that they can pitch a very personalized ad. This process is how users eerily get ads for products that capture exactly what they were looking for. By simulating the website crash and the data leak, we also conveyed how personal data collected by companies is susceptible to cyber attacks. Even if a security threat of such a magnitude never occurs, the business model of surveillance capitalism requires corporations to sell data in order to earn revenue.
Throughout this process, we also came to learn about how theater is evolving into an experience that does not clearly define the role of the performer and the audience. With new digital mediums, theater is adapting to more immersive pieces, where audiences play a role in the story. This feedback loop between the audience and the performers is called autopoiesis and is easier to embed in performances taking place in the new digital mediums that are at society’s disposal today.
Interestingly, we noticed that the techniques used to create theatrical pieces are very similar to algorithms used by companies. Both collect user input and use that input to shape the user experience. Thus, they both use autopoiesis. Is it possible to conclude that the algorithms share similar aspects with theater?
The root of this answer depends on the definition of theater. Is theater centered around the topic of interaction? Or is it the story lines that make up theater? We believe theater is the performance of creating a strong user experience that is personalized based on the audience. The data-collecting and inference-making algorithms make technological companies a form of theatrical display, as AI-algorithms make these platforms a perfect medium for a personalized user interaction. With more progress on deep learning models, AI becomes much smarter, and soon enough, we will have to dig deeper into the Turing test, the test to determine whether something is human or not. With AI becoming more human, each experience on the Internet becomes more interactive and more focused on developing an impactful human experience.
In our research, we focused heavily on the impact of surveillance capitalism and the attention economy in our lives. Thus, it was crucial for us to highlight the importance of resisting surveillance capitalism. Resistance is important because surveillance capitalism reduces personal privacy and narrows choices through creation of algorithmic echo chambers. We distinguished two methods of resistance to surveillance capitalism: radical and analog ways of resistance. Radical resistance focuses on changing device settings, such as turning on private browsing mode, rejecting certain cookies on browsers, switching to search engines that don’t collect users’ private data (eg. DuckDuckGo, Startpage, Ecosia), turning off push notifications to prevent distractions, and setting screen to grayscale to reduce screen-time. Analog methods of resistance focus on improving users’ mental health and overall well-being. They include practicing mindfulness and gratitude, exploring new hobbies, making bedrooms screen-free, and developing positive habits such as choosing one day per week to set your phone aside or putting a hairband around your phone (the hairband allows users to answer phone calls easily, but makes other uses of the phone more difficult).
Future Directions
Currently, most of the algorithms that make inferences on Sahara Prime use hard-coded conditions. We hope to eventually implement deep learning algorithms that train themselves to output more information about the user based on the input data. We also want to find more ways to show the amount and impacts of screen time. Our way of illustrating the value of the data was to show the massive amount that we collected in the twenty minutes of the experience. One feature that we hope to put into effect would be to quantitatively quantify the value of a certain user’s data by leveraging the amount of clicks or amount of time spent on a page, as they indicate how much a user interacts with the site and also how much potential data is collected during that time span. By doing so, we hoped to keep the audience aware of the data collection methods embedded in our website.
Acknowledgements
This project would not have been possible without the support of our mentors and the STEM to SHTEM team. Our team would like to acknowledge and thank Devon Baur and Marieke Gaboury for their guidance and inspiration. We would also like to express our gratitude to Professor Tsachy Weissman for creating the Stanford Compression Forum Internship and to Sylvia Chin for running the program. We are grateful for this unique opportunity to explore novel topics, conduct research, and gain insightful feedback.
References
Berrios, Giovanni, et al. Cyber Bullying Detection System. 6 May 2020, https://engineering.ucdenver.edu/docs/librariesprovider29/college-of-engineering-and-applied-science/sp2020-capstone/csci14-report.pdf?sfvrsn=d3731fb9_2.
Cohn, N. (2014, June 12). Polarization is dividing American society, not just politics. The New York Times. Retrieved August 5, 2022, from https://www.nytimes.com/2014/06/12/upshot/polarization-is-dividing-american-society-not-just-politics.html
De’, R., Pandey, N., & Pal, A. (2020). Impact of digital surge during Covid-19 pandemic: A viewpoint on research and practice. International journal of information management, 55, 102171. https://doi.org/10.1016/j.ijinfomgt.2020.102171
Ioanăs, Elisabeta, and Ivona Stoica. Social Media and its Impact on Consumers Behavior. Report no. 1, 2014. International Journal
of Economic Practices and Theories, https://training.unmuhkupang.ac.id/index.php/JAK/article/view/2
Kant, T. (2021). Identity, Advertising, and Algorithmic Targeting: Or How (Not) to Target Your “Ideal User.” MIT Case Studies in Social and Ethical Responsibilities of Computing, (Summer 2021). https://doi.org/10.21428/2c646de5.929a7db6
McDavid, J. (2020). The Social Dilemma. Journal of Religion and Film, 24(1), COV41+. https://link.gale.com/apps/doc/A616580373/AONE?u=googlescholar&sid=bookmark-AONE&xid=2ffcc915
Smith, Ben. “How Tiktok Reads Your Mind.” The New York Times, The New York Times, 6 Dec. 2021, https://www.nytimes.com/2021/12/05/business/media/tiktok-algorithm.html
“Tips for Reducing Screen Time, Reduce Screen Time.” National Heart Lung and Blood Institute, U.S. Department of Health and Human Services, https://www.nhlbi.nih.gov/health/educational/wecan/reduce-screen-time/tips-to-reduce-screen-time.htm.