Understanding COVID-19 Through Sentiment Analysis on Twitter and Economic Data

Journal for High Schoolers


Cecilia Quan*, Noemi Chulo*, and Parth Amin (* indicates equal contribution)


We trained a sentiment analysis bot using machine learning and Twitter data to classify tweets as expressing positive, negative, or neutral sentiments toward COVID-19 safety measures such as mask wearing and social distancing. We then compared data obtained from this bot to both economic data and COVID-19 case count data to better understand the interplay between social mindsets, consumer spending, and disease spread.



The COVID-19 pandemic has caused mass social unrest across the United States. The safety procedures (e.g., mask wearing and social distancing) that have been strongly recommended by the Centers for Disease Control and Prevention (CDC) to constrain the spread of the virus have proven to be controversial. By analyzing publicly available Twitter data via the Twitter API, we sought to better understand 1) the usage and attitudes towards these procedures over time in the US, and 2) their effect on case count growth. Using tweets containing the keywords “mask”, “masks”, or  “social distancing”, we trained a machine learning-based sentiment analysis bot to determine whether a tweet expresses positive, negative, or neutral sentiments regarding COVID-19 related safety measures. Specifically, we used a combination of the Naive Bayes and Logistic Regression algorithms to train the bot. We then used our bot to automatically classify thousands of tweets into each of these categories and observe how public sentiments have changed over time since the beginning of the pandemic. Finally, we analyzed government economic data related to consumer spending in brick-and-mortar locations (such as restaurants and retail stores) to see if this data had any correlation with the sentiment data and case counts from the pandemic. The case count and death count datasets were obtained from the World Health Organization (WHO). 



Sentiment analysis is a branch of computer science that attempts to identify sentiment and emotion from natural language input. There are a multitude of ways to accomplish this, but most methods fall into two main branches: machine-learning-based sentiment analysis, and dictionary approaches to sentiment analysis. 

Machine-learning-based sentiment analysis refers to a sub-field of artificial intelligence that aims to understand human emotion in natural language expression in an automated fashion. By training a bot with a classification algorithm and training data consisting of text strings and then their corresponding labels, the bot can learn to classify text on its own with some level of accuracy. Although using sentiment analysis on tweets comes with many limitations, we aim to show that its usage will give us a better perspective of both public opinion on COVID safety procedures, and how these opinions may influence the actions of others. 

Dictionary approaches are much more simplistic, and essentially work by maintaining a large database of words with certain associations (similar to any dictionary for natural language, but encoded for computer purposes), and then classifying a string of text by referring to the dictionary’s classification of the words within it. Compared to machine-learning based sentiment analysis, this system does not learn as it reads more data, cannot be trained, and is unable to understand any word that is not within its dictionary. 

Data and Methods


This section describes the data collection and labeling process, as well as the details of our machine learning system. 

  1. Data Collection and Categorization from Twitter   


To create training data for our bot, we had to obtain a large amount of tweets with our desired keywords (“mask”, “masks”, “social distancing”). To do this, we used the free extension of Twitter Developers, known as Sandbox. This allows us to fetch up to 5k historical tweets (tweets from longer than 7 days ago) per month, but unfortunately historical tweets are truncated (after 140 characters, the tweet is cut off). Instead of collecting historical tweets for our training data, we used the “Stream” function which allowed us to obtain real time tweets with no monthly limit. These were not truncated and came through July, when we were doing most of our classifications. We collected 1297 of these and hand labeled each of them into neutral (DIS), positive (POS), or negative (NEG) categories. An additional 87 were collected but couldn’t be used because they were in the wrong language, or did not have a keyword. We hypothesize that some of the returned tweets are actually retweets of tweets containing a keyword, but do not have the keywords themselves. This is a potential issue we face in our final results. 

Although sorting tweets into negative, positive, and neutral categories seems like a simple thing to do, it is extremely subjective and difficult to judge. Many of the tweets analyzed were entirely incoherent and self contradictory. At the beginning we believed that if a tweet was in favor of social distancing, it would also be in favor of masks, and vice versa. This was not always the case, so deciding where a tweet would go was often subjective. Tweets also tended to follow political orientation, but not as frequently as we expected before we began. You can find a full list of the tweets we used for training and testing along with their classifications below the Appendix. If you notice places where you think we misjudged, we’d love to discuss this with you so we can further increase the objectivity of our sentiment analysis bot. 

NEG (Negative, as in against masks or social distancing)

  1. General negative language/phrases/feelings surrounding protective masks and/or social distancing to prevent the spread of COVID 19
  2. Indication that masks/social distancing are part of a conspiracy theory/government control
  3. Blatantly state they do not wear a mask (for any reason)
  4. Indication that a group of people does not have to wear a mask (even if the person themself is not within that group)*
  5. Proposing that masks lower oxygen intake 
  6. Indication that masks “don’t work”
  7. Proposing rebellions against mask wearing, or “cheating” the system
  8. Indicating “it makes no sense to do X because Y disease was much worse.”
  9. Wanting to open schools AND does not specifically suggest protective measures
  10. Apathy regarding masks or social distancing
  11. Against general mask wearing/social distancing mandates*
  12. Arguing against/ making a negative statement towards something or someone that promotes social distancing or masks
  13. Stating that coronavirus does not exist/ is exaggerated to justify not following safety guidelines
  14. Proposing that the best way to beat coronavirus is to “build up immunity” by defying public health guidelines. 
  15. Makes fun of/ is clearly against people or things that support mask wearing/ social distancing

POS (Positive, as in in favor of masks or social distancing)

  1. General positive language/phrases/feelings surrounding protective masks and/or social distancing to prevent the spread of COVID 19
  2. Implying that masks/social distancing work
  3. Mentioning that they wear a mask/social distance
  4. Arguing against/ making a negative statement towards something or someone that does not promote social distancing or masks
  5. Advertising masks (handmade or otherwise)
  6. Stating that coronavirus is not something trivial and/or should be feared. 
  7. Arguing that reasonable mandates are justified
  8. Encouraging others to follow guidelines

DIS (Discounted/Neutral, as in not clearly expressing a meaningful opinion)

  1. Neutral language/phrases/feelings surrounding protective masks for COVID 19
  2. Contradicting statements to the point the writer’s opinion is somewhat indiscernible, and the tweet does not blatantly conform to a specific rule in POS or NEG that would overwrite that issue. *
  3. People saying they want to understand another side (unclear where their actual opinions are)
  4. Calling people hypocrites for not wearing a mask*
  5. Not referring to masks in the context of COVID 19*
  6. Claiming that they are against a law/mandate that is ridiculous or unimaginable without further indication of their position.*

The full list of clarifications for rules can be found in the Appendix at the end of the paper. Rules that require clarifications are denoted by *.

Prior to the data labeling process, we assumed the issue of categorization would be more black and white than it really is. Although our philosophy towards categorization has evolved over our research, at the moment we base categorization on how the user’s words will affect others who read their tweet instead of just analyzing the writer’s opinion on their own. This makes sense for two reasons:

First is the volume of data that can be collected on this premise, and accuracy of analysis. Most of my tweets would have to be thrown out as neutral based on guidelines that only concern the users’ own assumed opinion, and in the end our data probably wouldn’t be a good reflection of the actual negative vs. positive bias. 

Second is the fact that basing classification on how the tweet affects the mindsets of others who see it addresses our thesis better. It shows better how social media opinions on coronavirus affect real life circumstances, not the portrayal of real life circumstances in social media. The difference is that a user’s opinions are a symptom of a situation, whereas the effect of their opinions on others may collectively cause a situation. We are looking for how social media may predict circumstances rather than how circumstances predict the state of social media. 

An example of this is the following commercial tweet: 

“Pssst! I got a secret. Get at ADDITIONAL 20% OFF face masks that are already on sale!!! That’s around $6 a mask. Only if you buy 4 or more! Sale won’t last long. BUY NOW!!!


RT plz @XMenFilms @xmentas @WolverSteve @DailyXMenFacts #XMen #facemasksforsale https://t.co/BIBtjqb9YW

Although it would make sense to assume that someone selling masks is pro-mask, we have no evidence of this whatsoever. If we were to adopt a philosophy of sorting tweets based on an individual’s opinion, we would run the risk of being forced to classify a majority of our tweets into neutral categories and therefore have a skewed dataset. Instead, we consider this as a positive tweet by following our more holistic philosophy of connecting this tweet to its likely effect on those who read it. The specific tweet above promotes a societal acceptance and usage of masks, and people who read it will be affected by this philosophy. 

  1. Collecting Economic Data 


In order to analyze consumer spending patterns during the pandemic, we sought out data made available by the Bureau of Economic Analysis under the United States Department of Commerce. On their website are monthly reports of Personal Income and Outlays, which illustrate consumer earning, spending, and saving. Personal outlay is the sum of Personal Consumption Expenditure, personal interest payments, and personal current transfer payments. 

Personal outlay can also be calculated as the Personal Income minus the Personal Savings and Personal Current Taxes. This represents an overall track of how much consumers have spent within a month. Using the data Table 1 provided by the Personal Income and Outlay report of June 2020 [1], we graphed the Personal Outlays in billions of United States dollars against months on Google Spreadsheets (Figure 2). Since this is a monthly report with overall changes, there is only one data point for each month. Additionally, the dollars are seasonally adjusted by annual rates, which helps remove seasonal patterns that may affect the data. All the dollar amounts in the following figures are seasonally adjusted by annual rates, as well as the index numbers. We chose to include 4 months of data, as the coronavirus 

pandemic started to impact the United States in mid-March, so the data of March is skewed to both the pre-pandemic era and in the pandemic era. 

Figure 1: Personal Outlays (in billions of US dollars)

Within the Personal Outlays, the subtopic of Personal Consumption Expenditure (PCE), also known as Consumer Spending, is a more specific measure of spending on consumer goods and services. PCE  is the value of the goods and services purchased by, or on behalf of, “persons” who reside in the United States.Using the same June 2020 report from the BEA [1], we gathered the PCE in billions of dollars over the months February 2020 to June 2020. For the total amount, we utilized Table 1: Personal Income and Its Dispositions (Months), and for the changes between months, we used Table 3. Personal Income and Its Disposition, Change from Preceding Period (Months). PCE is divided between two sections, goods and services. Within goods, there are two further subtopics: durable and non-durable goods. The first graph is the total amount (Figure 3). 

Figure 2: Personal Consumption Expenditure (in billions of US dollars)

For more details into CPE, we looked at a variety of different products and collected their CPE in billions of USD. To find the data, we found the data in Excel Spreadsheets linked under the Underlying Details section of Interactive Data on the direct Consumer Spending [3] site page on the Bureau of Economic Analysis as SECTION 02: PERSONAL CONSUMPTION EXPENDITURE [2]. Through the Excel Spreadsheet, we accessed Table 2.4.4U. Price Indexes for Personal Consumption Expenditures by Type of Product, which is under the spreadsheet code U20404. From the spreadsheet, we chose a range of goods and services that had a variety of changes over the four months.

  1. Computer software and accessories
  2. Food and beverages purchased for off-premises consumption
  3. Food and nonalcoholic beverages purchased for off-premises consumption (4)
  4. Food purchased for off-premises consumption
  5. Personal care products
  6. Electricity and gas
  7. Public transportation
  8. Air transportation
  9. Live entertainment, excluding sports
  10. Food services and accommodations
  11. Food services
  12. Personal care and clothing services
  13. Personal care services
  14. Hairdressing salons and personal grooming establishments

Using the data in the spreadsheet, we collected the PCE of each good or service in billions of USD, and graphed it using Google Spreadsheets (Figure 4).

Figure 3: Detailed Price Consumption Expenditure (in billions of US dollars)

To go even more in depth on retail sales, we gathered data from the US Census Bureau [4] in their Advance Monthly Trade Report released in July. Using their customizable time series, we found the sales in millions of US dollars for the following: 

  1. Retail Trade and Food Services: U.S. Total
  2. Retail Trade: U.S. Total
  3. Grocery Stores: U.S. Total
  4. Health and Personal Care Stores: U.S. Total
  5. Clothing and Clothing Access. Stores: U.S. Total
  6. General Merchandise Stores: U.S. Total
  7. Department Stores: U.S. Total
  8. Nonstore Retailers: U.S. Total
  9. Food Services and Drinking Places: U.S. Total

Figure 4: Sales of Food and Retail Services (in millions of US dollars)

Another determinant associated with consumer spending is the PCE Price Index, which is a measure of the prices that people living in the United States, or those buying on their behalf, pay for goods and services, and reflects changes in consumer behavior. Utilizing the same June 2020 Personal Income and Outlays BEA report [1], we gathered the data from Table 9: Price Indexes for Personal Consumption Expenditures: Level and Percent Change from Preceding Period (Months). Using the percent change in index, we calculated the change based on the index in February, and graphed it across four months. 

Figure 5: Change in Price Consumption Expenditure Price Index 

As an additional correlation, we gathered the CPI, or Consumer Price Index and compared it with the PCE Price Index. The difference between the two indexes is that the CPI gathers data from consumers while the PCE Price Index is based on information from businesses. Moreover, CPI only tracks expenditures from all urban consumers while the PCE Price Index tracks spending from all households that purchase goods and services. See this resource by the BLS for more details on the differences [5].

For the CPI, we collected data from the Bureau of Labor Statistics, which is another bureau under the Department of Commerce. On the site page, CPI Databases, we accessed Tables of the series All Urban Consumers, which led us to the page, Archived Consumer Price Index Supplemental Files, where we accessed News Release Table 3 [6], which is Consumer Price Index for All Urban Consumers (CPI-U): U.S. city average, special aggregate indexes, June 2020. We chose the exact same expenditures as we had in the PCE Price Index: Services, Durables, and Non-Durables, and collected the seasonally adjusted percent change within the months March 2020 to June 2020. Durable goods are not for immediate consumption, and thus are purchased infrequently while non-durables are purchased on a frequent basis. Since there were three percentages each for between two months, we allocated the percentage to the latest month. Using the percent change in index, we calculated the change based on the index in February, and graphed it across three months. 

Figure 6: Change in Consumer Price Index

3) Machine Learning Methods to Classify Twitter Data


Once we had all our data prepared, we took to Wolfram Alpha to start creating our sorting bot. We decided to include two separate machine learning algorithms as a part of our bot to increase accuracy. Our first bot sorted neutral tweets out from binary (negative or positive tweets), while our second would sort decidedly binary tweets into negative or positive categories. The first bot was trained and tested on all the data we sorted, while the second was only trained and tested on the non-neutral sorted training and test data. This didn’t make a significant impact on the amount of data the second bot was trained on since neutral data only represented 21.14% of the sorted data.

Wolfram has many classifier algorithms available to take advantage of, but since most are designed to be trained on numerical data as opposed to language data, they can be flawed when used for NLP (Natural Language Processing) . Here are most of the options available in Alpha.

Percentage accuracy was one of our top priorities, but another important consideration was bias. We wanted to make sure that when our bot made a mistake, it wasn’t significantly more likely to make one sort of mistake than the other. This ended up ruling out some methods, because 100% of their errors were assuming test was “positive” when the label was “negative”. Note, we had anywhere from 302 to 374 pieces of test data depending on the type of test that was being run, so this was unlikely due to pure chance. We assume that these methods were not created for processing strings, and just guessed the highest probability option from the training data in every instance if the tested data were strings. This is an important reason to run different kinds of tests and analysis on bots besides just accuracy, because although these methods had high accuracy for our particular data, they were very unreliable. 

Another important consideration that we kept in mind was how neutral tweets tended to be sorted when they were mis-sorted. In this scenario, it’s much harder to assume an “ideal” rate. Neutral tweets didn’t only include tweets entirely unrelated to COVID-19, so what the ideal sorting ratio for them truly is is much more difficult to hypothesize. Given time constraints, in our experiment we made the assumption that ideally neutral tweets should be sorted equally into “positive” and “negative” categories if they weren’t sorted as neutral. Although this is a metric that is much more difficult to control for, we took this final predicted ratio into consideration after processing our data to be able to better predict a confidence ratio for each datapoint. 

We ended up using Naive Bayes for the neutral vs. binary sorter, and a combination of Naive Bayes and Logistic Regression for the positive vs. negative sorter. This was done by collecting the probabilities for each outcome within each algorithm and multiplying each probability by its respective ratio (.65 for Logistic Regression, .35 for Naive Bayes), adding them across algorithms, and choosing the largest. 

On their own, the respective accuracy ratios of the neutral sorter and binary sorter are estimated P1= 79.841% and P2= 72.093%. Total accuracy is more difficult to calculate, because our main interest isn’t sorting every tweet into the correct group, it’s obtaining the correct ratio of tweets that are in certain groups. We don’t currently have an estimation of the former accuracy, but we do know for our test group what the assumed vs. true proportions are, and the proportion of correctly sorted tweets. For the latter, we can just assume that this number is roughly equivalent to P1 * P2=0.5755977. In the future, we hope to improve this overall accuracy by refining the algorithms we use and implementing synonym-based data augmentation strategies. 

One more issue we’d like to mention is that although using Wolfram allowed us to complete this project in a timely manner with a wide variety of options available to us, Wolfram is a closed source software and therefore the information about how their algorithms work is somewhat obscure. When using Naive Bayes in Wolfram, we didn’t know whether the function would automatically reformat or clean data for us, and if it did what issues it might’ve encountered when analyzing unknown strings like hashtags. Although we may or may not continue using Wolfram for future extensions of this project, keeping these issues in mind might help guide logistical and practical decisions in the future.

In getting our final data we were somewhat limited by time constraints and available computing power. Since only one of us had access to Wolfram Alpha, and because we did our computations locally, we were only able to process 100 pieces of data per month, save for July which also used our pre-labeled data. 

Figure 7: Proportion of Tweets by Sentiment (keywords: mask, masks, social distancing)

We graphed the data above by month to be able to match the economic data better, and because we did not have enough data to show a day-by-day graph. We would take the data from every day in the month, classify it, and then take the proportion of the data that fit that classification out of all of the data in that month. Unfortunately, historical tweets are truncated, so our algorithm guessed tweet sentiment solely based on the first 140 characters of each tweet for every month before July. In the future, we hope to find a work-around for this issue.

A final issue to mention is that our training data was not equally distributed. About 55.3241% of our human-labeled data was positive, 23.071% negative, and 21.2963% neutral. This means that our machine learning algorithms may hover around these ratios regardless of true values, and means that our data may be more conservative in percent changes than it should be. We decided to keep these ratios because changing them to be equal would significantly decrease the amount of labeled data we could use to train the bot, but we hope that as we work on this in the future, we will have enough data such that we could have equal ratios without having to sacrifice much of our labeled data. 

Correlation Analysis Between Differing Data Forms


Figure 8: WHO Data on Confirmed Cases & Deaths in US 


Figures 1-6: As with all the economic data, we use publicly available data from the government through monthly reports; therefore we only have one datapoint for each month. This leaves only an estimate of the data between months. For example, the lowest dip shows it happened in April in Figure 6, but it does not show when in April. 

Figures 5 & 6: For the index data, we calculated the change of the index from the “original” index given in February (pre-pandemic), which means the y=0 line represents the February index. 

Figure 6: This graph, using data from the Bureau of Labor Statistics, only has three months, which is inconsistent with the rest of the economic data graphs. 

Figure 7: There are a number of reservations to be had with this graph and it should not be considered as fact. We hope to collect a higher volume of data, alongside more accurate evaluations for our future pursuits, so the data we have at the moment could be considered a ‘teaser’ for what is to come. Our machine learning software only has about a 75% accuracy rate and only 1796 pieces of data were used to create the graph due to data processing issues, alongside limits in usage for historical Twitter data. Not to mention, the algorithms we used do have considerable bias which could affect our data. Also, data is exhibited month-by-month because of a lack of data that would make day-by-day analysis jerky and confusing.

Figure 8: For pt.1 of fig.8, since WHO data for the US comes from the government report for the US, there is some discrepancy in the number and the curve due to collection errors, lack of testing, underreporting, misdiagnosis, and other issues. For death counts, unlike death counts for seasonal diseases like the common flu, are not representative of real-time. It often takes days to weeks to test the deceased, get the results, send the reports to the National Center for Health Statistics. This means that the data is often based on past weeks and are not entirely current. In addition, even if it were reported on time it more represents the situation of the nation two weeks ago as opposed to on that day, because it takes a while after getting the virus for any patient to be at risk of death. The following article goes more in depth about the issue: https://fivethirtyeight.com/features/coronavirus-case-counts-are-meaningless/

Initial Observations


Fig 1: Throughout the data collection, positive tweets always seemed to be the majority, indicating that the majority of the US population is in support of mask usage and social distancing. This is supported by surveys of the general US population, which might indicate that representation of opinions on social media about this particular issue might be somewhat representative of the general population. 

Interestingly, the data seems to go through significant changes between April and May, and also June and July. Between April and May, and rising number of Twitter users seem to have positive opinions about mask wearing and social distancing, and not many neutral comments, which could indicate that more people are talking about COVID 19, that more people are making explicit opinions about masks and social distancing, or a combination of both during that interval of time. Between June and July, neutral rates stay somewhat constant but there’s a sharp increase in negative tweets and decrease in positives. From this, we might hypothesize that given the amount of times safety mandates have been in place, people are gradually starting to get more and more upset about mandates and are changing their opinions on it. 

Figure 2-3 & 4-5: These four figures represent the overall Consumer Spending, with Personal Outlays as the overall curve, and then CPE, detailed CPE, and sales of food and services. They all follow a similar dip in USD spent in April, which is succeeded by a slow growth back up. However, at the time of June, the numbers do not reach back to pre-COVID levels, especially considering the rate at which our consumer spending was growing before that. In the possible ways economic impact can play out due to a pandemic like this, our consumer spending may slowly yet surely return to normal growth rate, since there is no indication in our graphs of a quick growth spurt to catch up to what could have been our consumer spending levels. A possible reason that Consumer Spending is recovering while COVID-19 is spreading is the prevalence of e-commerce, especially when it comes to retail. Even though unemployment is still an issue during these times, the stimulus check combined with online shopping and delivery may have helped spur spending in the economy. Another possible option is that many counties and states have started reopening during May and June, allowing more in-person spending. 

Figure 6 & 7: Unlike the other graphs, the price indexes see the dip in May instead of April. Since the price indexes are a measure of inflation, it could be a showing a delayed effect of money spent on the prices of the goods and services. 

Fig 8: Both case counts and death counts rise significantly between April and May, and have somewhat bell-curve-like shapes during the interval. Case counts spike between July and August, while death counts begin to slowly increase day by day in the same interval, lagging behind case counts as expected. 

Between Figures

April-May: Between April and May most graphs seem to change dramatically. Figure 1 sees an increase in positive tweets and a decrease in neutral tweets. Fig 2-7 sees a dip in most forms of in-person sales like restaurants or air travel. Fig 8 sees a local maximum in case counts and deaths. Because Fig 1-7 are monthly, it is more difficult to make general assumptions about associations between the data types: For example we cannot say whether twitter sentiments imply case counts or vice versa, we can simply say that they may be correlated. Regardless of implications, this does give us a hint of how social changes might be able to affect case counts, or how the reverse is true. 

June-August: Although little government economic data about this timeframe has been posted yet, we can make some assumptions and inferences about the correlations between twitter sentiment data and case count data. Case counts rise dramatically during this time and see a local and global peak. Meanwhile, twitter data indicates that positive sentiment for precautionary measures is dramatically falling (between late June and late July), whilst negative sentiment rises to its peak, about 380.67% higher than the highest previous point, which takes place in April. Upon grabbing further data for twitter, and showing data on a day by day basis instead of month by month, we might find that the dramatic maximum that takes place mid July for case counts can be explained by a rapidly diminishing concern for COVID safety precautions. For now, we can only speculate that this may be the case, but it would be very interesting and telling if it were to be true. 

Possible Meaning


We’re intrigued by the idea that economic data and/or Twitter sentiment data may be able to be used to predict upcoming case rates. Although the idea that economic data can be used to predict spread of disease is not new, the idea that sentiment from social media can be used to predict (and possibly help prevent) diseases like this from spreading in the future is a new and revolutionary idea that gives reason to both hold social media companies more accountable, and to take these trends much more seriously than we have in the past. Although we’re far from making those conclusions, simply showing the possibility is important to us, and hopefully with more data we will be able to better understand these connections.

Conclusions and Future Directions


One way to strengthen our understanding of the multifaceted impact of the coronavirus is to combine qualitative and quantitative data. In this case, public opinion on social distancing measures is difficult to visualize; however, by using sentiment analysis on Twitter, we can better understand public sentiment. Since the interaction between the pandemic and society is so complex, we further explored the connection between more quantitative impacts such as consumer spending and COVID-19 data. For some of the economic data, we found similar dips between the two, especially in April. We also found that positive public sentiment decreased while consumer spending continued to rise. This type of analysis is applicable to real world problems, and, through deeper understanding, can lead to policy change and better preparation for future pandemics. 

Mentions for the Future

Economic Pathway:

A future direction would be to analyze the change in growth, which is the derivative of the economic data curves, and possibly correlate that with the derivative of the sentiment analysis data. Although consumer spending may be rising in dollar amounts, there could be indication that the rise is slowing, and such rate-of-change information could offer a deeper layer of analysis. 

Twitter Pathway:

We plan to increase accuracy of sentiment analysis bots through increase in data intake, this may include synonym-based data augmentation, and using computer generated lists with human corrections to increase speed of sorting. We also plan to collect more data so that data could be shown on a day by day basis, and labeling data next to a timeline of political events that might have led to drastic changes in sentiment. We also plan to change the formatting of our algorithms, and possibly migrate to a new coding language to have a better and wider control over our algorithms. This along with stronger data processing that might include word to word associations and grammatical constructs will allow us to have a better and more holistic bot that can tell us more about the data itself. We plan on getting tweet volume data for our keywords and using this to estimate the amount of tweets in each sentiment per day. Finally, if we accomplish our previous goals, we plan to analyze more word based analytics to better understand what people are talking about in the context of COVID-19 safety procedures, and what this can tell us about the spread of misinformation regarding COVID-19.

Time Delay: 

Another way to analyze the relationship between the economic and sentiment analysis data would be to look at the time delay between their curves. For example, we might observe that consumer spending increases a certain amount of time following a rise in positive public sentiment. 

Our results show this very relationship between the peaks of the respective graphs, but with a closer quantitative look, we could figure out if the time delay is consistent or not. This would be very interesting and may lead to us finding more ways to help prevent diseases from spreading as rapidly in the future.



A huge thank you to our graduate student mentors Surin and Leighton in this project for always being there for us to help answer any questions we had, putting us in contact with people to give us advice, helping format the paper, and generally always being there for us to give valuable advice and support us along the way. 

Another thank you to our overseeing professor, Ayfer Ozgur. Her support helped guide us through this project and helped us feel motivated through every step. 

Thank you to Huseyin Inan for giving us great advice on our project and supporting us.

I (Noemi) also wanted to thank Megan Davis for helping me through learning Wolfram so I was able to do this project. 

Appendix: Clarifications/Mentions for Rules for Sorting Tweets


Curse words and retweets in tweets are censored like so: kjnwef

General Mention 1: Sarcasm negates all rules. For example, if someone was advertising masks (POS 5) but doing it in a sarcastic or joking way with fake/non-existent masks, this would be considered a negative. Negating a rule also would likely categorize a tweet in the category opposite of the original rule that was negated. This doesn’t always apply for rules in DIS.

General Mention 2: If a person clearly is attempting to express one opinion or another, but is not using the right words, they will still be sorted under that category assuming that they misunderstood the meaning of their words. An example tweet to illustrate our point:

That shit be pissing me off. People putting other people in danger because of their negligence.

I don’t even know you but GIRL FUCK YOU!

And fuck anyone who refuses to practice herd immunity and wear mask to protect their loved ones and others.

This shit is NOT A GAME.

This tweet follows POS 3, POS 4 and POS 6. The only issue is that the statement “practice herd immunity” could imply that the user was attempting to convey NEG 16. In this case, we can assume that by “herd immunity” the user meant “social distancing” based on everything else she said. Of course, making assumptions always leaves room for error, but human languages require these assumptions of us if we are to semi-accurately predict collective intended meaning. 

General Mention 3: If a person’s tweet fits into one of the neutral rules, but also has qualities that would fit into negative or positive, it will almost always be sorted into whichever binary (non neutral) category it shares a rule with. There are a few exceptions, but in general negative/positive rules will overwrite neutral rules unless that neutral rule encompasses the binary rule. See the following example of an exception where neutral overwrites binary:

Yep. #MaskUp America. Unless of course you’re the exalted Dr. Fauci and the Mrs. I’m guessing just like protesting/looting, there’s an invisible shield that protects you when at a baseball game. #Hypocrite

This rule contains both (POS 8) and sarcasm, which might normally put it in the NEG category because of sarcasm negating previous rules. However, since this entire phrase is encompassed by the person attempting to show that a certain person is a hypocrite, we choose to put it in neutral. 

NEG 4: This includes people who state “people with asthma should not have to wear masks”, “children should not wear masks”, “STRONG, NOT-SICK PEOPLE SHOULDN’T WEAR MASKS”, or otherwise indicate that certain groups are exempt from mask wearing. We as researchers acknowledge that there are medical conditions that would warrant not wearing a mask, but these mainly include severe skin conditions and severe particle allergies that would likely bar the patient from going outside at all during this time. For this reason, stating that certain people should not have to wear a mask would likely put a tweet in the NEG category and overwrite most positive statements. 

NEG 11: This mainly refers to people that do not necessarily explicitly mention that they are against masks, but that they are against reasonable mandates for masks. (ex. Required to wear a mask to enter a store, or receiving a fine for not wearing masks) This is often characterized by a user stating that a mandate would violate civil liberties, or otherwise violate a human right of theirs. This does not include people who state that they are against mandates to wear a mask/social distance in a domestic setting, or claim to be against other ridiculous or unrealistic mandates. People who state the above will probably instead be sorted into a neutral category. 

DIS 2: If there is little for no way for us to discern a users’ opinion with even a small degree of accuracy, we will likely choose to discard it. This is because even if the user clearly has some opinion, by labeling it we risk misrepresenting them, and because other people who have read the tweet may not have understood it either, making it less important to us in the first place. 

  1. “@smogmonster89 @savesnine1 @sainsburys Yet you ID people who abuse you? People in their late 20s who say “are you taking the piss?” Becuase it’s the law; if wearing a mask is also meant to be the law surely it’s same as ID’ing people, just my thoughts”
  1. “More credible then u or trump, but u pitch only to his gullible base. So w/out context, u bring up mask issue I know trump did everything perfect,on face value thats moronic Ur blathering nonsense isnt helping reelect. Thinking repub @JesseBWatters @BillOReilly @newtgingrich”

Above are two examples of tweets that were labeled as neutral, despite being very clearly opinionated. They appear to be self contradictory, could be argued to be representing either side, and have inconsistent and confusing grammar. To some extent, it’s unclear whether or not they’re even talking about masks. For tweets that are entirely incomprehensible, it’s better to just leave them out of the main dataset. 

DIS 4: This particular issue came about when there was controversy around Anthony Fauci for not wearing a mask or social distancing during certain parts of a baseball game. The issue being, many would criticize him for not wearing a mask, but not make their opinions clear. One would think that these tweets would follow POS 4, but the issue was that some users were criticizing Fauci solely because he was not wearing a mask, whilst others were really only criticizing him for being a hypocrite, but not necessarily for not wearing a mask. I noticed that in general this rule was true, when someone was being criticized for something for the reason that they were being hypocritical, it didn’t necessarily convey the users’ opinion on what the person did on its own, sometimes it just conveyed their opinion of the person themself. For this reason, tweets under this category that do not explicitly follow some other rule that would make them positive or negative will often be labeled as neutral. 

Anthony Fauci: It’s ‘Mischievous’ to Criticize Me Taking Off Mask in Baseball Stands:We all need to understand when DEM/Socialist and theGODS of the Media make the laws they are making those laws for you and me not for them they are above the law https://t.co/2txBqx5nU7

The above tweet is very obviously anti-Fauci, and we might assume that they are anti-mask, but because the article they linked was only explicitly “anti-Fauci” and didn’t mention their take on masks, and because the user themself did not mention their take on masks, we have to put them in the neutral category. I considered putting this tweet in the negative category because the last part of their tweet seems to indicate NEG 6 by implying that “mask rules” only apply to people who aren’t democrats, but because we technically can’t differentiate between them being angry about their belief that the rules are only applied to them, or anger that necessary rules don’t apply to others, we cannot categorize it. 

DIS 5: This category just includes tweets that aren’t talking about masks in the context of COVID 19. For example, they may be referencing Bat Man’s mask, masking emotions, or other non-medical references to masks. Below is my favorite example of this rule. 

@REMills2 I’m an abusive pageant mom. Every day I shake him by his tiny little shoulders and say “ONLY WINNERS IN MY HOUSE” and he sheds a single glistening anime tear, knowing the mask of fame must continue to hide his deep dissatisfaction and emptiness. Alexa play Lucky by Britney Spears

DIS 6: We came across a few tweets that were claiming “mandates requiring you to wear a mask in your own home” was something they were completely against. It makes sense to be completely against such a rule if it existed, regardless of position of masks, so if the tweet gave no other indication of their position on the issue it was sorted into this category. 

Twitter Data





[1] Personal Income and Outlay, June 2020, and Annual Update, Bureau of Economic Analysis, July 31, 2020

[2] SECTION 02: PERSONAL CONSUMPTION EXPENDITURE, Underlying Details, Consumer Spending, Bureau of Economic Analysis

[3] Consumer Spending, Bureau of Economic Analysis, July 31, 2020

[4] Advance Monthly Retail Trade Report, US Census Bureau, July 16, 2020

[5] Differences between Consumer Price Index and Personal Consumption Price Index, Bureau of Economic Analysis, May 2011

[6] News Release Table 3 June 2020, Archived Consumer Price Index Supplemental Files, Bureau of Labor Statistics, June 2020

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.