US Twitter Election

Data Analytics Case Study – 2020​

The Project

In the year 2020, we experienced one of the most infamous United States Elections in history, with candidates Donald Trump and Joe Biden competing. My aim in the run-up to Election Day was to create a Python Machine Learning Algorithm that would scan tweets regarding the election and candidates in an attempt to forecast the outcome.

The Plan

  1. Create an algorithm that can parse tweets and categorizes them as “Mentions Trump”, “Mentions Biden”, “Mentions Neither”, or “Mentions Both”. 
  2. In order to translate the way humans comprehend natural language and tone into Python Algorithm, conceptualizes how humans understand natural language and emotion through simple text format.
  1. Create a scoring method that analyses a tweet’s negative, positive, or neutral impact in order to estimate a Twitter account’s likelihood of voting for a certain candidate. 
  2. Run 150,000 tweets through the developed Machine Learning Algorithm and examine the results to develop a hypothesis

My Role

In this project, I was the primary programmer and designer of the algorithm as well as built the human language interpretation guide. I reasoned that humans do not talk in a single tone while constructing a phrase; rather, there may be tone variations throughout that are lost in a text format. I developed a scoring system that would take a single word and examine its worth in relation to its neighbouring words, producing a compounded score for the entire tweet.

Discoveries

After analyzing all tweets through our algorithm, our team revealed a less than 3% difference in popularity vote between the candidates, which was insufficient to make a firm prediction on the next election winner. What we did uncover, however, was a projection for a large number of prospective non-voters in the election, since the algorithm detected widespread dissatisfaction in the “Mentions Both” group. Based on this category and the high degree of animosity toward Donald Trump, our system predicted: 

– The US election will be a close fight, with less than a 3% difference in popularity votes between contenders. 

– Donald Trump will lose the election due to his slightly bad reputation and low voter turnout.

Final Takeaways

While the algorithm did give insight into the probable election outcome, our team concluded that tweets alone were insufficient to generate a definite winner forecast for the 2020 US Election. We did discover that Twitter has become a popular venue for political campaigns, and that data from Twitter may be used to anticipate outside world events such as elections and stock prices.

Platforms Used

  • Google Colabratory
  • Slack