To Impossible and Beyond: Social Media Sentiment Analysis of Plant-Based Patties
data:image/s3,"s3://crabby-images/7dda4/7dda4a8f9e7a48b375e9bc0423c3b77d6590cd50" alt="Cover Image"
Plant-based alternatives to meat have become a staple on supermarket shelves and restaurant menus. This trend has been bolstered by the increasing popularity of veganism and sustainable diets, as well as significant product innovations. Gone are days of veggie burgers bearing no resemblance to burgers made with beef. The plant-based patties of today, primarily produced by Impossible Foods and Beyond Meat, claim to be nearly identical to the real deal in appearance, texture, and taste. As these alternatives steadily gain popularity and eat up the meat industry’s market share, I am curious about public opinion towards these products.
Social media is an arbiter of public opinion, and I chose Twitter as a channel. With a focus on tweets containing the terms “impossible burger” or “beyond burger”, I obtained 40,000 recent tweets for each term.
Goals:
- Get tweets: twitter_scraper.pdf
- Create training sets: make_training_set.pdf
- Impossible Burger: impossible_analyze_tweets.pdf
- Beyond Burger: beyond_analyze_tweets.pdf
After analyzing the tweets for positive or negative sentiment using the rule-based method VADER, I visualized the positive and negative sets of tweets separately in word clouds to gain familiarity with each set. These positive and negative sets of tweets will be used to train a machine learning model to perform sentiment prediction.
Impossible Burger
Positive Sentiment
data:image/s3,"s3://crabby-images/1532c/1532cb498a4673d445332ac0904c129934a574e0" alt="Wordcloud for Impossible tweets with positive sentiment"
Negative Sentiment
data:image/s3,"s3://crabby-images/d6059/d6059c0523831d31ab01b4bc4b25f15de2a592cf" alt="Wordcloud for Impossible tweets with negative sentiment"
Beyond Burger
Positive Sentiment
data:image/s3,"s3://crabby-images/277ec/277eca3b43925b5b2c973e7fcc7e313f5fc3aea3" alt="Wordcloud for Beyond tweets with positive sentiment"
Negative Sentiment
data:image/s3,"s3://crabby-images/cb828/cb828665274388b8777e5acc113d33223559c373" alt="Wordcloud for Beyond tweets with negative sentiment"
I extracted the hashtags from the positive and negative sets of tweets and calculated the frequency of usage for each.
Impossible Burger
Positive Sentiment
data:image/s3,"s3://crabby-images/6265c/6265c589f73030ce842f77510d1edd7f1d6b66a6" alt="Dataframe for Impossible hashtags with positive sentiment"
Negative Sentiment
data:image/s3,"s3://crabby-images/d5550/d555064c5a4483910bf52b3ae6fa150eb5193a99" alt="Dataframe for Impossible hashtags with negative sentiment"
Beyond Burger
Positive Sentiment
data:image/s3,"s3://crabby-images/9598f/9598fb95b9a57d4c40b47a1f51cabfc8891cfd5d" alt="Dataframe for Beyond hashtags with positive sentiment"
Negative Sentiment
data:image/s3,"s3://crabby-images/a7fba/a7fbac160af6ded43371f0b54ccc46b272e2a0ed" alt="Dataframe for Beyond hashtags with negative sentiment"
Feature extraction is the process of gathering words (as vectors of numbers) from text to use as input to train machine learning models. Bag-of-Words and TF-IDF (term frequency-inverse document frequency) are two methods of feature extraction. Bag-of-Words represents the occurrence of each word in a document (tweet). TF-IDF represents the importance of each word to a document within a group of documents (set of tweets).
Machine learning models: Logistic Regression, XGBoost, Decision Trees
F1 score is used to measure the effectiveness of each combination of machine learning model and feature extraction method. For Impossible Burger, the combination with the highest F1 score (.665513) is logistic regression using TF-IDF features. For Beyond Burger, the combination with the highest F1 score (.623679) is logistic regression using Bag-of-Words features.
Impossible Burger
Bag-of-Words Features
data:image/s3,"s3://crabby-images/4ef0e/4ef0e69e23c73128ba2761bdcaed238e67265ce7" alt="Dataframe for Impossible Burger Bag-of-Words comparison among machine learning models"
TF-IDF Features
data:image/s3,"s3://crabby-images/b1b41/b1b413e050545d005ea01982ace1c31d9e5b6632" alt="Dataframe for Impossible Burger TF-IDF comparison among machine learning models"
Bag-of-Words vs TF-IDF
data:image/s3,"s3://crabby-images/93ae8/93ae89e1e679e90dff2bbe13f8a4d7efbba7cc8c" alt="Dataframe for Impossible Burger comparison of Bag-of-Words and TF-IDF for logistic regression"
Beyond Burger
Bag-of-Words Features
data:image/s3,"s3://crabby-images/78655/78655d5da9faca5573d9687696921a32f10d7eef" alt="Dataframe for Beyond Burger Bag-of-Words comparison among machine learning models"
TF-IDF Features
data:image/s3,"s3://crabby-images/ef93b/ef93be35e5010cc4ce2fa02eee363b6964ab01e3" alt="Dataframe for Beyond Burger TF-IDF comparison among machine learning models"
Bag-of-Words vs TF-IDF
data:image/s3,"s3://crabby-images/0a4e4/0a4e47b2c3e8f36c8a0eb08d026407d3b6f5835f" alt="Dataframe for Beyond Burger comparison of Bag-of-Words and TF-IDF for logistic regression"
Sentiment predictions (0 for positive, 1 for negative) of tweets by logistic regression machine learning model. The Impossible Burger set used TF-IDF features, while the Beyond Burger set used Bag-of-Words features.
Impossible Burger
data:image/s3,"s3://crabby-images/37a19/37a199a1ba1f26364fccd012b7c14dd28c101a28" alt="Dataframe for Impossible Burger Sentiment Predictions"
Beyond Burger
data:image/s3,"s3://crabby-images/bf954/bf95443e00823738a032c550220ce7a7c39a337d" alt="Dataframe for Beyond Burger Sentiment Predictions"