To Impossible and Beyond: Social Media Sentiment Analysis of Plant-Based Patties

Plant-based alternatives to meat have become a staple on supermarket shelves and restaurant menus. This trend has been bolstered by the increasing popularity of veganism and sustainable diets, as well as significant product innovations. Gone are days of veggie burgers bearing no resemblance to burgers made with beef. The plant-based patties of today, primarily produced by Impossible Foods and Beyond Meat, claim to be nearly identical to the real deal in appearance, texture, and taste. As these alternatives steadily gain popularity and eat up the meat industry’s market share, I am curious about public opinion towards these products.

Social media is an arbiter of public opinion, and I chose Twitter as a channel. With a focus on tweets containing the terms “impossible burger” or “beyond burger”, I obtained 40,000 recent tweets for each term.

Goals:

Get 40,000 tweets (30,000 to train machine learning model, 10,000 to test machine learning model) for both "impossible burger" and "beyond burger" search terms

Create training sets by analyzing tweets for positive (0) or negative (1) sentiment using the rule-based method VADER

Visualize positive and negative training sets of tweets in word clouds to gain familiarity with each set

Compare combinations of feature extraction methods (Bag-of-Words and TF-IDF) and machine learning models (Logistic Regression, XGBoost, Decision Trees) to find the combination with the highest F1 score

Train machine learning model using features extracted from training sets of tweets

Apply trained machine learning model to predict sentiments of tweets in test sets

Get tweets: twitter_scraper.pdf
Create training sets: make_training_set.pdf
Impossible Burger: impossible_analyze_tweets.pdf
Beyond Burger: beyond_analyze_tweets.pdf

After analyzing the tweets for positive or negative sentiment using the rule-based method VADER, I visualized the positive and negative sets of tweets separately in word clouds to gain familiarity with each set. These positive and negative sets of tweets will be used to train a machine learning model to perform sentiment prediction.

Impossible Burger

Positive Sentiment

Negative Sentiment

Beyond Burger

Positive Sentiment

Negative Sentiment

I extracted the hashtags from the positive and negative sets of tweets and calculated the frequency of usage for each.

Impossible Burger

Positive Sentiment

Negative Sentiment

Beyond Burger

Positive Sentiment

Negative Sentiment

Feature extraction is the process of gathering words (as vectors of numbers) from text to use as input to train machine learning models. Bag-of-Words and TF-IDF (term frequency-inverse document frequency) are two methods of feature extraction. Bag-of-Words represents the occurrence of each word in a document (tweet). TF-IDF represents the importance of each word to a document within a group of documents (set of tweets).

Machine learning models: Logistic Regression, XGBoost, Decision Trees

F1 score is used to measure the effectiveness of each combination of machine learning model and feature extraction method. For Impossible Burger, the combination with the highest F1 score (.665513) is logistic regression using TF-IDF features. For Beyond Burger, the combination with the highest F1 score (.623679) is logistic regression using Bag-of-Words features.

Impossible Burger

Bag-of-Words Features

TF-IDF Features

Dataframe for Impossible Burger TF-IDF comparison among machine learning models

Bag-of-Words vs TF-IDF

Dataframe for Impossible Burger comparison of Bag-of-Words and TF-IDF for logistic regression

Beyond Burger

Bag-of-Words Features

TF-IDF Features

Dataframe for Beyond Burger TF-IDF comparison among machine learning models

Bag-of-Words vs TF-IDF

Dataframe for Beyond Burger comparison of Bag-of-Words and TF-IDF for logistic regression

Sentiment predictions (0 for positive, 1 for negative) of tweets by logistic regression machine learning model. The Impossible Burger set used TF-IDF features, while the Beyond Burger set used Bag-of-Words features.

Scripts

Word Cloud Visualizations

Impossible Burger

Positive Sentiment

Negative Sentiment

Beyond Burger

Positive Sentiment

Negative Sentiment

Hashtags

Impossible Burger

Positive Sentiment

Negative Sentiment

Beyond Burger

Positive Sentiment

Negative Sentiment

Machine Learning Model Comparisons

Impossible Burger

Bag-of-Words Features

TF-IDF Features

Bag-of-Words vs TF-IDF

Beyond Burger

Bag-of-Words Features

TF-IDF Features

Bag-of-Words vs TF-IDF

Sentiment Predictions

Impossible Burger

Beyond Burger