This blog will introduce BigQuery ML, show a brand sentiment model on BigQuery ML, share code for readers to create their own Bert sentiment model, and share a Data Studio dashboard that shows readers how to use data for their own brand.
For the last few weeks I have set a small project for myself to build a sentiment analysis engine in the cloud using Tensorflow and BigQuery ML. In December, I had written an article for Search Engine Land called Brand reputation and the impact of Google SERP selections. In this article I attempted to show how often, for company brand terms, Google would display what I consider leading questions in search results. For example, “why is apple so bad?”. Sure, Google can say that they have some data in their black box that points to this being a helpful question, but I am sure Apple would like to know the details of why this question is or was displayed predominantly in their 4.7M brand searches per month.
In the article, I had used several sentiment analysis projects as well as Google’s own API to classify the sentiment of People Also Ask (PAA) results for the 500 brands in the popular Fortune 500 dataset. When I did this, I built a Data Studio so readers could explore the data I had collected. The one thing that bugged me is that the sentiment data was a fixed dataset, and couldn’t capture sentiment of new PAA results that appear over time. I set out to change that as well as become more familiar with Tensorflow and BigQuery ML.
In this article, you will find the following:
- We will introduce BigQuery ML.
- We will go over our project to host a sentiment model on BigQuery ML.
- We will share code for fine-tuning your own Bert sentiment model.
- We will share a Data Studio dashboard that shows how to use the data for your brand.
BigQuery ML is a wonderful offering from Google. As a digital marketer, many of us have tons of data stored or accessible in BigQuery datasets. Google makes it easy to leverage this data with their Data Studio visualization suite which connects directly to BigQuery as well as many other data sources. BigQuery ML allows us access to many machine learning models including clustering, recommender systems, forecasting, classification, as well as our own custom-developed Tensorflow models.
If you have some data in BigQuery as well as a knowledge of SQL, this opens up strong opportunities to leverage forecasting models as well language models for a myriad of tasks on existing data.
A few notes on BigQuery ML. The maximum custom model size is 250MB (in our testing, closer to 260MB) and there are limitations on batch size due to memory limitations of the hosted models.
Our goal was to produce a language model available in BigQuery that would allow us to run SQL queries to classify the sentiment of text data in other tables.
We started with the awesome Hugging Face Transformers library which is a Python library that makes most state-of-the-art transformer-based machine learning models available with a very intuitive API.
The pipeline portion of the transformers library allows you to quickly build models for specific tasks from an array of pretrained models hosted in their hub. The default pretrained-model for sentiment analysis is one called `distilbert-base-uncased-finetuned-sst-2-english` which is a smaller Distilbert model pre-trained on data from Stanford Sentiment Treebank v2 (SST2).
The overall workflow for getting a custom Tensorflow model into BigQuery ML is:
- Saving the model into TensorFlow’s SavedModel format.
- Uploading the model files to a Google Cloud Storage bucket.
- Then Running an SQL query that tells BigQuery the type of model and where it lives.
I won’t go into all the gory details, but originally we thought it would be very straightforward to use a pre-trained Hugging Face model, upload to BigQuery ML, and start classifying our text. This was not so easy. The size of the distilbert model ended up being nearly 280 MB in size and despite trying several strategies including model freezing, model pruning, layer dropping, etc, we were never able to get a working Hugging Face model hosted on BigQuery ML. Special thanks to Colt Sliva of iPullRank for helping me through this period. 🙂
What we did discover is that Tensorflow Hub has a rich array of models as well as preprocessing layers which are very simple to implement in Tensorflow models.
Building a simple Bert model involved only a few lines of code.
The preprocessing layer will take text input and convert into the dictionary of token ids needed for the Bert transformer model. The default max length is 128.
Since there were 24 Small Bert models available on TensorFlow hub in an array of sizes, it was easy to test a few and find the max number of transformer layers and hidden layer sizes that would fit within our 250MB allowance. For example, picking a Small Bert model with 4 Transformer layers and a hidden layer size of 256, produced a saved model that was 48.7 MB, well within our BigQuery ML limitations.
We are providing the full code to train a simple BigQuery ML sentiment classification model on Google Colab.
Note: You will need to be familiar with running Jupyter notebooks as well as using projects, buckets, and datasets in Google Cloud Console to fully understand the code.
In this Colab notebook, you will find the following:
- Installation of all libraries needed to train the model.
- Code to download and prepare the SST-2 (Stanford Sentiment Treebank) dataset for fine-tuning the model. This creates a training and validation dataset.
- Code which builds the classifier model and provides the training function which will save the best model and stop training (early stopping) if consecutive training epochs fail to produce a more accurate model.
- Code that saves the model into Saved Model format, tests TensorFlow serving signatures, and tests on example text.
- Finally, code which will upload the model to Google Cloud Storage, make the model available to BigQuery ML, and then test the model using an SQL query.
A Few Notes from the Project
- Tensorflow model saving is a bit hard to understand. There are several APIs available in Tensorflow for saving models, and it can be difficult to find code and examples which are not outdated. In our example, we used the tf.saved_model.save function.
- Serving signatures are also difficult to understand. Essentially, serving signatures tell third-party services what to expect as the input and output datatypes of your model as well as the function which initializes the model, for class-based models. Using tf.keras.Sequential([ ]) or tf.keras.Model(input, output) makes it easier as it allows the API to define inputs and outputs as well as serving signatures.
- Naming your inputs, layers, and outputs is really useful in Tensorflow because it makes it very easy to understand the model.
- We uncovered several bugs in TensorFlow and TensorFlow Hub while completing this project. One was an outdated URL to the SST-2 dataset, and the other was an issue using version 1 of the hosted Tensorflow Hub Bert models.
- In our own internal model, we fine-tuned the model on several datasets. The two most popular are the SST-2 and IMDB dataset which are both easily accessible. We found this did a better job of classifying new types of data. Our best accuracy using the Small Bert models was 91.6% with a model that was 230MB in size.
Using the BigQuery ML Model
Now that we have a model hosted in BigQuery, we can use this anytime we have a need to classify text data from other BigQuery tables. The code below will update newly found People Also Ask (PAA) queries found in daily crawls of keywords in Nozzle. Nozzle saves their crawled SERP data to BigQuery, so this makes it really easy for us to access. You will note that I am limiting the number to 50 results. This is because there are memory limitations of hosted BigQuery ML models and sending more will produce errors. Using a smaller model will help alleviate this issue at the cost of accuracy.
We have a project in Nozzle that has been tracking the main brand terms for 2019 Fortune 500 companies since December of 2020. Since December, Nozzle has found 13,547 distinct PAAs for these 500 companies. Of these 13.5k PAAs, these are the worst, according to our sentiment model:
Often Marketing Managers and CMOs we work with are strongly attuned to their company’s brand perception in Google search results. We put together a quick Data Studio dashboard for the company Apple which shows how the sentiment data can be used to label certain features of search results.
Just for fun, we also trended the average sentiment found in Google People Also Ask (PAA) questions with Apple’s stock prices since we first started collecting the sentiment data in December of 2020.
In this article, we introduced BigQuery ML, shared code and tips for training your own sentiment classification model using BERT, and shared SQL code and an example dashboard of how you can make the data actionable. In our Fortune 500 dataset, we are only tracking the main brand. For example, for Apple, we are tracking “apple”. For many companies, there are often numerous important brand terms that the company would like to understand user perception over. We normally track upwards of a hundred distinct terms for clients. Having a way to point out strongly positive and strongly negative terms helps us to cut the noise a bit and monitor things that may indicate success or problems in their user’s psyche.