Content-Based Recommender Systems in TensorFlow and BERT Embeddings

Tags: bert, content-based, contextual, recommender systems, tensorflow

This tutorial will show you how to build content-based recommender systems in TensorFlow from scratch. For this example, we will work with ads and our KPI will be the “Clicks“. In other words, we would like to build a content-based recommender system for serving ads by considering as features the users’ attributes and the content of the ads. For the content of the ads, we will get the BERT embeddings.

The architecture of the model will be two tower models, the user model, and the item model, concatenated with the dot product.

Load the Data and the Libraries

The data are from a Web Ad campaign. The available features are:

The user attributes like age, gender and so on. These columns start with the prefix att_
The KPI, where in our case is clicked, taking values 0 or 1.
The content of the ad, which is a text column
The ad ID

Load the Libraries

#!pip install --upgrade tensorflow_hub
#!pip install --upgrade tensorflow_text

import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Model
from sklearn.feature_extraction.text import CountVectorizer
import tensorflow_hub as hub
import tensorflow_text as text  # Imports TF ops for preprocessing.

pd.set_option("max_colwidth", 300)

Create a function to convert the text to BERT Embeddings

# Define the model
BERT_MODEL = "https://tfhub.dev/google/experts/bert/wiki_books/2"
 
# Choose the preprocessing that must match the model
PREPROCESS_MODEL = "https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3"

preprocess = hub.load(PREPROCESS_MODEL)
bert = hub.load(BERT_MODEL)

def text_to_emb(input_text):
    input_text_lst = [input_text]
    inputs = preprocess(input_text_lst)
    outputs = bert(inputs)
    return np.array((outputs['pooled_output'])).reshape(-1,)

Load the Data

# load the data
df = pd.read_csv("my_campaign.csv")

#define the KPI
kpi = 'clicked'

users_features = [col for col in df if col.startswith('att_')]

extra = ['text', 'message_id', kpi]



# convert the df to dummies
df = pd.concat([pd.get_dummies(df[users_features]), df[extra]], axis=1)

df.head()

In order to be more efficient, we will get the embeddings of the unique ads

# keep the unique messages that will be used for the predictions
unique_messages = df.drop_duplicates(subset=['message_id']).sort_values(by='message_id').filter(regex='^text', axis=1)

unique_messages_wit_ids = df.drop_duplicates(subset=['message_id','message_id']).sort_values(by='message_id').filter(regex='^text|message_id', axis=1)
unique_messages_wit_ids.reset_index(drop=True, inplace=True)

unique_messages_wit_ids['embeddings']  = unique_messages_wit_ids['text'].apply(lambda x:text_to_emb(x))

Train and Test Dataset

We will go with 80% train and 20% test dataset.

# create the train and test dataset

train=df.sample(frac=0.8,random_state=5) 
test=df.drop(train.index)

train.reset_index(drop=True, inplace= True)
test.reset_index(drop=True, inplace= True)


items_train = np.array(train.merge(unique_messages_wit_ids, how='inner', on='message_id')['embeddings'].values.tolist())
items_test = np.array(test.merge(unique_messages_wit_ids, how='inner', on='message_id')['embeddings'].values.tolist())

Build the Model

We will build three models, the user model, the item model, and the concatenated model. The user and the item model are Neural Network models of many layers. The models can have a different architecture, but the final layer must be of the same dimension in order to concatenate them using the dot product. In our case, the final layer of each model consists of 32 units.

num_user_features = train.filter(regex='^att_').shape[1]
num_item_features =items_train.shape[1]

# the model

num_outputs = 32
tf.random.set_seed(1)
user_NN = tf.keras.models.Sequential([
      
    tf.keras.layers.Dense(128,activation='relu'),
    #tf.keras.layers.Dropout(0.5),
    #tf.keras.layers.Dense(64,activation='relu'),
    #tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(num_outputs)

])

item_NN = tf.keras.models.Sequential([
    
    tf.keras.layers.Dense(128,activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(64,activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(num_outputs)
  
])

# create the user input and point to the base network
input_user = tf.keras.layers.Input(shape=(num_user_features))
vu = user_NN(input_user)
vu = tf.linalg.l2_normalize(vu, axis=1)

# create the item input and point to the base network
input_item = tf.keras.layers.Input(shape=(num_item_features))
vm = item_NN(input_item)
vm = tf.linalg.l2_normalize(vm, axis=1)

# compute the dot product of the two vectors vu and vm
output_dot = tf.keras.layers.Dot(axes=1)([vu, vm])
output = tf.keras.layers.Dense(1,activation='sigmoid' )(output_dot)

# specify the inputs and output of the model
model = Model([input_user, input_item], output)

model.summary()

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_1 (InputLayer)           [(None, 63)]         0           []                               
                                                                                                  
 input_2 (InputLayer)           [(None, 768)]        0           []                               
                                                                                                  
 sequential (Sequential)        (None, 32)           12320       ['input_1[0][0]']                
                                                                                                  
 sequential_1 (Sequential)      (None, 32)           108768      ['input_2[0][0]']                
                                                                                                  
 tf.math.l2_normalize (TFOpLamb  (None, 32)          0           ['sequential[0][0]']             
 da)                                                                                              
                                                                                                  
 tf.math.l2_normalize_1 (TFOpLa  (None, 32)          0           ['sequential_1[0][0]']           
 mbda)                                                                                            
                                                                                                  
 dot (Dot)                      (None, 1)            0           ['tf.math.l2_normalize[0][0]',   
                                                                  'tf.math.l2_normalize_1[0][0]'] 
                                                                                                  
 dense_5 (Dense)                (None, 1)            2           ['dot[0][0]']                    
                                                                                                  
==================================================================================================
Total params: 121,090
Trainable params: 121,090
Non-trainable params: 0
__________________________________________________________________________________________________

Train the model

tf.random.set_seed(1)
cost_fn = tf.keras.losses.BinaryCrossentropy()
opt = keras.optimizers.Adam(learning_rate=0.01)
model.compile(optimizer=opt,
              loss=cost_fn)

callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3)

tf.random.set_seed(1)
model.fit([train.filter(regex='^att_').values, items_train], train[kpi].values, epochs=20,  
          batch_size=16, validation_split=0.1, callbacks=[callback] )

Make Predictions

# keep the unique messages and their corresponding embeddings
sorted_msg_ids = sorted(unique_messages_wit_ids['message_id'].values)
unique_messages_vectors = np.array(unique_messages_wit_ids['embeddings'].values.tolist())


preds = []
for i in range(test.shape[0]):
    temp_pred = model.predict([np.tile(test.filter(regex='^att_').values[i], (unique_messages_vectors.shape[0],1)), unique_messages_vectors]).argmax()
    preds.append(sorted_msg_ids[temp_pred])

Make Predictions with Matrix Multiplication

Note that we have built the user and the item model. By taking the product of those two models and then taking into account the constant and the beta of the sigmoid function, we will be able to calculate the probabilities.

message_matrix = model_m.predict(unique_messages_vectors)
user_matrix = model_u.predict(test.filter(regex='^att_').values)
user_item_matrix = pd.DataFrame(np.matmul(user_matrix, np.transpose(message_matrix)))

# apply the sigmoid function with the weights and the bias from the last layer

tmp = (user_item_matrix.values*model.layers[-1].get_weights()[0])+model.layers[-1].get_weights()[1]

user_item_matrix = pd.DataFrame(1/(1 + np.exp(-tmp)))
user_item_matrix.columns = sorted(train.message_id.unique())

More tutorials related to recommendations?

Tags: bert, content-based, contextual, recommender systems, tensorflow

Share This Post

2 thoughts on “Content-Based Recommender Systems in TensorFlow and BERT Embeddings”

SoumyaRanjan

March 1, 2023 at 4:40 am

Hi where to get the data?
Reply
- George Pipis
  
  March 23, 2023 at 6:09 pm
  
  Sorry but I cannot share the data
  Reply

Get updates and learn from the best

More To Explore

Python

Image Captioning with HuggingFace

Image captioning with AI is a fascinating application of artificial intelligence (AI) that involves generating textual descriptions for images automatically.

George Pipis March 21, 2024

Python

Intro to Chatbots with HuggingFace

In this tutorial, we will show you how to use the Transformers library from HuggingFace to build chatbot pipelines. Let’s

George Pipis March 15, 2024

Content-Based Recommender Systems in TensorFlow and BERT Embeddings

Load the Data and the Libraries

Train and Test Dataset

Build the Model

Train the model

Make Predictions

Make Predictions with Matrix Multiplication

More tutorials related to recommendations?

Share This Post

2 thoughts on “Content-Based Recommender Systems in TensorFlow and BERT Embeddings”

Leave a Comment Cancel reply

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Image Captioning with HuggingFace

Intro to Chatbots with HuggingFace

Content-Based Recommender Systems in TensorFlow and BERT Embeddings

Load the Data and the Libraries

Train and Test Dataset

Build the Model

Train the model

Make Predictions

Make Predictions with Matrix Multiplication

More tutorials related to recommendations?

Share This Post

2 thoughts on “Content-Based Recommender Systems in TensorFlow and BERT Embeddings”

Leave a Comment Cancel reply

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Image Captioning with HuggingFace

Intro to Chatbots with HuggingFace

#Tag Cloud ☁️