Predictive Hacks

How to Save and Load Scikit Learn Models

Let’s go straightforward to show you how to save and load the scikit learn models. We will start with random forest model.

from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=1000, n_features=4,
                           n_informative=2, n_redundant=0,
                           random_state=0, shuffle=False)
clf = RandomForestClassifier(max_depth=2, random_state=0)
clf.fit(X, y)

Our model is the clf and we want to save it in the hard disk and then load it.

Pickle Models

We can save the models as pkl objects using the pickle library.

import pickle

# Save the model under the cwd
pkl_filename = "clf.pkl"
with open(pkl_filename, 'wb') as file:
    pickle.dump(clf, file)

# Load the saved model
with open("clf.pkl", 'rb') as file:
    clf = pickle.load(file)


# Now you can use the model
print(clf.predict([[0, 0, 0, 0]]))

Joblib Models

We will save the clf model but using the joblib library.

from sklearn.externals import joblib

# Save the model under the cwd
joblib_filename = "clf.pkl"
joblib.dump(clf, joblib_filename )

# Load the saved model
clf = joblib.load('clf.pkl')

# Now you can use the model
print(clf.predict([[0, 0, 0, 0]]))

How to Save the Model and the Tokenizer in a Single File

In many NLP tasks, apart from the Machine Learning model, we have a tokenizer where it makes sense to save both of them in a single file. Let’s see how we can achieve that.

import pickle

# Save the Tokenizer and the Model in the same file
with open('model_and_tokenizer.pkl', 'wb') as file:
  pickle.dump((tokenizer, clf), file)


# Load the Tokenizer and the Model
with open('model_and_tokenizer.pkl', 'rb') as file:
  tokenizer, clf = pickle.load(file)


# Apply it to your data 
X_test_tokenized = tokenizer.transform(X_test)

clf.predict(X_test_tokenized)

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Python

Image Captioning with HuggingFace

Image captioning with AI is a fascinating application of artificial intelligence (AI) that involves generating textual descriptions for images automatically.

Python

Intro to Chatbots with HuggingFace

In this tutorial, we will show you how to use the Transformers library from HuggingFace to build chatbot pipelines. Let’s