Predictive Hacks

How to Save and Load Scikit Learn Models

Let’s go straightforward to show you how to save and load the scikit learn models. We will start with random forest model.

from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=1000, n_features=4,
                           n_informative=2, n_redundant=0,
                           random_state=0, shuffle=False)
clf = RandomForestClassifier(max_depth=2, random_state=0), y)

Our model is the clf and we want to save it in the hard disk and then load it.

Pickle Models

We can save the models as pkl objects using the pickle library.

import pickle

# Save the model under the cwd
pkl_filename = "clf.pkl"
with open(pkl_filename, 'wb') as file:
    pickle.dump(clf, file)

# Load the saved model
with open("clf.pkl", 'rb') as file:
    clf = pickle.load(file)

# Now you can use the model
print(clf.predict([[0, 0, 0, 0]]))

Joblib Models

We will save the clf model but using the joblib library.

from sklearn.externals import joblib

# Save the model under the cwd
joblib_filename = "clf.pkl"
joblib.dump(clf, joblib_filename )

# Load the saved model
clf = joblib.load('clf.pkl')

# Now you can use the model
print(clf.predict([[0, 0, 0, 0]]))

How to Save the Model and the Tokenizer in a Single File

In many NLP tasks, apart from the Machine Learning model, we have a tokenizer where it makes sense to save both of them in a single file. Let’s see how we can achieve that.

import pickle

# Save the Tokenizer and the Model in the same file
with open('model_and_tokenizer.pkl', 'wb') as file:
  pickle.dump((tokenizer, clf), file)

# Load the Tokenizer and the Model
with open('model_and_tokenizer.pkl', 'rb') as file:
  tokenizer, clf = pickle.load(file)

# Apply it to your data 
X_test_tokenized = tokenizer.transform(X_test)


Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Photo by NordWood Themes on Unsplash

How To Manage Multiple Screen Sessions

Linux’s Screen lets you run terminal applications to a Server in the background even if you disconnect from the ssh connection.

python exception

Exceptions in Python

In this tutorial, we will provide you with an example of exception handling in Python. For simplicity, we will work