Predictive Hacks

How to Share Jupyter Notebooks with Docker

docker

As Data Scientists, we want our work to be reproducible, meaning that when we share our analysis, everyone should be able to re-run it and come up with the same results. This is not always easy, since we are dealing with different operating systems (iOS, Windows, Linux) and different programming language versions and packages. That is why we encourage you to work with virtual environments like conda environments. Another more robust solution from conda environments is to work with Dockers.

Scenario: We have run an analysis using Python Jupyter Notebooks on our own data, and we want to share this analysis with the Predictive Hacks community ensuring that everyone will be able to reproduce the results.

Run the Analysis Locally

For simplicity, let’s assume that I have run the following analysis:

In essence, I try to run a sentiment analysis on my_data.csv using the pandas, numpy and vaderSentiment libraries. Thus, I want to share this Jupyter notebook and to be plug and play. Let’s see how I can create a docker image containing a Jupyter Notebook, as well as my data and the required libraries.

Jupyter Docker Stacks

Jupyter Docker Stacks are a set of ready-to-run Docker images containing Jupyter applications and interactive computing tools. You can use a stack image to do any of the following (and more):

  • Start a personal Jupyter Notebook server in a local Docker container
  • Run JupyterLab servers for a team using JupyterHub
  • Write your own project Dockerfile

We will build our custom image based on jupyter/scipy

Create the requirements.txt File

The Jupyter Docker core images contain the most common libraries, but it is possible to need to install some extra. Like in our case that we wanted to install the vaderSentiment==3.3.2 library. This means that we have to create the requirements.txt file.

The requirements.txt file is:

vaderSentiment==3.3.2

Create the Dockerfile

Now we need to create the Dockerfile as follows:

FROM jupyter/scipy-notebook


COPY requirements.txt ./requirements.txt
COPY my_data.csv ./my_data.csv
COPY my_jupyter.ipynb ./my_jupyter.ipynb

RUN pip install -r requirements.txt

So, we start our image with the jupyter/scipy-notebook then we copy the required files from our local computer to the image. Note that we could have used paths and directories. Finally, we install the required libraries in the requirements.txt file.

Build the Dockerfile

Since we have created the Dockerfile, we are ready to build it. The command is the following. Note that you can give any name. I chose to call it mysharednotebook. Tip: Do not forget the period!

$ docker build -t mysharednotebook .

If you want to make sure that your image has been created, you can type:

$ docker images

to get the docker images.

Run the Image

If we want to make sure that the image is running as expected we run:

$ docker run -it -p 8888:8888 mysharednotebook

And we will get a link for our jupyter notebook!

If you want to see which containers are running you can type:

$ docker ps -a

Push your Image to Docker Hub

Once you make sure that the image works as expected, you can push it to Docker Hub so that everyone will be able to pull it. The first thing that you need to do, is to tag your image.

$ docker tag 9811503b3d3a  gpipis/mysharednotebook:first

The 9811503b3d3a is that Image ID obtained it from the command docker images. The gpipis is my username and the mysharednotebook is the image name that I have created above. Finally, the :first is an optional tag.

Now we are ready to push the image by typing:

$ docker push gpipis/mysharednotebook:first

Pull the Image from Docker Hub

The work above is done by the person who wants to share his/her work. Now, let’s see how we can get this image and work on the reproducible Jupyter Notebook.

What we have to do is to pull the image by typing:

$ docker pull gpipis/mysharednotebook:first

And now we are ready to run it by typing:

$ docker run -it -p 8888:8888 gpipis/mysharednotebook:first

If we copy-paste the URL to our browser we get:

Notice that you can change the port. For example, if you want to run on 8889 then you type:

$ docker run -it -p 8888:8888 gpipis/mysharednotebook:first

and you have to change also to port to your URL:

http://127.0.0.1:8889/?token=7e767d9a8dbb92e9d93ce7a5f52ba3c524a3cfcc65401714

The Takeaway

When you want to share your work with many people and you want them to be able to reproduce your analysis, then the best approach is to work with Dockers.

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email

Leave a Comment

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Python

Image Captioning with HuggingFace

Image captioning with AI is a fascinating application of artificial intelligence (AI) that involves generating textual descriptions for images automatically.

Python

Intro to Chatbots with HuggingFace

In this tutorial, we will show you how to use the Transformers library from HuggingFace to build chatbot pipelines. Let’s