Jupyter Notebook dockerized

I like Python and I like Jupyter Notebook very much. I’m not a typical programmer but I use this framework frequently as an replacement for Excel. I got to know some basics of Pandas, Matplotlib and Plotly – they are great tools for data processing and visualization. But they are developed with great speed which causes troubles when you want to keep your working environment up-to-date. It is quite easy to break something by installing some fancy-but-not-well-tested plugin or package.

Again, docker is our ally. It is not only the way to avoid troubles with dependencies but also good platform to present examples on blog because they become easy to reproduce on other environments. So, here is short description how I run Jupyter Notebook inside the docker:

Create docker image

I use official Anaconda3 image which is available in the docker registry. Anaconda is company which maintains and support entire stack of Python and R packages used in Data Science – on most modern operating systems. Here is simple Dockerfile I use for now:

FROM continuumio/anaconda3
EXPOSE 8888/tcp
RUN /opt/conda/bin/conda install jupyter pandas-datareader -y 
RUN pip install plotly cufflinks

Building image is very simple. Lets create image called ppp:

michal@sunman:~$ docker build -t ppp:latest .

Prepare directories

To be able to easily exchange files between my host and docker – I create some directories on my host:

michal@sunman:~$ mkdir /export/docker/in
michal@sunman:~$ mkdir /export/docker/out
michal@sunman:~$ mkdir /export/docker/notebooks

Start container

michal@sunman:~$ docker run -it --rm \
   --name "ppp" \
   -p 127.0.0.1:8888:8888 \
   -v /export/docker/notebooks:/notebooks \
   -v /export/docker/in:/in \
   -v /export/docker/out:/out \
   ppp \
   /bin/bash -c "jupyter notebook --notebook-dir=/notebooks  --NotebookApp.token='' --ip='0.0.0.0' --allow-root --no-browser"

Container is started interactively (-it) and will be removed completely after it finishes (--rm). I gave it simple name ppp – same as image used to create it. I also mount directories from my host and start Jupyter Notebook inside it. The notebook is started with empty token (normally unique one is generated) – but because I bind the docker only to my localhost address – I disable this security feature. Eliminating token allows me to blindly use http://127.0.0.1:8888 address to access it from host instead of copying generated token into my browser.

Verify container by opening the browser at http://127.0.0.1:8888

And now you can start working on new notebook. Below is simple example of such work showing plot with Oracle stock closing prices:

In [1]:
import pandas_datareader.data as pddr
import cufflinks as cf
cf.go_offline()
%matplotlib inline
In [2]:
orcl=pddr.DataReader('ORCL', 'stooq')
orcl.Close.iplot()

Summary

This post is not related to Oracle (ok, stock plot is related ;> ). But soon (hopefully), I’m going to prepare some examples of using Jupyter Notebook to analyze and visualize performance data. So this post is kind of preparation for it. You can find all files necessary to run examples here.

Leave a Reply

Your email address will not be published. Required fields are marked *

The following GDPR rules must be read and accepted:
This form collects your name and email so that I can keep track of the comments placed on the website. I do not share this data with any organisation or person. Your IP address is not collected and will not be displayed with your comment.