By far the most popular post on this blog is a review of several Python integrated development environments (IDEs) geared toward science. Coming from a Matlab background, it’s natural to search for something Matlab-like to replace it – an IDE with integrated editor, code execution, plotting, benchmarking, file management, etc.
An increasingly attractive alternative is the IPython Notebook. The ipython Notebook interface, which runs in the browser, allows one to write and run interactive notebooks which combine code, documentation – including Markdown and LaTeX equations – and interaction seemlessly. It’s not unlike Mathematica, Maple, or RMarkdown.
You can try out an interactive ipython notebook session on nature.com.
I covered ipython notebook a couple of years ago, back when it was a relatively new tool, but it’s become a lot more powerful as it has matured and its ecosystem has grown. It has become an excellent tool for running and documenting exploratory analyses, while the rest of the Python ecosystem – IDEs, IPython console, debugging tools, etc. – can be leveraged for batch processing and standardized analyses.
Here I highlight some of the more advanced features of ipython notebook with particular focus on recently added features.
Ipython notebook particularly shines for creating narrative reports – a form of literate programming which is an excellent workflow for data analysis.
A narrative report mixes code, plots, and a text narrative that highlight results, non-results, thoughts and concerns. A first draft of a narrative report might sound like stream-of-consciousness beat poetry meets data analysis. A bit of editing tightens the narrative and serves to aggregate and summarize one’s thoughts – map-reduce for the brain. The report can then be used for self-archival and sharing insights with other team members. It can also be used to support open science.
Matlab offers this possibility with cell-mode publishing, but the ipython notebook is leaps and bounds above Matlab’s report generation. The notebook interface is particularly well-adapted for narrative reports, as it transparently mixes code, plots, printouts, text, Markdown, and LaTeX.
An IPython notebook – which includes code, text, and the results of computations – is essentially a JSON-formatted file with an .ipynb extension. An .ipynb file can be shared, stored, viewed, and converted in a number of ways:
- it can be exported to static formats, like PDF, via
- It can be shared as a self-contained file and run on another computer that has the ipython notebook server installed – e.g. via Dropbox
- It can be viewed statically via the ipython Notebook viewer
- It can be run interactively in the cloud
- It can be run interactively via colaboratory – more on this later.
A recent addition to the notebook interface is interactivity with widgets. At its most basic, one can link a slider widget with a callback to a function to create an interactive plot, e.g.:
%matplotlib inline import pylab as plt import numpy as np from IPython.html.widgets import interact import math def plot_sine(period=10): x = np.linspace(0,10,100) plt.plot(x,np.sin(x*2*math.pi/period)) interact(plot_sine,period=(2,20,.5))
IPython and the notebook interface have proven so popular that a recent effort has been made towards supporting multiple languages, both at the command line and in the notebook interface. The Jupyter project, which, as I understand it, is a fork/superset/new version of IPython, supports interactive notebooks for Python, R, Julia, and many others.
Hopefully this will help bridge the gap between these languages. As much as I love Python, R has a much larger number of statistical models in it. Julia, on the other hand, offers the compiled-like speed that’s necessary for some types of numeric problem that can’t be vectorized easily.
coLaboratory allows multiple people to collaborate on Jupyter notebooks . The notebooks are hosted on Google Drive – the execution is handled either by a Jupyter kernel running in Chrome or a kernel on the host computer. It’s not as streamlined as it could be right now, but it has a ton of potential, as it does not require the user to have ipython or jupyter installed on their computer – only the chrome extension is required.
There are other features of the ipython notebook that I haven’t covered in detail here. For instance, the notebook interface can be used to manage local or remote kernels for parallel computing.
The notebook interface has grown a lot in the last 2 years, and it’s quite useful for day-to-day work. Jupyter, when it matures, will be bring seamless support for multiple languages. Furthermore, ipython, matplotlib, and others will benefit greatly from the exponential growth in data science, which means that they’re now backed by corporate sponsors that mean that contributors can be devoted to these projects full time.
Maybe not today, maybe not tomorrow, we can finally leave Matlab, and say once and for all: I will never code another GUI in GUIDE again!
- A nice video presentation about IPython notebook – and some accompanying notebooks showcasing Pandas – from my former classmate Julia Evans
- A short presentation on Jupyter
- A presentation on Julia using an interactive Jupyter notebook
- An article on Nature on using notebooks for reproducible open science