Dynamic scientific visualizations in the browser for Python users

As a scientist, interacting with data allows you to gain new insight into the phenomena you’re studying. If you read the New York Times, the D3 docs or you browse distill, you’ll see impressive browser-based visualizations – interactive storytelling that not only accurately represent data but bring your attention to surprising aspects of it. Making good visualizations of your work can increase the accessibility of your research to a wide audience. My favorite example here (already several years old) is the semantic map of the brain from James Gao, Alex Huth, Jack Gallant and crew – at the time, visualizing brain map data in the browser was unheard of, and here was a visualization of how semantics map to different brain areas, in the browser, no downloads necessary.

Snapshot of brain data visualization from James Gao, Jack Gallant, Alex Huth and crew. Built on three.js and tornado. You can make these kind of visualizations yourself with the pycortex library.

Coming from the Python world, most visualizations you’ll make on a day-to-day basis will be very different from these: static plots, because that’s what’s convenient to make in matplotlib. How can you go from the Python world of static plots to the wizardry of Javascript?

A big wall you’ll run into is that modern web development is big. Getting to proficiency is hard: it’s very easy to get discouraged and fall off the wagon before you get to build something interesting. What I’ve assembled here is a kind of roadmap so that you can start building interesting dynamic visualizations with the lowest barrier to entry as possible, while building a self-reinforcing skillset.

TechnologyApplication areasProsConsWhat you learn
Python-based generatorsDashboards, lightweight interactionsEasy distribution, low barrier to entryLimited interactivityGrammar of interactive graphics
Jupyter notebookDaily data explorationYou’re already using it!Distribution not straightforwardSnippets of HTML, JS, SVG
Walled gardenAll-in-one visualizations (widgets)Limited things to learnWhat you learn is non-transferableThe walled garden
JS notebooksLiterate programsREPL environment, good docsLearning curve a bit steepCore JS plus visualization libraries
The webWeb datavizYou control everythingFloodgates of tech stack, learning curve quite steepPackaging, components, deployment

Finally, at the end of this article, I cover JS libraries you might want to learn to do for plotting or numeric work. I’ve built this from experience over the last year experimenting with about a half-dozen interactive visualization methods; of course, if you know better curricula, books or tutorials, please let me know in the comments.

Step 0: Python-based libraries

Your first option is to not code for the web at all by leveraging pure Python-based libraries. Many libraries aim to fill a niche for everyday interactive graphics, which include minimal interactions – hover interactions, tooltips, interactive legends, etc. This includes Plotly, Bokeh w/Holoviews and Altair w/vegalite. On the mapping side, ipyleaflet is worth mentioning.

Dashboard tools

There has been an explosive growth in the past few years. These solutions fill the same niche as the ubiquitous Shiny fills in the R ecosystem: a way for data science teams to create dashboards. You can nevertheless use them for interactive scientific visualizations. Getting familiar with this kind of visualization will help you build a grammar of interactive visualizations and rapidly prototype more fully interactive visualizations.

Here’s what to expect: a Python script defines a visualization (plots, videos, images, maps, etc.) along with widgets, such as sliders and combo boxes. The whole thing gets served via a server which caches requests intelligently – changes in sliders trigger remote fetches which refresh the visualization. Some custom visualizations may be written in JS. Oftentimes, dashboard solutions are associated with a specific plotting library. so all the docs will be written in terms of that library – but generally you have some flexibility to use other frameworks.

One clarification: if you want to share your visualizations, you’ll need a server. All solutions have a basic open source offering allowing you to DIY deploy such a server yourself via e.g. a cloud provider. If that sounds messy, you can instead opt for a managed cloud solution – there’s often a free tier and paid tiers with more capabilities availables. Here they are:

ProductDeploymentPeculiaritiesAssociated plotting library
PanelDIYWorks inside of jupyter and standaloneBokeh, with support other charting libraries
StreamlitDIY & commercialVery easy deploysAgnostic
DashDIY & commercialSupports lots of customization, aimed at enterprise usersPlotly, with support for other libraries

Of these, Streamlit seems to have the most momentum according to Github stars.

See this article for a detailed breakdown.

Step 1: build for jupyter notebook

Perhaps the lowest friction way of doing dynamic HTML-based visualization in Python is in the jupyter notebook environment (or jupyter lab or colab). After all, you’re writing Python inside the browser right now, why not use the same environment to whip up a quick interactive visualization? A big advantage of this type of solution is that it minimizes the contortions you have to go through to translate Python data types to javascript. And because you’re writing HTML, JS and Python side-by-side, your iteration cycles are very fast. The biggest problem is that most widgets you will create here will not display correctly when you publish them to the web, e.g. on GitHub: people will have to download the notebook to get the interactive effect. They will work inside of Binder, however. They are a great way to dip your toe into this world – at this point, you can probably get by by copying and pasting snippets of HTML, CSS, SVG and Javascript rather than formally learning them.

_repr_html_ and IPython.core.display

Image
Basic demo of _repr_html_

When you print a variable at the end of a Jupyter cell, how does Jupyter know how to display it? The answer is that it looks for a _repr_html_ method for the associated class and calls it. This function can return any arbitrary snippet of HTML, including an iframe which does interactive things, or a snippet which embeds javascript. See this colab for examples on how to do this. You can do fairly sophisticated things with this simple method, for instance building an interactive widget for pandas.

You may also trigger a rich display from anywhere inside a cell, not just at the end, using the ipython API. Again, you can include any number of web technologies, including embedding scripts inside the embedded HTML.

from IPython.core.display import display, HTML
display(HTML('<h1>Hello, world!</h1>'))

ipywidgets and interact

ipywidgets, also known as jupyter-widgets or simply widgets, are interactive HTML widgets for Jupyter notebooks and the IPython kernel. Notebooks come alive when interactive widgets are used. Users gain control of their data and can visualize changes in the data. Learning becomes an immersive, fun experience. Researchers can easily see how changing inputs to a model impact the results.

The ipywidget manual

Next in line in complexity are ipywidgets. Like IPython.core.display, ipywidgets can display rich HTML; however, widgets can also have events which cause them to call Python. See this notebook for an example. You can use them, for instance, to navigate through a plot with widgets. In fact, this pattern is so common that there’s a convenience function called interact which allows you to do just slider-based interactions.

voila

Voila example, from their docs.

Building on jupyter widgets, you can release fully fledged dashboards based in jupyter notebooks. This is called voila. The idea is that you write a regular jupyter notebook, and then you can serve it through a tornado server on the web. The end user sees a dashboard which is the concatenation of the output cells of the dashboard, without the code. You can deploy this on Heroku or Google App Engine to make your application world accessible.

Step 2: Walled gardens

At this point, you might feel stifled by the limited interactivity afforded by the solutions offered so far. And you might feel intimidated by full-fledged modern web development. Many products exist to fill the gap. However, I argue here that it’s better to bite the bullet and skip straight ahead to writing real javascript in a notebook environment, because you will continue building on core skill sets around web technology. Nevertheless, I mention these options here for completeness and because they may satisfy your particular use case.

Unity

Unity is a game development engine and IDE. It’s a great environment to build offline games and VR visualizations. It transpiles exported scenes and C#-based scripting to WebAssembly (WASM), so it also runs in the browser. C# is a modern language with operator overloading. 2d and 3d interaction are easily accommodated. Cons: widgets (sliders, input boxes, etc.) must be created in the Unity interface and are non-native. The Unity applet acts as a walled garden, making it difficult to interact with the native DOM. The C# code cannot be compiled with a command line tool – it must be combined with the project and compiled in the Unity editor, and the project is several GBs, hence can’t be tracked in Github. Furthermore, the math libraries in C# are pretty limited. Overall, a pretty limited option for the web, though it has considerable appeal if you’re doing 3d viz.

p5.js and openprocessing



OpenProcessing sketch, Self-organized critical model

p5.js and openprocessing bring processing – a creative coding platform – to javascript. You write sketches, with an init function and a loop function, which define how the sketch works. You draw into the sketch with functions that allow you to draw lines, rectangles, sprites, polygons, etc. The appeal of processing is that you’re learning creative coding and there are very little barriers to entry. The biggest con is that while you learn a little bit of javascript in the environments, you really don’t learn modern web technologies. At the end of the day, you’ll know how to draw into your sketch, but you won’t know about canvas, svg, css, etc. – and you’ll be limited in the libraries that you can import into that environment. That being said, they can be part of your toolkit, especially p5.js which can interop with plain JS.

Step 3: Javascript notebooks

The next step in complexity is Javascript-based notebooks. Just as there are Python-based notebook environments that use remote kernels (jupyter, colab), there are local notebook-based environments that use Javascript that runs locally in the browser. That means you get the full power of the web – html, canvas, css, svg, javascript with remote libraries, etc. – but you’re doing it in an environment on wheels which supports the familiar REPL workflow of jupyter notebooks. These are environments which are ideal for literate programming which mixes text, math, code and visualization.

For many, I suspect this will be the right level of complexity. In practice, using notebooks mean you will need to export your data out of Python in order to visualize it – for many use cases, json will do the trick.

Observable

Text, code, equations, dynamic visualizations can all be mixed in observable. Code here.

Observable is a notebook environment that runs in the browser, created by Mike Bostock, creator of d3. Observable is a little like jupyter notebooks, but it uses a kind of functional style where each cell may only have one output, and it resolves the DAG of dependencies to figure out which cells to recompute at what time, thus avoiding the pitfalls of ipython out-of-order execution issues. It has an excellent set of starter tutorials. The observable runtime is open-source, but the notebook environment is closed-source. It continues to be feverishly developed – they just added multi-user collaboration. Strictly speaking, it does not use Javascript, and it doesn’t make it very easy to export strict JS out of it – however, you can embed single cells on a webpage. Overall, a great place to get your REPL workflow going.

Starboard

An interactive visualization of coupled oscillators in Starboard, mixed Python and JS. Source.

Starboard is an open-source interactive notebook environment. It uses a different metaphor than Observable, sticking much closer to the Jupyter metaphor with its advantages and disadvantages. The biggest thing it has going for is that it supports both javascript and python via Pyodide. What’s that you say? Python in the browser? Why yes, in fact:

“Pyodide brings the Python 3.8 runtime to the browser via WebAssembly, along with the Python scientific stack including NumPy, Pandas, Matplotlib, SciPy, and scikit-learn. The packages directory lists over 75 packages which are currently available. In addition it’s possible to install pure Python wheels from PyPi. Pyodide provides transparent conversion of objects between Javascript and Python. When used inside a browser, Python has full access to the Web APIs.”

Pyodide docs

That means you can keep some of the numerical computation in Python and mix and match JS and Python as necessary – though you’ll probably still want to use Javascript to manipulate the DOM. Pyodide was originally spun out by Mozilla from another (now defunct) notebook project called iodide, so starboard, along with a few others like jupyterlite are carrying the torch. I really like starboard – the editor feels great, it’s snappy, it has surprisingly little confusing magic going on. However, the documentation and examples are minimal, and it would benefit from a series of tutorials. I think observable is the more viable and mature option in this space for now but starboard is one to watch.

Step 4: the web

Now you’ve had a chance to learn javascript, html, svg, enough css to do the job in the context of dynamic visualizations, and now you’re ready to tackle the world – the world wide web! It’s tempting at this point to learn modern web development, with its complex toolchains (npm + react + webpack + …) . I had an interesting conversation with Chris Olah recently who mentioned that when it comes to doing visualizations, the core skillsets – html, the dom, css, svg, etc. – are more important than learning any particular framework, and it helped organize this article. If you do need a component framework to work within a larger page, a more lightweight framework, like Svelte or vue.js, might satisfy your needs.

One very annoying gotcha you’ll run into is packaging differences between node and JS built for the web. A lot of packages you’ll use for data analysis in JS – like mathjs – were originally built for the node server environment. You need to build them to use them, which means learning about build tools, and pretty soon you’ve wasted a couple of days figuring out your transpiler. There are now cloud transpilers, in particular skypack and JSPM, that will compile node JS libraries for you so you can use them immediately without using a complex toolchain. That means you can import node packages in JS much in the same way you would import pip installable packages in Python:

import * as mathjs from 'https://cdn.skypack.dev/mathjs@8.0.1';
console.log(mathjs.add(1,2));

Thanks to Guido Zuidhof for patiently explaining this to me.

Some templates exist to facilitate the creation of interactive blog posts or articles. Most notably, the distill template and the Idyll language facilitate the creation of explanatory articles (think those cool articles on the nytimes website which highlight different parts of graphs when you scroll down). Some of the best examples of this type of article can be seen on the VISxAI website.

For deployment, provided you don’t rely on external services, you could use github pages, or use a CDN service like netlify or Vercel. If you need to interact with data, then you may need to both create and deploy a data source via a REST API. Something lightweight like flask deployed on heroku would do the trick.

JS libraries

As you’re going through this process, it will often feel like the garden of forking paths: seemingly infinite decisions to make. However, there’s a small core set of JS libraries you might interact with:

I think this is enough of a curriculum to carry you from 0 to 1 over a 6 month period. Practice, learn, and in no time you’ll be ready to make solid visualizations!

Further resources

3 comments

  1. You can use Plotly with Anvil to deploy web apps using all Python, and you can also pull in other JS libraries.

Leave a Reply to Dynamic scientific visualizations in the browser for Python users – Full-Stack Feed Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s