A glimpse in the future of scientific publishing

You have certainly heard about the discovery of the gravitational waves. There is of course the press coverage offering the big picture to a wide audience. Maybe you were curious enough to dig a bit more in the details of the measurement. The natural way is to read the official scientific publication under the form of an article in the Physical Review Letters. Usually four pages long (here a bit more, probably due to the importance of the discovery), rigorously written, based on numerous references, in a very concise but accurate style. Briefly, this is the summum of publication in physics.

But this time, the authors have released their results in another form: a Jupyter notebook. You can find the result here. Do you see the difference? Yes, it is interactive. You can just follow what the authors wrote, but you can also modify the code, do tests, torture the data. Compare both publications and try to see where you understand better. What is interesting is that you do not see only a finite product but also a big part of the process to reach it. It is a big progress on the road to reproducibility and a great instrument to learn a topic (did you notice that you have sounds included in the notebook?).

Of course, this is a first step: several aspects can be improved but Jupyter offers already a vast potential to enrich the publication process. But all the same imagine one big potential: Jupyter is based on a three tiers architecture: a kernel to do the calculations, a frontend in html/javascript (with a backend.js to insure the communication with the kernel) and a server based on Tornado (with ZeroMQ for the communication with the kernel). In the present example, the LIGO team released only a notebook which is self contained: it needs only standard libraries and can run with python kernels available with the Jupyter distribution.  The data are independently downloaded through a server. Now imagine that the team gives access, in addition to the notebook, to a kernel equipped with their specialized libraries and all the tools they use for refined analysis; a kernel which has directly access to the data, all the data, not only the good ones. So the notebook, instead of connecting to the local kernel, connects to the remote kernel. In this case, and if you have the required experience, you are able to work almost in the same conditions than the discovery team.

One step further, imagine that you can comment below each cell, compare your modifications with the ones brought by other persons on their own version of the notebook, you will increase and (if the noise controls are good) improve the level of discussion on each paper. These extensions could be based on an extension of Jupyter Hub, which manages from one single point the access to notebooks for a group of user.

One last step in the future and you imagine that the kernel, instead of being a coding language, is a machine: a raspberry pi, an arduino, a digitizer, the controller of an experiment. Instead of programming code, you program remotely from your notebook. It is a bit far fetched for the moment, but imagine that you can enter your own program of observation for LIGO directly from the notebook. The kernel takes in charge of queuing the requests from all the notebooks, the experimental team prioritizes these requests and you get notified when the results you asked for are available. Science from your bed!

This is the kind of stuff that Jupyter starts to make possible. And don’t be surprised, the futures comes very fast.


One Response to A glimpse in the future of scientific publishing

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: