I have explained in the first part the reason why I chose a Jupyter-based system; in few words: maintenance, human/data interface, python. I will now give some details on the design of the application. A prototype can be find in my github but be careful: this is still a proof-of-concept, yet a working one, that I and my teammates are using (and debugging) but still in an early stage without the polished completeness of a production-graded application. Therefore, my purpose is not here to “sell” a product that can be downloaded for immediate use but to explain the method and, maybe, encourage others to develop their own application.
The application, which is officially called Gilgamesh, is made of three components:
\citep[xxxx2016] in a hyperlink to the content of the corresponding document (a la Latex).
It is the Python Library that provides access to the data and to the physics models. This part is deeply dependent on the structure of the diagnostics that we have, which makes it not easily exportable for other projects in the present configuration. Yet there are several patterns that can easily be generalized. My present work is to separate this general logic from the details of the implementation of our diagnostics. The objective of the library is to give the user a high-level access to the data, without thinking of how the data are hard-wired to the captor and to give him the power of data processing libraries like pandas, sk-learn and friends. One difficulty with the high level access is to provide a seamless interface to data which are permanently changing from experiment to experiment: diagnostics can be changed, recalibrated, disconnected, reconnected, new components can be added to the testbed, and so on. It is painful for the user to keep track of all changes, especially if you are not on location. So, the idea is that the library take cares of all the details: if the user wants the current signal from the Langmuir probe, he just has to type ‘Langmuir_I’ and he will get it: the library would have found for the request experiment on which port it was connected and which calibration was applied to the raw signal. This is one step to the high level approach and it is related to the ‘Signal’ approach: you call a signal by its name and then you plot it, you check its quality, your process it. Another approach, which is complementary, is to make the signals aware of their environment; it is the ‘Machine’ approach. The testbed and its components, especially the diagnostics are modelled in Python by classes (in a tree-like hierarchy). A given diagnostic has its own class with its name, its properties (position, surface,…), its collection of signals and its methods which represent its internal physics model. Let’s take an example with again a Langmuir probe: instead of calling the signal ‘Langmuir_I’ and the signal ‘Langmuir_V’ and process them to extract the density, you just call the method Langmuir.density() and the object will do all the hard work for you. So the library makes it possible for the user to choose between the ‘signal’ approach for basic processing of data and the ‘machine’ approach to activate the heavy physics machinery to interpret at a higher level these data.
This is the more classical part: a standalone, GUI-based application to manage the data. I added it as a safety net: I was not sure at the beginning how easy it would be to use the notebooks to manage the data. So I used Qt-Designer to develop this graphical layer to the Gilgamesh Library. I am not sure that I will keep this component in the future. The development of the ipywidgets is fast and makes it possible to develop some advanced interactive tools directly in the notebook. If you combine that with the Dashboards extension, you practically get the equivalent of a native application in the browser. OK, I exaggerate a bit, because it is not yet as fast and the interactive manipulation of data (like with pyqtgraph that I use in the Manager) is not as efficient but these tools are progressing quickly and I can see a total replacement in the near future. But even now, I have a notebook “Dashboard” that displays the overview of the results of the last discharge on the big screen of the control room and it is, I must say, convincing.
This is it: the tour of the design choices for this Jupyter-based data processing system comes to the end. Next time, I will give some return on experience on the development and operation of it. After that, we will have a look at some examples of each component.