Visualizer - aegisbigdata/documentation GitHub Wiki

Deployment

Visualizer is implemented with python as a Jupyter notebook. Here, we describe how the tool can be utilized from the AEGIS platform.

STEP 1 - Installation

The latest version of Visualizer is available by default in every newly created project. Older projects might not have the latest version. In this case, one should first download it. Then, upload the Visualizer.ipynb file into Jupyter from the corresponding menu, as seen in figure below

Upload Notebook

STEP 2 - Open the notebook

There are 3 different ways through which you can open the Visualizer

a. Via the AEGIS Tools page

The notebook can be opened by selecting the AEGIS Tools section and then click on the Visualizer tile.

Visualizer Startup

b. Via the Jupyter notebook file context menu

Visit the Datasets page and then open "Jupyter" dataset which is available by default in all projects. The user should browse for the Visualizer notebook, right click on it and then select the "Open Jupyter Notebook" context menu option.

Visualizer Startup

c. Via a CSV file directly.

This approach makes again use of the context menu, but in a completely different way. Again, the user should navigate to the Datasets Page and then open a dataset of her choice. Then, by right-clicking on a specific CSV file, she is given the ability to open that file directly into the Visualizer, as it can be seen in the screenshot above.

Visualizer Startup

Once the notebook has been loaded, the user sees the following:

Visualizer Initial State

This is the initial state. All code cells are hidden by using the hide code plugin and also all output has been cleared.

STEP 3 - Initialize the notebook

In order to initialize the notebook, make sure the first cell is selected by clicking on it and either click on the "Run" button from the top Jupyter menu, or press ctrl + enter.

STEP 4 - Working with the visualizer

Now that everything is set, the first widget that appears is something like a file picker, where a user can navigate between the files stored in the current project and open a desired one. Note that only CSV files are supported. If you have opened the visualizer with the third option as described earlier, this field will be pre-filled.

File Picker

Once a file is open, a small preview in tabular format will be shown

File Preview

Depending on whether or not the selected file has been associated with extended metadata, a small section with a few recommended chart types will be displayed.

Afterwards, a list with the available visualization types will be displayed. Pick a desired type and click on the Apply button.

Visualization Types

For every visualization type, a widget with various options depending on the type will be opened. Fill mandatory fields with the desired values and then click on the Visualize button.

Parameters

Pay attention to the parameter called Position. This actually refers to the position on the dashboard that we would like to place the chart to. If you notice in the previous picture, there are five empty cells in the end of the notebook. Those act as placeholders for the dashboard. Currently only up to five different charts at the same time are supported, but this can be easily extended. The dashboard can contain several charts from the same dataset or even from different ones. When a new dataset is openned, previous charts are not deleted.

A sample chart that appears after filling the necessary options can be seen below.

Sample Scatter Plot

At the bottom of every chart, the user is presented with a choice to save the displayed chart. The only thing that needs to be done is to provide a desired name and the press on the "Save Chart as HTML" button. This will store the chart as a dynamic HTML file inside HopsFS.

STEP 5 - Cleaning up

When you have completed your work with the visualizer, it is highly recommended to clear all the output, so you can do a fresh start without any leftovers the next time. This can be achieved from Jupyter's top menu. Choose Cell -> All Output -> Clear

Clear Output

API

No API is available, since visualizer is actually a Jupyter notebook.