ELT New Experiment - materials-commons/materialscommons.org GitHub Wiki

In general, and especially, to upload and process a large spreadsheet or any spreadsheet with attached data, you will be using Globus in connection with Materials Commons Excel-file ETL.

This section describes how to build an experiment, with attached files, using ETL once those files have been upload to a project in Materials Commons. To upload files into materials commons see the instructions on upload in the Globus Command help page.

We assume that you uploade a folder that contains both the Excel file, and the data folders and files, for the new Experiment. In our example, we have a folder "allInOne" that contains a spreadsheet "small_input.xlsx" and a folder "data":

Project folder with excel file and data directory

Then, in the spreadsheet, as shown in this screenshot, the references to the data for each file entry are based on the folder data. That is, for example, file page in the FILES column in the HeatTreatment process for Sample 1, is "data/file1.txt" - the data folder contains the file "file1.txt" referenced by this entry:

Spreadsheet showing files in data directory

With that spreadsheet uploaded (as "small_input.xlsx") with its data folder, all in one directory, we can now run the ETL to built a new Experiment. Starting from the Project's home page, which you get to by clicking on the project name at the top of the left navigation bar, "Demo for ELT, in this case:

Project home page showing buttons

Then to start the ETL using this excel file, click on the button "NEW EXPERIMENT USING ETL":

ETL Panel

And you will see this ETL dialog to create an experiment from a spreadsheet (make sure you are seeing the panel for the "USING PROJECT FILES" tab):

Detail of Excel File pulldown

To fill in the information on the spreadsheet, select from the pulldown menu (small triangle on the right), which will show a list of all the excel files in your project; pick the excel file for this ETL, making sure it is the one in the folder containing the data - especially in the case that there are multiple files with the same name:

ETL parameters filled in

Then fill in the Experiment name, and (optionally) the description; and click the "RUN ETL" button. The button will be disabled until there is both an excel file and an Experiment name (the required parameters). Once the action button is clicked you will see a "Loading" message; this is because, currently, the ETL process is run synchronously and will not return until the the command is finished, or the browser times out. If the command finished without timing out, you will see the new experiment your project's home page.

Results of Excel ETL

In the case that the browser times out before the ETL is run, just wait for a while and the process will finish. The data for the experiment is incrementally loaded during the process, so it you only see partial results, the it is likely that the browser timed out and the experiment is still loading. Wait for 20 minutes of so and then refresh the page.