Excel Spreadsheet Preparation - materials-commons/materialscommons.org GitHub Wiki
In order to use your spreadsheet to create an Experiment description in Materials Commons, there are a few requirements. First off, you will note, in what follows that the spreadsheet is organized, generally, into "one process-run per row" - while there are exceptions to this, described below in the example, we are assuming that data you are collecting describe a setup of the sample, followed by one or more sample "treatment" steps that express the variable conditions being studied, followed a measurement step, the "results" for the experiment. Variations on this basic pattern are easily expressed, for example measurements between additional treatment steps, or multiple (nondestructive) measurements of a sample. In general, however, the "flow" from left to right reflects the temporal flow of a sample through multiple workflow steps. An example workflow might be something like this: take a sample of a particular alloy, perform tempering, stretch it, and examine it's surface microstructure in an scanning electron microscope.
Secondly, you will have to add annotations, described herein, as a small number of header lines at the top of the spread sheet. Three of these lines are required and additional lines my be used to add detail to the annotations. Also required is an additional left-side column to annotate the nature and order of the optional header lines.
This image shows a contrived example, to illustrate some of the features of an annotated spreadsheet. While not necessary, in this case, color has been added to assist in delineating the header rows, and the process annotations there in. This screenshot image does not show the entire spreadsheet, but enough of it to illustrate the description of details, below. In most cases, your browser will have an option (left click, for example) that will let you view the image, full size, or magnified, to enable viewing of detail.
Spreadsheet data portion
In this example, the recorded data starts in cell 6B and occupies all the cells from there below and to the right. Generally the first empty row, moving downward, and the first empty column, moving right indicate the end of the data. Thus, this could have easily been data from an original spreadsheet that was copied and annotated as described. In the data portion, in general, each row contains the data for the processing of an individual sample following the process description from the header rows. We will see that some steps in the general workflow, as described in the header rows, can be skipped in the individual workflow for a sample, for example, 15 through 21 show workflows in which these samples were not subjected to the Heat Treatment labeled "Preparation 1" - remember this is a contrived example.
Header Rows - Row one
The first of the header rows contains the model process name (know as the 'template name') proceeded by the keyword "PROC: " - in our example, theses are "PROC: Create Samples", "PROC: Heat Treatment" (twice) and "PROC: EBSD SEM Data Collection". Each position in this row that contains such a label, defined a process-block, the column that the label is in and the columns to it's right. In our example, columns B-E are the headers for a 'Create Sample' Process, columns F-G for the first 'Heat Treatment' process, columns H-I for the second 'Heat Treatment' process and columns K+ for a 'EBSD SEM Data Collection' process.
In the first row, there can be an optional label "END" at the head of the first column beyond the data, to maek the end of the process headers and data, in the case that this column is not blank.
Header Rows - Rows two and three
The next two header rows annotate the column types within the process description and for MEAS and PARAM types the attributes for those type.
For row two, the acceptable column types are SAMPLES, MEAS, PARAM, IGNORE, NOTE, and FILES.
- SAMPLES - sample name - indicates name of the specific that is used in the process instances in this row (in 'Create Samples' process)
- MEAS indicates a measurement
- PARAM indicates either a template-defined setup parameter or a user-defined process parameter
- IGNORE and NOTE columns are not processed, they are for user notes and/or simply columns that should be ignored (like intermediate results)
For Measurements and parameters, row three indicates the measurement or parameter attribute, by name and, optionally, with unit; in this form - name (unit). In our example: 'Temperature (C)' and 'Time (h)' are examples of setup parameters; for these the particulars of spelling and format are important. The other parameters, like Catalog number in the 'Create Samples' process are user-defined process parameters. Attribute names can also be hierarchal, for example: the attribute 'composition.Al (wt%)' indicates that the attribute 'composition' as parts, like 'Al' for each element. Note: (unit) is required for setup parameters that have a unit selection, otherwise optional.
If a PARAM name does not exactly match a known setup parameter, it is treated as a user-defined process parameter - see the process templates for setup parameter names and unit selections.
Other header rows.
The additional header rows are indicated by special tags in the left hand column (column A). Acceptable tags are: NAME, DESCRIPTION, LABEL, BEGIN_DATA
- NAME - (optional) this row is for user-specified Process names; the process name it in the first column of the process description block (in our example, cell F4 indicates that the instances of the 'Heat Treatment' process defined by columns F-G, should be named "Preparation 1"). Where there is no name in this row (cells B4 and K4) the process instances will be given the template name as their name.
- DESCRIPTION - (optional) in the first column of the process, a description for process instances. Processes with no data are “skipped” - not in experiment workflow Spreadsheet is "master", experiment data is regenerated each time the spreadsheet is uploaded
- LABEL - (optional, multiple) indicates a user-defined (and hopefully descriptive) name or description for the data column. And characters can be used here.
- BEGIN_DATA - (required) must the last tag in the first column and most be on the row where the data starts.
Templates
When creating a spreadsheet, in the first header row, in the PROC: tag, you will need to know the name of the Process Template that represents the process you wish to describe. For parameters and measurement given in the template, in the case that you wish to specify the template parameters (which are, for example, displayed in the UI) you will need to know the detailed spelling of the parameter name. Other details that you will need to know are the expected spelling of unit names and of choices on selection-type values for those Setup Parameters and Measurements that are given in the template. These details are available in the detailed index of the Materials Commons templates. This index is available at this site: https://test.materialscommons.org/mcapp/#/public/templates
Note, a user can specify parameter names and measurement names that are not given in the template, and these will be loaded from the spreadsheet into the experiment description and be available for download, for example, when the experiment is written out to a spreadsheet, but are not be displayed in the UI at this time.
The template index looks like this:
The columns of this table can be sorted, on all of the columns except the description, and you can filter on the text or numbers in any field in the table. To sort click on the tables's column header - in the screenshot shown above, the table was sorted A-Z on the template name (the left most column). Click again to change sort direction. To filter table items, start typing the word (or words) or number that you would like to filter on in the filter input at the top of the page, and table items will be limited (as you type) to those items matching what you have typed.
The first column shows the template names. These are the names used in the "PROC:" tag (described above) for the identification of a process type. The entire name need not be typed (or copied) into the spreadsheet, a unique substring is sufficient. For example, for the template Crystal Plasticity Finite Element an tag of PROC: Crystal Plasticity would suffice, as there is no other process template name that matches that partial-name string.
In setting up the process headers in the Excel file, one need to have a process template name, and (optionally) the template-supported parameters and/or measurements. For example, this screenshot snippet (extracted from the image at the head of this section) shows the excel-spreadsheet header for a Heat Treatment process:
Note that the process type tag (in line one), PROC: Heat Treatment, contains the process template name: Heat Treatment.
If we go the the template index and look up the details for this template (that is, click on the name), we get a pop-up that shows the template details. It looks like this:
Therein we see that there are four Setup Parameters: Temperature, Time, Cooling Type, and Cooling Rate. In building the spreadsheet, we can use these parameters to describe the setup of the heat treatment process. Other parameters and measurements may also be used. The advantage in using the predefined Setup Parameters is that they will also appear in the various process User Interfaces on the web site. Currently, parameters and measurement that do not match those in the template are not displayed in the UI. Accordingly, in this case, we have chosen to use the predefined Temperature and Time setup parameters.
Note, also, from the details panel, that the setup parameters, Temperature and Time, have predefined unit specifications. The form and spelling of the predefined setup parameters are defined by the template, and usage must match that definition. Thus, in the spread sheet, for the Temperature setup parameter, we used on line two PARAM and on line three Temperature (C). Likewise, for Time, we see PARAM in line two and Time (h) in line three. In general, line two will contain PARAM or MEAS for parameter or measurement, respectively and line three will contain the parameter or measurement name entry. In general, the form of the parameter and measurement names entry will be Name (unit), where name can match a template name, or be arbitrary, and for the case that the name is a predefined template name then the unit type must also match one of the template unit types for that name, otherwise unit can be any arbitrary string. Note that the unit type is contained in parentheses.
As described previously, cell H4 is the (user supplied) name for the process; and line 5 contains (under the LABEL tag on the left) a user supplied label, reminder or note for the column in question. See the full-page screenshot at the beginning of this section.