Transitions - Genometric/GeMSE GitHub Wiki

Transitions and how to apply them?

GeMSE supports the following transitions: Extract, Sort, Rewrite, Discretize, and Clustering.

Follow these steps to apply a transition:

StepsToApplyATransition

  1. From start-transition tree, choose a node on which the transition should be applied.
  2. Choose a transition from Operations dropdown.
  3. (Optional) assign a label for the transition in the Operation Label textbox.
  4. Set up the transition-specific parameters (described in the following for each transition).
  5. Click the Apply Operation button.

The transitions and their parameters are described in the following.

Extract

Extracts a sub-genometric space from a genometric space.

Extract Options

This transition has the following arguments:

  1. Rows-From (inclusive): sets the row number from which the extraction starts.
  2. Rows-To (exclusive): sets the row number at which the extraction ends.
  3. Columns-From (inclusive): sets the column number from which the extraction starts.
  4. Columns-To (exclusive): sets the column number at which the extraction ends.

Sort

Sorts in Ascending or Descending order the selected genometric space based on various attributes.

Sorting Area is the domain of the attributes for a sort criterion; hence, Metadata and Sorting domain are options that vary depending on the Sorting Area option. The Sorting Areas are as follows:

  • Columns Metadata: sorts columns based on the metadata of the loaded samples. When this option is selected, GeMSE populates the Metadata dropdown with all the loaded metadata attributes. User chooses an option, then clicks on the Add button to add the criterion. Multiple attributes can be defined, and columns will be sorted based on the attributes in their entered order.
  • Rows Metadata: sorts rows based on a string attribute of the loaded samples (e.g., gene name).
  • Columns Contents: sorts columns based on the values in the cells. User defines a range of columns whose cells should be used for sorting, using the Range From and To arguments. If the To argument is set more than the number of columns, GeMSE automatically adds only the available column having clicked the Add button.
  • Rows Contents: sorts rows based on the values defined in the cells. Parameter setting is similar to Columns Contents.

Sort Options

Rewrite

The Rewrite transition is applicable on the whole, or a sub-genometric space, of the selected genometric space. By default, the Rows-From/To and Columns-From/To are set to target the selected genometric space entirely.

The Source Range parameter with From and To arguments, sets a range of values to be rewritten with the value given for the New Value parameter.

After setting the parameters, clicking on Add button registers the mapping (not applied yet). User can define an arbitrary set of mappings. All the registered mappings are displayed in the table below the Add button. The registered mappings can be deleted clicking on Reset button. The mappings are applied having clicked on the Apply Operation button.

Note that, the defined mappings are discreet (see Discretize for contiguous rewrite). In other words:

  • the Rewrite operation do not necessarily rewrite all the values in the selected sub-genometric space (the values which are not covered by the mappings do not change)
  • the defined mappings can overlap, and the values are changed to the last mapping in a group of overlapping mappings.

Rewrite Options

Discretize

Unlike Rewrite, Discretize is a contiguous rewrite. Similar to Rewrite, Discretize can be applied on a sub-genometric space of a selected genometric space. The operation is setup as following.

Choose a break point. This point divides the values of the selected genometric space in two groups:

  1. the values less than or equal to the break and greater than a previously defined smaller break. All the values in this range are replaced with the value entered in the textbox to the left of the break textbox. Or, not changed if Change is unchecked.
  2. the values greater than the break and less than a previously defined greater break. All the values in this range are replaced with the value entered in the textbox to the right of the break textbox. Or, not changed if Change is unchecked.

For instance:

  • Set the break to 0, and enter -10 and 10 in the left and right value textboxes respectively. This defines a mapping as:

    (-∞, 0] ← -10

    (0, ∞) ← 10

  • Then set the break to 50 and enter 10 and 100 in the left and right value textboxes respectively. This updates the mapping as follows:

    (-∞, 0] ← -10

    (0, 50] ← 10

    (50, ∞) ← 100

GeMSE automatically updates the From and To labels based on the entered break and previously defined mappings.

The mapping is registered by clicking on the Add button. All the defined mappings are listed in the mappings table.

Discretize Options

Clustering

This operation clusters data using Agglomerative Hierarchical Clustering. The linkage criteria options are:

And clustering can be applied on rows, columns, or both (aka bi-clustering; it requires a connection to R to be set a prior, see setup page). The clustering metrics are:

Additionally, GeMSE suggests a number of clusters using the Elbow method. If Plot Elbow method data is checked, it plots percentage of variation vs number of clusters, and suggests a number of clusters based on highest percentage of variation slope change.

Clustering Options

⚠️ **GitHub.com Fallback** ⚠️