Glossary for Developers - pm4knime/pm4knime-document GitHub Wiki

Glossary for Developers

This folder is used to give tutorials on project pm4knime which integrates ProM into KNIME. This file describes the concepts of KNIME in order to understand it better.

Jave General Concepts

  1. Difference between jdk and jre, when to use jdk and jre According to this link, we have summarized the difference here.

JDK> JRE > JVM
JDK = JRE + Development/debugging tools
JRE = JVM + Java Packages Classes(like util, math, lang, awt,swing etc)+runtime libraries.
JVM = Class loader system + runtime data area + Execution Engine.

In other words if you are a Java programmer you will need JDK in your system and this package will include JRE and JVM as well but if you are normal user who like to play online games then you will only need JRE and this package will not have JDK in it.

KNIME General Concepts

This section introduces the normal KNIME concepts, mainly by giving the link to online documents. But the features of KNIME concepts, which benefits the programming in KNIME will be addressed here.

  1. DataCell:
  • Data Cell is the smallest unit to store data, like double, numeric, sting.
    It is initialized by corresponding cell,
DataCell cell = DataType.getMissingCell();
double dValue = peekFlowVariableDouble(c.getFirst());
cell = new DoubleCell(dValue);

String sValue = peekFlowVariableString(c.getFirst());
sValue == null ? "" : sValue;
cell = new StringCell(sValue);
  • DataCell is like final variable. After declaring and initializing the value, it's impossible to change the value again, since there is no method for setValue().
  • MissingCell is compatible to all DataTypes, so if we want to find the exact DataType, we need to test
 DataCell cell = row.getCell(i);
 if (!cell.isMissing()) {
                // do something...
 }
  1. DataColumn:
  2. DataColumnSpec:
DataCell[] specs = new DataCell[vars.size()];
...
specs[i] = cell;
DataRow newRow = new DefaultRow(rowName, specs)

An array of DataCell can be used to create dafult row, but type of each element can be different. Later, if we want to create a table, we need to keep the column of the same type as the first row.

  1. DataTable:
  • Immutable, which means the data kept in it are only-readable.
  • There is no random access for predefined DataTable in KNIME. If we want to get the data from a DataTable, we can use the iterator method. The reason behind it is storage consideration. BufferedTable allows a chunk of data in memory but not all the table in memory, especially when data is big.
  • DataTable must be created with ExecutionContext exe, and there are two ways to create a DataTable. More information under this link.
  1. DataTableSpec: Considering the following codes refered from knime-base on github how to convert one variable to data table?
DataColumnSpec[] specs = new DataColumnSpec[vars.size()];  
 ...  
 specs[i] = new DataColumnSpecCreator(c.getFirst(), type).createSpec();  
 DataTableSpec tableSpec = new DataTableSpec(specs);

Above codes explain that DataTableSpec can be created from columnSpec

  • BufferedDataConstainer: It is mutable, data can add to it.
  • ExecutionContext:
DataTableSpec spec = createOutSpec();  
BufferedDataContainer cont = exec.createDataContainer(spec);
... 
// one row is added to DataContainer
cont.addRowToTable(newRow);
// for return data
// A BufferedDataTable is created from DataContainer by getTable()
PortObject[] = new BufferedDataTable[]{cont.getTable()};

DataTableContainer is created from ExecutionContext according to a table spec.

  1. FlowVariable [https://www.knime.com/wiki/flow-variables]
  • To enable dialog parameters set with flow variables, those dialog parameters need to implement `
`saveSettingsTo() ;`
`loadSettingsFrom();`
new PortType[]{BufferedDataTable.TYPE}, new PortType[]{FlowVariablePortObject.TYPE}
  • They are control parameters for workflow, not like normal data..The process is usually done like this,
  • get table spec by input`
  • iterate on columns of this table spec to get the column spec`
  • create new data cell according to the type of column spec`
  • push data cell into FlowVariables`
private DataCell[] createDefaultCells(final DataTableSpec spec) {
final DataCell[] cells = new DataCell[spec.getNumColumns()];
        for (int i = cells.length; --i >= 0;) {
            final DataColumnSpec c = spec.getColumnSpec(i);
            if (c.getType().isCompatible(IntValue.class)) {
                cells[i] = new IntCell(m_int.getIntValue());
}
}

...
 protected void pushVariables(final DataTableSpec variablesSpec, final DataRow currentVariables) throws Exception {
 ..
DataColumnSpec spec = variablesSpec.getColumnSpec(i);
DataType type = spec.getType();
final DataCell cell;
cell = currentVariables.getCell(i);
if (cell != null) {
                if (type.isCompatible(IntValue.class)) {
                    pushFlowVariableInt(name, ((IntValue) cell).getIntValue());
}
...
}