CI Properties and Dynamic Fields - bschmalhofer/otobo-ideas GitHub Wiki

Introduction

In OTOBO, entities like tickets, articles, customers, and customer users can hold a ordered flat list of properties. A property has a name, a type, a value, and maybe additional info like a display name. An ordered list is either a special case of an ordered tree or of a JSON-like data structure. In this context a single property is referred to as a dynamic field. Let's call the list of properties a ticketing property tree.

Configuration items can have nested data structure that contains properties. The differences to the afore mentioned property lists are:

  • different type declaration
  • the ordered tree can have nesting depth greater one
  • different representation in the database

Let's call that data structure a CI property tree.

It would be nice to merge the technical base for ticketing property trees and CI property trees.

Implementation as of OTOBO 10.1.x

Storage in the table xml_storage.

The table xml_storage and the Perl module Kernel::System::XML are used for storing the stats configuration and the CI property trees. The stored data structure is a XMLHash. A XMLHash is a list of undefs and hashrefs. The values of the hashrefs are either strings or other XMLHashes. undefs are used primarily as the first element of a XMLHash in order to allow for indexes starting at 1. Thus XMLHashes are like ordered trees. They would be real ordered trees if the keys holding an arrayref were really sub elements.

Note: The data stored in xml_storage is more JSON-like than XML-like.

The table attribute xml_storage.xml_content_key holds a path to the value in a nested Perl data structure. Searching is done with LIKE searches. There is no indexing of specific content_keys.

CI Property tree in XML storage

A subset of XMLHashes where only the key Content has a string value. For storing CI annotation we need a reference to the root and for each property:

  • name
  • type
  • reference to the parent or materialized path or an interval
  • maybe payload as key value pairs
  • sequence id per name an parent

TODO: Does it make sense to switch to DAGs? Might be useful for versioning config items, where there are usually only small changes.

Schema of CI property tree

Currently the description for CI types is homegrown. It contains everything needed for consistency checking, creation, editing, search. Notably, is declared the order in which properties are displayed.

Storage of dynamic fields

The tables dynamic_field and dynamic_field_value are used for that.

Proposal for changing the underlying data storage

This chapter is about changing the underlying data storage that supports CI property trees and ticketing property trees. These are the goals:

  • be 100 % feature compliant (no new features yet)
  • provide a migration path
  • use standards, specifically JSON-related standards
  • unify CI property trees and Ticketing property trees
  • allow to be performant
  • allow easy addition of new features

The general idea is to keep the basic aproach of xml_storage. Property sets are declared with JSON Schema. Storage of specific data types will be supported.

New Database Table json_storage

There will be a new table called json_storage. A replacement for xml_storage and dynamic_field_value. The table attributes are:

  • object_type: like xml_storage.xml_type with new possible values for the ticketing properties tree
  • object_key: like xml_storage.xml_key foreign key to the table indicated by object_type
  • pointer: like xml_storage.xml_content_key but a JSON Pointer instead of Dumper-like path
  • value_text, value_data, value_int: like xml_storage.xml_content_value but more specific, just like in dynamic_field_value.
  • version: replacement for the first level in the old xml_storage ???
  • is_latest: mark what the latest version is ???

For CI property trees all content will be stored in value_text so there will be no migration hindrances. Ticketing property trees will use the specific types. The TagKey entries from xml_storage are not needed and can be discarded. Thus there no need to extra Content and we do away with the dummy undef element as the first element of a list.

Searching in json_storage is initially done with the same strategy as was used for xml_storage.

New Database Table json_schema

Contains JSON Schema. Replacement for configitem_definition and dynamic_field.

Actually this is not plain JSON Schema definitions, but schemas with additional vocabulary. This includes the required information from the old tables and the old CI field declaration. Among that are the indicators whether properties are searchable and whether changes lead to new versions. It would be nice to base the vocabulary on an existing approach.

eliminate the dynamic field tables

This is done by switching to the new tables

enhance the test suite

create benchmarks

Proposal for Enhancements

allow nesting in ticketing property trees

support the dynamic field types in CI property trees

Using libraries for generating the UI

Something like:

JSON-LD

  • add mapping to fully qualified names to the JSON schema
  • emit machine readable JSON-LD in the emitted content
  • activate features like providing a map for addresses

avoid LIKE in searches

This will likely involve the addition of generated fields based on json_storage.pointer

Remove the table xml_storage altogether, by storing the stats configuration as a simple JSON field

Optimize the searching only after doing benchmarks.

Support for includes in JSON schemas

Seems to be supported: https://json-schema.org/understanding-json-schema/structuring.html

Excursion: How to store an ordered tree in the database

Note: This discusses the storage of ordered trees, not JSON-like data structures.

Storage in database tables

The three standard approaches are:

  • adjacency list
  • materialized path
  • nested set

These are nicely explained in https://railsware.com/blog/storing-tree-structures-in-the-rdbms/.

Other options are:

There are also:

  • closure tables
  • PostgreSQL has ltree for supporting materialized paths

The best approach for general trees might recursive common table expressions. See https://mariadb.com/kb/en/recursive-common-table-expressions-overview/. Is there a CPAN tree module that uses recursive CTEs ? https://metacpan.org/pod/SQL::Abstract::More does provide with_recursive().

External Storage in NoSQL-databases

Yet another option: Using MariaDB CONNECT for connecting to MongoDB.

Storage as XML

Oracle has XMLType, MySQL and PostgreSQL might have analogues.

Storage as JSON or JSON-B

The data structure for ordered trees can be dumped as JSON and stored in the database. MySQL and PostgreSQL have good support for JSON, but currently no good way of indexing.

Generated fields

Values from the stored JSON can be extracted into a generated fields which can be indexed.

JSON-LD

Linked data, https://json-ld.org/, is the preferred approach.

Reading List

Relevant CPAN modules