Design - nokia/minifold GitHub Wiki

Design

The design of minifold is based on three main entities:

  • Queries characterizes a set of objects of interest.
  • Entries gather the set of objects to a given query. In minifold, entries is represented by a list of dictionaries.
  • Nodes are in charge to perform the processing related to queries and entries. A node may have children in charge of performing some sub-tasks. This induces a directed acyclic graph structure (usually a tree).

The resulting graph corresponds to the query plan. There are two kinds of nodes:

  • Connectors corresponds to its leaves. They are in charge of wrapping sources of data and are called. They translate minifold queries according to the remote platform paradigm (e.g. a LDAP query, a HAL query, etc)
  • Operators corresponds to the other nodes. They usually implements simple operations (like SQL operators) on the data.

Once the query plan is built, it can be executed:

  • The arcs of the query plan graph are traversed forward by queries and backward by entries.
  • The user sends a query via the root node. The root returns to the user the requested entries.
  • The query is recursively forwarded by each node to its children. The query may be modified when it is forwarded to a child node.
  • When a connector handles a query, it transposes it into the data source formalism, collects the matching results, and returns the entries to its parent node.
  • A parent node processes the entries returned by their children, according to the primitive it implements.
  • The task assigned to a given node ends once all the entries returned by its children have been processed.
  • The query plan is fully executed once the root has finished its job.

Queries

A Query is an object which characterize the data of interest in a unified format. This object is inspired from SQL background and typically embeds:

  • an action: ACTION_CREATE, ACTION_GET, ACTION_UPDATE, ACTION_DELETE
  • the queried object/table: for instance "researcher", "publication" or "conference"
  • the requested fields: for instance "year", "title", "authors"
  • optionally some filters (see where)
  • and some other options (offset, limit...)

Example: Institution query

from minifold.query import Query, ACTION_GET

q_institution = Query(
   action = ACTION_GET,
   object = "institutions",
   attributes = [],
   filters = BinaryPredicate("institution_id", "==", 3)
)

Example: LDAP query:

from minifold.query import Query, ACTION_GET

q_ldap = Query(
    action = ACTION_GET,
    object = "ou=users,dc=lincs,dc=fr",
    attributes = ["uid", "sn", "givenName", "departmentNumber"],
    filters = BinaryPredicate("sn", "==", "Mathieu")
)

Nodes

Connectors

A Connector is in charge of wrapping sources of data involved of the query plan.

The following examples shows how to build a Connector that can be queried afterwards, as shown in the "Query" section.

Example: from a list of python dictionary.

from minifold.entries import EntriesConnector
from minifold.query   import Query, ACTION_GET
 
q_institution = Query(
  action = ACTION_GET,
  object = "institutions", # Not needed
  attributes = [],
  filters = BinaryPredicate("institution_id", "==", 3)
)

institution_connector = EntriesConnector([
    {"institution_id" : 1, "institution" : "TPT"},
    {"institution_id" : 2, "institution" : "UPMC"},
    {"institution_id" : 3, "institution" : "INRIA"},
    {"institution_id" : 4, "institution" : "SystemX"},
    {"institution_id" : 5, "institution" : "Nokia"},
])
    
entries = institution_connector.query(q_institution)

Example: from a LDAP server:

from lincs_config   import LDAP_HOST, LDAP_USERNAME, LDAP_PASSWORD
from minifold.ldap  import LdapConnector
from minifold.query import Query, ACTION_GET
   
q_ldap = Query(
   action = ACTION_GET,
   object = "ou=users,dc=lincs,dc=fr",
   attributes = ["uid", "sn", "givenName", "departmentNumber"],
   filters = BinaryPredicate("sn", "==", "Mathieu")
)
   
with LdapConnector(LDAP_HOST, LDAP_USERNAME, LDAP_PASSWORD) as ldap_connector:
    # here we can query the container
    entries = ldap_connector.query(q_ldap)

Operators

An Operator in charge of processing the Query issued by its parent(s) and to process the entries issued by its children. Operators are usually based on a underlying function. The developer is free to interconnect nodes or to directly rely on these functions to build its workflow.

The following example shows the difference between these two approaches.

Example: querying Hal with the original ontology and rename the results afterwards:

from pprint          import pprint
from minifold.rename import rename, RenameConnector
from minifold.hal    import HAL_ALIASES, HalConnector

hal_connector = HalConnector()

entries = hal_connector.query(Query(
  action     = ACTION_READ,
  object     = "publication",
  attributes = [
    "title_s", "producedDateY_i",
    "authFullName_s", "conferenceTitle_s"
  ],
  filters    = BinaryPredicate("authFullName_s", "==", "Fabien Mathieu")
))

publications = rename(HAL_ALIASES, entries)
pprint(publications)

Example: querying Hal with a renamed ontology

from pprint          import pprint
from minifold.rename import rename, RenameConnector
from minifold.hal    import HAL_ALIASES, HalConnector

hal_connector = RenameConnector(HAL_ALIASES, HalConnector())

publications = hal_connector.query(Query(
  action     = ACTION_READ,
  object     = "publication",
  attributes = ["title", "year", "authors", "conference"],
  filters    = BinaryPredicate("authors", "==", "Fabien Mathieu")
))
pprint(publications)

Nodes design

In terms of implementation, Nodes rely on two primitives:

  • query() handles an incoming Query and forwards it to its child(ren). The Query may be altered during this step, depending on the nature of Node. For example, the RenameConnector changes the attributes names mentioned in the query.
  • answer() processes the entries (resulting from a past query) returned by its child(ren) and returns them to its own parent (if any).

If a Node requires a connection-state, we rely on the python "with" statement (e.g. LdapConnector).

⚠️ **GitHub.com Fallback** ⚠️