Home - mageshwaranr/dmc GitHub Wiki
DMC
Data movement controller (DMC) is an event based dependency management system. It let's user define dependency / relationship among events and notifies when the dependency is met. It consists of
- a set of REST APIs to let user define event dependencies as a rule and configure callbacks/notifications.
- a http interface to listen to events
- a rule builder to express dependencies between events
- an async notifier.
Though, this can be viewed as a CEP system (listening to events and correlating them), DMC has purpose build intelligence to automically scope events in context of a rule and expire old events, hence very relevent in managing data-set dependencies in a data warehouse or similar.
In context of DMC, events are often Job triggers like START_DATASET_1 processing and DMC notifices the actual job once their dependencies (or) pre-conditions are met. Notifications are sent via web-hook easily extendable to something like Kafka. Events has headers (key, value pairs) using which one can be correlated with other event.
How to use it ?
All of the below are avaialble as an API and as an Java Library driven so you get the benefits of syntax check.
Define a Rule
Rule consists of
- spec (captures dependencies between events using their headers)
- callbacks that needs to be notified.
Spec will look like below
// Create a spec capturing dependencies between events using their headers
// More examples at com.thoongatechies.dmc.spec.service.SpecServiceImplTest
Spec spec = SpecBuilder.newBuilder()
.event("DATASET_1_COMPLETED")
.set("srcVersion").fromHeader("assemblyId") // rename & choose required headers from event
.set("customerId").fromHeader("customerId")
.and() // express and criteria between two events
.event("DATASET_2_COMPLETED")
.set("srcVersion").fromHeader("assemblyId")
.set("customerId").fromHeader("customerId")
.groupBy("srcVersion","customerId") // short cut for specifying e1.metadata1
// in the notification message, you configure the required data
.addToResponse("version").fromHeader("srcVersion") // ability to pick selective meta-data
.addToResponse("customerId").fromHeader("customerId")
.build();
A callback (web-hook) will look like below
// Create a callback URL to notify DATASET_3
CallbackDefinition callback = CallbackDefinition.newBuilder()
.withName("DataSet3_Processor")
.withUrl("http://somedomain.com/post/jobs")
.build();
Rule is a binding of both, which will look like below.
// create a rule binding the spec and callback information together
RuleDefinition rule = RuleDefinition.newBuilder()
.withName("Trigger DataSet_3 on completion of 1 and 2")
.withExpression(spec.toString())
.withCallbacks(Lists.fixedSize.of(callback))
.build();
RuleDefinition is a REST resource. A POST Operation to /v1/dependency/definition/rule
will create this.
Publish events to DMC
A Event has following key information.
- name e.g., DATASET_1_COMPLETED
- qualifier ( a.k.a headers) , a set of key values
- data, a set of key-values but never understood by the system. Will be simply passed back in notifications. Typically used to pass some context information
Note that, a event could match n
no. of rules based on the name. Evaluation will happen using headers, group by and other criteria's specified.
It looks like below
Event.newBuilder()
.withName("DATASET_1_COMPLETED")
.withOwner("System1")
.withOccuredAt(new Date())
.withQualifier(UnifiedMap.newWithKeysValues("assemblyId","uuid","customerId",25))
.build();
A POST operation to /v1/dependency/instance/event
will create this.
Receive notifications
When the dependency condition specified in a rule mets, the corresponding callbacks will be notified with following payload structure
{
"rule" : { ... }, //rule_definition
"events" : [ ], //all events matched against this rule
"qualifier" : { } // key-values of response headers. specified as the response metadata in the rule.
}