Project Write up and Analysis - federatedbookkeeping/timesheets GitHub Wiki

Abstract

As the project progressed, a number of insights emerged. This page summarises what we set out to do, what we learned, and what we propose as next steps.

Project Objectives

Our primary hypothesis was that it was possible for multiple Open Source time-tracking tools to act in concert as part of a decentralised federation of systems, "connected, but sovereign". As Michiel de Jong, originator of the concept of Federated Bookkeeping put it, "if you enter timesheet data in one system, it shows up in the others".

Learnings - Conceptual

In the early stages of the project, it was unclear what the best approach to federation between sovereign systems should be, and the criteria indicating that it had been achieved. We did not even have a clear common notion of federation. This section captures that shared understanding and documents the journey towards it.

Characteristics of Federated Systems

Our collaboration distilled the following characteristics of a federation of systems:

Each system is sovereign - i.e. the integrity of the data originating in it is sacrosanct, and those data are not subject to change by other systems;
Those data are propagated across the federation, subject to privacy and confidentiality constraints;
To accomplish data sharing in a decentralised network, each system may be directly connected with others in the federation, or act as a proxy for them, to minimise the number of point-to-point connections;
Each federated system may be integrated with one or more non-federated systems, from which it may receive data, which will then propagate across the federation to other member systems;
The federation intentionally tolerates divergent views of the truth for business reasons, in keeping with the principle of sovereignty, enabling such divergence to be captured as part of the data describing the federation, rather than out-of-band -- contrasting with integrated systems, which are expected to converge to a consistent shared state; and
The member systems have suitable technical capabilities to implement the security controls for data confidentiality and integrity.

Measuring success

The project initially identified and analysed a substantial number of FOSS time-tracking tools (155 in all) and assessed whether and how they could export and/or import such data. Integrations were built between 19 of these and two of the federated systems (16 for PreJournal, maintained by Ponder Source, and 3 for Tiki, maintained by EvoluData). As the project progressed, it became apparent that we needed clear ways of showing whether and how federation had been accomplished, and that we could have taken a more optimal approach to it earlier on.

Understanding Metadata

For data to flow between those systems, we recognised that a common understanding of the data they needed to exchange was required as well. Some important thinking contributing to this may be found in Michiel's "Let your data wear CLOGS" blog post. Our efforts initially focussed on the low-level structure (such as CSV or JSON), proposing a "union" data format to which all participating federated systems would conform. This seemed the right approach at the time, since time-tracking is ostensibly a simple domain, with relatively few concepts and associated data fields. We then began to map metadata between systems, and invested considerable investigative effort into identifying the most appropriate single public ontology to describe all the relevant concepts and all metadata, settling on Valueflows for this. Flaws arose with this approach, however.

When m-ld sought to map each of its timeld concepts to Valueflows, we found that ontology's concept set insufficient to support a domain even as simple as timesheet data. Furthermore, we found less concept commonality between time-tracking systems than expected - for example, Evoludata's Tiki tool has two fields 'Task' and 'Description', which only partially match the equivalent 'activity' concept in timeld. A greater challenge was that there are differences in the syntactic approach between federated systems - for example, timeld uses RDF for its knowledge representation (with URIs for every entity it stores), as does Tiki (as identifiers for timesheet entries). This is a syntactic, rather than a semantic use, though.

It also became apparent that this approach would not scale to larger and more complex domains such as Bookkeeping, with many more concepts. Hence, it was adapted to identify additional public ontologies with concepts having the best match to those in the tools' data model, beginning with timeld's.

The learning from this: when documenting a system's data model and API, be very clear about the semantics of individual fields to ease the mapping task between federated systems - the more information available about their meaning and definition, the easier it is to determine which properties to map other systems' metadata to.

Follow-on hypothesis: there is net benefit in tackling the integration aspect of federating sovereign systems by identifying the common semantic concepts between them, using the approach of Linked Data to anchor the associated metadata in each system to public ontologies - the additional up-front effort is repaid by reduced longer-term semantic drift between federated systems.

Impacts of data retransmission by intermediate systems

The fact that each receiving system is expected to retransmit data received from sending systems has two beneficial consequences:

It reduces the number of point-to-point interactions required between systems, by the sending system treating each receiving system it posts to as a proxy for the other systems in the federation, which makes that federation more scalable; and
In the event of defects in sending systems, the data in the overall federation may be more correct, by virtue of there being multiple intermediate sources having furnished them.

There are some accompanying drawbacks, however:

Potential for conflict - because the federated systems communicate using a 'gossip' approach (i.e. they forward new or changed data to all other systems they are connected to), it is possible that any given system will receive data relating to the same timesheets, projects, and resources from more than one other system. If these data conflict (such as when changes made to the same timesheet are received out of sequence), a means of resolution is needed, preferably described as an ancillary part of the data being shared, rather than out of band.
Privacy - this initial approach to federation assumes that all data are shared with all participating systems, with little in the way of access control stipulations. In practice, this is not desirable, since any given individual will likely prefer their timesheet data to be visible only to those parties whom they trust. This requires access control, for both the time-tracking tools in which data originate, and those in the federation to which they propagate. Furthermore, those controls must be consistent across those federated systems. Even if the access controls themselves were included as metadata shared between federated systems, the originating system would need to trust all receiving systems in the federation to enforce them.

Possible solutions to 1. include:

Adopting a default 'last write wins' policy - this assumes conflicting changes to the same records are always received in the correct order;
Attaching an event timestamp to each change to a record, to give the receiving system an indication which was the most recent - this assumes a sufficiently accurate shared clock, whose skew is always smaller than the interval between conflicting changes;
Capturing the divergence formally in metadata visible to the federation as a whole; or
Preventing conflict altogether, by assigning each external (i.e. non-federated) time-tracking tool a 'home system' in the federation with which it is integrated, and through which all data changes flow - this enables receiving systems in the federation to avoid conflicting changes by trusting the data from only one other system. In consequence, however, it becomes necessary to implement a means of binding individual users, projects and/or organisations to a specific source time-tracking tool (or the system to which it is federated), which detracts from federation's aim of flexibility in loose coupling. This has further implications for access control, both within and between federated systems.

A follow-on project could explore these two challenges and their possible solutions in greater depth.

Scaling Federation

This project connected just three systems - timeld, PreJournal and Tiki - in a federation. Limiting it to a small number was an appropriate choice, since it freed us to focus on the fundamental questions of federation, such as what it actually represents in contrast to conventional system integration, the immediate challenges arising, and the foreseeable consequences, both positive and negative.

While this bounded scope proved helpful in enabling us to focus on these higher-priority considerations, they also limited the extent to which federation could be shown to be scalable. A much larger number of federated systems is needed, in order to identify and explore the constraints that such an arrangement might encounter. For instance, although federated systems are intended to retransmit data on behalf of others in the federation, to yield the benefit of reducing the number of point-to-point connections required, this can only be tested in practice with more systems participating. The number of bidirectional connections ('push' and 'pull') required between n systems in a pure mesh is n(n-1) - in a federation of 3, this is just 6, but in one of 12, it is 132, representing disproportionately greater complexity. Thus the prospect of reducing that complexity becomes increasingly appealing at scale.

An idea raised during the project was to separate a larger federation into 'zones' of a smaller number of systems communicating among themselves, with a designated node responsible for communicating with the other zones in the federation on behalf of the other members of its zone, via counterpart designated nodes for those zones - in other words, acting as a proxy not only for non-federated systems with which they are integrated, but also for other federation members. In a federation of n nodes divided into z zones of m nodes in each, this would reduce the number of connections required to z * m(m-1) + z(z-1). In our example of 12 nodes overall, divided into 4 zones of 3 nodes each, this turns out at just 36 - a significant simplification.

Learnings - Governance

As a loose collaboration between ~11 individuals in four separate organisations, the project took longer than expected to make meaningful progress. Although there was reasonably good philosophical alignment among the parties (founded in a desire to improve the experience of exchanging standard commercial documents between organisations), it did suffer from a lack of overall governance. The person charged with coordinating the project left early on, and was not formally replaced, leading to the endeavour losing its direction for a time, which slowed it up. Latterly, the joint realisation that outcomes needed to be demonstrable to qualify for funding payments led to a more proactive stance on the part of the consortium members, which got things moving again, but this could have been achieved a lot earlier with a more joined-up approach.

The learning from this: ensure that proposals for collaborative projects by consortia of two or more organisations have funding explicitly requested for technical project management resources. This will ensure not only that the project remains on track, but also that a single party has both the responsibility and the incentive to ensure that.