The problem with Apache Kafka (and how SpecMesh addresses it) - specmesh/docs GitHub Wiki

The problem

Apache Kafka is not opinionated, which means you can:

  • Create topics with any name, using any structure or capitalization (camel case, kebab case, lower/upper case). This lack of constraint often results in flat structures of topic names that have unclear ownership.
  • Experience confusing or non-existent governance; associating and managing ACLs is complicated by the fact that consumer IDs, topic names, and ACLs are all managed separately.
  • Encounter inconsistent tooling in many areas. Some apps use the Java admin client, while DevOps tools may include Ansible, Terraform, or JulieOps (GitOps) for provisioning resources.

Adopting Kafka is hard; adopting Kafka at scale is even harder. It's difficult to make the right decisions early on, and often things become hidden, lost, or misunderstood.

The solution

SpecMesh provides an opinionated way of designing and modeling your Kafka resources together. It adds an opinionated layer over Apache Kafka resources, modeling and conflating resources such as topics, schemas, and ACLs using AsyncAPI. Although AsyncAPI already supports Kafka for modeling resources like topics, SpecMesh extends this model to include:

  • GitOps semantics: Provisioning of topics, schemas, and ACLs (with git maintaining the current and next state, as well as API spec history)
  • Hierarchical structural modeling of topic names: i.e., a.b.c
  • DDD-Aggregate Identification introduction to enforce resource ownership over all resources
  • Conflation of topic governance using well-understood concepts by demarcating them as public, private, or protected
  • Self-governance enablement for protected topics by using the grant:a.b.c. tag metadata
  • Inclusion of a rich metadata modeling overlay to document attributes, making them suitable for governance, data cataloging, or building extended functionality

A sample specification

asyncapi: '2.5.0'
id: 'urn:acme.simple_range.life_enhancer'
info:
  title: ACME Life Enhancer
  version: '1.0.0'
  description: |
    ACMEs Life enhancer records and predicts how ones life will change due to many events that are experienced - see http://acme.org/life_range for more info
  license:
    name: Apache 2.0
    url: 'https://www.apache.org/licenses/LICENSE-2.0'
servers:
  test:
    url: test.mykafkacluster.org:8092
    protocol: kafka-secure
    description: Test broker

channels:
  _public.user_signed_up:
    bindings:
      kafka:
        envs:
          - staging
          - prod
        partitions: 3
        replicas: 1
        configs:
          cleanup.policy: delete
          retention.ms: 999000

    publish:
      summary: Inform about signup
      operationId: onSignup
      message:
        bindings:
          kafka:
            schemaIdLocation: "payload"
        schemaFormat: "application/vnd.apache.avro+json;version=1.9.0"
        contentType: "application/octet-stream"
        payload:
          $ref: "/schema/acme.simple_range.life_enhancer._public.user_signed_up.avsc"

  _private.user_checkout:
    bindings:
      kafka:
        envs:
          - staging
          - prod
        partitions: 3
        replicas: 1
        configs:
          cleanup.policy: delete
          retention.ms: 999000

    publish:
      summary: User purchase confirmation
      operationId: onUserCheckout
      message:
        bindings:
          kafka:
            schemaIdLocation: "payload"
        schemaFormat: "application/json;version=1.9.0"
        contentType: "application/json"
        payload:
          $ref: "/schema/acme.simple_range.life_enhancer._private.user_checkout.yml"

  acme.simple_range.transport._public.tube:
    subscribe:
      operationId: onUserArriving
      summary: Humans arriving in the borough
      bindings:
        kafka:
          groupId: 'acme.simple_range.life_enhancer.new_users'
      message:
        schemaFormat: "application/vnd.apache.avro+json;version=1.9.0"
        contentType: "application/octet-stream"
        bindings:
          kafka:
            schemaIdLocation: "header"
            key:
              type: string
        payload:
          $ref: "acme.simple_range.transport._public.tube.passenger.avsc"

Noteworthy features from the above spec

  • hierarchy: The 'app-name, aggregate' is called 'simple:spec_demo' - it prefixes topics owned by this app. ACLs will also be provisioned to prevent other principles for accessing 'private' data
  • absolute and relative paths: topics owned by this app (channels section) start with 'private, public or protected' - the convention is used to combine topic naming structure with permissions like those of a file system.
  • Absolute paths indicate consumption from other apps. i.e. /london/hammersmith/transport/