Kafka to Kafka Gateway to SMQ to SQL - seaweedfs/seaweedfs GitHub Wiki
Kafka client → Kafka gateway → SMQ → SQL
Bring your existing Kafka clients. Point them at the SeaweedFS Kafka gateway. Messages flow into Seaweed Message Queue (SMQ) for streaming, while SeaweedFS persists them into Parquet for SQL analytics.
See the end-to-end picture: Structured Data Lake with SMQ and SQL.
Why use the Kafka gateway
- Keep your Kafka tooling and clients
- Scale stateless brokers and storage independently
- Get streaming + Parquet-based analytics without changing producers
Architecture
Kafka Clients <=> SeaweedFS Kafka Gateway <=> SMQ Brokers => Subscribers
\
+--> SeaweedFS (Parquet) => SQL Engines
The gateway speaks the Kafka protocol to clients and maps topics/partitions, offsets, and consumer groups to SMQ semantics.
What stays the same
- Kafka client libraries and tooling (producers/consumers)
- Topic/partition concepts
- Consumer groups and offsets
What you gain
- Durable Parquet storage for batch analytics
- One pipeline for both streaming and SQL
- Simple, scalable operations (stateless brokers, disaggregated storage)
Getting started
- Start SMQ and the Kafka gateway (example ports):
weed mq.broker -port=17777 -master=localhost:9333
weed mq.agent -port=16777 -broker=localhost:17777
weed mq.kafka -port=19092 -broker=localhost:17777
-
Point your Kafka producer/consumer at
localhost:19092. -
Query the resulting Parquet data with your SQL engine of choice (Trino, Spark, DuckDB, etc.).
Next steps
- Central concepts: Structured Data Lake with SMQ and SQL
- SMQ overview: Seaweed Message Queue