Apache Kafka - sgml/signature GitHub Wiki

Examples

Data Contracts

Schema Registry

CLI

Deduplication

Bindings

Glossary

Flink

References

JSONPointer Integration

<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-java</artifactId>
    <version>1.15.0</version>
</dependency>
<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-streaming-java_2.12</artifactId>
    <version>1.15.0</version>
</dependency>
<dependency>
    <groupId>com.fasterxml.jackson.core</groupId>
    <artifactId>jackson-core</artifactId>
    <version>2.13.0</version>
</dependency>
<dependency>
    <groupId>com.github.fge</groupId>
    <artifactId>jackson-coreutils</artifactId>
    <version>1.9</version>
</dependency>
import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.util.Collector;
import com.fasterxml.jackson.core.JsonPointer;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;

public class JsonPathFinder {
    public static void main(String[] args) throws Exception {
        // Set up the streaming execution environment
        var env = StreamExecutionEnvironment.getExecutionEnvironment();

        // Create a sample JSON input stream
        var jsonInputStream = env.fromElements(
                """
                {"foo": "bar"}
                """,
                """
                {"baz": "qux"}
                """
        );

        // Define a FlatMapFunction to find the JSON Pointer path "foo"
        var results = jsonInputStream.flatMap((FlatMapFunction<String, Tuple2<String, Boolean>>) (value, out) -> {
            var mapper = new ObjectMapper();
            var rootNode = mapper.readTree(value);
            var pointer = JsonPointer.compile("/foo");

            var pathExists = !rootNode.at(pointer).isMissingNode();
            out.collect(new Tuple2<>(value, pathExists));
        });

        // Print the results
        results.print();

        // Execute the Flink job
        env.execute("Json Path Finder");
    }
}

Migration

Kafka/Flink for SpringMVC WebFlux Developers

Similar Methods in Apache Kafka

  • Flux.fromIterable() - Similar to Kafka's KafkaConsumer.poll() which retrieves records from a Kafka topic.
  • Flux.subscribe() - Similar to Kafka's KafkaConsumer.subscribe() which subscribes the consumer to one or more topics.
  • Flux.map() - Similar to Kafka's KafkaStreams.map() which transforms records in a stream.
  • Flux.flatMap() - Similar to Kafka's KafkaStreams.flatMap() which transforms records into multiple records.
  • Flux.delayElements() - Similar to Kafka's KafkaProducer.send() which can be used with a delay.

Similar Methods in Apache Flink

  • Flux.fromIterable() - Similar to Flink's DataStream.fromCollection() which creates a DataStream from a collection.
  • Flux.subscribe() - Similar to Flink's DataStream.addSource() which adds a source to the DataStream.
  • Flux.map() - Similar to Flink's DataStream.map() which applies a function to each element in the stream.
  • Flux.flatMap() - Similar to Flink's DataStream.flatMap() which transforms each element into zero or more elements.
  • Flux.delayElements() - Similar to Flink's DataStream.timeWindow() which introduces a delay or windowing in the stream.

Kafka Protocol

Flink Protocol

Kinesis Protocol

Storm Protocol

Integration

Use Cases

Concepts

Internals

Refactoring

Security

Videos

VS

One of Kafka's core features is the partitioning of data by means of a partition key, which can be used to select data for which the order must be maintained and data which can be processed in parallel.

A Kafka cluster consists of brokers that coordinate the writing (and reading) of data to permanent storage. With Kafka, every message is stored. Communicating via permanent storage decouples the send and receive operations from each other

The key benefits of Kafka are its scalability, its ordering guarantees, its wide-scale adoption, and wealth of commercial service offerings.

All message types are brokered. This means that messages can be delivered even if the recipient was disconnected for a moment.

The communication is also decoupled in terms of time so that direct feedback from the recipient to the sender of a message is no longer possible.

References

Kafka Protocol

Kafka Security Vulnerabilities

Kafka Presentations

Cluster Linking Failover

Confluent Client Reference

Data Contracts

Schema Management

Data Contracts

Stream Governance

Consumer Configs

Confluent Security Advisories

⚠️ **GitHub.com Fallback** ⚠️