Apache Kafka - sgml/signature GitHub Wiki
- https://github.com/scholzj/kafka-test-apps/blob/main/kafka-producer.yaml
- https://www.stardog.com/labs/blog/stream-reasoning-with-stardog/
- https://towardsdatascience.com/kafka-python-explained-in-10-lines-of-code-800e3e07dad1
- https://github.com/confluentinc/librdkafka/tree/master/examples
- https://medium.com/@ali.mrd318/simplifying-kafka-testing-in-python-a-mockafka-py-tutorial-3a0dbbfe9866
- https://docs.confluent.io/platform/current/schema-registry/fundamentals/data-contracts.html
- https://www.confluent.io/blog/error-handling-patterns-in-kafka/
-
https://docs.confluent.io/platform/current/schema-registry/connect.html
-
https://developer.confluent.io/courses/schema-registry/evolve-schemas-hands-on/
-
https://developer.confluent.io/learn-more/kafka-on-the-go/schemas/
-
https://developer.confluent.io/courses/schema-registry/key-concepts/
-
https://developer.confluent.io/courses/schema-registry/schema-subjects/
-
https://docs.confluent.io/platform/current/schema-registry/fundamentals/data-contracts.html
-
https://docs.confluent.io/platform/current/schema-registry/index.html
-
https://docs.confluent.io/platform/current/schema-registry/fundamentals/index.html
-
https://docs.confluent.io/platform/current/schema-registry/fundamentals/schema-evolution.html
-
https://docs.confluent.io/platform/current/schema-registry/develop/api.html
-
https://docs.confluent.io/platform/current/schema-registry/installation/migrate.html
-
https://docs.confluent.io/platform/current/schema-registry/fundamentals/serdes-develop/index.html
-
https://docs.confluent.io/operator/current/co-manage-schemas.html
-
https://docs.confluent.io/operator/2.2/co-manage-schemas.html
-
https://www.confluent.io/blog/best-practices-for-confluent-schema-registry/
-
https://docs.confluent.io/platform/current/schema-registry/develop/using.html
-
https://www.confluent.io/blog/how-schema-registry-clients-work/
-
https://developer.confluent.io/patterns/event/schema-on-read/
-
https://www.confluent.io/blog/schema-registry-for-beginners/
- https://www.confluent.io/blog/using-apache-kafka-command-line-tools-confluent-cloud/
- https://greenplum.docs.pivotal.io/streaming-server/1-3-6/kafka/load-from-kafka-example.html
- https://play.vidyard.com/e869cfd0-76d8-4859-a90f-2471c52a7e22
- https://www.slideshare.net/slideshow/stream-data-deduplication-powered-by-kafka-streams-philipp-schirmer-bakdata/249203406
- https://www.slideshare.net/slideshow/embed_code/key/ayg8N2YEG0jw4A
- https://docs.confluent.io/cloud/current/flink/reference/functions/datetime-functions.html
- https://docs.confluent.io/cloud/current/flink/reference/timezone.html
- https://cwiki.apache.org/confluence/display/Flink/FLIP-188%3A+Introduce+Built-in+Dynamic+Table+Storage#FLIP188:IntroduceBuiltinDynamicTableStorage-Retention
- https://www.alibabacloud.com/blog/introduction-to-unified-batch-and-stream-processing-of-apache-flink_601407
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
<version>1.15.0</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java_2.12</artifactId>
<version>1.15.0</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.13.0</version>
</dependency>
<dependency>
<groupId>com.github.fge</groupId>
<artifactId>jackson-coreutils</artifactId>
<version>1.9</version>
</dependency>
import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.util.Collector;
import com.fasterxml.jackson.core.JsonPointer;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
public class JsonPathFinder {
public static void main(String[] args) throws Exception {
// Set up the streaming execution environment
var env = StreamExecutionEnvironment.getExecutionEnvironment();
// Create a sample JSON input stream
var jsonInputStream = env.fromElements(
"""
{"foo": "bar"}
""",
"""
{"baz": "qux"}
"""
);
// Define a FlatMapFunction to find the JSON Pointer path "foo"
var results = jsonInputStream.flatMap((FlatMapFunction<String, Tuple2<String, Boolean>>) (value, out) -> {
var mapper = new ObjectMapper();
var rootNode = mapper.readTree(value);
var pointer = JsonPointer.compile("/foo");
var pathExists = !rootNode.at(pointer).isMissingNode();
out.collect(new Tuple2<>(value, pathExists));
});
// Print the results
results.print();
// Execute the Flink job
env.execute("Json Path Finder");
}
}
- Flux.fromIterable() - Similar to Kafka's KafkaConsumer.poll() which retrieves records from a Kafka topic.
- Flux.subscribe() - Similar to Kafka's KafkaConsumer.subscribe() which subscribes the consumer to one or more topics.
- Flux.map() - Similar to Kafka's KafkaStreams.map() which transforms records in a stream.
- Flux.flatMap() - Similar to Kafka's KafkaStreams.flatMap() which transforms records into multiple records.
- Flux.delayElements() - Similar to Kafka's KafkaProducer.send() which can be used with a delay.
- Flux.fromIterable() - Similar to Flink's DataStream.fromCollection() which creates a DataStream from a collection.
- Flux.subscribe() - Similar to Flink's DataStream.addSource() which adds a source to the DataStream.
- Flux.map() - Similar to Flink's DataStream.map() which applies a function to each element in the stream.
- Flux.flatMap() - Similar to Flink's DataStream.flatMap() which transforms each element into zero or more elements.
- Flux.delayElements() - Similar to Flink's DataStream.timeWindow() which introduces a delay or windowing in the stream.
- https://www.confluent.io/blog/how-to-share-kafka-connectors-on-confluent-hub/
- https://docs.confluent.io/kafka-connectors/github/current/configuration_options.html
- https://docs.confluent.io/kafka-connectors/aws-lambda/current/lambda_sink_connector_config.html
- https://medium.com/geekculture/heroku-integration-capabilities-the-mini-guide-b8ce745faad1
- https://www.confluent.io/hub/castorm/kafka-connect-http
- https://docs.confluent.io/kafka-connect-aws-cloudwatch-logs/current/overview.html
- https://docs.confluent.io/kafka-connect-sftp/current/source-connector/csv_source_connector.html
- https://rmoff.net/2021/01/11/running-a-self-managed-kafka-connect-worker-for-confluent-cloud/
- https://developer.salesforce.com/blogs/2016/05/streaming-salesforce-events-heroku-kafka
- https://dzone.com/articles/kafka-for-xml-message-integration-and-processing
- https://mozilla-version-control-tools.readthedocs.io/en/latest/hgmo/replication.html
- http://www.liferaysavvy.com/2021/07/liferay-tomcat-access-logs-to-kafka.html
- https://www.oreilly.com/library/view/mastering-kafka-streams/9781492062486/ch01.html
- https://www.confluent.io/kafka-summit-sf18/kafka-as-an-eventing-system-to-replatform-a-monolith-into-microservices/
- https://towardsdatascience.com/getting-started-with-apache-kafka-in-python-604b3250aa05
- https://blog.bosch-si.com/developer/eclipse-hono-supporting-apache-kafka-for-messaging/
- https://github.com/eclipse/hono/issues/8
- https://www.confluent.io/de-de/blog/enabling-exactly-once-kafka-streams/
- https://dev.to/heroku/what-is-a-commit-log-and-why-should-you-care-pib
- https://preparingforcodinginterview.wordpress.com/2019/10/04/kafka-3-why-is-kafka-so-fast/
- https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
- https://www.oreilly.com/library/view/streaming-architecture/9781491953914/ch04.html
- https://docs.datastax.com/en/kafka/doc/kafka/kafkaHowMessages.html
- https://kafka.apache.org/cve-list
- https://jaceklaskowski.gitbooks.io/apache-kafka/content/kafka-tools-DumpLogSegments.html
- https://logging.apache.org/log4j/2.x/log4j-users-guide.pdf
- http://events17.linuxfoundation.org/sites/events/files/slides/developing.realtime.data_.pipelines.with_.apache.kafka_.pdf
- https://www.moengage.com/blog/kafka-at-moengage/
- https://www.confluent.io/es-es/blog/kafka-without-zookeeper-a-sneak-peek/
- https://www.confluent.io/blog/apache-flink-apache-kafka-streams-comparison-guideline-users/
- https://stackoverflow.com/questions/60625612/how-does-one-use-kafka-with-openid-connect
- https://developer.ibm.com/tutorials/kafka-authn-authz/
One of Kafka's core features is the partitioning of data by means of a partition key, which can be used to select data for which the order must be maintained and data which can be processed in parallel.
A Kafka cluster consists of brokers that coordinate the writing (and reading) of data to permanent storage. With Kafka, every message is stored. Communicating via permanent storage decouples the send and receive operations from each other
The key benefits of Kafka are its scalability, its ordering guarantees, its wide-scale adoption, and wealth of commercial service offerings.
All message types are brokered. This means that messages can be delivered even if the recipient was disconnected for a moment.
The communication is also decoupled in terms of time so that direct feedback from the recipient to the sender of a message is no longer possible.