Kafka & Iceberg - cniackz/public GitHub Wiki

Objective:

To use Iceberg Format in Kafka

Links:

Steps:

  1. Get Kafka Running:
# Window 1:
# Download files:
rm -rf ~/kafka
mkdir ~/kafka
cd ~/kafka; wget https://dlcdn.apache.org/kafka/3.3.1/kafka_2.13-3.3.1.tgz
tar -xzf kafka_2.13-3.3.1.tgz
cd kafka_2.13-3.3.1;
pwd;

# Window 1:
# Kafka with ZooKeeper:
bin/zookeeper-server-start.sh config/zookeeper.properties
# Window 2:
# Start the Kafka broker service:
bin/kafka-server-start.sh config/server.properties
  1. Get kafka-connect-iceberg-sink from getindata based on memiiso/debezium-server-iceberg
cd ~
rm -rf ~/kafka-connect-iceberg-sink
mkdir ~/kafka-connect-iceberg-sink
cd ~/kafka-connect-iceberg-sink
wget https://github.com/getindata/kafka-connect-iceberg-sink/releases/download/0.2.0/kafka-connect-iceberg-sink-0.2.0-plugin.zip
unzip kafka-connect-iceberg-sink-0.2.0-plugin.zip
rm kafka-connect-iceberg-sink-0.2.0-plugin.zip
cd kafka-connect-iceberg-sink
  • From getindata you got a jar file:
$ ls kafka-connect-iceberg-sink-0.2.0.jar 
kafka-connect-iceberg-sink-0.2.0.jar
  1. Run Debezium container with Docker:
docker run -it \
--name connect \
-p 8083:8083 \
-e GROUP_ID=1 \
-e CONFIG_STORAGE_TOPIC=my-connect-configs \
-e OFFSET_STORAGE_TOPIC=my-connect-offsets \
-e BOOTSTRAP_SERVERS=10.206.196.160:9092 \
-e CONNECT_TOPIC_CREATION_ENABLE=true \
-v ~/.aws/credentials:/kafka/.aws/credentials \
-v /Users/cniackz/kafka-connect-iceberg-sink/kafka-connect-iceberg-sink/kafka-connect-iceberg-sink-0.2.0.jar:/kafka/connect/kafka-connect-iceberg-sink-0.2.0.jar \
debezium/connect
  1. run PostgreSQL on Docker:
docker run -d --name postgres -e POSTGRES_PASSWORD=postgres \
  -p 5432:5432 postgres -c wal_level=logical