Kafka introduce - swsuh93/study GitHub Wiki

Apache Kafka (A distributed streaming platform)

  • ์ŠคํŠธ๋ฆฌ๋ฐ ํ”Œ๋žซํผ์˜ ์ฃผ์š” ๊ธฐ๋Šฅ ๋ฉ”์‹œ์ง€ ํ ๋˜๋Š” ์—”ํ„ฐํ”„๋ผ์ด์ฆˆ ๋ฉ”์‹œ์ง• ์‹œ์Šคํ…œ๊ณผ ์œ ์‚ฌํ•˜๊ฒŒ ๋ ˆ์ฝ”๋“œ ์ŠคํŠธ๋ฆผ์„ ๊ฒŒ์‹œํ•˜๊ณ  ๊ตฌ๋… ๋‚ด๊ฒฐํ•จ์„ฑ์ด ๊ฐ•ํ•œ ๋ฐฉ์‹(N๊ฐœ์˜ ๋…ธ๋“œ๋“ค ์ค‘ ํ•˜๋‚˜ ์ด์ƒ์˜ ๋…ธ๋“œ๊ฐ€ ๋ถˆ๋Šฅ ์ƒํƒœ์—ฌ๋„ ์ •์ƒ ์ž‘๋™)์œผ๋กœ ๋ ˆ์ฝ”๋“œ ์ŠคํŠธ๋ฆผ ์ €์žฅ ๋ ˆ์ฝ”๋“œ ์ŠคํŠธ๋ฆผ์„ ์ฒ˜๋ฆฌ

  • ์ฃผ ์‚ฌ์šฉ ์‘์šฉํ”„๋กœ๊ทธ๋žจ ์‹œ์Šคํ…œ ๋˜๋Š” ์‘์šฉํ”„๋กœ๊ทธ๋žจ๊ฐ„์— ๋ฐ์ดํ„ฐ๋ฅผ ์•ˆ์ •์ ์œผ๋กœ ์–ป๋Š” ์‹ค์‹œ๊ฐ„ ์ŠคํŠธ๋ฆฌ๋ฐ ๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ ๊ตฌ์ถ• ๋ฐ์ดํ„ฐ ์ŠคํŠธ๋ฆผ์„ ๋ณ€ํ™˜ํ•˜๊ฑฐ๋‚˜ ์ด์— ๋ฐ˜์‘ํ•˜๋Š” ์‹ค์‹œ๊ฐ„ ์ŠคํŠธ๋ฆฌ๋ฐ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๊ตฌ์ถ•

  • ๊ฐœ๋… Kafka๋Š” ์—ฌ๋Ÿฌ ๋ฐ์ดํ„ฐ ์„ผํ„ฐ๋กœ ํ™•์žฅ ๋  ์ˆ˜ ์žˆ๋Š” ํ•˜๋‚˜ ์ด์ƒ์˜ ์„œ๋ฒ„์—์„œ ํด๋Ÿฌ์Šคํ„ฐ๋กœ ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค. ์นดํ”„์นด ํด๋Ÿฌ์Šคํ„ฐ๋Š” 'topic'์ด๋ผ๋Š” ์นดํ…Œ๊ณ ๋ฆฌ์— '๋ ˆ์ฝ”๋“œ'์˜ ์ŠคํŠธ๋ฆผ์„ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค. ๊ฐ ๋ ˆ์ฝ”๋“œ๋Š” ํ‚ค, ๊ฐ’, ํƒ€์ž„์Šคํƒฌํ”„๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค

  • 4๊ฐ€์ง€ ํ•ต์‹ฌ ์ œ๊ณต API

    • Producer API : ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์ด ํ•˜๋‚˜ ์ด์ƒ์˜ Kafka topic์— ๋ ˆ์ฝ”๋“œ์˜ ์ŠคํŠธ๋ฆผ์„ ์ƒ์‚ฐํ•  ์ˆ˜ ์žˆ๋„๋ก ํ—ˆ์šฉํ•œ๋‹ค
    • Consumer API : ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์ด ํ•˜๋‚˜ ์ด์ƒ์˜ topic์„ ๊ตฌ๋…ํ•˜๊ณ , ๊ทธ๋“ค์—๊ฒŒ ์ƒ์‚ฐ๋œ ๋ ˆ์ฝ”๋“œ์˜ ์ŠคํŠธ๋ฆผ์„ ์ฒ˜๋ฆฌํ•˜๋„๋ก ํ—ˆ์šฉํ•œ๋‹ค
    • Streams API : ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์ด ์ŠคํŠธ๋ฆผ ํ”„๋กœ์„ธ์„œ ์—ญํ• ์„ ํ•˜๋„๋ก ํ•œ๋‹ค, ํ•˜๋‚˜ ์ด์ƒ์˜ topic์œผ๋กœ๋ถ€ํ„ฐ input stream์„ ์†Œ๋น„ํ•˜๊ณ , ํ•˜๋‚˜ ์ด์ƒ์˜ ouput topic์— ouput stream์„ ์ƒ์‚ฐํ•˜๋Š”, ํšจ๊ณผ์ ์œผ๋กœ input stream์„ ouput stream์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋„๋ก
    • Connector API : Kafka topic์„ ์กด์žฌํ•˜๋Š” ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์ด๋‚˜ ๋ฐ์ดํ„ฐ ์‹œ์Šคํ…œ์— ์—ฐ๊ฒฐํ•˜๋„๋ก ์žฌ์‚ฌ์šฉ๊ฐ€๋Šฅํ•œ ์ƒ์‚ฐ์ž ๋˜๋Š” ์†Œ๋น„์ž๋“ค์˜ building, running ์„ ํ—ˆ์šฉํ•œ๋‹ค . ์˜ˆ๋ฅผ๋“ค์–ด, ๊ด€๊ณ„ํ˜• ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์— ๋Œ€ํ•œ connector ๋Š” ํ…Œ์ด๋ธ”์— ๋Œ€ํ•œ ๋ชจ๋“  ๋ณ€๊ฒฝ ์‚ฌํ•ญ์„ capture(ํฌํš,์ธ์ง€) ํ•  ์ˆ˜ ์žˆ๋‹ค
  • Kafka์—์„œ ํด๋ผ์ด์–ธํŠธ์™€ ์„œ๋ฒ„ ๊ฐ„์˜ ํ†ต์‹ ์€ ๋‹จ์ˆœํ•˜๊ณ , ๊ณ ์„ฑ๋Šฅ์ด๊ณ , ์–ธ์–ด์— ๊ตฌ์• ๋ฐ›์ง€ ์•Š๋Š” TCP protocol๋กœ ์ˆ˜ํ–‰๋œ๋‹ค. ์ด ํ”„๋กœํ† ์ฝœ์€ ๋ฒ„์ „์ด ์ง€์ •๋˜๋ฉฐ ์ด์ „ ๋ฒ„์ „๊ณผ์˜ ํ•˜์œ„ ํ˜ธํ™˜์„ฑ์„ ์œ ์ง€ํ•œ๋‹ค. Kafka๋ฅผ ์œ„ํ•œ Java client๋ฅผ ์ œ๊ณตํ•˜์ง€๋งŒ client๋“ค์€ ์—ฌ๋Ÿฌ ์–ธ์–ด๋กœ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•˜๋‹ค

  • ๊ฐ ํŒŒํ‹ฐ์…˜์€ ๊ณ„์†ํ•ด์„œ ์ถ”๊ฐ€๋˜๋Š” ์ˆœ์„œํ™” ๋œ ๋ถˆ๋ณ€์˜ ๋ ˆ์ฝ”๋“œ ์ˆœ์„œ ์ธ ๊ตฌ์กฐํ™” ๋œ ์ปค๋ฐ‹ ๋กœ๊ทธ์ž…๋‹ˆ๋‹ค. ํŒŒํ‹ฐ์…˜์˜ ๋ ˆ์ฝ”๋“œ์—๋Š” ํŒŒํ‹ฐ์…˜ ๋‚ด์˜ ๊ฐ ๋ ˆ์ฝ”๋“œ๋ฅผ ๊ณ ์œ ํ•˜๊ฒŒ ์‹๋ณ„ ํ•˜๋Š” ์˜คํ”„์…‹ ์ด๋ผ๋Š” ์ˆœ์ฐจ์  ์ธ ID ๋ฒˆํ˜ธ๊ฐ€ ๊ฐ๊ฐ ์ง€์ • ๋ฉ๋‹ˆ๋‹ค.

  • Kafka ํด๋Ÿฌ์Šคํ„ฐ๋Š” ๊ตฌ์„ฑ ๊ฐ€๋Šฅํ•œ ๋ณด์กด ๊ธฐ๊ฐ„์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฒŒ์‹œ ๋œ ๋ชจ๋“  ๋ ˆ์ฝ”๋“œ (์‚ฌ์šฉ ์—ฌ๋ถ€์™€ ์ƒ๊ด€์—†์ด)๋ฅผ ์˜๊ตฌํžˆ ์œ ์ง€ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ๋ณด์กด ์ •์ฑ…์„ 2 ์ผ๋กœ ์„ค์ •ํ•˜๋ฉด ๋ ˆ์ฝ”๋“œ๋ฅผ ๊ฒŒ์‹œ ํ•œ ํ›„ 2 ์ผ ๋™์•ˆ ์†Œ๋น„ ์ •์ฑ…์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ๊ทธ ์ดํ›„์—๋Š” ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๊ณต๊ฐ„์„ ๋Š˜๋ฆฌ๊ธฐ ์œ„ํ•ด ํ๊ธฐ๋ฉ๋‹ˆ๋‹ค. Kafka์˜ ์„ฑ๋Šฅ์€ ๋ฐ์ดํ„ฐ ํฌ๊ธฐ์™€ ๊ด€๋ จํ•˜์—ฌ ์‚ฌ์‹ค์ƒ ์ผ์ •ํ•˜๋ฏ€๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์˜ค๋žซ๋™์•ˆ ์ €์žฅํ•˜๋Š” ๊ฒƒ์€ ๋ฌธ์ œ๊ฐ€๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

  • ์ผ๋ฐ˜์ ์œผ๋กœ ์†Œ๋น„์ž๋Š” ๋ ˆ์ฝ”๋“œ๋ฅผ ์ฝ์„ ๋•Œ ์„ ํ˜• ์ ์œผ๋กœ ์˜คํ”„์…‹์„ ์ง„ํ–‰ํ•˜์ง€๋งŒ ์‹ค์ œ๋กœ๋Š” ์œ„์น˜๊ฐ€ ์†Œ๋น„์ž์— ์˜ํ•ด ์ œ์–ด๋˜๋ฏ€๋กœ ์ข‹์•„ํ•˜๋Š” ์ˆœ์„œ๋Œ€๋กœ ๋ ˆ์ฝ”๋“œ๋ฅผ ์†Œ๋น„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์†Œ๋น„์ž๋Š” ๊ณผ๊ฑฐ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค์‹œ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด ์˜ค๋ž˜๋œ ์˜คํ”„์…‹์œผ๋กœ ์žฌ์„ค์ •ํ•˜๊ฑฐ๋‚˜ ๊ฐ€์žฅ ์ตœ๊ทผ์˜ ๋ ˆ์ฝ”๋“œ๋กœ ๊ฑด๋„ˆ ๋›ฐ๊ณ  "์ง€๊ธˆ"์—์„œ ์†Œ๋น„ํ•˜๊ธฐ ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • they can come and go without much impact on the cluster or on other consumers

  • ๋กœ๊ทธ์˜ ํŒŒํ‹ฐ์…˜์€ ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ์šฉ๋„๋กœ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ฒซ์งธ, ๋กœ๊ทธ๋ฅผ ๋‹จ์ผ ์„œ๋ฒ„์— ๋งž๋Š” ํฌ๊ธฐ ์ด์ƒ์œผ๋กœ ํ™•์žฅ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ฐ ๊ฐœ๋ณ„ ํŒŒํ‹ฐ์…˜์€ ํ˜ธ์ŠคํŠธํ•˜๋Š” ์„œ๋ฒ„์— ์ ํ•ฉํ•ด์•ผํ•˜์ง€๋งŒ ์ฃผ์ œ์— ๋งŽ์€ ํŒŒํ‹ฐ์…˜์ด์žˆ์–ด ์ž„์˜์˜ ์–‘์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‘˜์งธ, ๊ทธ๋“ค์€ ๋ณ‘๋ ฌ ์ฒ˜๋ฆฌ์˜ ๋‹จ์œ„์ฒ˜๋Ÿผ ํ–‰๋™ํ•ฉ๋‹ˆ๋‹ค.

  • ๋ถ„ํฌ : ๋กœ๊ทธ์˜ ํŒŒํ‹ฐ์…˜์€ Kafka ํด๋Ÿฌ์Šคํ„ฐ์˜ ์„œ๋ฒ„๋ฅผ ํ†ตํ•ด ๋ฐฐํฌ๋˜๋ฉฐ ๊ฐ ์„œ๋ฒ„๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ณ  ํŒŒํ‹ฐ์…˜ ๊ณต์œ ์— ๋Œ€ํ•œ ์š”์ฒญ์„ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค. ๊ฐ ํŒŒํ‹ฐ์…˜์€ ์žฅ์•  ํ—ˆ์šฉ์„ ์œ„ํ•ด ๊ตฌ์„ฑ ๊ฐ€๋Šฅํ•œ ์ˆ˜์˜ ์„œ๋ฒ„์— ๋ณต์ œ๋ฉ๋‹ˆ๋‹ค. ๊ฐ ์„œ๋ฒ„๋Š” ์ผ๋ถ€ ํŒŒํ‹ฐ์…˜์˜ ๋ฆฌ๋”์™€ ๋‹ค๋ฅธ ์„œ๋ฒ„์˜ ํŒ”๋กœ์–ด๋กœ ์ž‘๋™ํ•˜๋ฏ€๋กœ๋กœ๋“œ๊ฐ€ ํด๋Ÿฌ์Šคํ„ฐ ๋‚ด์—์„œ ์ž˜ ๊ท ํ˜•์„ ์ด๋ฃน๋‹ˆ๋‹ค

  • Geo-Replication : Kafka MirrorMaker๋Š” ํด๋Ÿฌ์Šคํ„ฐ์— ์ง€๋ฆฌ์  ๋ณต์ œ ์ง€์›์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋ฉ”์‹œ์ง€๊ฐ€ ์—ฌ๋Ÿฌ ๋ฐ์ดํ„ฐ ์„ผํ„ฐ ๋˜๋Š” ํด๋ผ์šฐ๋“œ region์— ๋ณต์ œ. ๋ฐฑ์—… ๋ฐ ๋ณต๊ตฌ๋ฅผ ์œ„ํ•ด ๋Šฅ๋™/์ˆ˜๋™ ์‹œ๋‚˜๋ฆฌ์˜ค ์‚ฌ์šฉ ๊ฐ€๋Šฅ. ๋ฐ์ดํ„ฐ ์ง€์—ญ์„ฑ ์š”๊ตฌ ์ง€์› ๊ฐ€๋Šฅ.

  • Producers : ์„ ํƒํ•œ topic์— ๋ฐ์ดํ„ฐ publish. topic ๋‚ด์—์„œ ์–ด๋–ค ํŒŒํ‹ฐ์…˜์— ์–ด๋–ค ๋ ˆ์ฝ”๋“œ๋ฅผ ํ• ๋‹นํ• ์ง€ ์„ ํƒํ•ด์•ผ ํ•จ. load balance๋ฅผ ๋งž์ถ”๊ธฐ ์œ„ํ•ด ๋ผ์šด๋“œ ๋กœ๋นˆ ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•˜๊ฑฐ๋‚˜ ์ผ๋ถ€ semantic partition function(๋ ˆ์ฝ”๋“œ์˜ ์ผ๋ถ€ ํ‚ค๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•จ)) ์— ๋”ฐ๋ผ ์ˆ˜ํ–‰ ๋  ์ˆ˜ ์žˆ์Œ. ๋‘๋ฒˆ์งธ๋กœ ํŒŒํ‹ฐ์…”๋‹์„ ๋” ๋งŽ์ด ์‚ฌ์šฉํ•จ.

  • Consumers : ์†Œ๋น„์ž๋Š” ์†Œ๋น„์ž ๊ทธ๋ฃน ์ด๋ฆ„์„ ์‚ฌ์šฉํ•˜์—ฌ ์ž์‹ ์—๊ฒŒ ๋ ˆ์ด๋ธ” ์„ ์ง€์ •ํ•˜๊ณ  ์ฃผ์ œ์— ๊ฒŒ์‹œ ๋œ ๊ฐ ๋ ˆ์ฝ”๋“œ๋Š” ๊ฐ ๊ตฌ๋… ์†Œ๋น„์ž ๊ทธ๋ฃน ๋‚ด์˜ ํ•˜๋‚˜์˜ ์†Œ๋น„์ž ์ธ์Šคํ„ด์Šค์— ์ „๋‹ฌ๋ฉ๋‹ˆ๋‹ค. ์†Œ๋น„์ž ์ธ์Šคํ„ด์Šค๋Š” ๋ณ„๋„์˜ ํ”„๋กœ์„ธ์Šค ๋˜๋Š” ๋ณ„๋„์˜ ์‹œ์Šคํ…œ์—์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ชจ๋“  ์†Œ๋น„์ž ์ธ์Šคํ„ด์Šค๊ฐ€ ๋™์ผํ•œ ์†Œ๋น„์ž ๊ทธ๋ฃน์„ ๊ฐ–๋Š” ๊ฒฝ์šฐ ๋ ˆ์ฝ”๋“œ๋Š” ์†Œ๋น„์ž ์ธ์Šคํ„ด์Šค๋ณด๋‹ค ํšจ๊ณผ์ ์œผ๋กœ๋กœ๋“œ ๋ฐธ๋Ÿฐ์‹ฑ๋ฉ๋‹ˆ๋‹ค. ๋ชจ๋“  ์†Œ๋น„์ž ์ธ์Šคํ„ด์Šค๊ฐ€ ์„œ๋กœ ๋‹ค๋ฅธ ์†Œ๋น„์ž ๊ทธ๋ฃน์„ ๊ฐ–๊ณ  ์žˆ์œผ๋ฉด ๊ฐ ๋ ˆ์ฝ”๋“œ๊ฐ€ ๋ชจ๋“  ์†Œ๋น„์ž ํ”„๋กœ์„ธ์Šค์— ๋ธŒ๋กœ๋“œ ์บ์ŠคํŒ…๋ฉ๋‹ˆ๋‹ค.

  • Multi-tenancy : Kafka๋ฅผ ๋ฉ€ํ‹ฐ ํ…Œ๋„ŒํŠธ ์†”๋ฃจ์…˜์œผ๋กœ ๋ฐฐํฌํ•  ์ˆ˜ ์žˆ๋‹ค. ๋ฉ€ํ‹ฐ ํ…Œ๋„Œ์‹œ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑ ๋˜๋Š” ์†Œ๋น„ํ•  ์ˆ˜ ์žˆ๋Š” topic์„ ๊ตฌ์„ฑํ•˜์—ฌ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค. ํ• ๋‹น๋Ÿ‰์— ๋Œ€ํ•œ ์šด์˜ ์ง€์›๋„ ์žˆ๋‹ค. ๊ด€๋ฆฌ์ž๋Š” ํด๋ผ์ด์–ธํŠธ๊ฐ€ ์‚ฌ์šฉํ•˜๋Š” ๋ธŒ๋กœ์ปค ์ž์›์„ ์ œ์–ดํ•˜๋ผ๋Š” ์š”์ฒญ์— ๋Œ€ํ•ด ํ• ๋‹น๋Ÿ‰์„ ์ •์˜ํ•˜๊ณ  ์‹œํ–‰ํ•  ์ˆ˜ ์žˆ๋‹ค.

  • Guarantees : ์ˆœ์„œ๋ฅผ ์ง€ํ‚ด(๋จผ์ €๊ฐ„ ๋ฉ”์‹œ์ง€์— ๋Œ€ํ•ด ๋‚ฎ์€ ์˜คํ”„์…‹, ๋กœ๊ทธ์— ์ผ์ฐ ๋‚˜ํƒ€๋‚จ) : consumer instance๋Š” ๋กœ๊ทธ์— ์ €์žฅ๋œ ์ˆœ์„œ๋Œ€๋กœ ๋ ˆ์ฝ”๋“œ๋ฅผ ๋ด„ : ๋ณต์ œ ์ธ์ž N์ด ์žˆ๋Š” topic์˜ ๊ฒฝ์šฐ, ๋กœ๊ทธ์— commit ๋œ ๋ ˆ์ฝ”๋“œ๋ฅผ ์†์‹คํ•˜์ง€ ์•Š๊ณ  ์ตœ๋Œ€ N-1๊ฐœ์˜ ์„œ๋ฒ„ ์˜ค๋ฅ˜ ํ—ˆ์šฉ

  • Kafka as a Messaging System : As with a queue the consumer group allows you to divide up processing over a collection of processes (the members of the consumer group). As with publish-subscribe, Kafka allows you to broadcast messages to multiple consumer groups.[queuing์™€ publish-subscribe ๋ฐฉ์‹ ๋ชจ๋‘ ๊ฐ€๋Šฅ] : The advantage of Kafka's model is that every topic has both these propertiesโ€”it can scale processing and is also multi-subscriberโ€”there is no need to choose one or the other.[๋ชจ๋“  topic ์ด ์ด๋Ÿฌํ•œ ์†์„ฑ(์ฒ˜๋ฆฌ ๊ทœ๋ชจ ์กฐ์ •, ๋‹ค์ค‘ ๊ฐ€์ž…์ž)์„ ๊ฐ–์ถ”๊ณ  ์žˆ๋‹ค] : Kafka has stronger ordering guarantees than a traditional messaging system, too.[๊ฐ•๋ ฅํ•œ ordering guarantees] : By having a notion of parallelismโ€”the partitionโ€”within the topics, Kafka is able to provide both ordering guarantees and load balancing over a pool of consumer[๋ณ‘๋ ฌ ์ฒ˜๋ฆฌ ๊ฐœ๋…์œผ๋กœ ordering guarantees์™€ load balancing ๋ชจ๋‘ ์ œ๊ณต] however that there cannot be more consumer instances in a consumer group than partitions.[์†Œ๋น„์ž ๊ทธ๋ฃน์—๋Š” ํŒŒํ‹ฐ์…˜๋ณด๋‹ค ๋” ๋งŽ์€ ์†Œ๋น„์ž ์ธ์Šคํ„ด์Šค๊ฐ€ ์žˆ์„ ์ˆ˜ ์—†๋‹ค]

  • Kafka as a Storage System : Data written to Kafka is written to disk and replicated for fault-tolerance[Kafka์— ๊ธฐ๋ก๋œ ๋ฐ์ดํ„ฐ๋Š” ๋””์Šคํฌ์— ๊ธฐ๋ก๋˜๊ณ  ๋‚ด๊ฒฐํ•จ์„ฑ์„ ์œ„ํ•ด ๋ณต์ œ๋œ๋‹ค] : The disk structures Kafka uses scale wellโ€”Kafka will perform the same whether you have 50 KB or 50 TB of persistent data on the server[Kafka๊ฐ€ scale well-Kafka๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋””์Šคํฌ ๊ตฌ์กฐ๋Š” ์„œ๋ฒ„์— 50KB ๋˜๋Š” 50TB์˜ ์ง€์†์ ์ธ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋”๋ผ๋„ ๋™์ผํ•˜๊ฒŒ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค] : As a result of taking storage seriously and allowing the clients to control their read position, you can think of Kafka as a kind of special purpose distributed filesystem dedicated to high-performance, low-latency commit log storage, replication, and propagation. [๊ณ ์„ฑ๋Šฅ, ๋‚ฎ์€ ๋Œ€๊ธฐ ์‹œ๊ฐ„์˜ commit ๋กœ๊ทธ ์ €์žฅ, ๋ณต์ œ, ๋ฒˆ์‹ ๋“ฑ์˜ ํŠน๋ณ„ํ•œ ๋ชฉ์ ์˜ ๋ถ„์‚ฐ ํŒŒ์ผ ์‹œ์Šคํ…œ]

  • Kafka for Stream Processing[Stream API] : It isn't enough to just read, write, and store streams of data, the purpose is to enable real-time processing of streams [์ŠคํŠธ๋ฆผ์˜ ์‹ค์‹œ๊ฐ„ ์ฒ˜๋ฆฌ๋ฅผ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๋Š” ๊ฒƒ] : This facility helps solve the hard problems this type of application faces: handling out-of-order data, reprocessing input as code changes, performing stateful computations, etc. [์ˆœ์„œ๊ฐ€ ์ž˜๋ชป๋œ ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ, ์ฝ”๋“œ ๋ณ€๊ฒฝ ์‚ฌํ•ญ์œผ๋กœ ์ž…๋ ฅ ์žฌ ์ฒ˜๋ฆฌ, ์ƒํƒœ ๊ณ„์‚ฐ ๋“ฑ์— ๋„์›€] : The streams API builds on the core primitives Kafka provides: it uses the producer and consumer APIs for input, uses Kafka for stateful storage, and uses the same group mechanism for fault tolerance among the stream processor instances. [์ŠคํŠธ๋ฆผ API๋Š” Kafka๊ฐ€ ์ œ๊ณตํ•˜๋Š” ํ•ต์‹ฌ ๊ธฐ๋ณธ ์š”์†Œ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœํ•ฉ๋‹ˆ๋‹ค. ์ž…๋ ฅ์— ์ƒ์‚ฐ์ž ๋ฐ ์†Œ๋น„์ž API๋ฅผ ์‚ฌ์šฉํ•˜๊ณ , ์ƒํƒœ ์ €์žฅ์„ ์œ„ํ•ด Kafka๋ฅผ ์‚ฌ์šฉํ•˜๋ฉฐ, ์ŠคํŠธ๋ฆผ ํ”„๋กœ์„ธ์„œ ์ธ์Šคํ„ด์Šค ๊ฐ„์˜ ๋‚ด๊ฒฐํ•จ์„ฑ์„ ์œ„ํ•ด ๋™์ผํ•œ ๊ทธ๋ฃน ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.]

  • ์กฐ๊ฐ ๋งž์ถ”๊ธฐ : ๋ฉ”์‹œ์ง•, ์Šคํ† ๋ฆฌ์ง€, ์ŠคํŠธ๋ฆผ ์ฒ˜๋ฆฌ์˜ ๊ฒฐํ•ฉ์€ ๋ณด๊ธฐ ๋“œ๋ฌผ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด๊ฒƒ์ด ์ค‘์š”ํ•œ streaming platform์œผ๋กœ์จ์˜ Kafka์˜ ์—ญํ• ์ด๋‹ค : Kafka combines both of these capabilities, and the combination is critical both for Kafka usage as a platform for streaming applications as well as for streaming data pipelines.[๊ณผ๊ฑฐ์˜ ๊ธฐ๋ก ๋ฐ์ดํ„ฐ ์ €์žฅ, ์ฒ˜๋ฆฌ / ๋ฏธ๋ž˜์˜ ๋ฐ์ดํ„ฐ ๋„์ฐฉํ•˜๋Š”๋ฐ๋กœ ์ฒ˜๋ฆฌ ๋ชจ๋‘ ๊ฐ€๋Šฅ] : By combining storage and low-latency subscriptions, streaming applications can treat both past and future data the same way. That is a single application can process historical, stored data but rather than ending when it reaches the last record it can keep processing as future data arrives. This is a generalized notion of stream processing that subsumes batch processing as well as message-driven applications.[๊ณผ๊ฑฐ ๋ฐ ๋ฏธ๋ž˜ ๋ฐ์ดํ„ฐ ๋™์ผ ๋ฐฉ์‹ ์ฒ˜๋ฆฌ / ๋งˆ์ง€๋ง‰ ๋ ˆ์ฝ”๋“œ ๋„๋‹ฌ ์‹œ ์ข…๋ฃŒํ•˜์ง€ ์•Š๊ณ  ์ดํ›„ ๋ฐ์ดํ„ฐ ๋„์ฐฉ ์‹œ ์ฒ˜๋ฆฌ] : Likewise for streaming data pipelines the combination of subscription to real-time events make it possible to use Kafka for very low-latency pipelines; but the ability to store data reliably make it possible to use it for critical data where the delivery of data must be guaranteed or for integration with offline systems that load data only periodically or may go down for extended periods of time for maintenance. The stream processing facilities make it possible to transform data as it arrives.[์ŠคํŠธ๋ฆฌ๋ฐ ๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„ ๋ผ์ธ์˜ ๊ฒฝ์šฐ ๋งค์šฐ ์งง์€ ์ง€์—ฐ์‹œ๊ฐ„์˜ ํŒŒ์ดํ”„๋ผ์ธ ์‚ฌ์šฉ์ด ๊ฐ€๋Šฅํ•˜๋‹ค.]