cbridge - aragorn/home GitHub Wiki

NAME

cbridge - bridge tool according to ConvBridge interface

SOURCE

https://github.com/aragorn/home/blob/master/bin/cbridge

SYNOPSIS

cbridge seed seed_url destination_pattern ...

cbridge session source_url file

cbridge pull source_url file [count]

cbridge pull source_url destination_url [count]

cbridge pull source_url [count]

cbridge list [pattern ...]

cbridge resume session [count]

cbridge resume pattern ... [count]

cbridge status session

cbridge stop pattern ...

cbridge remove pattern ...

DESCRIPTION

cbridge๋Š” ConvBridge ์ธํ„ฐํŽ˜์ด์Šค์— ๋”ฐ๋ฅธ bridge๋ฅผ ๊ตฌํ˜„ํ•œ ํ”„๋กœ๊ทธ๋žจ์ด๋‹ค. cbridge๋Š” source ์˜ ๋ฐ์ดํ„ฐ๋ฅผ destination ์œผ๋กœ ๋ณด๋‚ด๋Š” ์ „์†ก ๊ธฐ๋Šฅ์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. ConvBridge ์ธํ„ฐํŽ˜์ด์Šค์—์„œ ๋ฐ์ดํ„ฐ์˜ ๋ณ€ํ™˜, ๋ณ‘ํ•ฉ ๋“ฑ ๊ฐ€๊ณต ๊ธฐ๋Šฅ์€ converter๊ฐ€ ๋‹ด๋‹นํ•˜๊ณ , bridge๋Š” converter๋กœ ๋ณ€ํ™˜๋œ ๋ฐ์ดํ„ฐ๋ฅผ destination์œผ๋กœ ์ „์†กํ•˜๋Š” ๊ธฐ๋Šฅ์„ ๋‹ด๋‹นํ•œ๋‹ค. ConvBridge๋Š” ๋Œ€์šฉ๋Ÿ‰ ๋ฐ์ดํ„ฐ ์ „์†ก์„ ์œ„ํ•œ ์ธํ„ฐํŽ˜์ด์Šค๋กœ ConvBridge์—์„œ ์ƒ์„ธํ•œ ๋ช…์„ธ์™€ ๋ฌธ์„œ, ์˜ˆ์‹œ๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ๋‹ค.

bridge๋Š” ์„ธ์…˜ ๊ธฐ๋ฐ˜์œผ๋กœ ์ž‘๋™ํ•œ๋‹ค. ์„ธ์…˜์€ source์—์„œ destination์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์ „์†กํ•˜๋Š” ์ผ๋ จ์˜ ํ๋ฆ„์—์„œ์˜ ์ƒํƒœ๋ฅผ ๊ฐ€๋ฆฌํ‚จ๋‹ค. bridge๋Š” ์„ธ์…˜ ์ •๋ณด๋ฅผ ์ €์žฅํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ €์žฅ๋œ ์„ธ์…˜ ์ •๋ณด๋ฅผ ์ด์šฉํ•ด ์ค‘๋‹จ๋œ ์ „์†ก์„ ์žฌ๊ฐœํ•  ์ˆ˜ ์žˆ๋‹ค. ์„ธ์…˜์„ ๊ตฌ์„ฑํ•˜๋Š” ํ•ญ๋ชฉ ์ค‘, source, destination, next_urls ์ด ํ•„์ˆ˜ ํ•ญ๋ชฉ์ด๋‹ค.

cbridge seed๋Š” seed_url์—์„œ converter_url ์„ ๊ฐ€์ ธ์™€ ์„ธ์…˜์„ ์ƒ์„ฑํ•œ๋‹ค. seed_url์€ converter_url์„ ํ•˜๋‚˜ ์ด์ƒ ๋‚˜์—ดํ•˜๋Š” ๊ฒฐ๊ณผ๋ฅผ ๋‚ด๋ณด๋‚ด๋Š” API์ด๋‹ค.

destination_pattern์€ "sprintf" in perlfunc์—์„œ ์‚ฌ์šฉํ•˜๋Š” ํฌ๋งท์œผ๋กœ, ์ •์ˆ˜ํ˜• conversion specification ์„ ํ•˜๋‚˜ ํฌํ•จํ•  ์ˆ˜ ์žˆ๋‹ค. ์ •์ˆ˜ํ˜• conversion specification์—๋Š” seed_url์ด ๋‚ด๋ณด๋‚ด๋Š” converter_url์˜ ์ˆœ์„œ์— ๋”ฐ๋ฅธ ๋ฒˆํ˜ธ๊ฐ€ ์ฃผ์–ด์ง„๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, converter_url์ด 10๊ฐœ์ด๊ณ , destination_pattern์ด path/file-%02d.txt๋กœ ์ฃผ์–ด์ง„ ๊ฒฝ์šฐ, ํฌ๋งท์˜ conversion specification์—๋Š” 0..9 ์˜ ๊ฐ’์ด ์ฃผ์–ด์ ธ, path/file-00.txt, path/file-01.txt, path/file-02.txt, ... , path/file-09.txt ๋“ฑ 10๊ฐœ์˜ destination ์ด ๊ฒฐ์ •๋œ๋‹ค.

์—ฌ๋Ÿฌ converter_url์€ ํ•˜๋‚˜ ์ด์ƒ ์ฃผ์–ด์ง„ destination_pattern์— ์ˆœ์ฐจ์ ์œผ๋กœ ๋Œ€์‘๋˜์–ด ๊ฐ ์„ธ์…˜์„ ๊ตฌ์„ฑํ•œ๋‹ค. converter_url์˜ ์ˆ˜๋งŒํผ session์ด ์ƒ์„ฑ๋˜๋ฉฐ, session์˜ source๋Š” converter_url, destination์€ ์ง€์ •๋œ destination_pattern์ด ๋œ๋‹ค. destination_pattern์ด ์—ฌ๋Ÿฌ ๊ฐœ์ธ ๊ฒฝ์šฐ, session์˜ destination์€ ์—ฌ๋Ÿฌ pattern์— ์ˆœ์ฐจ์ ์œผ๋กœ ๋Œ€์‘๋˜์–ด ์ƒ์„ฑ๋œ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, converter_url์ด url1, url2, url3, ... url10 ๋“ฑ 10๊ฐœ์ด๊ณ , destination_pattern์ด path1/file-%02d.txt, path2/file-%02d.txt, path3/file-%02d.txt ์œผ๋กœ 3๊ฐœ์ธ ๊ฒฝ์šฐ, 10๊ฐœ์˜ ์„ธ์…˜์ด ์ƒ์„ฑ๋˜๋ฉฐ, ๊ฐ session์˜ source - destination ๊ตฌ์„ฑ์€ url1 - path1/file-00.txt, url2 - path2/file-01.txt, url3 - path3/file-02.txt, url4 - path1/file-03.txt, ... , url10 - path1/file-09.txt ๊ฐ€ ๋œ๋‹ค.

bash๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ, destination_pattern์€ shell์˜ brace expansion, tilde expansion ๋“ฑ์„ ํ™œ์šฉํ•˜์—ฌ, ์œ„์˜ ์˜ˆ์—์„œ destination_pattern์„ {path1,path2,path3}/file-%02d.txt ๋กœ ์ž…๋ ฅํ•  ์ˆ˜ ์žˆ๋‹ค.

cbridge pull ๋˜๋Š” cbridge session ๋ช…๋ น์€ ์ƒˆ๋กœ์šด ์„ธ์…˜์„ ์ƒ์„ฑํ•˜๊ฒŒ ๋œ๋‹ค. cbridge session ๋ช…๋ น์€ ์„ธ์…˜์„ ์ƒ์„ฑํ•œ ํ›„, ๋ฐ์ดํ„ฐ๋ฅผ ์ „์†กํ•˜์ง€ ์•Š๊ณ  ์ข…๋ฃŒํ•œ๋‹ค. cbridge pull ์€ ์„ธ์…˜์„ ์ƒ์„ฑํ•œ ํ›„, ๋ฐ์ดํ„ฐ ์ „์†ก์„ ์‹คํ–‰ํ•œ๋‹ค.

๋ฐ์ดํ„ฐ ์ „์†ก์€ ๋ฐ์ดํ„ฐ์˜ ๋งˆ์ง€๋ง‰๊นŒ์ง€ ์ „์†ก์ด ์™„๋ฃŒ๋˜๊ฑฐ๋‚˜, ์„ ํƒ์ ์œผ๋กœ ์ฃผ์–ด์ง„ ์‹คํ–‰ ํŒŒ๋ผ๋ฏธํ„ฐ์ธ count๋งŒํผ converter์— ๋ณ€ํ™˜์š”์ฒญ์„ ํ•˜๊ณ , ๊ทธ ๊ฒฐ๊ณผ๋ฅผ ์ „์†กํ•œ ํ›„ ์‹คํ–‰์„ ์ •์ง€ํ•œ๋‹ค. count๊ฐ€ 0 ์ธ ๊ฒฝ์šฐ, ๋ฐ์ดํ„ฐ์˜ ๋งˆ์ง€๋ง‰๊นŒ์ง€ ๋ฐ์ดํ„ฐ๋ฅผ ์ „์†กํ•œ๋‹ค๋Š” ์˜๋ฏธ์ด๋‹ค. ์ง€์ •๋œ count ๋งŒํผ ๋ณ€ํ™˜ ์š”์ฒญ์„ ํ•˜๊ธฐ ์ „์— ๋ฐ์ดํ„ฐ ์ „์†ก์ด ์™„๋ฃŒ๋˜์–ด๋„, cbridge๋Š” ์‹คํ–‰์„ ์ •์ง€ํ•œ๋‹ค.

count ํŒŒ๋ผ๋ฏธํ„ฐ๋Š” converter์˜ url ์ฟผ๋ฆฌ์—๋„ ์ •์˜๋˜์–ด ์žˆ์ง€๋งŒ, cbridge pull ์˜ count ํŒŒ๋ผ๋ฏธํ„ฐ๋Š” ์ด converter url์˜ ์ฟผ๋ฆฌ ํŒŒ๋ผ๋ฏธํ„ฐ์™€ ๊ด€๋ จ์ด ์—†๋‹ค. converter์˜ url ์ฟผ๋ฆฌ์— ์ •์˜๋œ count ํŒŒ๋ผ๋ฏธํ„ฐ๋Š” 1ํšŒ ๋ณ€ํ™˜ ์š”์ฒญ์—์„œ ๋‚ด๋ณด๋‚ด๋Š” ๋ฌธ์„œ์˜ ์ตœ๋Œ€์ˆ˜ ๋˜๋Š” ๋ณ€๊ฒฝ๋ชฉ๋ก์˜ ์ด๋ฒคํŠธ ์ˆ˜๋ฅผ ์˜๋ฏธํ•œ๋‹ค. bridge์˜ ์‹คํ–‰ ํŒŒ๋ผ๋ฏธํ„ฐ์ธ count๋Š” bridge๊ฐ€ converter์— http ์š”์ฒญ์„ ๋ณด๋‚ด๋Š” ์ตœ๋Œ€ ํšŸ์ˆ˜๋ฅผ ์˜๋ฏธํ•œ๋‹ค.

cbridge list๋Š” ์ƒ์„ฑ๋œ ์„ธ์…˜์„ ๋ชฉ๋ก์œผ๋กœ ๋‚˜์—ดํ•œ๋‹ค. pattern์„ ์ง€์ •ํ•˜๋Š” ๊ฒฝ์šฐ, shell ์˜ ํŒŒ์ผ ๋งค์นญ๊ณผ ๋™์ผํ•œ ์„ธ์…˜์ด๋ฆ„ ํŒจํ„ด ๋งค์นญ์ด ๊ฐ€๋Šฅํ•˜๋‹ค. pattern ๋งค์นญ์€ perl์˜ "glob" in perlfunc ๋‚ด์žฅํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•˜์˜€๋‹ค.

cbridge status๋Š” ์ง€์ •ํ•œ ์„ธ์…˜์˜ ์ƒ์„ธ ์ •๋ณด๋ฅผ ์ถœ๋ ฅํ•œ๋‹ค.

cbridge resume๋Š” ์ง€์ •ํ•œ ์„ธ์…˜์˜ ๋ฐ์ดํ„ฐ ์ „์†ก์„ ์žฌ๊ฐœํ•œ๋‹ค. cbridge pull ๋ช…๋ น์€ cbridge session ์‹คํ–‰ ํ›„ cbridge resume ๋ช…๋ น์„ ์‹คํ–‰ํ•œ ๊ฒƒ๊ณผ ๋™๋“ฑํ•˜๋‹ค.

cbridge stop๋Š” pattern์— ์ผ์น˜ํ•˜๋Š” ์ด๋ฆ„์˜ ์„ธ์…˜์„ ์ค‘์ง€ํ•œ๋‹ค. ์ด๋•Œ ์„ธ์…˜์€ ๋‹ค๋ฅธ cbridge ํ”„๋กœ์„ธ์Šค์— ์˜ํ•ด ๋ฐ์ดํ„ฐ ์ „์†ก์ด ์ง„ํ–‰ ์ค‘์ธ ์„ธ์…˜์„ ๊ฐ€๋ฆฌํ‚ค๋ฉฐ, ๋ฐ์ดํ„ฐ ์ „์†ก์„ ์ค‘์ง€ํ•˜๊ธฐ ์œ„ํ•ด SIGINT ์‹œ๊ทธ๋„์„ ๋ณด๋‚ธ๋‹ค.

cbridge remove๋Š” ์ง€์ •ํ•œ ์„ธ์…˜์„ ์‚ญ์ œํ•œ๋‹ค. ํ•ด๋‹น ์„ธ์…˜์—์„œ ์ „์†กํ•œ ๋ฐ์ดํ„ฐ๋Š” ์‚ญ์ œํ•˜์ง€ ์•Š๋Š”๋‹ค. ๋กœ์ปฌ ํŒŒ์ผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜๋Š” ๊ฒฝ์šฐ, ๋กœ์ปฌ ๋ฐ์ดํ„ฐ ํŒŒ์ผ์€ ๊ฑด๋“œ๋ฆฌ์ง€ ์•Š๊ณ  ๊ทธ๋Œ€๋กœ ๋‘”๋‹ค.

Development Plan

TODO

  • cbridge pull ์—์„œ progress ๋ณด์—ฌ์ฃผ๊ธฐ

  • cbridge pull ์—์„œ from, to ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ๊ฐ’์„ ์ด์šฉํ•ด ์ข…๋ฃŒ ์‹œ๊ฐ ์˜ˆ์ธกํ•˜๊ธฐ

  • support for search engines - elastic search

  • support for search engines - breeze

  • support for mysql destination

DONE

  • cbridge seed

DROP

  • dump file ์„ ์Šคํฌ๋ฆฝํŠธ ํŒŒ์ผ๋กœ ๊ฐ„์ฃผํ•˜์—ฌ ์‹คํ–‰ํ•˜๋Š” ๊ธฐ๋Šฅ : ๋…ผ๋ฆฌ์ ์ด์ง€ ์•Š์Œ

Known Bugs

์ƒˆ๋กœ์šด ํ”„๋กœ์„ธ์Šค์—์„œ ์ €์žฅ๋œ session์„ resume ํ•˜๋Š” ๊ฒฝ์šฐ, session ๊ฐœ์ฒด๋ฅผ ์ƒ์„ฑ ํ›„ ๊ณง๋ฐ”๋กœ ํŒŒ์ผ์— ์ €์žฅํ•œ๋‹ค. ์ˆœ๊ฐ„์ ์ธ race condition ์€ ๊ฐ€๋Šฅํ•˜๋‚˜, PID ๊ฐ’ ๊ฐฑ์‹ ์— ์ˆ˜์ดˆ๊ฐ€ ์†Œ์š”๋˜์ง€ ์•Š๋Š”๋‹ค. race condition์—์„œ๋Š” ๋˜๋‹ค๋ฅธ ํ”„๋กœ์„ธ์Šค์—์„œ stop session ํ•  ๋•Œ ์‹คํŒจํ•  ์ˆ˜ ์žˆ๋‹ค.

์ด๋ฏธ ์กด์žฌํ•˜๋Š” ๋กœ์ปฌํŒŒ์ผ์— ๋ฐ์ดํ„ฐ๋ฅผ ๋ง๋ถ™์ด๋Š” ์„ธ์…˜์„ ์ƒ์„ฑํ•˜๋Š” ๊ฒƒ์ด ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค.

ํ˜„์žฌ ๊ตฌํ˜„์€ ์„ธ์…˜ ์ƒ์„ฑ ์‹œ, ๋กœ์ปฌํŒŒ์ผ์ด ์ด๋ฏธ ์กด์žฌํ•˜๋ฉด ์‹คํ–‰์„ ์ค‘๋‹จํ•˜๊ณ  ์„ธ์…˜ ์ƒ์„ฑ์„ ์ทจ์†Œํ•œ๋‹ค. destination ์—์„œ append ๋ฅผ ๊ธฐ๋ณธ์ ์œผ๋กœ ํ—ˆ์šฉํ•˜๋Š” url schema ๋˜๋Š” ์˜ต์…˜์„ ๊ตฌํ˜„ํ•˜๋ฉด ๋ ๊นŒ...??? ์•„๋‹ˆ๋ฉด append mode ์˜ต์…˜์„ ์ถ”๊ฐ€ํ• ๊นŒ....?

Implementation

session

session object ๋Š” session file์„ open(2)ํ•œ ํ›„, shared lock ์„ ์–ป๋Š”๋‹ค. session ์ด active ์ธ์ง€ ์—ฌ๋ถ€๋ฅผ ํ™•์ธํ•˜๋ ค๋ฉด, exclusive lock ์„ ์–ป์œผ๋ ค ์‹œ๋„ํ•ด ๋ณด๊ณ , ์‹คํŒจํ•œ ๊ฒฝ์šฐ, ์ด session ์ด active ๋ผ ํŒ๋‹จํ•œ๋‹ค. ์ฆ‰, ํ˜„์žฌ์˜ session object ์ด์™ธ์— ๋™์ผํ•œ session object ๊ฐ€ ์กด์žฌํ•˜๋Š” ๊ฒฝ์šฐ, active ๋ผ ํŒ๋‹จํ•˜๋Š” ๊ฒƒ์ด๋‹ค.

session ์ด active ํ•˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ, next_urls ๊ฐ€ ์กด์žฌํ•˜๋ฉด stopped ์ƒํƒœ์ด๊ณ , next_urls ๊ฐ€ ์กด์žฌํ•˜์ง€ ์•Š์œผ๋ฉด, completed ์ƒํƒœ์ด๋‹ค.

Test

โš ๏ธ **GitHub.com Fallback** โš ๏ธ