RegLexer - wkpark/reglexer GitHub Wiki

RegLexer

SimpleTest์— ํฌํ•จ๋˜์–ด ์žˆ๋Š” lexer.php๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ฐœ์„ ๋œ ๋„์ฟ ์œ„ํ‚ค์˜ lexer.php๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ํ•˜์˜€๋‹ค.

๋„์ฟ ์œ„ํ‚ค์—์„œ๋Š” lexer.php๋ฅผ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ฐœ์„ ๋˜์—ˆ๋‹ค.

  • support for lookback and lookahead patterns
  • support for changing the pattern modifiers from within the pattern.
  • notifying the Handler of the starting byte index in the raw text, where a token was matched.

์—ฌ๊ธฐ์— ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ ์„ ์ˆ˜์ • ๊ฐœ์„ ํ•˜์—ฌ regex์˜ ํšจ์šฉ์„ฑ์„ ๋†’์˜€๋‹ค.

  • capture ์ง€์›.
  • token์„ ๋ฐ›์•„์„œ ์ „๋‹ฌํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹Œ, match๋œ ๋ฐฐ์—ด ์ „์ฒด๋ฅผ ์ „๋‹ฌํ•˜์—ฌ regex ์‚ฌ์šฉ ๊ทน๋Œ€ํ™”.
  • match offset ์ธ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜๋ฉฐ, ^ anchor๋ฅผ ์‚ฌ์šฉ ๊ฐ€๋Šฅ.
  • PREG_OFFSET_CAPTURE๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ, ํŒจํ„ด์ด ๋น„์–ด์žˆ๋Š” ๊ฒฝ์šฐ(์˜ˆ๋ฅผ ๋“ค์–ด "(?=\n)"๊ฐ™์€ ํŒจํ„ด)๋„ ์‚ฌ์šฉ ๊ฐ€๋Šฅ.
  • ์ด๋ฒคํŠธ๋ฅผ ๋ณ€๊ฒฝ ๊ฐ€๋Šฅํ•˜๋ฉฐ ์œ ์—ฐ์„ฑ์„ ๋†’์˜€๋‹ค. ์ฆ‰, LEXER_EXIT ์ด๋ฒคํŠธ๋ฅผ ์ „๋‹ฌ๋ฐ›์•„์„œ ์ด๊ฒƒ์ด LEXER_EXIT์ด ์•„๋‹ˆ๋ผ๊ณ  ํŒ๋ณ„๋˜๋ฉด LEXER_UNMATCHED ๋“ฑ์˜ ๋‹ค๋ฅธ ์ด๋ฒคํŠธ๋กœ ๋ณ€๊ฒฝ ๊ฐ€๋Šฅ.
  • ์†Œ๋ชจ๋œ ํ† ํฐ ๊ธธ์ด ์žฌ์ง€์ • ๊ฐ€๋Šฅ. ์ฆ‰, ''''' ํ† ํฐ์„ ์ „๋‹ฌ๋ฐ›์•„์„œ ''' ๋ฌธ์ž 3๊ฐœ๋งŒ ์†Œ๋ชจํ•˜๋Š” ๊ฒฝ์šฐ์— 3์„ ๋ฆฌํ„ดํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์†Œ๋ชจ๋œ ํ† ํฐ์˜ ๊ธธ์ด ์žฌ์ง€์ •ํ•˜์—ฌ ์œ ์—ฐ์„ฑ์„ ๋†’์˜€๋‹ค.