auto space by FST - beyondnlp/nlp GitHub Wiki
autospace by FST
- Rouzeta๋ฅผ ์ด๋ ์ ๋ ์์ค์ผ๋ก ํ์
ํ ์ ์๋ค๋ฉด WFST๋ฅผ ์ด์ฉํ ๋์ด์ฐ๊ธฐ ๋ชจ๋์ ๊ฐ๋ฐํ ์ ์๋ค.
- ์ด ํ์ด์ง๋ฅผ ๊ทธ ๋ฐฉ๋ฒ์ ๋ํ ๊ฐ๋ตํ ์ค๋ช
์ ๋ด๊ณ ์๋ค.
[example] ์๋๋ ์์ ๊ฐ ์ ์ด ๊ทธ๋ํ๋ฅผ ๊ทธ๋๊ธฐ ์ํ ๋ฌธ๋ฒ์ด๋ค.( by lexc )
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! HPS by FST !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!noun : /n
!josa : /j
Multichar_Symbols /j /n
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
LEXICON Root
noun ;! ์ผ๋ฐ๋ช
์ฌ
LEXICON noun !์ผ๋ฐ๋ช
์ฌ
์ด์ฑ์น๊ตฌ/n nNext;
์ด์ฑ๋๋ฃ/n nNext;
LEXICON josa !์กฐ์ฌ
์/j jNext;
์ด/j jNext;
๋/j jNext;
๋ก/j jNext;
๋ง/j jNext;
์/j jNext;
๊ฐ/j jNext;
๊ณผ/j jNext;
์/j jNext;
๋/j jNext;
๋ค/j jNext;
LEXICON nNext
josa;
finLexicon;
LEXICON jNext
finLexicon;
LEXICON finLexicon
# ;
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!! End of Document !!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
[example] ์ ๋ฌธ๋ฒ์ ์๋์ ๊ฐ์ด ๊ทธ๋ํ๊ฐ ์์ฑ๋๋ค.

[test] ์ example์ ๊ธฐ๋ฐ์ผ๋ก ์ฌ์์ฑํ ๋ฌธ๋ฒ์ด๋ค.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! HPS by FST !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!noun : /n
!josa : /j
Multichar_Symbols /js /nc /b
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
LEXICON Root
noun ;! ์ผ๋ฐ๋ช
์ฌ
LEXICON noun !์ผ๋ฐ๋ช
์ฌ
์ฃฝ์/nc nNext;
์๋/nc nNext;
๋๋ก/nc nNext;
LEXICON josa !์กฐ์ฌ
์/js jNext;
๋ก/js jNext;
LEXICON nNext
noun;
josa;
finLexicon;
LEXICON jNext
noun;
finLexicon;
LEXICON finLexicon
# ;
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!! End of Document !!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
~
[test] ํ๊ทธ๋ epsilon์ผ๋ก ์นํ
!!!!!!!!!!!!!!!!!!!!!!!
! Read hps Lexicon !
!!!!!!!!!!!!!!!!!!!!!!!
read lexc hps.lexc
define Lexicon ;
define josa_lex [ ์ด | ๊ฐ ];
define josa_tag [ %/nc | %/js ];
define Filter1 %/nc -> %/w || _ [ ์ด | ๊ฐ ];
define Filter2 %/nc -> 0;
define Filter3 %/js -> 0;
define Filter4 %/w -> 0;
define test Lexicon
.o. Filter1
.o. Filter2
.o. Filter3
.o. Filter4
;
regex test ;
invert net
att > hps.att;
[test] ์ ๋ฌธ๋ฒ์ ์ ์ฉํ์ ๊ฒฝ์ฐ์ ๊ทธ๋ํ

openfst์ weight๋ฅผ ์ถ๊ฐํ๋ ๋ฐฉ๋ฒ

- ์ ์คํ ๋งํ๋ฅผ ๋ณด๋ฉด '์ฃฝ์/์/๋๋ก', '์ฃฝ์/์๋/๋ก' ๋๊ฐ์ง ๊ฒฝ๋ก๊ฐ ๋ชจ๋ ์กด์ฌํ๋ค.
- 7๋ฒ ๋
ธ๋์์ 4๋ฒ ๋
ธ๋(์์ฌ) ๋๋ 8๋ฒ ๋
ธ๋(์กฐ์ฌ) ๋๊ฐ์ง ์ ํ์ด ๊ฐ๋ฅํ๊ฒ ๋๋ค.
- ํ์ฌ๋ weight๋ฅผ ๋์ผํ๊ฒ ๋์๊ธฐ ๋๋ฌธ์ '์ฃฝ์/์๋/๋ก'๋ก ๊ฐ ๊ฒฐ๊ณผ๋ก ๋์ค๋๋ฐ
- ์/js์ weight๋ฅผ ์ฌ๋ฆฌ๋ฉด '์ฃฝ์/์/๋๋ก'๋ก ์ ํ๊ฐ๋ฅํ๊ฒ ๋๋ค.
0 1 ์ ์
0 2 ๋ ๋
0 3 ์ฃฝ ์ฃฝ
3 4 ์ ์
4 5 @0@ /nc
5 6 ๋ก ๋ก
5 7 ์ ์
5 2 ๋ ๋
5 3 ์ฃฝ ์ฃฝ
7 4 ๋ ๋
7 8 @0@ /js
8 1 ์ ์
8 2 ๋ ๋
8 3 ์ฃฝ ์ฃฝ
6 8 @0@ /js
2 4 ๋ก ๋ก
1 4 ๋ ๋
5
8
- ์์ ๊ฐ์ at&tํฌ๋งท์ ๋ณด๋ฉด weight๋ฅผ ์ด๋ป๊ฒ ๋ถ์ฌํ ์ง ์ ๋ชจ๋ฅด๊ฒ ๋ค.
7 4 ๋ ๋
7 8 @0@ /js
- 7๋ฒ ๋
ธ๋๋ 4๋ฒ ๋
ธ๋, 8๋ฒ ๋
ธ๋ ๋ ๊ฐ์ง ๊ฒฝ๋ก๋ก ์ด๋ ๊ฐ๋ฅํ๋ค.
- ๋๊ฐ์ง ๋
ธ๋ ์ค ํ๋์ weight๋ฅผ ๋์ด๋ฉด ํด๋น ๊ฒฝ๋ก๋ก ์ด๋ํ ๊ฐ๋ฅ์ฑ์ด ๋๋ค.
update weight
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! HPS by FST !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Multichar_Symbols /js /nc /vv /ncp /xsp /ma
Definitions
NUM = %0|1|2|3|4|5|6|7|8|9 ;
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
LEXICON Root
Hnc ;! ์ผ๋ฐ๋ช
์ฌ
Hvv ;! ์ฉ์ธ๋ฅ
Hncp;! ํ๋ค์ฑ ๋ช
์ฌ
Hma;! ๋ถ์ฌ
LEXICON Hnc !์ผ๋ฐ๋ช
์ฌ
์ฃฝ์/nc nNext;
์๋/nc nNext;
๋๋ก/nc nNext;
LEXICON Hncp !ํ๋ค์ฑ ๋ช
์ฌ
๊ณต๋ถ/ncp ncpNext;
๋
ธ๋ ฅ/ncp ncpNext;
์ฌ๋/ncp ncpNext;
LEXICON Hxsp !ํ๋ค
ํ๋ค/xsp xspNext;
ํ๊ณ /xsp xspNext;
ํ๋ฉด/xsp xspNext;
ํด์/xsp xspNext;
LEXICON Hma !๋ถ์ฌ
์/ma maNext;
์์ฒญ/ma maNext;
์์ฃผ/ma maNext;
๋งค์ฐ/ma maNext;
LEXICON maNext
Hma;
Hvv;
final;
LEXICON Hvv ! ์ฉ์ธ๋ฅ
ํ๋ณตํ/vv vNext;
์ฆ๊ฑฐ์ด/vv vNext;
LEXICON Hjs !์กฐ์ฌ
์/js jNext;
๋ก/js jNext;
LEXICON vNext
Hnc;
final;
LEXICON nNext
Hnc;
Hjs;
final;
LEXICON xspNext
Hnc;
Hjs;
final;
LEXICON ncpNext
Hxsp;
Hma;
final;
LEXICON jNext
Hnc;
final;
LEXICON final
# ;
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!! End of Document !!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!
! Read hps Lexicon !
!!!!!!!!!!!!!!!!!!!!!!!
read lexc hps.lexc
define Lexicon ;
define Filter1 %/nc -> 0;
define Filter2 %/js -> 0;
define Filter3 %/vv -> 0;
define Filter4 %/nn -> 0;
define Filter5 %/ncp -> 0;
define Filter6 %/xsp -> 0;
define Filter7 %/ma -> 0;
define test Lexicon
.o. Filter1
.o. Filter2
.o. Filter3
.o. Filter4
.o. Filter5
.o. Filter6
.o. Filter7
;
regex test ;
invert net
att > hps.att