konlpy X m1 silicon - hexists/konlpy GitHub Wiki

konlpy X m1 silicon

๋ฐฐ๊ฒฝ

konlpy๋ฅผ m1 silicon์—์„œ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.
m1์€ 2020๋…„ 11์›”์— ์ถœ์‹œ๋œ cpu๋กœ ํ•ด๋‹น cpu๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ์ปดํ“จํ„ฐ(mac m1 silicon)์—์„œ๋Š” konlpy๊ฐ€ ์ œ๋Œ€๋กœ ์‹คํ–‰๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
konlpy์˜ ๋ฌธ์ œ๋ณด๋‹ค๋Š” konlpy์—์„œ java package๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๋Š”๋ฐ ์ด๋•Œ jdk ํ˜ธํ™˜์„ฑ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
๋ณธ ๊ธ€์—์„œ๋Š” ํ˜ธํ™˜์„ฑ ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋œ jdk๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ konlpy๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์ •๋ฆฌํ•ฉ๋‹ˆ๋‹ค.

์„ค์น˜ ๋ฐฉ๋ฒ•

๊ธฐ์กด์˜ konlpy๋ฅผ ์„ค์น˜ํ–ˆ๋˜ ๋ฐฉ๋ฒ•๋Œ€๋กœ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
konlpy_install_mac.png

์›๋ณธ ๋งํฌ์˜ ๋ช…๋ น์–ด์ž…๋‹ˆ๋‹ค.

# update pip
$ python3 -m pip install --upgrade pip

# install konlpy
$ python3 -m pip install konlpy        # Python 3.x

# install mecab
$ bash <(curl -s https://raw.githubusercontent.com/konlpy/konlpy/master/scripts/mecab.sh)

virtualenv๋กœ ๊ฐ€์ƒํ™˜๊ฒฝ์„ ์„ค์ •ํ•˜๊ณ  ์„ค์น˜ํ•˜๋Š” ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค.

$ python3 -V
Python 3.8.2

$ mkdir test_konlpy_m1

$ cd test_konlpy_m1

$ virtualenv -p python3 .venv

$ source .venv/bin/activate

$ python3 -m pip install --upgrade pip

$ python3 -m pip install konlpy        # Python 3.x

$ bash <(curl -s https://raw.githubusercontent.com/konlpy/konlpy/master/scripts/mecab.sh)

ํฐ ๋ฌธ์ œ๊ฐ€ ์—†๋‹ค๋ฉด ์„ค์น˜๊ฐ€ ์ž˜๋ฉ๋‹ˆ๋‹ค.

ํ…Œ์ŠคํŠธ

kkma๋ฅผ ์‹คํ–‰์„ ํ†ตํ•ด, Konlpy๊ฐ€ ์ž˜ ์„ค์น˜๋๋Š”์ง€ ํ™•์ธํ•ด๋ด…๋‹ˆ๋‹ค.

>>> from konlpy.tag import Kkma
>>> from konlpy.utils import pprint
>>> kkma = Kkma()
>>> pprint(kkma.pos(u'์˜ค๋ฅ˜๋ณด๊ณ ๋Š” ์‹คํ–‰ํ™˜๊ฒฝ, ์—๋Ÿฌ๋ฉ”์„ธ์ง€์™€ํ•จ๊ป˜ ์„ค๋ช…์„ ์ตœ๋Œ€ํ•œ์ƒ์„ธํžˆ!^^'))

์‹คํ–‰์ด ์ž˜๋˜๋ฉด ์ข‹๊ฒ ์ง€๋งŒ, ์•„๋ž˜์™€ ๊ฐ™์€ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

tweepy ๋ฌธ์ œ
Type "help", "copyright", "credits" or "license" for more information.
>>> from konlpy.tag import Kkma
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/daniellee/Develop/test_konlpy_m1/.venv/lib/python3.8/site-packages/konlpy/__init__.py", line 12, in <module>
    from konlpy import (
  File "/Users/daniellee/Develop/test_konlpy_m1/.venv/lib/python3.8/site-packages/konlpy/stream/__init__.py", line 8, in <module>
    from konlpy.stream.twitter import TwitterStreamer
  File "/Users/daniellee/Develop/test_konlpy_m1/.venv/lib/python3.8/site-packages/konlpy/stream/twitter.py", line 17, in <module>
    class CorpusListener(tweepy.StreamListener):
AttributeError: module 'tweepy' has no attribute 'StreamListener'
>>>

tweepy ๋ฒ„์ „ ๋ฌธ์ œ๋กœ 1.4.0์ด ์„ค์น˜๋œ ๊ฒฝ์šฐ konlpy์—์„œ ์š”๊ตฌํ•˜๋Š” ๋ฒ„์ „(tweepy 3.7.0<= <3.10.0)์œผ๋กœ ์žฌ์„ค์น˜ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

$ pip install tweepy>=3.7.0,3.10.0

์ฐธ๊ณ ๋กœ tweepy ๊ด€๋ จ๋œ ๋‚ด์šฉ์€ konlpy master branch์— ์ด๋ฏธ ์ œ๊ฑฐ๋œ ์ฝ”๋“œ๋กœ, pypi ํŒจํ‚ค์ง€๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ ๋‚˜ํƒ€๋‚˜๋Š” ๋ฌธ์ œ์ž…๋‹ˆ๋‹ค. https://github.com/konlpy/konlpy/issues/368

jdk

tweepy ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋ฉด ๋‚˜ํƒ€๋‚˜๋Š” ์˜ค๋ฅ˜์ž…๋‹ˆ๋‹ค.
m1 silicon ํ˜ธํ™˜์„ฑ์ด ํ•ด๊ฒฐ๋˜์ง€ ์•Š์€ jdk๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ ๋‚˜ํƒ€๋‚˜๋ฉฐ, libjli.dylib์—์„œ jvm dll์„ ์ฐพ์ง€ ๋ชปํ•œ๋‹ค๋Š” ๋ฉ”์„ธ์ง€๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.

python
Python 3.8.2 (default, Dec 21 2020, 15:06:03)
[Clang 12.0.0 (clang-1200.0.32.29)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from konlpy.tag import Kkma
>>> kkma = Kkma()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/daniellee/Develop/test_konlpy_m1/.venv/lib/python3.8/site-packages/konlpy/tag/_kkma.py", line 95, in __init__
    jvm.init_jvm(jvmpath, max_heap_size)
  File "/Users/daniellee/Develop/test_konlpy_m1/.venv/lib/python3.8/site-packages/konlpy/jvm.py", line 64, in init_jvm
    jpype.startJVM(jvmpath, '-Djava.class.path=%s' % classpath,
  File "/Users/daniellee/Develop/test_konlpy_m1/.venv/lib/python3.8/site-packages/jpype/_core.py", line 226, in startJVM
    _jpype.startup(jvmpath, tuple(args),
OSError: [Errno 0] JVM DLL not found: /Library/Java/JavaVirtualMachines/jdk-16.0.1.jdk/Contents/Home/lib/libjli.dylib

์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” m1 silicon์ด ํ˜ธํ™˜๋˜๋Š” jdk๋ฅผ ์ฐพ์•„ ์„ค์น˜ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
์—ฌ๋Ÿฌ ์‹œ๋„ ๋์— oracle jdk 17์—์„œ ๋ฌธ์ œ ์—†์ด ์‹คํ–‰๋˜๋Š” ๊ฒƒ์„ ํ™•์ธํ–ˆ์Šต๋‹ˆ๋‹ค.

  1. jdk ๋‹ค์šด๋กœ๋“œ
    oracle jdk download ์‚ฌ์ดํŠธ์—์„œ macOS์—์„œ arm64์šฉ jdk๋ฅผ ๋‹ค์šด๋กœ๋“œ ํ•ฉ๋‹ˆ๋‹ค.
    ์ €๋Š” Arm 64 Compressed Archive๋ฅผ ๋‹ค์šด๋กœ๋“œ ํ–ˆ์Šต๋‹ˆ๋‹ค.
    oracle jdk download
    konlpy_jdk.png

  2. ์••์ถ• ํ•ด์ œ ๋‹ค์šด ๋ฐ›์€ ๊ฒฝ๋กœ๋กœ ์ด๋™ํ•˜์—ฌ ๋‹ค์šด๋กœ๋“œ ๋ฐ›์€ ํŒŒ์ผ์„ ์••์ถ• ํ•ด์ œ ํ•ฉ๋‹ˆ๋‹ค. konlpy_jdk_tgz.png

  3. ์ด๋™(ํŽธํ•œ ๊ฒฝ๋กœ์— ์œ„์น˜) jdk-17.jdk ๋””๋ ‰ํ† ๋ฆฌ๋ฅผ ์ ์ ˆํ•œ ์œ„์น˜๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค.
    ์ €๋Š” /Library/Java/JavaVirtualMachines/์œผ๋กœ ์ด๋™ํ–ˆ์Šต๋‹ˆ๋‹ค.

  4. .zshrc ์„ค์ •
    .zshrc์— ํ™˜๊ฒฝ ์„ค์ •์„ ํ•ฉ๋‹ˆ๋‹ค.

$ echo "export JAVA_HOME_17=/Library/Java/JavaVirtualMachines/jdk-17.jdk/Contents/Home" >> ~/.zshrc
$ echo "export JAVA_HOME=$JAVA_HOME_17" >> ~/.zshrc
$ echo "export PATH=$PATH:$JAVA_HOME/bin" >> ~/.zshrc

.zshrc๋ฅผ ํ˜„์žฌ ํ™˜๊ฒฝ์— ์ ์šฉํ•˜๊ธฐ ์œ„ํ•ด ์•„๋ž˜ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

$ source ~/.zshrc

๋ชจ๋“  ์˜ค๋ฅ˜ ์ˆ˜์ • ํ›„ ์‹คํ–‰

๋ชจ๋“  ์˜ค๋ฅ˜๋ฅผ ์ˆ˜์ •ํ•œ ๋‹ค์Œ์—๋Š” ์˜ค๋ฅ˜ ๋ฉ”์„ธ์ง€ ์—†์ด konlpy๊ฐ€ ์ž˜ ์‹คํ–‰๋˜๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

>>> from konlpy.tag import Kkma
>>> from konlpy.utils import pprint
>>> kkma = Kkma()

>>> pprint(kkma.pos(u'์˜ค๋ฅ˜๋ณด๊ณ ๋Š” ์‹คํ–‰ํ™˜๊ฒฝ, ์—๋Ÿฌ๋ฉ”์„ธ์ง€์™€ํ•จ๊ป˜ ์„ค๋ช…์„ ์ตœ๋Œ€ํ•œ์ƒ์„ธํžˆ!^^'))
[('์˜ค๋ฅ˜', 'NNG'),
 ('๋ณด๊ณ ', 'NNG'),
 ('๋Š”', 'JX'),
 ('์‹คํ–‰', 'NNG'),
 ('ํ™˜๊ฒฝ', 'NNG'),
 (',', 'SP'),
 ('์—๋Ÿฌ', 'NNG'),
 ('๋ฉ”์„ธ์ง€', 'NNG'),
 ('์™€', 'JKM'),
 ('ํ•จ๊ป˜', 'MAG'),
 ('์„ค๋ช…', 'NNG'),
 ('์„', 'JKO'),
 ('์ตœ๋Œ€ํ•œ', 'NNG'),
 ('์ƒ์„ธํžˆ', 'MAG'),
 ('!', 'SF'),
 ('^^', 'EMO')]

Reference

Issues

โš ๏ธ **GitHub.com Fallback** โš ๏ธ