llama - deptno/deptno.github.io GitHub Wiki

llama

  1. meta의 llama νŽ˜μ΄μ§€μ—μ„œ downdoad url μš”μ²­
  2. github.com μ—μ„œ download.sh μ‹€ν–‰
  • email λ‘œμ „λ‹¬λœ 링크 μ‚½μž…
  1. llama.cpp clone
  • pip install -r requirements.txt λ₯Ό ν•˜κ²Œλ˜λ©΄ pytorch 의 cuda λ²„μ „μ—λŸ¬ λ±‰μŒ
    • λ”°λ‘œ μ„€μΉ˜ ν•΄μ„œ ν•΄κ²°
  • python convert.py [download 받은 llama λͺ¨λΈ 폴더]
    • guff 파일 생성됨
  • optional ./quantize file.guff 2
    • μ–‘μžν™”λΌκ³ ν•˜λŠ”λ° f16 -> int8 둜 무언가λ₯Ό λ³€ν™˜ν•˜λ©΄μ„œ λ¦¬μ†ŒμŠ€ νš¨μœ¨μ„ μƒμŠΉμ‹œν‚¨λ‹€
  • ./main [guff_location.guff](/deptno/deptno.github.io/wiki/guff_location.guff) -p 'μ§ˆμ˜μ–΄'

link