shisa‐7b‐v1 Release Tracking - AUGMXNT/shisa GitHub Wiki
Public on Wed 2023-12-06 21:55 JST
Searches:
- https://twitter.com/search?q=shisa%207b&src=typed_query
- https://twitter.com/search?q=shisa%207b&src=typed_query&f=live
Articles
- 2023-03-21 https://sakana.ai/evolutionary-model-merge/
- Sakana AI announces Evolutionary Model Merge, a new way of merging models. They use shisa-gamma-7b-v1 as the Japanese base model for their EvoLLM-JP and EvoVLM-JP foundational models. AFAIK https://arxiv.org/abs/2403.13187 is the first time Shisa has shown up in Arxiv/LaTeX, so that's fun. Github repo here: https://github.com/SakanaAI/evolutionary-model-merge
- 2023-12-20 https://note.com/peter_lightblue/n/ne08a7c8cc47a
- Karasu 7B/Qarasu 14B released, using our shisa datasets as part of their training, as well as using shisa-7b-v1 as their based model for Karasu 7B
- 2023-12-29 https://note.com/oshizo/n/n3d7954400a00
- A review which includes shisa-gamma-7b-v1 which shows how it can hold its own against larger models
- 2023-12-20 https://qiita.com/wayama_ryousuke/items/105a164e5c80c150caf1#appendix-3-%E3%81%95%E3%82%89%E3%81%AB%E4%BB%96%E3%81%AE%E3%83%A2%E3%83%87%E3%83%AB%E3%82%82%E8%A9%95%E4%BE%A1%E3%81%97%E3%81%A6%E3%81%BF%E3%81%9F
- A great writeup w/ of ELYZA-100 testing, with how Shisa compares
- 2023-12-11 https://note.com/alexweberk/n/naa9c266ae690
- Review: EN+JA comparison testing
- 2023-12-10 https://qiita.com/isanakamishiro2/items/f3b09b5f738adf6572cc
- Review: JA testing on Databricks
- 2023-12-09 https://note.com/npaka/n/n24c44bc4bfd6
- Detailed tutorial on running in a Google Colab
Discussion
- 12-06 @jondurbin announcement: https://twitter.com/jon_durbin/status/1732390203115929691
- 12-07 r/LocalLlama announcement https://www.reddit.com/r/LocalLLaMA/comments/18cwh4n/shisa_7b_a_new_jaen_bilingual_model_based_on/
- 12-08 https://twitter.com/webbigdata/status/1733044645687595382
- 12-09 https://twitter.com/kam0shika/status/1733301241080643928
- kam0shika is https://huggingface.co/kunishou (see also: https://twitter.com/kun1em0n)
- 12-09 https://twitter.com/hmtd223/status/1733353070329987477
- 12-09 https://twitter.com/npaka123/status/1733441824948404488
- 12-09 https://twitter.com/WMjjRpISUEt2QZZ/status/1733500734354952401
- Japanese GGUF quant https://huggingface.co/mmnga/shisa-7b-v1-gguf (works?!)
- 12-11 https://twitter.com/alexweberk/status/1734023203130159264
- tried out colab, useful tests in EN + JA
- 12-11 https://twitter.com/icoxfog417/status/1734001536915698102
- 66 retweets and 246 likes pointing to our analysis, really?
- 12-11 https://twitter.com/mutaguchi/status/1734211633147433061
- does he know base models are just pre-train w/ no instruction tuning?