OpenHermes 2.5 - amosproj/amos2025ss04-ai-driven-testing GitHub Wiki
📌 Important Facts
Based on Mistral 7B, quantized versions run well on most laptops (~8–16GB RAM).
Trained for general-purpose instruction following, including reasoning and conversational use.
Licensed under the Apache 2.0 License.
Personal Evaluation: One of the best open-source general chat models in the 7B class. Solid at following instructions, decent at simple code tasks but not optimized for code-heavy outputs.
🔍 Overview
OpenHermes 2.5 is a fine-tuned variant of Mistral 7B, developed to improve instruction-following, natural language reasoning, and conversation fluency. It combines the strong base capabilities of Mistral with curated fine-tuning datasets aimed at creating a helpful, safer assistant with broad general-purpose utility.
This version reflects updates in training techniques and dataset quality over previous OpenHermes iterations, resulting in improved output coherence and alignment with user intent.
🔧 Key Features
Instruction-Tuned:
OpenHermes 2.5 is trained for single- and multi-turn instructions, designed to follow natural language queries and prompts with minimal context.
-
General-Purpose Performance: Performs well across a variety of tasks — writing, summarization, reasoning, creative writing, and basic programming.
-
Dialogue-Ready: Tuned to handle multi-turn conversations, including memory of prior context (within its token window).
-
Safety-Conscious: Includes preference optimization and filtering techniques to reduce harmful, biased, or off-topic completions.
-
Language Support: Primarily English, but can handle prompts in other European languages with moderate reliability.
-
Open Weight Availability: Can be pulled via Ollama and other platforms that support Mistral format models.
💼 Use Cases
-
OpenHermes 2.5 is ideal for:
-
Chatbots and personal AI assistants
-
Summarization and rewriting tools
-
Creative writing support
-
Educational or tutoring-style applications
-
Code-adjacent tasks (e.g., code explanation or basic generation)
🧠 Architecture
Built on Mistral 7B — a dense transformer with grouped-query attention and sliding window attention.
Context length: 8K tokens
Quantization: commonly used with Q4_K_M or similar for efficient local inference
Model checkpoint available via Hugging Face and Ollama-compatible formats.
📄 License
Apache 2.0 License https://www.apache.org/licenses/LICENSE-2.0