OpenHermes 2.5 - amosproj/amos2025ss04-ai-driven-testing GitHub Wiki

📌 Important Facts

Based on Mistral 7B, quantized versions run well on most laptops (~8–16GB RAM).

Trained for general-purpose instruction following, including reasoning and conversational use.

Licensed under the Apache 2.0 License.

Personal Evaluation: One of the best open-source general chat models in the 7B class. Solid at following instructions, decent at simple code tasks but not optimized for code-heavy outputs.

🔍 Overview

OpenHermes 2.5 is a fine-tuned variant of Mistral 7B, developed to improve instruction-following, natural language reasoning, and conversation fluency. It combines the strong base capabilities of Mistral with curated fine-tuning datasets aimed at creating a helpful, safer assistant with broad general-purpose utility.

This version reflects updates in training techniques and dataset quality over previous OpenHermes iterations, resulting in improved output coherence and alignment with user intent.

🔧 Key Features

Instruction-Tuned:

OpenHermes 2.5 is trained for single- and multi-turn instructions, designed to follow natural language queries and prompts with minimal context.

  • General-Purpose Performance: Performs well across a variety of tasks — writing, summarization, reasoning, creative writing, and basic programming.

  • Dialogue-Ready: Tuned to handle multi-turn conversations, including memory of prior context (within its token window).

  • Safety-Conscious: Includes preference optimization and filtering techniques to reduce harmful, biased, or off-topic completions.

  • Language Support: Primarily English, but can handle prompts in other European languages with moderate reliability.

  • Open Weight Availability: Can be pulled via Ollama and other platforms that support Mistral format models.

💼 Use Cases

  • OpenHermes 2.5 is ideal for:

  • Chatbots and personal AI assistants

  • Summarization and rewriting tools

  • Creative writing support

  • Educational or tutoring-style applications

  • Code-adjacent tasks (e.g., code explanation or basic generation)

🧠 Architecture

Built on Mistral 7B — a dense transformer with grouped-query attention and sliding window attention.

Context length: 8K tokens

Quantization: commonly used with Q4_K_M or similar for efficient local inference

Model checkpoint available via Hugging Face and Ollama-compatible formats.

📄 License

Apache 2.0 License https://www.apache.org/licenses/LICENSE-2.0