Groundedness-Evaluator

Overview


Score range	Integer [1-5]: 1 is the lowest quality and 5 is the highest quality.
What is this metric?	Groundedness measures how well the generated response aligns with the given context in a retrieval-augmented generation scenario, focusing on its relevance and accuracy with respect to the context. If a query is present in the input, the recommended scenario is question and answering. Otherwise, the recommended scenario is summarization.
How does it work?	The groundedness metric is calculated by instructing a language model to follow the definition and a set of grading rubrics, evaluate the user inputs, and output a score on a 5-point scale (higher means better quality). Learn more about our definition and grading rubrics.
When to use it?	The recommended scenario is retrieval-augmented generation (RAG) scenarios, including question and answering and summarization. Use the groundedness metric when you need to verify that AI-generated responses align with and are validated by the provided context. It's essential for applications where contextual accuracy is key, like information retrieval, question and answering, and summarization. This metric ensures that the AI-generated answers are well-supported by the context.
What does it need as input?	Query (Optional), Context, Response

Version: 6