hallucination - chunhualiao/public-docs GitHub Wiki

List

Provide complete input (code, callees)
Ask LLMs to check input and refuse the request, self/third-party critique and repair
Using and testing generated docs: generate unit tests based on docs, then test the original functions.
Cross-check with static analysis
Human-in-the loop: add labels or ratings as feedback to fine tune the models

Generating documentation for C++ (or any other language) using large language models (LLMs) is a powerful technique, but ensuring correctness and avoiding hallucinations (i.e., plausible-sounding but incorrect outputs) can be challenging. Below are several strategies and best practices:

1. Provide High-Quality, Context-Rich Input

Give the model the complete source or relevant snippets:
- If the model doesn’t see all necessary classes, functions, or usage examples, it is more likely to produce incorrect or incomplete documentation.
- Ensure that all dependencies or relevant parts of the codebase are visible to the model—either through direct input (if feasible) or via a retrieval method that surfaces relevant snippets or references.
Include relevant comments and docstrings:
- If your code already contains comments or partial documentation, feed them to the model as well.
- The model can then refine existing docs instead of inventing them from scratch.
Provide usage context:
- If a function is part of a library or framework with established conventions, share that context.
- For example, if your class is an adapter pattern for a known library, mention that explicitly.

2. Use a Chunking and Retrieval Approach

Chunk the source code:
- Large codebases can exceed token limits if fed in full. Break the code into logical chunks (e.g., classes, modules, or namespaces).
- For each chunk, consider generating or refining documentation separately.
Retrieval-Augmented Generation (RAG):
- Before calling the LLM, use a retrieval system (e.g., a vector database) to find the most relevant code snippets or references.
- Feed those into the model as context so that the LLM can ground its output in the actual codebase.
- This approach reduces the chance of the model hallucinating because it is “reminded” of real code snippets.

3. Ground the Model in Established References

Reference library documentation or language specs:
- Provide authoritative sources (e.g., cppreference.com sections) relevant to the code at hand.
- If a function uses specific standard library constructs, feed the model the official documentation excerpt to anchor its explanations.
Link to your project’s style guides or design docs:
- If your project has internal references—like a design doc or style guide—feed that text to the model.
- This provides a “truth” baseline that the LLM can rely on when generating documentation.
Leverage code comments as “mini ground-truth”:
- If certain behaviors are well documented in comments (e.g., performance constraints, multi-threading assumptions), highlighting those for the LLM reduces the odds of misinterpretation.

4. Apply Verification and Testing Loops

Use a human-in-the-loop approach:
- Have developers or domain experts review and refine the generated documentation.
- A quick pass can catch glaring errors or omissions before they make it into a final docs set.
Automate tests to confirm correctness:
- For example, generate usage examples from the documentation snippets, then compile and run them.
- Confirm that the sample code compiles, executes as documented, and produces expected outputs.
Cross-check with static analysis or compiler warnings:
- Tools like Clang-Tidy or static analyzers can reveal any mismatch between the documentation’s claims and the code’s actual behavior.
- If the doc claims “no exceptions thrown,” but the function is annotated to throw an exception, that’s a red flag.

5. Use Model Selection and Fine-Tuning Strategies

Pick domain-specific or fine-tuned models where possible:
- General-purpose LLMs can be good for broad tasks, but if you can fine-tune or customize a model using your project’s own code-and-doc pairs, you’ll reduce hallucinations and increase accuracy.
Parameter selection and prompt engineering:
- Experiment with temperature (lower values yield more deterministic, less “creative” output).
- Use system prompts or special instructions that emphasize factual correctness over creativity.
Continuously refine with iterative feedback:
- If you find repeated errors—say, misunderstandings about a certain library or pattern—update your training data or prompt instructions to clarify those points in subsequent generations.

6. Enforce Transparency in the Output

Encourage the model to cite code references:
- Ask the LLM to reference line numbers or file paths.
- This reduces the likelihood of purely invented statements and makes it easier to verify the doc’s claims.
Include disclaimers for uncertain outputs:
- If the model is not sure about a particular function’s use case or edge cases, encourage it to note “Assumption or Potential Issue.”
- This signals to human reviewers where additional verification is needed.

7. Post-Processing and Analysis

Linter or rule-based checks on the generated docs:
- Simple rule-based systems can check for contradictory statements (e.g., “This function returns void” vs. “It returns an integer”).
- Highlight and correct these conflicts automatically or send them for human review.
Document structure templates:
- Provide a standard doc structure (e.g., “Summary,” “Parameters,” “Return Value,” “Exceptions,” “Example usage,” etc.).
- Using a consistent format helps both the LLM and your reviewers quickly identify missing or incorrect sections.
Monitor doc usage and user feedback:
- If users frequently ask about certain documented functions or report confusion, you know the doc might need refinement.
- Collect feedback to iteratively improve the generation pipeline.

Putting It All Together

By combining high-quality context, retrieval methods, human reviews, and iterative refinement, you can significantly improve the accuracy of LLM-generated C++ documentation and reduce hallucinations. Always view LLM output as a draft that needs validation and testing—especially for complex or business-critical APIs. With these strategies in place, LLMs can become powerful documentation aids that save time and ensure codebases remain well-documented and understandable.