adaptive testing:related work - chunhualiao/public-docs GitHub Wiki

adaptive testing

Okay, finding mature, open-source projects that specifically combine AI-driven (LLM) question generation with adaptive testing algorithms (like IRT) is still a bit challenging, as this is a relatively new intersection. However, here's a breakdown of relevant open-source projects and tools based on the search results:

1. Established Open Source Adaptive Testing Platforms (IRT-based, non-LLM specific):

These platforms provide the core adaptive logic (IRT-based item selection, scoring) but typically expect a pre-calibrated bank of questions rather than dynamically generating them with an LLM. You could potentially integrate LLM question generation into these.

Concerto Platform:
- Developed by The Psychometrics Centre at the University of Cambridge.
- A well-regarded open-source (GPL) platform for creating various online assessments, including sophisticated IRT-based Computerized Adaptive Tests (CATs).
- Uses the R statistical language for backend calculations, making it flexible for implementing various psychometric models.
- Designed to be extensible, potentially allowing integration with external question sources (like an LLM API).
R Packages:
- RSCAT: An R package specifically mentioned for CAT simulations and implementation. R is commonly used in psychometrics, so other relevant packages might exist.
OSCATS (Open Source Computerized Adaptive Testing System):
- Mentioned by the International Association for Computerized Adaptive Testing (IACAT) as an open-source option. Finding its specific repository or current status might require further searching.

2. Open Source Projects Incorporating LLMs for Assessment/Generation (Closer to your goal):

These projects leverage LLMs in the educational assessment space, though they might not be end-to-end adaptive testing systems themselves.

QGen Studio:
- Described in recent research papers (April 2025) as an adaptive question-answer generation, training, and evaluation platform using LLMs.
- Aims to enable users to create custom QA datasets with LLMs and fine-tune models.
- The papers state it "will be open-sourced soon."
- An IBM/qgen-studio repository exists on GitHub under an Apache-2.0 license, suggesting an open-source release is intended or in progress. Keep an eye on this project.
AERA Chat:
- Described in research papers (October 2024) as the "first open-source interactive platform explicitly designed to utilize LLMs in explainable student answer scoring."
- Focuses on automated scoring, generating rationales for grading decisions (especially for free-text answers), and providing tools for evaluating/annotating these rationales.
- While heavily using LLMs for the assessment part, it's not explicitly described as a full CAT engine that adaptively selects the next question based on ability estimates, but it's a significant open-source component for AI in assessment.
csv610/mcq_generator (GitHub):
- A project focused specifically on generating Multiple-Choice Questions (MCQs) using LLMs (OpenAI or Ollama).
- Allows specifying subject, difficulty, number of questions. Provides explanations and prerequisite knowledge.
- This could serve as a component for generating questions within a larger adaptive system but isn't the adaptive engine itself.

3. Less Relevant (But Related) Open Source AI Projects:

Projects listed under "AI Testing Tools" (like Selenium, Appium, Robot Framework, Katalon) are primarily for software testing automation, not educational assessment.
Repositories like LLM-Testing/LLM4SoftwareTesting, codelion/adaptive-classifier, SakanaAI/self-adaptive-llms are research-focused on LLM applications in software testing, general classification, or LLM adaptation mechanisms, not specifically educational CAT.

In summary:

For the adaptive testing engine (IRT logic), Concerto is a strong, established open-source option.
For LLM-powered question generation, you might use libraries/projects like mcq_generator or build custom logic using LLM APIs.
For LLM-powered answer evaluation, AERA Chat provides an open-source framework.
A potentially integrated solution is QGen Studio, which aims to combine LLM QA generation and evaluation, although its full open-source release and features need monitoring.

You would likely need to combine elements from these different areas – potentially using Concerto for the adaptive framework and integrating your own LLM-based question generation and evaluation modules (perhaps inspired by or using parts of QGen Studio or AERA Chat if available/suitable).