FAQ - marcusgreen/moodle-qtype_aitext GitHub Wiki
AI Text FaQ
Frequently asked questions about AIText
AI Text is a Moodle question type that uses an external Large Language Model/AI System to evaluate student responses to quiz questions. This raises both teaching and technical issues.
Will it give accurate feedback?
TL;DR mostly. LLM systems are inherently unreliable, they deliver responses built on statistical analysis of publicly available data. This data will include “well known” but sometimes entirely inaccurate information. They will also embed widely held prejudice,bias and misinformation. For this reason responses should always be regarded as preliminary and with an expectation that some will be misleading and or wrong
Why use something inherently inaccurate?
Two of the possible benefits of LLM’s for feedback are that it comes quickly and it is mostly correct, plus it can be more correct with improved prompting. This begs the question as to if a quick incorrect answer is better than no answer at all, or a delayed correct answer.
Should it be used for summative assessment?
This tool should never be used for “high stakes” summative assessments. It is designed to promote learning and to offer quick feedback. If the evaluation of learning is high stakes, e.g. decides some significant benefit to a student this should not be used.
Will it increase cheating?
What is and what is not cheating is subjective and depends on context. Since the dawn of Educational Technology students have attempted to shortcut the need to learn content by doing things such as writing answers on their skin, using calculators for maths and copying from web sites. This tool is no different.
Technical
Is it expensive to run?
Anecdote: My bill for 12 months use of OpenAI ChatGPT including making a set of questions available on a public website with instant sign up is approximately $USD 50. I have also made extensive use of Groq cloud which is a high performance LLM system that offers access without financial cost (or at least I am not aware of any way to pay for it.
Once the cost of installation and maintenance has been covered there is an ongoing cost for the LLM/Inference system. I do not have direct experience with measuring the costs of large numbers of students using Inference systems but I am in contact with people who do and I will update this section as I get more information.
Typically Inference systems (e.g. OpenAI/ChatGPT) charge in units of millions of tokens. There are some high cost leading edge systems but so far costs per student seam “reasonable”, i.e. within historical ranges of putting a computer on a desk, providing textbooks and the fractionalised cost of having a teacher in a classroom.
However unlike those costs Inference costs have fallen dramatically over the last two years and it is likely they will fall further. There is huge investment in alternatives to the GPU approach to Inference that is likely to significantly change the economics of inference.
Throttling responses
AI Systems can implement a limit on the number of requests they can process within a set amount of time, e.g. X thousand requests per second. Because of the burst-like nature of quiz responses this could result in hitting this sort of barrier.
Will it be in the Plugins database?
I intend to submit it to the Moodle.org plugins database once I consider it sufficiently mature. The need for an external LLM adds a layer of complexity on top of a standard Moodle plugin. I want it to work with a variety of LLM systems and that means getting feedback from “real world use”. It is also necessary to manage expectations as LLM/AI systems are useful but are not “magic”.