Exploring Iterative Enhancement for Improving Learnersourced Multiple-Choice Question Explanations with Large Language Models
Large language models (LLMs) have demonstrated strong capabilities in language understanding and generation, and their potential in educational contexts is increasingly being explored. One promising area is learnersourcing, where stu- dents engage in creating their own educational content, such as multiple-choice questions.
Implementation
Source publication / research team or educational organization described in paper
Learning context
Higher education
AI role
Tutor
Outcome signal
Conceptual understanding
Registry Facets
- Higher education
- Higher education
- AI-supported learning
- assessment/explanations
- LLM/Chat
- NLP / text classification
- Learning tool / resource design
- Assessment support
- Students
- Researchers
- LLM/Chat
- NLP / text classification
- Assessment / tutoring analytics
- Higher education
- Pre/post or experimental evidence
- Conceptual understanding
- Assessment / feedback quality
Implementing Organization
Source publication / research team or educational organization described in paper
New Zealand
Researchers, educators, instructors, or facilitators as described in the source publication
Learning Context
- Higher education
Tool / platform-supported learning activity
Not specified in extracted text
Not specified in extracted text
LLM/Chat, NLP / text classification, Assessment / tutoring analytics
- AI output reliability, hallucination, academic integrity, and age-appropriate use require safeguards.
Learner Profile
Higher education
Mixed or not explicitly specified; infer from target learner group and intervention design.
Varies by intervention; not specified unless the paper explicitly describes prerequisites.
Educational Intent
- Document the AI education intervention, course, tool, or resource described in the source publication.
- Extract the learner context, AI role, pedagogy, outcomes, and constraints for AAB registry comparison.
- Large language models (LLMs) have demonstrated strong capabilities in language understanding and generation, and their potential in educational contexts is increasingly being explored.
- Support AAB comparison across AI literacy, AI education, teacher training, higher education, and workforce contexts.
- Capture evidence maturity, transferability, and limitations rather than treating the publication as product endorsement.
- Not an AAB endorsement of the tool, curriculum, provider, or result.
- Not a direct replication record unless the source paper reports implementation details sufficient for replication.
AI Tool Description
LLM/Chat, NLP / text classification, Assessment / tutoring analytics
Language context discussed in source publication
- Tutor
- Evaluator
- Primary interaction pattern inferred from publication: Learning tool / resource design, Assessment support.
- AI capability focus: LLM/Chat, NLP / text classification, Assessment / tutoring analytics.
- Require human review of generated outputs and explicit guidance against over-reliance or answer copying.
Activity Design
- Review the publication’s reported context, learner group, AI tool or curriculum, implementation process, and outcome evidence.
- Map the case to AAB registry fields for comparison across educational levels and AI capability types.
- Use the source publication and PDF for any manual verification before public registry release.
- Human educators/researchers remain responsible for instructional design, supervision, interpretation, and ethical safeguards.
- AI systems or AI concepts provide the learning object, support tool, evaluator, simulator, or automation context depending on the paper.
- Tutoring / feedback-supported learning
- Registry extraction emphasizes explicit learning goals, observed outcomes, constraints, and safety limitations.
Observed Challenges
- AI output reliability, hallucination, academic integrity, and age-appropriate use require safeguards.
Design Adaptations
- Case classified under: Published empirical study.
- Pedagogical pattern: Tutoring / feedback-supported learning.
- Any additional adaptations should be verified against the full paper before public-facing publication.
Reported Outcomes
- Engagement evidence should be interpreted according to the source paper’s reported method and sample.
- One promising area is learnersourcing, where stu- dents engage in creating their own educational content, such as multiple-choice questions.
- One promising area is learnersourcing, where stu- dents engage in creating their own educational content, such as multiple-choice questions.
- A critical step in this process is generating effective explanations for the solutions to these questions, as such explanations aid in peer understanding and promote deeper conceptual learning.
- To support this task, we introduce “ILearner-LLM,” a framework that uses iterative enhancement with LLMs to improve gen- erated explanations.
Large language models (LLMs) have demonstrated strong capabilities in language understanding and generation, and their potential in educational contexts is increasingly being explored. One promising area is learnersourcing, where stu- dents engage in creating their own educational content, such as multiple-choice questions.
Ethical & Privacy Considerations
- Require human review of generated outputs and explicit guidance against over-reliance or answer copying.
Evidence Type
- Pre/post or experimental evidence
Relevance to Research
- Can be used as an AAB evidence record for cross-case comparison, standards drafting, and evidence-maturity mapping.
- Supports identification of recurring patterns in AI literacy, AI education implementation, teacher preparation, assessment, and responsible AI learning.
- Conceptual understanding
- Assessment / feedback quality
- Learning tool / resource design
- Assessment support
- LLM/Chat
- NLP / text classification
- Assessment / tutoring analytics
Case Status
- Completed
AAB Classification Tags
Higher education
Higher education
LLM/Chat, NLP / text classification, Assessment / tutoring analytics
Tutoring / feedback-supported learning
Medium
Medium
Source Publication
Exploring Iterative Enhancement for Improving Learnersourced Multiple-Choice Question Explanations with Large Language Models
- Qiming Bao
- Juho Leinonen
- Alex Yuxuan Peng
- Wanjun Zhong, Gaël Gendron
- Timothy Pistotti
- Alice Huang
- Paul Denny
- Michael Witbrock
- Jiamou Liu
Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 39 No. 28, EAAI-25
2025
10.1609/aaai.v39i28.35164
https://ojs.aaai.org/index.php/AAAI/article/view/35164
https://ojs.aaai.org/index.php/AAAI/article/view/35164/37319
001_Exploring Iterative Enhancement for Improving Learnersourced Multiple-Choice Question Explanations with Large Language Models.pdf
9
Large language models (LLMs) have demonstrated strong capabilities in language understanding and generation, and their potential in educational contexts is increasingly being explored. One promising area is learnersourcing, where stu- dents engage in creating their own educational content, such as multiple-choice questions. A critical step in this process is generating effective explanations for the solutions to these questions, as such explanations aid in peer understanding and promote deeper conceptual learning. However, students of- ten find it difficult to craft high-quality explanations due to limited understanding or gaps in their subject knowledge. To support this task, we introduce “ILearner-LLM,” a framework that uses iterative enhancement with LLMs to improve gen- erated explanations. The framework combines an explanation generation model and an explanation evaluation model fine- tuned using student preferences for quality, where feedback from the evaluation model is fed back into the generation model to refine the output. Our experiments with LLaMA2- 13B and GPT-4 using five large datasets from the PeerWise MCQ platform show that ILearner-LLM produces explana- tions of higher quality that closely align with those written by students. Our findings represent a promising approach for enriching the learnersourcing experience for students and for leveraging the capabilities of large language models for edu- cational applications.
Transferability
- Higher education
- AI output reliability, hallucination, academic integrity, and age-appropriate use require safeguards.
Cost And Operations
Not specified in extracted text unless noted in duration field.
Requires educators/researchers/facilitators with sufficient AI literacy and pedagogy knowledge for the target learners.
Infrastructure depends on AI tool type, learner devices, data access, and institutional policy context.
Extraction Notes
High
- group_size
- duration
This entry was automatically extracted from the PDF text and manifest metadata. Fields should be manually verified before public registry publication, especially group size, location, duration, and outcome claims.
Pre-service teachers preparedness for AI-integrated education: An investigation from perceptions, capabilities, and teachers’ identity changes
0.404
false
