Case ReportPublished empirical study2023

AAB-CASE-2026-RV-099

Exploring Social Biases of Large Language Models in a College Artificial Intelligence Course

This page documents an AI literacy or AI education case for registry purposes. It is descriptive and does not imply AAB endorsement of any specific tool, provider, or intervention.

Implementation

Source publication / research team or educational organization described in paper

Learning context

Higher education

AI role

Evaluator

Outcome signal

Conceptual understanding

Implementing Organization

Organization Type

Source publication / research team or educational organization described in paper

Location

Not specified in extracted text

Primary Facilitator Role

Researchers, educators, instructors, or facilitators as described in the source publication

Learning Context

Setting Type

Higher education

Session Format

Course implementation or course design

Duration

Not specified in extracted text

Group Size

Not specified in extracted text

Devices

LLM/Chat, NLP / text classification, ML concepts / supervised learning, Ethics / responsible AI

Constraints

AI output reliability, hallucination, academic integrity, and age-appropriate use require safeguards.
High-stakes or student-data-centered AI use requires stronger governance, transparency, and bias monitoring.

Learner Profile

Age Range

Higher education

Prior AI Exposure Assumed

Mixed or not explicitly specified; infer from target learner group and intervention design.

Prior Programming Background Assumed

Varies by intervention; not specified unless the paper explicitly describes prerequisites.

Educational Intent

Primary Learning Goals

Document the AI education intervention, course, tool, or resource described in the source publication.
Extract the learner context, AI role, pedagogy, outcomes, and constraints for AAB registry comparison.
Large neural network-based language models play an increas- ingly important role in contemporary AI.

Secondary Learning Goals

Support AAB comparison across AI literacy, AI education, teacher training, higher education, and workforce contexts.
Capture evidence maturity, transferability, and limitations rather than treating the publication as product endorsement.

What This Was Not

Not an AAB endorsement of the tool, curriculum, provider, or result.
Not a direct replication record unless the source paper reports implementation details sufficient for replication.

AI Tool Description

Tool Type

LLM/Chat, NLP / text classification, ML concepts / supervised learning, Ethics / responsible AI

Languages

Language context discussed in source publication

AI Role

Evaluator

User Interaction Model

Primary interaction pattern inferred from publication: Curriculum / course design, Ethics / responsible AI education.
AI capability focus: LLM/Chat, NLP / text classification, ML concepts / supervised learning, Ethics / responsible AI.

Safeguards

Require human review of generated outputs and explicit guidance against over-reliance or answer copying.
Include bias, fairness, transparency, and social impact discussion as part of the learning design.

Activity Design

Activity Flow

Review the publication’s reported context, learner group, AI tool or curriculum, implementation process, and outcome evidence.
Map the case to AAB registry fields for comparison across educational levels and AI capability types.
Use the source publication and PDF for any manual verification before public registry release.

Human Vs AI Responsibilities

Human educators/researchers remain responsible for instructional design, supervision, interpretation, and ethical safeguards.
AI systems or AI concepts provide the learning object, support tool, evaluator, simulator, or automation context depending on the paper.

Scaffolding Strategies

Instructional / curriculum-based learning
Registry extraction emphasizes explicit learning goals, observed outcomes, constraints, and safety limitations.

Observed Challenges

Educators Reported

AI output reliability, hallucination, academic integrity, and age-appropriate use require safeguards.
High-stakes or student-data-centered AI use requires stronger governance, transparency, and bias monitoring.

Design Adaptations

Adaptations

Case classified under: Published empirical study.
Pedagogical pattern: Instructional / curriculum-based learning.
Any additional adaptations should be verified against the full paper before public-facing publication.

Reported Outcomes

Engagement

Engagement evidence should be interpreted according to the source paper’s reported method and sample.
Although these models demonstrate sophisticated text generation capabili- ties, they have also been shown to reproduce harmful social biases contained in their training data.

Learning Signals

Although these models demonstrate sophisticated text generation capabili- ties, they have also been shown to reproduce harmful social biases contained in their training data.
Through the process of constructing a dataset and evaluation metric to measure bias, students mastered key technical concepts, in- cluding how to run contemporary neural networks for natural language processing tasks; construct datasets and evaluation metrics; and analyze experimental results.

Educators Reflection

Ethical & Privacy Considerations

Privacy

Require human review of generated outputs and explicit guidance against over-reliance or answer copying.
Include bias, fairness, transparency, and social impact discussion as part of the learning design.

Evidence Type

Evidence

Pre/post or experimental evidence
Activity documentation

Relevance to Research

Potential Research Use

Can be used as an AAB evidence record for cross-case comparison, standards drafting, and evidence-maturity mapping.
Supports identification of recurring patterns in AI literacy, AI education implementation, teacher preparation, assessment, and responsible AI learning.

Relevant Research Domains

Conceptual understanding
Engagement / motivation
Ethics and responsible use
Assessment / feedback quality
Curriculum / course design
Ethics / responsible AI education
LLM/Chat
NLP / text classification

Case Status

Completed

AAB Classification Tags

Age

Higher education

Setting

Higher education

AI Function

LLM/Chat, NLP / text classification, ML concepts / supervised learning, Ethics / responsible AI

Pedagogy

Instructional / curriculum-based learning

Risk Level

High

Data Sensitivity

Medium

Source Publication

Title

Exploring Social Biases of Large Language Models in a College Artificial Intelligence Course

Authors

Skylar Kolisko
Carolyn Jane Anderson

Venue

Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37 No. 13, EAAI-23

Year

2023

Doi

10.1609/aaai.v37i13.26879

Source URL

https://ojs.aaai.org/index.php/AAAI/article/view/26879

Pdf URL

https://ojs.aaai.org/index.php/AAAI/article/view/26879/26651

Pdf Filename

070_Exploring Social Biases of Large Language Models in a College Artificial Intelligence Course.pdf

Page Count

Abstract

Large neural network-based language models play an increas- ingly important role in contemporary AI. Although these models demonstrate sophisticated text generation capabili- ties, they have also been shown to reproduce harmful social biases contained in their training data. This paper presents a project that guides students through an exploration of social biases in large language models. As a final project for an intermediate college course in AI, stu- dents developed a bias probe task for a previously-unstudied aspect of sociolinguistic or sociocultural bias. Through the process of constructing a dataset and evaluation metric to measure bias, students mastered key technical concepts, in- cluding how to run contemporary neural networks for natural language processing tasks; construct datasets and evaluation metrics; and analyze experimental results. Students reported their findings in an in-class presentation and a final report, re- counting patterns of predictions that surprised, unsettled, and sparked interest in advocating for technology that reflects a more diverse set of backgrounds and experiences. Through this project, students engage with and even con- tribute to a growing body of scholarly work on social biases in large language models.

Transferability

Best Fit Contexts

Higher education

Likely Failure Modes

AI output reliability, hallucination, academic integrity, and age-appropriate use require safeguards.
High-stakes or student-data-centered AI use requires stronger governance, transparency, and bias monitoring.

Cost And Operations

Time Cost Notes

Not specified in extracted text unless noted in duration field.

Staffing Notes

Requires educators/researchers/facilitators with sufficient AI literacy and pedagogy knowledge for the target learners.

Infra Notes

Infrastructure depends on AI tool type, learner devices, data access, and institutional policy context.

Extraction Notes

Confidence

High

Missing Information

group_size
duration

Reasoning Limits

This entry was automatically extracted from the PDF text and manifest metadata. Fields should be manually verified before public registry publication, especially group size, location, duration, and outcome claims.

Duplicate Check Against Uploaded Cases Json

Closest Existing Title

Pedagogical Design of K-12 Artificial Intelligence Education: A Systematic Review

Similarity Score

0.486

Likely Duplicate

false

Registry Metadata

Case ID

AAB-CASE-2026-RV-099

Publication Status

Published empirical study

Exploring Social Biases of Large Language Models in a College Artificial Intelligence Course

Implementation

Learning context

AI role

Outcome signal

Registry Facets

Implementing Organization

Learning Context

Learner Profile

Educational Intent

AI Tool Description

Activity Design

Observed Challenges

Design Adaptations

Reported Outcomes

Ethical & Privacy Considerations

Evidence Type

Relevance to Research

Case Status

AAB Classification Tags

Source Publication

Transferability

Cost And Operations

Extraction Notes

Registry Metadata