Automatic Item Generation and Validation

A Network Integrated Approach using Large Language Models in R

This innovative course introduces a new way to create and validate questionnaires and scales using artificial intelligence, specifically large language models (LLMs).

In this course you will learn a fully automated scale development and validation method using R. Participants will learn to use LLMs and advanced network psychometric techniques both to develop new items using LLMs and to do a complete structural validation process without collecting data in humans. This enables a huge reduction in the time and resources traditionally required for scale development.

In simple terms, we’ll teach you how to:

Use AI to automatically generate questions for new scales.
Check if these items are good at measuring what they’re supposed to measure (structural validity) and if the items and dimensions are stable (dimensionality and item stability).
Do all of this without needing to test the questions on real people first.

Traditionally, creating a good questionnaire or test (usually called a “scale” in research) takes a lot of time and money. It usually involves writing many questions, testing them on hundreds of people, and then using complex statistics to figure out which questions work best.

Our course shows you how to do all this using R and AI using a method called AI-GENIE (Automatic Item Generation and Validation via Network Integrated Evaluation).

Starting March 12, we are offering this seminar as an 8-hour synchronous*, livestream workshop held via the free video-conferencing software Zoom. Each day will consist of two lecture sessions which include hands-on exercises, separated by a 30-minute break. You are encouraged to join the lecture live, but will have the opportunity to view the recorded session later in the day if you are unable to attend at the scheduled time.

*We understand that finding time to participate in livestream courses can be difficult. If you prefer, you may take all or part of the course asynchronously. The video recordings will be made available within 24 hours of each session and will be accessible for four weeks after the seminar, meaning that you will get all of the class content and discussions even if you cannot participate synchronously.

Closed captioning is available for all live and recorded sessions. Captions can be translated to a variety of languages including Spanish, Korean, and Italian. For more information, click here.

More Details About the Course Content

Automatic Item Generation and Validation via Network Integrated Evaluation (AI-GENIE) leverages large language models (LLMs) and advanced network psychometric techniques to streamline the item generation and validation process without the need to collect data in humans. (See the scientific preprint for technical details.) Traditional scale development is resource-intensive, time-consuming, and costly, often requiring extensive human expert intervention, and costly data collection for psychometric validation.

Recent advancements in AI and LLMs offer promising solutions to generate expert-quality text for scale items. The challenge lies in efficiently selecting and validating non-redundant, high-quality items that accurately represent intended psychological constructs, and that present adequate dimensionality (structural validity) AND item/dimension stability. AI-GENIE automates the entire process, from item generation to validation, enhancing efficiency and scalability in psychological assessment creation.

Previous research has shown that AI-generated items can create adequate psychological assessments, but the item selection process remains resource-intensive (with rounds of in-human data collection). AI-GENIE eliminates the need for extensive human expert involvement in generating, selecting, and validating items, potentially saving researchers significant time and money.

The methodology combines open and closed-source LLMs, generative AI, and network psychometrics to facilitate scale generation, selection, and validation. AI-GENIE is the first fully automated methodology to generate, assess, and validate the quality of AI-generated items for psychometric scales.

By the end of this course, you will be able to:

Understand the principles and applications of AI-GENIE in scale development.
Utilize various LLMs to generate new items using R.
Apply network psychometric techniques for item validation and selection in silico (i.e., without the use of human subjects) using R.
Critically evaluate the effectiveness and limitations of AI-generated items.
Design and implement a full-scale development project using AI-GENIE methodology.

Computing

This is a hands-on course with instructor-led software demonstrations and guided exercises. These guided exercises are designed for the R language, so you should use a computer with a recent version of R (version 4.1.3 or later) and RStudio (version 2022.02.1+461 or later).

To follow along with the course exercises, you should have good familiarity with the use of R, including opening and executing data files and programs, as well as performing very basic data manipulation and analyses.

If you’d like to take this course but are concerned that you don’t know enough R, there are excellent on-line resources for learning the basics. Here are our recommendations.

Who Should Register?

This course is perfect for researchers, marketers, or anyone interested in creating more efficient surveys or tests. No prior knowledge of AI or advanced statistics is required, but you should have:

Basic understanding of psychometrics and scale development.
Familiarity with R programming, like from a introductory seminar such as Introduction to R for Data Analysis, R for SPSS Users, or R for Stata Users.
Knowledge of statistical analysis and concepts.

Outline

Day 1

Introduction to AI-GENIE and LLMs

Overview of traditional scale development challenges
Introduction to AI-GENIE methodology
Exploration of LLMs: Gemma 2, Llama 3, Mixtral 8x7b, GPT 3.5, and GPT 4
Ethical considerations in AI-assisted scale development

Prompt Engineering for Item Generation

Principles of effective prompt design
Few-shot prompting techniques
Crafting prompts for item development
Hands-on practice with prompt engineering

Item Generation and Initial Pool Creation

Implementing LLMs for item generation
Managing temperature settings and their effects
Strategies for creating diverse and balanced item pools
Quality assessment of generated items

Network Psychometrics and Item Embedding

Introduction to network psychometrics
Text embedding techniques
Exploratory Graph Analysis (EGA)
Normalized Mutual Information (NMI) in item analysis

Day 2

Item Reduction and Redundancy Analysis

Unique Variable Analysis (UVA)
Weighted topological overlap for identifying locally dependent items
Iterative item reduction techniques
Optimizing Walktrap algorithm step size

Stability Analysis and Final Item Selection

Introduction to bootEGA for stability assessment
Implementing stability thresholds
Iterative stability analysis and item removal
Finalizing the item pool

Project Implementation and Evaluation

Designing a complete scale development project using AI-GENIE
Implementing the full AI-GENIE pipeline
Evaluating results and comparing to traditional methods
Discussion of potential improvements and future directions

Seminar Information

Wednesday, March 12 –
Thursday, March 13, 2025

Daily Schedule: All sessions are held live via Zoom. All times are ET (New York time).

10:30am-12:30pm (convert to your local time)
1:00pm-3:00pm

Payment Information

The fee of $695 includes all course materials.

PayPal and all major credit cards are accepted.

Our Tax ID number is 26-4576270.

Contact Information

+1 610-715-0115 info@statisticalhorizons.com

Automatic Item Generation and Validation: A Short Course

An 8-Hour Livestream Seminar Taught by Hudson Golino, Ph.D.