Scientists develop student agents based on generative models, which are expected

2024-05-18

Recently, a team from the University of California, San Diego, has proposed a student intelligent agent system called EduAgent, which is based on generative models. Utilizing large language models, it simulates students' fine-grained physical behaviors, psychological states, and learning processes in an all-around manner.

Experiments have shown that EduAgent can not only simulate and predict the learning behaviors of real students but also generate reasonable learning behaviors of virtual students without real data.

Recently, the related paper titled "EduAgent: Generative Student Agents in Learning" was published on the preprint website arXiv [1].

Xu Songlin, a doctoral student at the University of California, San Diego, is the first author and corresponding author of the study, which was completed under the joint guidance of Assistant Professor Qin Lianhui and Professor Zhang Xinyu.Challenge: Simulating Fine-Grained Student Behavior with Large Language Models

Although there has been extensive research on simulating student behavior using deep learning before the advent of large models, deep learning is limited by the need for a large amount of training data, making it difficult to directly simulate students of different personalities.

Advertisement

With the explosion of large language models, an increasing number of studies have proven that large language models possess strong learning and simulation capabilities, even without providing new training data to adjust the model.

Based on the pre-training of large language models, they have been taught strong contextual learning abilities and extensive knowledge bases. Therefore, it is more feasible to use large language models to simulate student behavior, only requiring context information from the language environment.

Despite the fact that large language models have been well proven to have strong imitation capabilities for human behavior, simulating fine-grained student behavior with them is more challenging.The team, building on previous research, has provided a more fine-grained dataset annotation, which includes real-time eye-tracking information collected during the online learning process, various psychological states, and the final learning outcomes of students.

However, there are too many student behaviors to simulate (including physical actions, psychological states, and the ability to understand knowledge points), and the context is too complex for large language models to grasp the key points from a large amount of information.

Xu Songlin said: "We propose a method based on cognitive priors to guide the model to think and reason about the potential connections and mutual influences between different behaviors, thereby achieving better student behavior simulation."

In addition, in many scenarios, it is difficult to obtain real student data. Therefore, researchers have also explored whether large language models can still produce reasonable behaviors when simulating virtual student agents without any real student data.

By simulating students with different personalities, they found that the cross-links between the physical behaviors, psychological states, and knowledge mastery abilities generated by the virtual student agents are consistent with the relationships between the three that have been proven by many existing cognitive science studies."This proves that it is feasible to generate reasonable and completely virtual student behaviors using large language models without relying on real data," he said.

Foreign companies develop cognitive assistive electronic dogs, which wag their t
MIT Ph.D. proposes a verifiable controller framework, providing a solution for c
Scientists propose a new tuning scheme to enhance the performance of multiple mo
The U.S. Department of Defense funds private edge computing, developing an ultra
Scientists develop a new type of micro-needle electrode array, which can be used
Who is stronger, OpenAI GPT-4o or Google Astra? The former has more realistic au
Google Gemini 1.5 Pro is upgraded to 2 million tokens, announcing it is open to
Scientists prepare a new type of liquid crystal elastomer, greatly reducing the
Scientists reveal a mysterious new mechanism of the brain: trying to explain how
Researchers find that Meta Cicero has deceptive behavior, not only telling lies

It is reported that in the field of digital twin systems for educational scenarios, researchers have developed intelligent agent simulation systems with finer granularity and closer to the real learning state of students.

Such systems can comprehensively simulate various real learning behaviors and states of students.

For example, not only can they simulate students answering questions, but they can also simulate the eye movements of students during the teaching process, and even whether the student feels confused about a certain knowledge point and whether they are focused or distracted in class.

At the same time, the intelligent teaching system based on digital twins has further improved compared to existing intelligent teaching systems, and is no longer limited to simply providing suggestions for teachers or providing answers for students, but is deeply integrated into the teaching process.Throughout the entire process of student learning, timely and personalized educational guidance is provided by focusing on fine-grained learning behaviors (such as points of concentration) and each student's unique knowledge background and comprehension abilities.

In addition to the field of education, the idea proposed in this paper of "using large language models to simulate human physical behaviors (such as eye movements)" is expected to be expanded to other scenarios that include user behaviors.

For example, interactions between real people and virtual digital humans. By generating eye contact and interactions between the virtual digital human's gaze and the real person's eye movements, users can gain a sense of identification and emotional communication.

Developing student agents based on generative modelsAccording to the introduction, the research roughly went through five stages, including: identifying significant issues, literature review, proposing a new model, designing experiments to evaluate the model, and considering the potential application scenarios and limitations of the model.

Specifically:

First, identify significant issues.

In fact, researchers had already discovered this issue in previous projects: How to simulate students' more granular and comprehensive learning behaviors more realistically with less data?

Existing deep learning models require large datasets for training, so there is an urgent need to research new models that can achieve good results without requiring a large amount of additional training data.Second, literature review, to understand the latest existing papers on the solution methods, effects, and limitations for this issue.

Xu Songlin introduced, "Through our research, we found that although existing studies have used large models to simulate student learning, the effects are not even as good as those of deep learning models."

One of the main reasons is that existing studies only use the accuracy of students' answers for modeling, ignoring the context information. For example, the content of specific knowledge points, and the behavioral status of students in the process of learning this knowledge point.

Third, propose a new model in response to the limitations of existing research.

In response to the limitations of the existing research mentioned above, the research group proposed a more comprehensive modeling of student behavior by combining students' physical behavior, psychological state, and knowledge understanding ability, to create a truly intelligent student agent EduAgent.The intelligent agent can possess its own eye movements, psychological states, and comprehension abilities, just like a real student, instead of merely predicting the accuracy of students' exam results, Xu Songlin said.

Fourth, design the experimental evaluation model.

Researchers designed two experiments to evaluate the model. The first experiment is for modeling a specific student, predicting future learning behaviors and states through a small amount of real students' historical learning behavior data.

The second experiment generates specific virtual learning behaviors for virtual students with different personalities without relying on any real experimental data. It can be applied to the generation of virtual student data to train specific teaching strategy models.

Xu Songlin explained, "Among them, the third and fourth stages are often iterative with each other, because it is impossible to achieve an ideal effect with a single model design."Fifth, consider the potential application scenarios and limitations of the thinking model, as well as unresolved issues.

For example, conduct a more in-depth study of the behavior of the generated virtual students to ensure their rationality. In addition, it is also necessary to address issues such as potential biases that may exist in large language models.

Hope to enhance personalized teaching for students.

At present, Xu Songlin is in the third year of his doctorate at the University of California, San Diego, mainly engaged in research on human enhancement based on AI (Human-AI Integration). That is to say, using AI to enhance human cognitive abilities and their applications in education, health and other scenarios.Based on this research, the research group will carry out more extended studies in the future. These include: exploring more powerful models based on the dataset of this study, developing an intelligent teaching system for the team's experimental platform, and solving students' personalized training issues through targeted teaching, etc.

It is reported that they have developed a teaching website, which integrates the latest models and algorithms to provide targeted teaching and guidance for students with different personalities and learning backgrounds.

For example, by the students' historical learning behavior, an individualized database is automatically established for each student to describe each student's unique learning status and mastery of knowledge points.

A student simulator is established for each unique student to train the intelligent teaching model (AI teacher), so that the AI teacher can explore different teaching strategies repeatedly according to the historical learning records of different students, and select the best teaching strategy.

"I hope our system can provide truly personalized teaching for all students, to overcome the obstacles of personalized and diversified teaching, and ensure the inclusiveness of education," said Xu Songlin.It is reported that they are considering recruiting volunteers (including teachers and students) to experience this personalized teaching system for free remotely. Partners interested in this project are welcome to contact Xu Songlin at soxu@ucsd.edu.

Leave a comment