Using GPT-4 to assist teaching for 6 weeks = normal learning for two years?!
The story is like this.
An authoritative team composed of education experts, data scientists, and research analysts from the World Bank, conducted a randomized controlled trial on student learning with GPT-4 tutoring in Nigeria.
They found that 6 consecutive weeks of AI-assisted after-school tutoring resulted in learning gains equivalent to two years of normal schooling.
Moreover, this method exceeded 80% of other educational interventions in the randomized controlled trial database of educational interventions in developing countries.
Almost all the students who participated in the experiment showed learning progress, and the more AI-assisted courses they participated in, the more obvious their progress was.
After Ethan Mollick, a professor at the Wharton School, posted this research on X, it quickly gained a lot of attention from netizens.
Greg Brockman also forwarded it.
Netizens in the comment area shared their experiences of using AI to assist their learning.
My 13-year-old daughter has been using ChatGPT to tutor her for over a year. She is already able to discuss topics such as derivatives and integrals in calculus, as well as electromagnetism and thermodynamics in physics. Last year, the school wanted to let her skip a grade, but we refused.
I am introducing a student-designed LLM tutoring tool for my university courses. I wonder if anyone has suggestions on how to implement this as a randomized controlled trial?
Providing the tutoring service to only half the students seems unfair.
Ethan Mollick further supplemented that it is very important for teachers to lead students in using AI:
In some cases, using AI independently as a tutor may impair learning, as it gives a false sense of learning.
Project Details
In 1984, educational psychologist Benjamin Bloom demonstrated that students receiving one-on-one tutoring significantly outperformed those limited to traditional classroom settings. Although the benefits of one-on-one tutoring have been proven, the cost is high.
The education team from the World Bank believes that generative AI can create new human-like content, opening up broader possibilities for educational applications.
Based on this potential, they conducted an experiment in Edo State, Nigeria.
From June to July 2024, 800 first-year high school students from seven pilot schools were required to attend two AI-assisted English tutoring sessions per week in the computer lab.
Specifically, each session began with the teacher introducing the weekly theme, followed by the students interacting with Microsoft Copilot powered by GPT-4 to complete English grammar learning and writing tasks.
The teachers guided the students on how to use AI, provided prompt suggestions, and led a brief reflection exercise at the end of each session.
During the project, the team summarized some lessons learned:
The participating students showed extremely high engagement, with many expressing a strong desire to use AI tools in the computer lab.
After the pilot, teachers' initial concerns about using AI transformed into recognizing its potential and the guiding role of AI in improving student learning.
The project lasted six weeks, and a longer duration may be more effective. In the early stages, students mainly learned to set up email, create Microsoft Copilot accounts, and use computers. Extending the project could focus more time on students' actual learning needs.
Frequent power and network outages during the rainy season disrupted student-AI interactions, and providing backup power and network connections was crucial for maintaining smooth course delivery.
Necessary support is needed for students and teachers, such as the project team developing a toolkit to guide the course and carefully designed prompts.
As with any project, there can be significant gaps between design and implementation. To address this, a small monitoring team closely supervised each pilot, collected key insights, and provided feedback to ensure the project progressed as planned.
Teachers also pointed out key risks of AI, such as over-reliance, generating incorrect feedback and misinformation, and misuse. Implementing appropriate mitigation strategies for these risks is crucial for students to explore this new learning approach.
After six weeks, students took a written test to assess their performance in three key areas: English (the focus), AI knowledge and digital skills.
The results showed that the randomly selected students who participated in the project significantly outperformed the non-participating students in all three areas.
Notably, the participating students also performed better in the school's regular year-end exams, even though the content of these exams far exceeded the topics covered during the six-week intervention.
This suggests that students who have learned to use AI effectively may have already applied these skills to independently explore and master other subjects.
Furthermore, the team found that the project had a positive impact on all students, not just those with high academic performance. And the more AI-assisted courses the students participated in, the more obvious their progress.
As mentioned earlier, many students had difficulty attending due to factors such as rainy season floods, so the team developed a rigorous monitoring system specifically for this project to accurately track student attendance.
The results showed that each additional day of attendance significantly improved learning outcomes. As shown in the figure below, the average assessment scores of students increased with the increase in attendance days:
The learning gains of students through AI assistance were very significant, about 0.3 standard deviations, equivalent to the progress of two years of normal learning in just six weeks.




