Perspective

Beta Testing Childhood

Ryan Lee / Apr 23, 2026

In April 2025, the White House issued an Executive Order promoting the integration of artificial intelligence into American classrooms "from the earliest stages of the educational journey." In parallel, the UK Department for Education is investing millions in classroom AI rollout while openly conceding that evidence on its developmental impact remains “limited” and “emerging."

Between them, the US and UK governments have authorized the deployment of generative AI to tens of millions of children aged 3 to 12. Neither has commissioned a neurodevelopmental impact assessment. Neither has published age-appropriate guidelines grounded in cognitive science. Executive Order 14277 does not mention executive function, working memory, critical developmental periods, or evidence of developmental safety.

One year on, this is not a policy gap. It is a policy vacuum, and it is being filled by whichever edtech vendor reaches the procurement desk first.

I am completing an MSc in Psychology and Neuroscience of Mental Health at King's College London, and I direct JAI Behavioural, an independent behavioral science consultancy. Over the past months, I have conducted a structured review of the peer-reviewed literature on AI, cognitive development, and children — the kind of review governments should have commissioned before authorizing deployment.

The evidence base

Recent work has begun to map the edges of this issue. In a commentary for Brookings, Sweta Shah documents the invisible AI exposure of birth to age 8, through consumer products such as smart monitors, AI-enabled toys, and algorithmic feeds, and calls for data limits and age-appropriate design standards. Jason Lodge and Leslie Loble's writing for the Australian Network for Quality Digital Education describes what they call the "performance paradox" and outlines a pedagogical solution: redesigned tools, metacognitive prompts, and teacher expertise.

What neither addresses is what happens when governments push generative AI directly into K-12 classrooms during the windows when executive function is being built, or the absence of accountability behind those policy decisions.

The most robust experimental evidence comes from a study by Bastani et al., published in PNAS. Involving nearly 1,000 high school mathematics students, the researchers found that those given unfettered access to GPT-4 during practice sessions performed substantially better while they had access to it. However, they scored 17% lower than peers who had never used AI, once access was removed. The short-term assistance masked a long-term capability deficit.

A second condition in the same study used a "GPT Tutor" with pedagogic guardrails designed to provide hints rather than answers. That group performed no worse than students who never had access to AI.

This is what Lodge and Loble term the performance paradox: short-term task performance improves while durable learning declines. It remains the clearest causal evidence currently available.

The implications become more serious when viewed through a developmental lens.

The students in Bastani's study were sixteen-year-olds whose executive function systems were already largely built. Executive functions such as working memory, cognitive flexibility, and inhibitory control develop through sustained cognitive effort during specific childhood windows. Psychologists such as Best and Miller find that inhibitory control strengthens most rapidly between ages 5 and 8, while work by Gathercole et al. demonstrates memory capacity expands substantially between 6 and 12. These capacities are among the strongest predictors of academic success, better than IQ, and once these developmental windows close, they do not fully reopen.

If a 17% capability deficit appears in sixteen-year-olds after unfettered AI access is withdrawn, the relevant question is what the analogous effect looks like in a 7-year-old whose working memory system is still under construction. Unsupervised home use of ChatGPT carries none of the guardrails that Bastani's “GPT Tutor” condition imposed.

A 2025 systematic review by Pergantis et al. confirms that AI chatbots can enhance executive functioning like working memory. However, the researchers themselves warn that these successes occurred in “well-controlled settings” primarily with older students, adults, or specific neurodivergent populations (such as children with ADHD), and do not readily generalize to real-world environments.

Rather than assuming these cognitive benefits transfer to typically developing young children in standard classrooms, the authors emphasize the critical need for further longitudinal studies across diverse populations. Scaling these highly specific interventions into universal classrooms carries an unmapped biological cost.

In effect, a cognitive intervention is being scaled across a developmental population for which the longitudinal evidence base is literally empty, on the basis of executive action that did not wait for it to exist.

The natural experiment already happened

The closest analog is COVID-19, and the data is now in.

Longitudinal research from Wright et al. tracked executive functioning in 667 elementary-age children across the pandemic using direct assessment. School closures produced an estimated 11 to 12 months of lost executive function growth. Post-lockdown, executive functioning development resumed at only 65% to 74% of the pre-pandemic rate–partial recovery, not full.A 2026 longitudinal study by Jones et al. found a massive 0.5 standard deviation drop in executive function scores following pandemic disruptions, while Madigan documented a 52% rise in children's screen time over the same period.

When the normal practices that build executive function are displaced by passive screen exposure, executive function growth measurably slows.

The COVID pandemic was involuntary. Classroom AI deployment is a policy choice that replicates the mechanism Wright identified: displacement of active cognitive processing during critical developmental windows. Unlike the pandemic, the "before" data exists this time. And yet the policy apparatus designed to prevent precisely this kind of thing is accelerating deployment rather than pausing to assess the evidence.

System pressure and classroom reality

This acceleration is not occurring in a vacuum. Education is in a workload crisis. Globally, between 25% and 74% of teachers report moderate to severe burnout. The 2026 Tes Global Wellbeing Report found that 50% of educators worldwide do not plan to stay in the profession long-term due to unmanageable workload. Echoing these global figures, the 2025 RAND State of the American Teacher survey reveals that 62% of US K-12 public school educators report experiencing frequent job-related stress.AI adoption is running ahead of evidence because teachers are grasping at cognitive relief from an impossible workload.

The systemic failure—unsustainable workload and absent policy—is being patched by individual teachers making ad hoc decisions about tools whose developmental consequences nobody has studied. This is the metacognitive equity gap Lodge and Loble identified, operating at the population level: students with the strongest scaffolding can potentially use AI productively; students without it, which is most students, are being handed tools designed to bypass the cognitive effort their brains are supposed to be building.

What evidence-based policy would require

The argument is not that AI should be banned from education. Bastani et al. specifically rule that position out: their guardrail “GPT Tutor” condition did not harm learning outcomes. Lodge and Loble's pedagogical framework is the right direction for the students already in the room. The argument is that unguarded generative AI access for children in critical developmental windows is now a controlled-trial harm signal, and the policy instruments currently governing deployment do not distinguish between guarded and unguarded conditions at all.

Evidence-based policy would follow from that distinction. No unguarded generative AI exposure during ages 3 to 6, when the cognitive infrastructure for everything else is forming. Computer literacy without generative AI at ages 7 to 9, during the period of most rapid inhibitory control development. Supervised, guardrailed, pedagogically structured AI use only at ages 10 to 12, delivered by teachers who have completed mandatory neurodevelopmental training. Supervised independence at 13 and over, once metacognitive awareness has matured enough for a student to recognize when they are engaging cognition versus outsourcing it. Parent education is a non-optional component because home use undermines school policy if parents do not understand why the guardrails exist.

None of this requires banning technology. All of it requires governments to treat children's developing brains with the same regulatory seriousness applied to children's food, children's medicine, and children's toys.

Authors

Ryan Lee

Ryan Lee is the founder of JAI Behavioural, an independent neuroscience-grounded behavioral science consultancy focused on burnout prevention and AI governance frameworks for international schools. He is completing an MSc in Psychology and Neuroscience of Mental Health at King's College London (IoPP...

Perspective

AI Is Changing Teens’ Lives. Why Are They Being Left Out of the Debate?April 9, 2026

Perspective

America Needs Better AI AmbitionsDecember 11, 2025

Perspective

Schools Went After Cellphones. Now It’s Time to Ban Generative AI.November 11, 2025

Perspective

Red Teaming Generative AI in Classrooms and BeyondSeptember 19, 2025

Tech Policy Press/YouGov Poll Finds Support for US Surgeon General Warning Label for Social MediaJuly 12, 2024