Skip to main content

Shifting from Superpower Factories to Superpower Classrooms: The New Landscape of Global Manufacturing

China’s dominance in manufacturing now rests on its vast talent pipelines, not just efficiency
Western economies risk losing ground unless education and training systems compress time-to-competence at scale
Factories of the future will be decided in classrooms as much as on shop floors

The Urgent Need to Address AI Security: Why AI Agents, Not LLMs, Are the Real Risk Surface

The Urgent Need to Address AI Security: Why AI Agents, Not LLMs, Are the Real Risk Surface

Picture

Member for

1 year 1 month
Real name
Catherine Maguire
Bio
Catherine Maguire is a Professor of Computer Science and AI Systems at the Gordon School of Business, part of the Swiss Institute of Artificial Intelligence (SIAI). She specializes in machine learning infrastructure and applied data engineering, with a focus on bridging research and large-scale deployment of AI tools in financial and policy contexts. Based in the United States (with summer in Berlin and Zurich), she co-leads SIAI’s technical operations, overseeing the institute’s IT architecture and supporting its research-to-production pipeline for AI-driven finance.

Modified

The real risk isn’t the LLM’s words but the agent’s actions with your credentials
Malicious images, pages, or files can hijack agents and trigger privileged workflows
Treat agents as superusers: least privilege, gated tools, full logs, and human checks

In 2025, the average cost of a U.S. data breach reached $10.22 million. Thirteen percent of surveyed organizations reported that an AI model or application was involved, yet 97% of them did not have proper AI access controls. This number should prompt us to rethink our threat model. We focus on the language model—its cleverness, its mistakes, its jailbreaks—while the real risk lies just one step away: the agent that holds keys, clicks buttons, opens files, fetches URLs, executes workflows, and impersonates us across business systems. When a seemingly "harmless" image, calendar invite, or web page provides instructions that an agent follows without question, the potential damage is not simply a bad paragraph; it is a chain of actions performed under your identity. Recent examples have demonstrated "silent hijacking" with minimal or no user interaction, involving mainstream assistants. If we continue to treat these incidents as model problems, we will continue to create superusers with inadequate controls. The solution starts by identifying the risk at its source: the LLM-based program that takes action.

From Chat to System Actor

The significant change is not about better or worse statistics, but about operations. An agent comprises prompt logic, tools and plugins, credentials, and policies. It can browse, run code, read emails, query CRMs, push tickets, edit spreadsheets, and make purchases. This combination brings traditional attack surfaces into a modern interface. Consider a support agent that reads PDFs, screenshots, and forms. A single compromised image or document can carry harmful instructions that the model unwittingly turns into actions—such as forwarding sensitive files, visiting attacker-controlled URLs, or stealing tokens. This is why image or pixel-level manipulations matter now: they don't just "trick" the model, but push a tool-enabled process to take action. Viewing this solely as a content moderation issue overlooks a larger systems problem: input handling, privilege boundaries, output filtering, and identity controls for software that takes action.

Security researchers and practitioners have started to recognize this shift. The OWASP Top 10 for LLM applications warns about prompt injection, insecure output handling, and vulnerabilities associated with plugins and third-party APIs. UK guidance on securely deploying machine-learning systems emphasizes the importance of the runtime environment, including credentials, storage, and monitoring. Attacks often combine methods and exploit areas where the software can act. Most agent incidents are not 'model failures' but failures related to identity and action: over-privileged tokens, unchecked toolchains, unrestricted browsing, and the absence of human oversight for high-risk activities. This necessitates a different solution: treating the agent like a privileged automation account with minimal privileges, outbound controls, and clear logs rather than as a chat interface with simple safeguards.

The description of credentials is particularly overlooked. Unit 42 points out that the theft or misuse of agent credentials puts every downstream system the agent can access at risk. Axios reported that "securing AI agent identities" was a significant theme at RSA this year. If your help desk agent can open HR tickets or your research agent can access lab drives and grant permissions, then agent impersonation represents a serious breach, not just a minor user interface annoyance. The central issue is simple: we connected a probabilistic planner to predictable tools and gave it the keys. The straightforward fix is to issue fewer keys, monitor the doors, and restrict the planner's access to certain areas.

What the Evidence Really Shows

Emerging evidence leads to one uncomfortable conclusion: AI agents are real targets today. In August, researchers revealed "silent hijacking" techniques that enable attackers to manipulate popular enterprise assistants with minimal interaction, thereby facilitating data theft and workflow changes. Shortly after, the trade press and professional organizations highlighted these findings: commonly used agents from major vendors can be hijacked, often by embedding malicious content in areas the agent already visits. The pattern is worryingly similar to web security—tainted inputs and over-trusted outputs—but the stakes are higher because the actor is different: software already authorized to operate within your company.

Figure 1: U.S. breach costs are more than double the global average—budgeting from “global” figures underestimates the exposure universities actually face.

Governments and standards organizations have responded by extending traditional cyber guidelines into the agent era. A joint U.S. and international advisory on safely deploying AI systems emphasizes that defenses should cover the entire environment, not just the model interface. NIST's Generative AI Profile translates this approach into operational controls: define and test misuse scenarios, carefully restrict tool access by design, and monitor for unusual agent behavior like you would for any privileged service. These measures align with OWASP's focus on output handling and supply-chain risks—issues that arise when agents access code libraries, plugins, and external sites. None of this is groundbreaking; it's DevSecOps for a new type of automation that communicates.

The cost and prevalence of these issues underscore the urgency of addressing this issue for educators and public-sector administrators. According to IBM's 2025 study, the global average breach cost is around $4.44 million, with the U.S. average exceeding $10 million. 'Shadow AI' appeared in about 20% of breaches, adding hundreds of thousands of dollars to response costs. In a large university, if 20% of incidents now involve unauthorized or unmanaged AI tools, and even a small percentage of those incidents go through an over-privileged agent connected to student records or grant systems, the potential loss—financial, operational, and reputational—quickly adds up. A conservative estimate for a 25,000-student institution facing a breach with U.S.-typical costs suggests that the difference can mean a challenging year or a canceled program. We can debate specific numbers, but we cannot ignore the trend.

The demonstration area continues to expand. Trend Micro's research series catalogs specific weaknesses in agents, ranging from code execution misconfigurations to data theft risks concealed in external content, illustrating how these vulnerabilities can be exploited when agents fetch, execute, and write. Additionally, studies have reported that pixel-level tricks in images and seemingly harmless user interface elements can mislead agents that rely on visual models or screen readers. The message is clear: once an LLM takes on the role of a controller for tools, the relevant safeguards shift to securing those tools rather than just maintaining prompt integrity. Focusing only on jailbreak prompts misses the bigger picture; often, the most harmful attacks don't aim to outperform the model at all. Instead, they exploit the agent's permissions.

A Safer Pattern for Schools and Public Institutions

The education sector presents an ideal environment for agent risk: large data sets, numerous loosely managed SaaS tools, diverse user groups, and constant pressure to "do more with less." The same features that make agents appealing—automating admissions, drafting grant reports, monitoring procurement, or large-scale tutoring—also pose risks when guidelines are unclear. Here's a practical approach that aligns with established security norms: begin with identity, limit the keys, monitor actions, capture outputs, and involve a human in any processes that impact finances, grades, credentials, or health records. In practice, this means using agent-specific service accounts, temporary and limited tokens, clear tool allow-lists, and runtime policies that prevent file modifications, network access, or API calls beyond approved domains. When an agent needs temporary elevated access—for example, to submit a purchase order—they require a second factor or explicit human approval. This is not just "AI safety"; it's access management for a communicative RPA.

Figure 2: The dominant failure is governance: almost all AI-related breaches occurred where basic AI access controls were absent—this is an agent/permissions problem, not a “chatbot” problem.

Implement that pattern with controls that educators and IT teams know well. Use NIST's GenAI profile as a checklist for procurement and deployment, map agent actions against your risk register, and develop scenarios for potential misuse, such as indirect prompt injection through student submissions, vendor documents, or public websites. Utilize the OWASP Top 10 for LLM apps to guide your testing: simulate prompt injection, ensure outputs do not activate unvetted tools, and test input variability. Follow the UK NCSC's deployment recommendations: safeguard sensitive data, run code in a secure environment, track all agent activities, and continually watch for unusual behaviors. Finally, treat agent credentials with utmost importance. If Unit 42 warns about agent impersonation, take it seriously—update keys, restrict tokens to single tools, and store them in a managed vault with real-time access. These are security practices already in place; the change lies in applying them to software that mimics user behavior.

Education leaders must also rethink their governance expectations. Shadow AI is not a moral issue; it's a procurement and enablement challenge. Staff members will adopt tools that work for them. If central IT does not provide authorized agents with clear features, people will use browser extensions and paste API keys into unapproved apps. The IBM data is crystal clear: unmanaged AI contributes to breaches and escalates costs. An effective response is to create a "campus agent catalog" of approved features, with levels of authorization: green (read-only), amber (able to write to internal systems with human oversight), and red (financial or identity actions requiring strict control). Combine this with a transparent audit process that tracks agent actions as you would for any other enterprise service. Encourage use by making the approved route the easiest option: set up pre-configured tokens, vetted toolchains, and one-click workspaces for departments. A culture of security will follow convenience.

Objections will arise. Some may argue that strict output filters and safer prompts suffice; others might insist that "our model never connects to the internet." However, documented incidents show that content-layer restrictions fail when the agent is trusted to act, and "offline" quickly becomes irrelevant once plugins and file access are included. A more thoughtful critique may center on costs and complexity. This is valid—segmented sandboxes, egress filtering, and human oversight can slow down teams. The key is focusing on impact: prioritize where agents can access finances or permissions, not where they merely suggest reading lists. The additional cost of limiting an agent that interacts with student finances is negligible compared to the expense of cleaning up after a credential-based breach. As with any modernization effort, we implement changes in stages and sequence risk reduction; we do not wait for a flawless control framework to ensure safer defaults.

Incidents in the U.S. cost double-digit millions, with a growing number tied to AI systems lacking essential access controls. The evidence has consistently pointed in one clear direction. The issue is not the sophistication of a model; it is the capability of software connected to your systems with your keys. Images, web pages, and files now serve as operational inputs that can prompt an agent to act. If we accept this perspective, the path forward is familiar. Treat agents as superusers deserving the same scrutiny we apply to service accounts and privileged automation: default to least privilege, explicitly restrict tool access, implement content origin and sanitization rules, maintain comprehensive logs, monitor for anomalies, and require human approval for high-impact actions. In education, where trust is essential and budgets are tight, this is not optional. It is necessary for unlocking significant productivity safely. Identify the risk where it exists, and we can mitigate it. If we keep viewing it as a "chatbot" issue, we will continue to pay for someone else's actions.


The views expressed in this article are those of the author(s) and do not necessarily reflect the official position of the Swiss Institute of Artificial Intelligence (SIAI) or its affiliates.


References

Axios. (2025, May 6). New cybersecurity risk: AI agents going rogue.
Cybersecurity and Infrastructure Security Agency (CISA) et al. (2024, Apr. 15). Deploying AI Systems Securely (Joint Cybersecurity Information).
Cybersecurity Dive (Jones, D.). (2025, Aug. 11). Research shows AI agents are highly vulnerable to hijacking attacks.
IBM. (2025). Cost of a Data Breach Report 2025.
IBM Newsroom. (2025, Jul. 30). 13% of organizations reported breaches of AI models or applications; 97% lacked proper AI access controls.
Infosecurity Magazine. (2025, Jun.). #Infosec2025: Concern grows over agentic AI security risks.
Kiplinger. (2025, Sept.). How AI puts company data at risk.
NCSC (UK). (n.d.). Machine learning principles: Secure deployment.
NCSC (UK). (2024, Feb. 13). AI and cyber security: What you need to know.
NIST. (2024). Artificial Intelligence Risk Management Framework: Generative AI Profile (NIST AI 600-1).
OWASP. (2023–2025). Top 10 for LLM Applications.
Palo Alto Networks Unit 42. (2025). AI Agents Are Here. So Are the Threats.
Scientific American (Béchard, D. E.). (2025, Sept.). The new frontier of AI hacking—Could online images hijack your computer?
Trend Micro. (2025, Apr. 22). Unveiling AI agent vulnerabilities—Part I: Introduction.
Trend Micro. (2025, Jul. 29). State of AI Security, 1H 2025.
Zenity Labs. (2025, May 1). RSAC 2025: Your Copilot Is My Insider.
Zenity Labs (PR Newswire). (2025, Aug. 6). AgentFlayer vulnerabilities allow silent hijacking of major enterprise.

Picture

Member for

1 year 1 month
Real name
Catherine Maguire
Bio
Catherine Maguire is a Professor of Computer Science and AI Systems at the Gordon School of Business, part of the Swiss Institute of Artificial Intelligence (SIAI). She specializes in machine learning infrastructure and applied data engineering, with a focus on bridging research and large-scale deployment of AI tools in financial and policy contexts. Based in the United States (with summer in Berlin and Zurich), she co-leads SIAI’s technical operations, overseeing the institute’s IT architecture and supporting its research-to-production pipeline for AI-driven finance.

The Barbell Workforce: How AI Lowers the Floor and Raises the Ceiling in Education and Industry

The Barbell Workforce: How AI Lowers the Floor and Raises the Ceiling in Education and Industry

Picture

Member for

1 year 2 months
Real name
Keith Lee
Bio
Keith Lee is a Professor of AI and Data Science at the Gordon School of Business, part of the Swiss Institute of Artificial Intelligence (SIAI), where he leads research and teaching on AI-driven finance and data science. He is also a Senior Research Fellow with the GIAI Council, advising on the institute’s global research and financial strategy, including initiatives in Asia and the Middle East.

Modified

AI lowers entry barriers, raises mastery standards
Novices gain most; experts move to oversight and design
Education must deliver operator training and governance mastery

A quiet result from a very loud technology deserves more attention. When a Fortune 500 company gave customer-support agents an AI assistant, productivity rose by 14% on average, but it jumped by 34% for the least-skilled workers. In other words, the most significant early gains from generative AI appeared at the bottom of the ladder, not the top. This single statistic changes how we view the skills debate. Automation is not just about replacing routine jobs; it also helps novices perform at nearly intermediate levels from day one. In factories and offices, this creates a barbell labor market: easier entry for low-skill roles and a faster-moving ceiling for highly skilled workers. Education systems—secondary, vocational, and higher—must change to a world where AI lowers the competence floor and raises the standard for mastery. The policy question is not whether AI will change work; it already has. The question is whether our learning institutions will change at the same speed. The urgency and necessity of ongoing curriculum evolution cannot be overstated.

The New Floor: Automation Expands the Entry Ramp

Evidence from large-scale deployments shows a clear trend. On the factory side, global robot stocks reached about 4.28 million operational units in 2023, increasing roughly 10% in a year, with Asia accounting for 70% of new installations. However, the new wave is less about brute automation and more about software that makes complex equipment understandable to less experienced operators. In administrative settings, AI replaces routine desk tasks; on shop floors, it often supports people by integrating guidance, quality control, and predictive maintenance into the tools themselves. Recent data show an overall shift towards higher demand for machine operators and younger, less-credentialed workers, even as clerical roles continue to shrink. This creates a two-sector split that policymakers cannot ignore.

Figure 1: Firm adoption accelerates sharply after 2015 and inflects again post-2020, showing that AI is now a broad managerial and production input rather than a niche tool.

The mechanism is straightforward. Generative systems capture the playbook of top performers and provide it to novices in real time. In extensive field studies, AI assistants helped less-experienced workers resolve customer issues faster and with better outcomes, effectively condensing months of on-the-job learning into weeks. This aligns with another line of research indicating that access to a general-purpose language model can reduce writing time by about 40% while improving quality, especially for those starting further from the frontier. These are not trivial edge cases or lab-only outcomes; they have now been observed in real workplaces across different task types. For entry-level employees, AI acts as a tutor and a checklist hidden within the workflow.

If this is the new floor, two implications arise for education. First, basic literacy—numeracy, writing, data hygiene—still matters, but the threshold for job-ready performance is shifting from “memorize the rules” to “operate the system that encodes the rules.” Second, AI exposure is unevenly distributed. Cross-country analysis suggests that occupations with high AI exposure are disproportionately white-collar and highly educated. However, this exposure has not led to widespread employment declines to date; in several cases, employment growth has even been positively linked to AI exposure over the past decade. This is encouraging, but it also means that entry ramps are being widened most where AI tools are actively used. Schools and training programs that treat AI as taboo risk exclude students from the very complementarity that drives early-career productivity. Educators and policymakers must ensure equitable access to AI, providing fair opportunities for all.

The Moving Ceiling: Why the Highly Skilled Must Climb Faster

It is tempting to think that if AI boosts lower performers the most, it lowers the value of expertise. However, the emerging evidence suggests the opposite: mastery changes shape and moves upward. In a study of software development across multiple firms, access to an AI coding assistant increased output by about 26% on average. Juniors saw gains in the high-20s to high-30s, while seniors experienced single-digit increases. This does not render senior engineers redundant; it raises the expectations for what “senior” should mean—less focus on syntax and boilerplate and more emphasis on architecture, verification, and socio-technical judgment. In writing and analysis, a similar pattern emerges: significant support on routine tasks and uneven benefits on complex, open-ended problems where human oversight and domain knowledge are crucial. The ceiling is not lower; it is higher and steeper.

This helps explain a paradox in worker sentiment. Even as tools improve speed and consistency, most workers say they rarely use AI, and many are unsure if the technology will benefit their job prospects. Only about 6% expect more opportunities from workplace AI in the long run; a third expect fewer. From a barbell perspective, this hesitation is logical: if AI handles standard tasks, the market will reward those who can operate the system reliably (new entry-level roles) and those who can design, audit, and integrate it across processes (new expert roles). The middle, where careers once developed over many years, is compressing. Education that does not teach students how to climb—from tool use to tool governance—will leave graduates stuck on the flattened middle rung.

For high-skill workers, the solution is not generic “upskilling” but specialization beyond the model’s capabilities: data stewardship, human-factors engineering, causal reasoning, adversarial testing, and cross-domain synthesis. Studies of knowledge workers show that performance can improve dramatically for tasks “inside” the model’s capabilities, but it can decline on tasks “outside” it if workers overly trust fluent outputs. This asymmetry is where advanced programs should focus: teaching when to rely on the model and when to question it. Think fewer assignments aimed at producing a clean draft and more assignments aimed at proving why a draft is correct, safe, and fair, with the model as a visible, critiqued collaborator rather than a hidden ghostwriter.

Designing a Two-Track Education Agenda

If AI lowers the entry threshold and raises the mastery bar, education policy should explicitly support both tracks. On the entry side, we need programs that quickly and credibly certify “operator-with-AI” competence. Manufacturing already sets an example. With robots at record scale and software guiding the production line, short, modular training can prepare graduates to operate systems that once required years of implicit knowledge. Real-time decision support, simulation-based training, and built-in diagnostics reduce the time it takes new hires to become productive. Community colleges and technical institutes that collaborate with local employers to design “Level 1 Operator (AI-assisted)” certificates will broaden access while addressing genuine demand.

Figure 2: Registrations cluster in a handful of application fields (panel a) and are concentrated in management and production uses (panel b), underscoring why education must prepare both operator-with-AI roles and system-level oversight.

The office counterpart is just as practical. Instead of prohibiting AI from assignments and then hoping for honesty, instructors should require paired submissions: a human-only baseline followed by an AI-assisted revision with a brief error log. This approach preserves practice in core skills while teaching students to view systems as amplifiers rather than crutches. It also instills the meta-skills that employers value but often do not assess: prompt management, fact verification, and iterative critique. Early field results indicate that novices benefit most from this structure; schools can gain a similar advantage by incorporating this scaffold into their rubrics.

For the mastery track, universities should shift focus toward governance literacy and system integration. Capstone projects should include model selection and evaluation under constraints, robustness testing, and comprehensive documentation that can be audited by a third party. Practicums can use real data from operations (help desks, registrars, labs) with explicit permissions and guidelines, allowing students to study not only performance improvements but also potential failures. Employers already indicate that adoption, not invention, is the primary barrier; surveys across industries show that enthusiasm outpaces readiness, with leadership, skills, and change management identified as key obstacles. This is a problem education can solve—if curricula are allowed to evolve at the pace of deployment rather than the pace of textbook cycles.

There is also a role for public policy to ensure that the floor rises effectively. Two main approaches stand out. First, expand last-mile apprenticeships linked to AI-enabled roles: a semester-long “operator residency” in advanced manufacturing, a co-op in data-supported student services, and a supervised stint in clinical administration using AI for scheduling and triage. Second, build assessment systems that align incentives: state systems could fund verification labs that test whether graduates can manage, monitor, and explain AI-assisted workflows to professional standards. These are foundational capacities, akin to welding booths or nursing mannequins from an earlier era. They make the invisible visible and certify what truly matters.

Skeptics will raise three reasonable critiques. One concern is that automation may lead to deskilling: if AI takes over tasks such as grammar or standard coding, will students lose foundational skills? Evidence suggests that when curricula sequence tasks—first unaided, then supported with explicit reflection—skills improve rather than deteriorate. A second critique is that AI adoption in the real world remains inconsistent; most workers currently report little to no use of AI in their jobs. This situation strongly argues for education to bridge the usage gap, enabling early-career workers to drive diffusion. A third critique concerns equity: will the benefits be distributed to those who already have access to better schools and devices? This risk is real; however, studies showing significant effects for novices also indicate that universal access and instruction can help reduce inequality. The challenge for policy is to ensure that this complementarity is broad, not exclusive.

34% productivity gains for the least-skilled workers serve as a reminder that AI’s most apparent benefits are not just for specialists. On the factory floor, software-guided machines allow newer operators to contribute sooner. In the office, embedded co-pilots help transform rough drafts into solid first versions. That is the “lowers the floor” aspect. The raised ceiling means that as standard tasks become faster and more uniform, fundamental value shifts to design, verification, and integration—the judgment calls that automation cannot replace. Education must embrace both sides. It must teach students how to use the tools without losing the craft and to take responsibility for the aspects of work that tools cannot handle. This requires credentialing operator competence early, developing governance mastery later, and measuring both through honest assessments. If we act now, the barbell workforce can be a deliberate policy choice rather than a chance occurrence—expanding opportunities at entry and deepening expertise at the top.


The views expressed in this article are those of the author(s) and do not necessarily reflect the official position of the Swiss Institute of Artificial Intelligence (SIAI) or its affiliates.


References

Brynjolfsson, E., Li, D., & Raymond, L. (2025). Generative AI at Work. Quarterly Journal of Economics, 140(2), 889–942.
International Federation of Robotics. (2024, Sept. 24). Record of 4 million robots working in factories worldwide. Press release and global market.
ManufacturingDive. (2025, Apr. 14). Top 3 challenges for manufacturing in 2025: Skills gap, turnover, and AI.
ManufacturingTomorrow. (2025, Aug.). How to Use AI to Close the Manufacturing Skills Gap.
MIT Sloan Ideas Made to Matter. (2024, Nov. 4). How generative AI affects highly skilled workers.
MIT Sloan Ideas Made to Matter. (2025, Mar. 10). 5 issues to consider as AI reshapes work.
Noy, S., & Zhang, W. (2023). Experimental evidence on the productivity effects of generative artificial intelligence. Science, 381.
Pew Research Center. (2025, Feb. 25). U.S. workers are more worried than hopeful about future AI use in the workplace; and Workers’ exposure to AI.
De Souza, G. (2025). Artificial Intelligence in the Office and the Factory: Evidence from Administrative Software Registry Data. Federal Reserve Bank of Chicago Working Paper 2025-11; and VoxEU column summary.

Picture

Member for

1 year 2 months
Real name
Keith Lee
Bio
Keith Lee is a Professor of AI and Data Science at the Gordon School of Business, part of the Swiss Institute of Artificial Intelligence (SIAI), where he leads research and teaching on AI-driven finance and data science. He is also a Senior Research Fellow with the GIAI Council, advising on the institute’s global research and financial strategy, including initiatives in Asia and the Middle East.

Impeachment’s Uncertainty Premium: How to Protect Accountability Without Unraveling Mandates

Impeachment carries an “uncertainty premium,” seen in a 14.6% drop in Korea’s FDI pledges
Use impeachment as an emergency brake, not routine politics, to protect electoral mandates and stability
Adopt guardrails—one-shot filings, fast voter-centered elections, judicial minimalism—and ring-fence education budgets

From Parrots to Partners: A Policy Blueprint for AI-Literate Learning

From Parrots to Partners: A Policy Blueprint for AI-Literate Learning

Picture

Member for

1 year 2 months
Real name
Keith Lee
Bio
Keith Lee is a Professor of AI and Data Science at the Gordon School of Business, part of the Swiss Institute of Artificial Intelligence (SIAI), where he leads research and teaching on AI-driven finance and data science. He is also a Senior Research Fellow with the GIAI Council, advising on the institute’s global research and financial strategy, including initiatives in Asia and the Middle East.

Modified

AI use is ubiquitous; current assessments reward fluency over thinking
Grade process, add brief vivas, and require transparent AI-use disclosure
Train teachers, ensure equity, and track outcomes to make AI a partner

Eighty-eight percent of university students report they now use generative AI to prepare assessed work, an increase from just over half a year ago. This significant shift in AI usage, while promising, also raises concerns. Nearly one in five students admits to pasting AI-generated text, whether edited or not, directly into their submissions. At the same time, new PISA results show that about one in five 15-year-olds across OECD countries struggle even with simpler creative-thinking tasks. This data highlights the need for a balanced approach to AI integration in education. The current trend reveals a growing disparity between the speed at which students can produce plausible text and the slower, more challenging task of generating ideas and making informed judgments. When we label chatbots as 'regurgitators,' we risk overlooking the real issue: a system that rewards fluent output over clear thinking, tempting students to outsource the work that learning should reinforce. The goal should not be to ban autocomplete; it should be to make cognitive effort noticeable again and valuable.

What We Get Wrong About "Regurgitation"

Calling large language models parrots lets institutions off the hook. Students respond to the incentives we create. For two decades, search engines have made copying easy; now language models make it quick to paraphrase and structure ideas. The issue isn't that students have suddenly become less honest. Many assessments still value smooth writing and recall more than the products of reasoning. Consider what educators report: a quarter of U.S. K-12 teachers believe AI does more harm than good, yet most lack clear guidance on how to use or monitor it. Teacher training is increasing but remains inconsistent. In fall 2024, fewer than half of U.S. districts had trained teachers on AI; by mid-2025, only about one in five teachers reported that their school had an AI policy. Confusion in the classroom translates into ambiguity for students. They will do what feels normal, quick, and safe.

The common, fast, "safe" use case is for ideation and summarization, not careful drafting. UK survey data from early 2025 indicate that the most frequent uses by students are explaining concepts and summarizing articles; using AI at any stage of assessment has become the standard rather than the exception. Teen usage for schoolwork is on the rise, but is still far from universal, suggesting a spread pattern where early adopters set norms that others follow under pressure. If we tell students to "use it like Google for ideas, not as a ghostwriter," we must assess in a way that clearly shows the difference. Right now, many find it hard to see a practical distinction. As detection methods become uncertain—and many major vendors avoid issuing low AI "scores" to minimize false positives—monitoring output quality alone cannot ensure academic integrity. We need a better design at the beginning.

Figure 1: Most students use AI upstream for understanding and summarising; a smaller—but policy-critical—minority move AI text into assessed work. The risk concentrates where grades reward fluent output over visible thinking.

The deeper risk isn't just copying; it's cognitive offloading. Several recent studies and evaluations, ranging from classroom surveys to EEG-based laboratory work, suggest that regular reliance on chatbots diverts mental effort away from planning, retrieval, and evaluation processes where learning actually occurs. These findings are still early and not consistent across tasks, but the trend is clear: when we let models draft or make decisions, our own attention and self-reflection can weaken. This doesn't mean AI cannot be helpful; it means we need to create tasks where human input is necessary and valued.

The Evidence—And What It Actually Implies

If 88% of students now use generative tools at some stage of assessment and 18% paste AI-generated text, we need to grasp the patterns behind these numbers. The same UK survey shows that the main uses dominate, with a quarter of students drafting with AI before making revisions; far fewer copy unedited text. In short, "regurgitation" isn't the average behavior, but it is a visible trend—and it becomes tempting in courses that reward speed and surface fluency. A Guardian analysis of misconduct cases in the UK shows that confirmed AI-related cheating increased from 1.6 to 5.1 per 1,000 students year-over-year, while traditional plagiarism declines; universities admit that detection tools struggle, and more than a quarter still do not track AI violations separately. Relying solely on enforcement cannot fix what assessment design encourages. (Method note: the Guardian figure combines institutional returns and likely undercounts untracked cases, potentially understating the actual issue.)

When we compare student abilities, we see the tension. In PISA 2022's first creative-thinking assessment, Singapore led with a mean score of 41/60; the OECD average was 33, and about one in five students couldn't complete simpler ideation tasks. Creative-thinking performance correlates with reading and math, but not as closely as those core areas relate to each other, suggesting that both practice and teaching—not just content knowledge—shape ideation skills. If AI speeds up production but our system does not clearly teach and evaluate creative thinking, students will continue to outsource the very steps we neglect.

Figure 2: Creative-thinking capacity is uneven: the OECD average has over one in five students below the baseline, while leading systems keep low-performer shares in single digits—evidence that practice and pedagogy matter.

What about the claim that AI is simply making us worse thinkers? Early findings are mixed and depend on context. Lab work from MIT Media Lab indicates reduced brain engagement and weaker recall in writing assisted by LLMs compared to "brain-only" conditions. Additionally, a synthesis notes that students offload higher-order thinking to bots in ways that can harm learning. Yet other studies, especially in structured settings, show improved performance when AI handles the routine workload, allowing students to focus their efforts on analysis. The key factor isn't the tool; it's the task and what earns credit. (Method note: many studies rely on small samples or self-reports; the best assumption is directional rather than definitive.)

Meanwhile, educators and systems are evolving, though unevenly. Duke University's pilot program offers secure campus access to generative tools, enabling the testing of learning effects and policies on a larger scale. Stanford's AI Index chapter on education notes an increasing interest among teachers in AI instruction, even as many do not feel prepared to teach it. Surveys through 2025 indicate that teachers using AI save time, and a growing, albeit still minority, share of schools have clear policies in place. In short, the necessary professional framework is developing, but slowly and with gaps. Students experience this gap as a result of mixed messages.

We should also be realistic about detection methods. Turnitin's August 2025 update specifically withholds percentage scores below 20% to reduce false positives, acknowledging that distinguishing between model-written and human-written text can be challenging at low levels. Academic integrity cannot depend on a moving target. Instead of searching for "AI DNA" after the fact, we can create assignments so that genuine thinking leaves evidence while it happens.

A Blueprint for Cognitive Integrity

If the ideal scenario is to use AI like a search tool—an idea partner rather than a ghostwriter—we need policies that make human input visible and valuable. The first step is to grade for process. Require a compact "thinking portfolio" for major assignments: a log of prompts used, a brief explanation of how the tool influenced the plan, the outline or sketch created before drafting, and a quick reflection on what changed after receiving feedback. This does not need to be burdensome: two screenshots, 150 words of rationale, and an outline snapshot would suffice. Give explicit credit for this work—perhaps 30–40% of the grade—so that the best way to succeed is to engage in thinking and demonstrate it. When possible, conclude with a brief viva or defense in class or online: five minutes, with two questions about choices and trade-offs. If a student cannot explain their claim in their own words, the problem lies in learning, not the software. (Method note: for a 12-week course with 60 students, two five-minute defenses per student add roughly 10 staff hours; rotating small panels can help manage this workload.)

The second step is to reframe tasks so that using ungrounded text is insufficient. Swap purely expository prompts with "situated" problems that require local data, classroom materials, or course-specific case notes that models will not know. Ask for two alternative solutions with an analysis of downsides; require one source that contradicts the student's argument and a brief explanation of why it was dismissed. Link claims to evidence from the course content, not just to generic literature. These adjustments force students to think within context, rather than just producing fluent prose.

Third, normalize disclosure with a simple classification. "AI-A" means ideation and outlining; "AI-B" refers to sentence-level editing or translation; "AI-C" indicates draft generation with human revision; "AI-X" means prohibited use. Students should state the highest level they used and provide minimal supporting materials. This treats AI like a calculator with memory: allowed in specific ways, with work shown, and banned where the skill being tested would be obscured. It also provides instructors with a common language, enabling departments to compare patterns across courses. (Method note: adoption is most effective when the classification fits on one rubric line, and the LMS provides a one-click disclosure form.)

Fourth, build teacher capacity quickly. Training at the district and campus levels increased in 2024, but it still leaves many educators learning on their own. Prioritize professional development on two aspects: designing tasks for visible thinking and providing feedback on process materials. Time saved by AI for routine preparation—which recent U.S. surveys estimate at around 1–6 hours weekly for active users—should be reinvested into richer prompts, oral evaluations, and targeted coaching. Teacher time is the most limited resource; policies must protect how it is used.

Fifth, address equity directly. Student interviews in the UK reveal concerns that premium models offer an advantage and that inconsistent policies across classes are perceived as unfair. Offer a baseline, institutionally supported tool with privacy safeguards; teach all students how to evaluate outputs; and ensure that those who choose not to use AI are not penalized by tasks that inherently favor rapid bot-assisted work. Gaps in creative thinking based on socioeconomic status indicate that we should prioritize practice that mitigates literacy bottlenecks—through visual expression, structured ideation frameworks, and peer review—so every student can develop the skills AI might distract them from.

Finally, measure what matters. Track the percentage of courses that evaluate process; the share employing short defenses; the distribution of student AI disclosures; and changes in results on assessments that cannot be faked by fluent text alone. Expect initial variation. Anticipate some resistance. But we make the human aspects of learning clear and valuable. In that case, the pressure to outsource will decline automatically in areas where we still need supervision—like professional licensure exams, clinical decisions, or original research—limit or prohibit generative use and explain the reasoning. The aim is not uniformity but clarity matched to the proper skills being assessed.

None of this requires waiting for standards bodies to take action. Universities can begin this semester; school systems can test it in upper-secondary courses right away. Institutions are already implementing this, with secure campus AI portals being tested in the U.S. and OECD member countries, which provide practical guidance on classroom use. Our policies should reflect this practicality: no panic or hype, just careful design.

The initial figure—eighty-eight percent—will only increase. We can continue to portray the technology as a parrot and hope to catch the worst offenders afterward, or we can adjust what earns grades so that the safest and quickest path is to think. The creative-thinking results remind us that many students need practice in generating and refining ideas, not just improving sentences. If we grade for process, hold small oral defenses, and normalize disclosure, we transform AI into the help it should be: a quick way to overcome obstacles, not a ghostwriter lurking in the shadows. This approach aligns incentives with learning honestly. It respects students by asking for their judgment and voice. It values teachers by compensating them in time for deeper feedback. And it reassures the public by ensuring that when a transcript indicates "competent," it means the student actually completed the work as required. The tools will continue to improve. Our policies can, too, if we design for visible thinking and view AI as a partner we guide, rather than a parrot we fear.


The views expressed in this article are those of the author(s) and do not necessarily reflect the official position of the Swiss Institute of Artificial Intelligence (SIAI) or its affiliates.


References

AP News. (2025, September 6). Duke University pilot project examining pros and cons of using artificial intelligence in college.

Gallup & Walton Family Foundation. (2025, June 25). The AI dividend: New survey shows AI is helping teachers reclaim valuable time.

Guardian. (2025, June 15). Revealed: Thousands of UK university students caught cheating using AI.

HEPI & Kortext. (2025, February). Student Generative AI Survey 2025.

Hechinger Report. (2025, May 19). University students offload critical thinking, other hard work to AI.

MIT Media Lab. (2025, June 10). Your Brain on ChatGPT: Accumulation of Cognitive Debt from LLM-Assisted Writing.

OECD. (2023, December 13). OECD Digital Education Outlook 2023: Emerging governance of generative AI in education.

OECD. (2024, June 18). New PISA results on creative thinking: Can students think outside the box? (PISA in Focus No. 125).

OECD. (2024, November 25). Education Policy Outlook 2024.

Pew Research Center. (2024, May 15). A quarter of U.S. teachers say AI tools do more harm than good in K-12 education.

Pew Research Center. (2025, January 15). About a quarter of U.S. teens have used ChatGPT for schoolwork—double the share in 2023.

RAND Corporation. (2025, April 8). More districts are training teachers on artificial intelligence.

Stanford HAI. (2025). The 2025 AI Index Report—Chapter 7: Education.

Turnitin. (2025, August 28). AI writing detection in the new, enhanced Similarity Report.

Picture

Member for

1 year 2 months
Real name
Keith Lee
Bio
Keith Lee is a Professor of AI and Data Science at the Gordon School of Business, part of the Swiss Institute of Artificial Intelligence (SIAI), where he leads research and teaching on AI-driven finance and data science. He is also a Senior Research Fellow with the GIAI Council, advising on the institute’s global research and financial strategy, including initiatives in Asia and the Middle East.