What Anthropic's Coding Study Should Make Every IT Leader Think Twice | Enterprise Technology

Anthropic published a randomized controlled trial this month that deserves careful reading by anyone leading a technology organization. The study assigned 52 junior software engineers to learn a new Python library either with or without AI assistance. The AI group finished faster — by about two minutes. But when tested on the same material, they scored 17 percentage points lower. Nearly two letter grades. The productivity gain was negligible. The knowledge gap was not.

The finding is not a verdict against AI tools. The researchers were careful to say so, and the data supports that caution. The developers who used AI well — who asked conceptual questions, who generated code and then interrogated it, who treated the tool as a thinking partner rather than an answer machine — performed just as well as the hand-coding group. The study's most important finding isn't the headline number. It's the interaction pattern data.

The pattern that predicted mastery

Participants who asked only conceptual questions — not for code, but for understanding — were the fastest high-scorers in the study. Those who delegated entirely to AI averaged below 40%. The difference wasn't tool access. It was whether the developer stayed in the loop.

What this means for higher education IT specifically

Higher education IT organizations face this challenge at two levels simultaneously. First, like any technology organization, we are building and maintaining complex systems with teams that increasingly include early-career developers who have grown up with AI assistance as a default. Second, unlike most technology organizations, our institutional mission is explicitly tied to learning — which means the question of how AI affects skill formation is not just an internal HR concern. It's a question we owe to the faculty and students who depend on us to think carefully about it.

The debugging finding in the study is the most operationally significant for teams like ours. The researchers found that the ability to understand when code is incorrect — and why — showed the largest gap between the AI-assisted and hand-coding groups. For an organization that maintains identity management, security infrastructure, learning management systems, and data platforms serving 80,000 community members, the ability to diagnose failures is not optional. It is the core competency. You cannot supervise AI-generated code you cannot read.

The workforce development question we haven't asked loudly enough

Most IT organizations have built their AI adoption strategies around access and efficiency. Which tools do we license? How do we get staff using them? What productivity gains can we measure? These are reasonable questions. But the Anthropic study suggests a prior question matters more: what usage patterns are we modeling and rewarding? If early-career developers observe that delegation is faster and that speed is what gets recognized, that is the behavior the organization will get — along with the knowledge gaps that follow.

At ET, we have been working to get this right on both sides. For our own technical staff, that means AI tools are part of the work environment — but so are code review practices, knowledge-sharing expectations, and a team culture that treats understanding as the metric, not output volume. For the campus community, it means programs like the ET Faculty Fellows are positioned to investigate exactly these questions in academic contexts: what does it mean to teach with AI in the room, and how do we know students are building knowledge rather than borrowing it?

You cannot supervise AI-generated code you cannot read. That is true for a junior developer shipping a feature. It is equally true for a student submitting an assignment.

What we think good practice looks like

The study's high-performing usage patterns share a common thread: the developer stayed cognitively active. They used AI to accelerate inquiry, not to replace it. The "conceptual inquiry" group — who asked only clarifying questions rather than requesting code — outperformed every other pattern and finished fastest. That finding aligns with what good mentorship has always looked like: you ask questions that make the learner think, not questions that hand them the answer.

For organizations deploying AI tools to technical staff, this suggests a training emphasis that is less about tool features and more about interaction discipline. What question are you actually trying to answer? Can you explain what the generated code is doing? What would you do if the tool weren't available? These are not restrictions on AI use. They are the habits that make AI use sustainable.

Read the research and explore ET's AI work

Anthropic: AI Assistance and Coding Skills (full study) ET Faculty Fellows — investigating questions like these AI Tools at UT Austin UT AI Policy

The question isn't whether to use AI. It's how.

ET is thinking carefully about AI tool deployment for our own teams and for the campus community. The Faculty Fellows program is one way we're bringing faculty into that conversation.

Explore the Fellows Program Back to ET Newsroom

AI-assisted draft

This story was developed with AI support as part of the writing and editing workflow.

The problem isn't the tool. It's whether you think while you use it.

What this means for higher education IT specifically

The workforce development question we haven't asked loudly enough

What we think good practice looks like

The question isn't whether to use AI. It's how.