-
AI tools don’t always increase productivity. A recent study from Model Evaluation and Threat Research found that when 16 software developers were asked to complete tasks using AI tools, they took longer than when they weren’t using the technology, despite their expectations that AI would increase productivity. The research challenges the dominant narrative of artificial intelligence leading to increased workplace efficiency.
It’s like a new story of “The Tortoise and the Hare”: a group of experienced software engineers entered an experiment where they were tasked with completing part of their work with the help of AI tools. Thinking like a fast rabbit, developers expected AI to speed up their work and increase productivity. Instead, technology has slowed them down. The turtle approach without artificial intelligence, in the context of the experiment, would have been faster.
The results of this experiment, part of a recent study, came as a surprise to software developers tasked with using AI — and to the study’s authors, Joel Becker and Nate Rush, members of the technical staff at the nonprofit technology research organization Model Evaluation and Threat Research (METR).
The researchers enlisted 16 software developers, who had an average of five years of experience, to perform 246 tasks, each part of projects they were already working on. For half of the tasks, developers were allowed to use AI tools—most selected the Cursor Pro or Claude 3.5/3.7 Sonnet code editor—and for the other half, developers performed the tasks on their own.
Believing that AI tools will make them more productive, software developers have predicted that the technology will reduce their time to complete tasks by an average of 24%. Instead, AI increased their activity time by 19% more than when they were not using the technology.
“While I like to think that my productivity didn’t suffer when I used AI for my tasks, it’s not unlikely that it didn’t help me as much as I anticipated, or maybe even hindered my efforts,” Philipp Burckhardt, a study participant, wrote in a blog post about his experience.
So where did the rabbits go off the path? Experienced developers, in the midst of their own projects, likely approached their work with a lot of additional context that their AI assistants didn’t have, meaning they had to adapt their own agenda and problem-solving strategies to the AI outputs, which they also spent a lot of time debugging, according to the study.
“Most of the developers who participated in the study noted that even when they get AI results that are generally useful to them — and I’m talking about the fact that AI in general can often do very impressive work or a very impressive kind of work — those developers have to spend a lot of time cleaning up the resulting code to make it really suitable for the project,” said Rush, the study’s author. wealth.
Other developers wasted time writing chatbot requests or waiting for the AI to generate results.
The study’s findings contradict lofty promises about AI’s ability to transform the economy and workforce, including a 15 percent increase in U.S. GDP by 2035 and, ultimately, a 25 percent increase in productivity. In fact, many companies have yet to see a return on investment in AI. An MIT report published in August found that out of 300 AI implementations, only 5% achieved rapid revenue acceleration. Only 6 percent of companies fully trust AI to manage core business practices, according to a Harvard Business Review Analytic Services research report published last month.
But Rush and Becker shied away from making sweeping claims about what their study results mean for the future of AI.
First, the study sample was small and non-generalizable, including only a specialized group of people to whom these AI tools were new. The study also measures the technology at a specific point in time, the authors said, without ruling out the possibility that AI tools could be developed in the future that would actually help developers improve their workflow.
The aim of the study was broadly to put the brakes on the torrid implementation of AI in the workplace and elsewhere, recognizing that more data about the real effects of AI need to be made known and accessible before more decisions are made about its applications.
“Some of the decisions we’re making now about the development and deployment of these systems are potentially very far-reaching,” Rush said. “If we’re going to do this, let’s not just take the obvious answer. Let’s do high-quality measurements.”
Economists have already argued that METR’s research aligns with larger narratives about AI and productivity. While AI is starting to decline in entry-level positions, according to LinkedIn’s head of economic opportunity, Aneesh Raman, it can offer diminishing returns for skilled workers like experienced software developers.
“For those people who already have 20 years or, in this specific example, five years of experience, maybe it’s not their main job that we should be looking for and forcing them to start using these tools if they’re already working well at work with existing work methods,” said Anders Humlum, assistant professor of economics at the University of Chicago’s Booth School of Business. wealth.
Humlum has similarly conducted research on the impact of AI on productivity. He found in a May work study that among 25,000 workers in 7,000 workplaces in Denmark – a country with similar AI uptake to the US – productivity improved by 3% among employees using the tools.
Humlum’s research supports MIT economist and Nobel laureate Daron Acemoglu’s claim that markets have overestimated productivity gains from AI. Acemoglu claims that only 4.6% of tasks in the US economy will be streamlined with artificial intelligence.
“In the rush to automate everything, even processes that shouldn’t be automated, companies will waste time and energy and get none of the promised productivity benefits,” Acemoglu previously wrote for wealth. “The hard truth is that achieving productivity gains from any technology requires organizational adjustment, a range of complementary investments and improvements in worker skills through training and on-the-job learning.”
The case of hampered software developer productivity points to this need for critical thinking about when AI tools are deployed, Humlum said. While previous AI productivity research has looked at self-reported data or specific and contained tasks, data on challenges from skilled workers using the technology complicates the picture.
“In the real world, many tasks are not as easy as just entering ChatGPT,” Humlum said. “Many experts have a lot of experience [they’ve] accumulated that is extremely beneficial and we should not just ignore that and discard that valuable expertise that has been accumulated.
“I would take this as a good reminder to be very cautious when using these tools,” he added.
A version of this story originally published on Fortune.com on July 20, 2025.
This story was originally featured on Fortune.com