AI coding tools may not speed up every developer, study shows

The landscape of software engineering has seen a significant shift with the emergence of AI coding tools like Cursor and GitHub Copilot. These innovations, powered by advanced AI models from giants such as OpenAI, Google DeepMind, Anthropic, and xAI, promise to revolutionize developer productivity by automating code generation, bug fixing, and testing. Indeed, these models have demonstrated rapidly increasing performance on various software engineering benchmarks.

However, recent findings from a study by METR, a non-profit AI research group, challenge the widespread assumption that today’s AI coding tools universally enhance productivity for seasoned developers. Published recently, this new study casts a surprising light on their real-world impact.

The METR study involved a rigorous randomized controlled trial with 16 experienced open-source developers. They were tasked with completing 246 real-world assignments on large code repositories they regularly contribute to. Roughly half of these tasks were designated as “AI-allowed,” permitting developers to utilize state-of-the-art AI coding tools like Cursor Pro, while the remaining tasks explicitly prohibited AI assistance.

Before commencing their assignments, developers anticipated that AI coding tools would reduce their task completion time by an average of 24%. Yet, the study’s outcome presented a stark contrast to these expectations.

“Surprisingly, we find that allowing AI actually increases completion time by 19% — developers are slower when using AI tooling,” the researchers reported. This suggests that for experienced developers, the integration of AI tools might not always translate into immediate efficiency gains.

It’s worth noting that only 56% of the developers in the study had prior experience with Cursor, the primary AI tool provided. While a significant majority (94%) were familiar with web-based Large Language Models (LLMs) in their coding workflows, for some, this was their first direct engagement with Cursor. Researchers ensured participants received training on Cursor before the study began.

Nevertheless, METR’s findings provoke crucial questions regarding the touted universal productivity enhancements promised by AI coding tools in 2025. The study implies that developers should exercise caution and not automatically assume that tools, particularly “vibe coders” (a term implying generative AI for coding), will inherently accelerate their workflows.

METR researchers identified several potential factors contributing to the observed slowdown. Developers often spent considerably more time on prompting the AI and waiting for its responses, rather than directly engaging in coding. Additionally, AI tools tend to struggle with the complexity and scale of large codebases, which were central to the tasks in this study.

While the study’s authors are careful not to draw overly broad conclusions, explicitly stating they do not believe AI systems currently fail to speed up many or most software developers, their research provides valuable counterpoints. It acknowledges that other large-scale studies have indeed shown AI coding tools boosting software engineer workflows, with some reporting productivity gains of up to 26%.

Furthermore, the rapid pace of AI advancement means that these findings might evolve quickly; the authors themselves suggest the results could differ significantly even in a few months. METR has also observed that AI coding tools have substantially improved their capacity to complete complex, long-horizon tasks in recent years.

Despite these caveats, the METR study adds to a growing body of evidence urging a degree of skepticism regarding the immediate and universal benefits of AI coding tools. Prior research has indicated that today’s AI coding tools can introduce mistakes and, in some instances, even lead to security vulnerabilities. This comprehensive view suggests that while AI holds immense promise for software development, its integration requires a nuanced approach and realistic expectations.