6Pages write-ups are some of the most comprehensive and insightful I’ve come across – they lay out a path to the future that businesses need to pay attention to.
— Head of Deloitte Pixel
At 500 Startups, we’ve found 6Pages briefs to be super helpful in staying smart on a wide range of key issues and shaping discussions with founders and partners.
— Thomas Jeng, Director of Innovation & Partnerships, 500 Startups
6Pages is a fantastic source for quickly gaining a deep understanding of a topic. I use their briefs for driving conversations with industry players.
— Associate Investment Director, Cambridge Associates
Read by
BCG
500 Startups
Used at top MBA programs including
Stanford Graduate School of Business
University of Chicago Booth School of Business
Wharton School of the University of Pennsylvania
Kellogg School of Management at Northwestern University
Reading Time Estimate
14 min read
1. Autonomous AI employees are inevitable
  • Over the past year, autonomous AI agents – which can operate autonomously to perform self-determined tasks across multiple tools to meet predetermined human-set goals – have evolved from a niche emergent branch of generative AI to an inevitability.
  • As we’ve documented in great detail over the 18 months since OpenAI introduced ChatGPT in Nov 2022, generative AI has come to dominate the tech landscape, becoming increasingly commonplace. A Reuters Institute survey released last week found that 7% of the US online population now use ChatGPT every day. A Gartner survey released last month found that, as of Q4 2023, 29% of businesses are using generative AI.
  • Nevertheless, the space has kept moving. What began as a set of early experiments has since become a vision for LLMs capable of long-range activities that can serve as virtual workers and super-capable senior colleagues. In an interview last month, Altman said the next killer app for AI would be akin to a “super-competent colleague that knows absolutely everything about my whole life, every email, every conversation I’ve ever had, but doesn’t feel like an extension.”
  • One of the most notable developments in recent months is the debut in Mar 2024 of Cognition Labs’ Devin, which it calls “the first AI software engineer.” Unlike code-completion assistants, Devin operates autonomously once provided with an initial objective. It can use the same software a human would use – such as a command-line shell, code editor, and web browser inside its own compute environment – to build a project. It can develop a plan, review API documentation, debug its own software, provide regular reports on its progress, and seek human feedback as needed. The month after Devin’s release, Cognition Labs raised $175M at a $2B valuation.
  • Devin, which is available via an early request waitlist, was reportedly trained using OpenAI models. In demos, Devin has demonstrated the ability to build and deploy apps end-to-end, learn to use unfamiliar technologies through web browsing, find and fix bugs in codebases, train and finetune its own AI models, address bugs and feature requests in open-source repositories, and contribute to mature production repositories – all autonomously. One early adopter (Ethan Mollick) had Devin create a Reddit thread to accept crowdsourced website-build requests to work on. The AI agent even began charging for its work at a rate of $50-100/hour before the human overseer shut it down. Devin has even responded to a real job request posted on Upwork to make inferences with a computer-vision model – although some developers have said the demos are deceptive and Devin isn’t quite as capable as reported.
  • Devin is far from the only autonomous AI agent focused on coding. The same month as Devin’s introduction saw the release of an independent open-source version called OpenDevin (Mar 2024). There is also Magic AI (which is working on its own models) and Replit’s Code Repair, as well as open-source tools Aider (May 2023), SWE-agent (Apr 2024), and AutoCodeRover (Apr 2024), among others.
  • There are also a growing number of platforms to help users build their own agents, such as Google’s Vertex AI Agent Builder, Microsoft’s Copilot Studio, SuperAGI’s full-stack AGI (artificial general intelligence) platform, Brevian’s no-code enterprise platform, NEXA AI’s AI Agent foundation models, and Luda’s agent-training platform.
  • We are still in the earliest stages of autonomous AI agents, as even Cognition Labs CEO Scott Wu has admitted. Agents such as Devin still “often make mistakes.” Devin, while significantly outperforming LLM rivals on the widely used SWE-bench benchmarks for complex coding tasks, is still only able to correctly resolve 13.9% of issues end-to-end. What’s more, without close human oversight, mistakes can compound as AI agents assign themselves new tasks based on the output of prior activities. These agents can also be very expensive to operate, as they are frequently calling upon other LLMs (e.g. OpenAI’s API) – which carry costs for input and output – without necessarily optimizing for cost.
  • Businesses are frothing over the potential applications of AI agents. Box CEO Aaron Levie said recently that autonomous AI “is probably the biggest thing that’s ever happened” to his company. According to a Jan 2024 Accenture report, “96% of executives agree that leveraging AI agent ecosystems will be a significant opportunity for their organizations in the next three years.”
  • The role of humans in a workforce with AI agents is still an open question. Humans will still be essential, especially in the early days, for defining the objectives for AI agents and providing oversight to make sure agents stay on track. Over time, however, a subset of AI agents will become more capable and trustworthy. If 40% of all working hours – according to one estimate – will be impacted by generative AI, it’s not hard to imagine a world where certain white-collar work will be permanently automated away.
  • We should keep in mind that not all AI agents will be useful and benevolent. A subset of agents may be malicious and exploitative – and also increasingly capable. Very soon, the industry will have to consider how to manage the associated risks, perhaps with hardened simulation sandboxes or other controls. Given the availability of open-source LLMs and the presence of nation-state adversaries, it’s too late to put the toothpaste back in the tube.
Related Content:
  • May 14 2024 (3 Shifts): RAG, vector databases, and the alternatives to AI fine-tuning
  • Apr 14 2023 (3 Shifts): Proprietary enterprise LLMs vs. open-source LLMs
Become an All-Access Member to read the full brief here
All-Access Members get unlimited access to the full 6Pages Repository of637 market shifts.
Become a Member
Become a Member
Already a Member?
Disclosure: Contributors have financial interests in Meta, Microsoft, Alphabet, and OpenAI. Amazon, Google, and OpenAI are vendors of 6Pages.
Have a comment about this brief or a topic you'd like to see us cover? Send us a note at tips@6pages.com.
All Briefs
See more briefs

Get unlimited access to all our briefs.
Make better and faster decisions with context on far-reaching shifts.
Become a Member
Become a Member
Get unlimited access to all our briefs.
Make better and faster decisions with context on what’s changing now.
Become a Member
Become a Member