Microsoft has built an AI agent that can act across an entire organization. The problem is most organizations aren't ready to manage a human workforce doing the same tasks, let alone an autonomous agent.
Copilot Cowork launched March 9, extending Microsoft's AI assistant beyond question-and-answer into autonomous task execution across Microsoft 365 — rescheduling meetings, drafting presentations, coordinating projects, compiling research. Powered in part by Anthropic's Claude, it is Microsoft's most ambitious attempt yet to establish AI as a workplace collaborator.
It is also arriving inside organizations where information is scattered, governance is still catching up to basic AI deployments and the employees meant to oversee the agent have no time to spare.
Where Workflows Wobble
There's a significant gap between what Cowork does in a demo and what it reliably does inside an organization. The problem is information structure, said Laura Saarainen, vice president of product management at enterprise content management firm M-Files.
"The places where these workflows usually break down are pretty familiar: information is scattered, people are working from different versions, permissions aren't consistent and the AI doesn't have enough business context to understand what matters," Saarainen said. "Documents by themselves are not enough. You need to understand the relationships around them: the customer, the contract, the process, the project. If that context is missing, the workflow starts to wobble very quickly."
The problem is that Cowork's effectiveness is not purely based on the model behind it. As with all agents, success depends on how well an organization manages its information — and most organizations do not manage it well. When that foundation is weak, the AI layer will be weak too, regardless of capability.
Work that breaks agentic AI is rarely the work that appears on an org chart, said Maria Potapkina, AI mentor and coach at TripleTen and formerly of PwC. Companies run on unwritten approvals, informal sensitivities and dependencies that never make it into a document. Cowork has no way to see any of that.
One further signal is worth flagging: Microsoft is rolling the product out through its Frontier research preview, which suggests the company itself is not yet treating this as a finished deployment.
The Check-In Model Is a Cookie Banner
Central to Microsoft's safety is the check-in model, the series of human review points built into Cowork's plan-to-action loop. The idea is that employees retain control over what the agent does before it touches live data. It is a reasonable idea with an obvious flaw.
"The 'check-in-with-my-human' model is a UX compromise disguised as a safety feature," said Jim Sherlock, software development and AI leader at ProCircular, a cybersecurity and technology firm. "The reality is an already-busy employee who delegates a complex workflow to Cowork is going to rubber-stamp those check-in prompts the same way they click through cookie consent banners when visiting a website."
The logic is hard to dismiss. The premise of delegating a multi-step workflow to an AI agent is that the employee does not have time or bandwidth to do it themselves. Asking that same employee to review each step of the agent's reasoning introduces the cognitive load they were trying to offload.
"That assumption will fail spectacularly the first time an agent takes an irreversible action nobody actually reviewed," Sherlock added.
The model works if review is calibrated to the risk level of the task, Potapkina said. The problem is that most organizations will not build that distinction into their rollout. They will apply the same approval habits to everything, and the check-in will become a formality.
IT Teams Got a New Employee, But No One Told Them
Cowork also creates structural challenges for enterprise IT teams that existing governance frameworks were not designed to manage. When an agent sends an email message, modifies permissions on a SharePoint file or updates a project timeline, who owns that action for compliance purposes? Most organizations haven't answered this question yet.
"IT teams will need to get ahead of this by treating Cowork like a new employee that needs its own identity, its own permissions and its own paper trail, not just another feature running under someone's existing account,” Sherlock said.
Valence Howden, advisory fellow at Info-Tech Research, an IT research and advisory firm focused on AI risk and governance, has questions about access levels, the adequacy of human oversight at scale and jurisdictional complexity.
Governing drift in high-complexity multi-agent environments is a particular worry — and one that most governance teams are not yet equipped for. "Most organizations are still learning how to govern AI and are trailing behind," Howden said.
"The concern isn't just whether an AI can complete a task," Saarainen said. "It's whether it can do it in a way that's defensible, traceable and aligned with policy."
The Question of Multi-AI Models
Cowork's integration of Anthropic's Claude into the Microsoft Copilot stack also deserves scrutiny. Microsoft selects models dynamically based on the task, which means sensitive enterprise data may be routed through systems an organization did not explicitly choose and cannot easily inspect.
For heavily regulated industries, this is a problem. Many organizations will not have assessed those systems against their own compliance obligations before Cowork is already running.
What this demands is a strategic risk lens, not just a technical one, argues Howden. Enterprises need clarity on the ethics and boundaries of every tool in the stack, not just the interface they interact with.
"I'd start with the basics: where is the content actually living, who has access to it, what controls are applied to it and can the organization audit what happened?” Saarainen said. “If you can't explain how an answer was generated or what data influenced it, it's going to be difficult to trust it in high-stakes workflows."
What Copilot Cowork Is Ready For
For now, the safe boundaries lie in meeting preparation, project status summaries, routine follow-up drafts, internal coordination and document-heavy workflows with well-structured information.
These tasks create drag in large organizations. Removing that drag is valuable, even if the ambitions of the product extend further.
The limits are equally consistent. Compliance workflows, regulated reporting, legal approvals, financial data and anything customer-facing where an error has consequences should stay off-limits until observability and rollback tooling catches up.
"Cowork is a glimpse of where enterprise productivity is headed and it's genuinely exciting," said Sherlock. "However, organizations that rush to adopt it before the guardrails mature are likely going to learn some expensive lessons."
Microsoft built Cowork to act across an organization. What it has not built — and what no technology vendor can build on an enterprise's behalf — is the governance, the accountability structures and the cultural readiness to manage what happens when it gets things wrong. That work falls to the organizations deploying it, and most of them have not yet started.
Related Articles:
- Slack's AI Integration Ambitions Are Rewriting – and Testing – Data Trust — Slack’s new AI APIs promise smarter workflows, but as data flows through more integrations, experts say the real risk isn’t ownership but lost control & trust.
- Your AI Workflows Will Outlast the Leaders Who Approved Them — AI agents don't create accountability problems, they inherit them. When autonomous systems outlast the teams that built them, ownership disappears.
- As SharePoint Turns 25, Is It Evolving or Being Repurposed? — SharePoint turns 25 with a new role as Copilot's AI backbone. But whether that vision succeeds depends on decades of organizational content decisions.