Reference Guide

Using Cowork Safely

Anthropic's official Cowork safety guidance, adapted into the practices that matter — granting access narrowly, monitoring tasks not commands, scheduled-task discipline, when 'Act without asking' is appropriate, computer-use caution, MCP and plugin vetting, and what to do when something looks wrong.

← Back to Reference Hub

Cowork can read, write, and (with explicit permission) permanently delete the files it can see. The single most effective guardrail is keeping the visible surface small. Mount a dedicated working folder rather than your home directory or your full Documents tree, and keep backups of anything you can't afford to lose. Deletion protection still requires you to click Allow, but a tighter scope reduces the blast radius of every other class of mistake too — including ones that don't trip the deletion prompt.

  • Create a dedicated /cowork (or similar) folder for Cowork work
  • Avoid granting access to financial documents, credentials, or personal records
  • Keep backups of important files independently of Cowork
  • Re-mount narrowly per session when possible — broad mounts persist
  • Deletion protection requires explicit per-action Allow, even inside a granted folder

Limitations: Doesn't help when the work genuinely requires sensitive data — in that case the right move is to scope the session itself, not the folder. Doesn't cover MCP-mediated file access or files reached through Claude in Chrome.

FoundationalPre-Session

Cowork executes a lot of commands. Trying to validate each one is impractical and gives a false sense of safety. The right defensive posture is pattern recognition: is Claude touching files or sites you didn't mention? Is the task scope creeping? If something feels off, stop immediately. This is a different mental model from code review — you're watching for drift, not auditing instructions.

  • Watch for unexpected file or site access — the article calls this the primary tell
  • Watch for scope creep beyond the original ask
  • Stop the task immediately if something feels off; investigate after, not during
  • Read the high-level task surface, not every command line
  • Build a baseline expectation for each task type so drift is easier to spot

Limitations: Requires you to be actively present — doesn't transfer to scheduled tasks or 'Act without asking' runs. Skill-dependent: the more types of work you delegate, the better your pattern recognition gets.

In-SessionAlways Applicable

Scheduled tasks run without you watching, which makes them structurally higher risk. The article's guidance is unambiguous: start with low-risk tasks like summaries and information compilation, avoid sensitive data and consequential actions (outbound email, purchases, financial transactions), review outputs after each run from the Scheduled page in the sidebar, and pause anything you're not actively using. Tasks only run while your computer is awake and the Claude Desktop app is open — that's a constraint, not a safety property.

  • Begin with summarizing, drafting, and information-compilation tasks
  • Avoid scheduling outbound email, payments, or anything hard to undo
  • Review the Scheduled page in the left sidebar after each run
  • Pause or delete tasks you no longer need — do not let them idle
  • Remember the runtime constraint: desktop awake + Claude Desktop open

Limitations: Discipline-dependent — easy to add a 'just one more' task that drifts past safe scope. Reviews lag the action: scheduled tasks have already executed by the time you read the output.

Recurring WorkHigher Risk

'Act without asking' removes the per-step approval gate — the same gate that gives you a chance to interrupt a prompt-injection attempt mid-task. It is faster for well-defined work but materially increases injection risk. Use it only when all three conditions hold: you're actively supervising, you're working with trusted files/sites/tools, and you can stop Claude immediately if something looks wrong. If any of the three is shaky, leave per-step approval on.

  • Condition 1: active supervision throughout the run
  • Condition 2: trusted files, sites, and tools — no untrusted web content
  • Condition 3: you can stop the run immediately
  • Best fit: short, well-scoped tasks over files you just created or curated
  • Worst fit: anything that ingests external content (email, web pages, documents from outside parties)

Limitations: Not a sandbox — it removes the most accessible defense against injection. Speed gains do not justify the mode for ambiguous or unattended work.

Mode SettingHigher Risk

Computer use is the highest-risk Cowork capability because it bypasses both the file-operation permission prompts and the code-execution VM. Claude clicks, types, and navigates the actual screen with no intervening sandbox. Build trust gradually like you would with a new colleague: start with lower-stakes tasks, block sensitive apps (banking, healthcare portals, dating apps), and stay aware that Claude is taking screenshots to read the screen. Even within granted apps, a click on a link in one app will open the target — so app-level permissions do not fully contain action.

  • Start with low-stakes tasks and expand only as trust accrues
  • Block sensitive apps explicitly (banking, healthcare, dating, personal messaging)
  • Be aware screenshots are taken to interpret the screen
  • Cross-app click-through: a link opened in an allowed app reaches its destination
  • No sandbox — different risk class from file or code-execution tools

Limitations: Blocklist only protects against intended access — a chain through an allowed app can still reach blocked surfaces via opened links. Productivity loss is real if you over-block.

Highest RiskPer-App Permissions

Web pages, emails, and documents pulled in from outside are the primary delivery surface for prompt-injection attacks. Cowork's default network access is intentionally restricted — only extend it to sites you trust. Note the asymmetry: network egress permissions do NOT apply to web fetch, web search, or MCPs (including Claude in Chrome). Web fetch runs server-side and is limited to search results and URLs you've shared, but the broader Chrome extension surface is governed separately. Team and Enterprise admins can disable web search and Claude in Chrome at the org level under Capabilities.

  • Default network access is restricted — extend deliberately, not by default
  • Egress allowlists do not cover web fetch, web search, or MCPs (Chrome included)
  • Web fetch is scoped to search results and URLs you have shared
  • Org-wide controls: Organization settings > Capabilities (web search)
  • Org-wide controls: Organization settings > Claude in Chrome (extension)
  • Avoid using Claude in Chrome on sensitive sessions (banking, payroll, identity)

Limitations: Egress allowlist gaps are easy to forget — adding a single MCP can reopen network reach you thought you closed. Org-wide controls are blunt instruments.

NetworkInjection Defense

Each MCP or plugin you install introduces a new path for attacks to reach Claude. Plugins are bundles — a single install can add skills, connectors, and sub-agents at once, significantly expanding scope of action. Local MCP servers bundled with plugins run on your computer with the same permissions as any other program you run; there is no Cowork-imposed sandbox around them. Stick to verified extensions from the Claude Desktop directory and read the requested permissions before installing.

  • Install only from the Claude Desktop directory or other verified sources
  • Read what permissions an extension requests before clicking install
  • Treat plugins as multi-component installs — assess every bundled piece
  • Local MCP servers run with your user permissions — no Cowork sandbox
  • Audit installed MCPs and plugins periodically; remove ones you do not use

Limitations: Reviewing every permission is realistic for paid Cowork users but tedious. Bundled plugins make it harder to opt into a single capability without inheriting the rest.

Install-TimeFoundational

The Claude for Excel and Claude for PowerPoint add-ins can read, edit, and pass context between applications. Claude might analyze a workbook and slide a chart into a deck without you directly instructing that transfer. Treat the Office add-ins + Cowork pairing as a single shared context — don't work with sensitive data in those add-ins while Cowork is active unless you intend the data to be available to whatever else Claude touches in the session.

  • Add-ins share context with Cowork — data can move between apps
  • Transfers can happen without an explicit 'move this' instruction
  • Avoid sensitive data in Excel/PPT add-ins while Cowork is active
  • Disable add-ins for sensitive workbooks rather than relying on prompt discipline

Limitations: Limits productivity for workflows that genuinely benefit from cross-app context. Doesn't apply to Excel/PPT used without the Claude add-ins.

Microsoft 365Context Leakage

When you message Cowork from your phone, the agent still runs on your desktop with whatever access you previously granted. The phone is effectively a remote control for your desktop's resources. This matters most on managed corporate computers: granting Cowork desktop access now extends that access to a personal mobile device that may not be under the same management posture. Review what you've granted Cowork and confirm the scope is appropriate for off-device triggering, not just for the corporate laptop in front of you.

  • Phone messaging does not move Cowork — it triggers the desktop agent
  • All previously-granted folders, connectors, and plugins remain in scope
  • Especially relevant for MDM-managed laptops paired with personal phones
  • Audit granted access through the lens of remote triggerability
  • Treat phone-side authentication as the new perimeter for desktop scope

Limitations: No way to scope mobile-triggered sessions more narrowly than desktop-triggered ones today.

Cross-DeviceManaged Devices

The recognizable signals of a prompt injection in progress: Claude suddenly discussing unrelated topics, attempting to access unexpected resources, or asking for sensitive information unprompted. The recommended response is to stop the task immediately and report to security@anthropic.com or via the in-app feedback button. Reports feed back into model training and content classifiers — the defenses everyone else benefits from get sharper because of yours.

  • Signal 1: Claude shifts to unrelated topics mid-task
  • Signal 2: attempts to access resources you did not mention
  • Signal 3: unprompted requests for sensitive information
  • First action: stop the task — do not continue to gather more evidence
  • Then report: security@anthropic.com or in-app feedback button

Limitations: Recognition is skill-dependent — subtler injections that mimic plausible task drift are harder to flag.

Incident ResponseAlways Applicable

Your phone is now a remote control for your desktop

The least-discussed safety consideration in the official article is that mobile messaging doesn't move Cowork to the phone — it triggers the desktop agent with all previously granted scope. If your laptop is managed by an employer and your phone is not, the mobile entry point quietly extends managed-device access into an unmanaged personal device. Audit accordingly before connecting mobile.