Reference Guide

Using Cowork Safely

Anthropic's official Cowork safety guidance, adapted into the practices that matter — granting access narrowly, monitoring tasks not commands, scheduled-task discipline, when 'Act without asking' is appropriate, computer-use caution, MCP and plugin vetting, and what to do when something looks wrong.

← Back to Reference Hub

Cowork can read, write, and (with explicit permission) permanently delete the files it can see. The single most effective guardrail is keeping the visible surface small. Mount a dedicated working folder rather than your home directory or your full Documents tree, and keep backups of anything you can't afford to lose. Deletion protection still requires you to click Allow, but a tighter scope reduces the blast radius of every other class of mistake too — including ones that don't trip the deletion prompt.

Create a dedicated /cowork (or similar) folder for Cowork work
Avoid granting access to financial documents, credentials, or personal records
Keep backups of important files independently of Cowork
Re-mount narrowly per session when possible — broad mounts persist
Deletion protection requires explicit per-action Allow, even inside a granted folder

Limitations: Doesn't help when the work genuinely requires sensitive data — in that case the right move is to scope the session itself, not the folder. Doesn't cover MCP-mediated file access or files reached through Claude in Chrome.

FoundationalPre-Session

Cowork executes a lot of commands. Trying to validate each one is impractical and gives a false sense of safety. The right defensive posture is pattern recognition: is Claude touching files or sites you didn't mention? Is the task scope creeping? If something feels off, stop immediately. This is a different mental model from code review — you're watching for drift, not auditing instructions.

Watch for unexpected file or site access — the article calls this the primary tell
Watch for scope creep beyond the original ask
Stop the task immediately if something feels off; investigate after, not during
Read the high-level task surface, not every command line
Build a baseline expectation for each task type so drift is easier to spot

Limitations: Requires you to be actively present — doesn't transfer to scheduled tasks or 'Act without asking' runs. Skill-dependent: the more types of work you delegate, the better your pattern recognition gets.

In-SessionAlways Applicable

Scheduled tasks run without you watching, which makes them structurally higher risk. The article's guidance is unambiguous: start with low-risk tasks like summaries and information compilation, avoid sensitive data and consequential actions (outbound email, purchases, financial transactions), review outputs after each run from the Scheduled page in the sidebar, and pause anything you're not actively using. Tasks only run while your computer is awake and the Claude Desktop app is open — that's a constraint, not a safety property.

Begin with summarizing, drafting, and information-compilation tasks
Avoid scheduling outbound email, payments, or anything hard to undo
Review the Scheduled page in the left sidebar after each run
Pause or delete tasks you no longer need — do not let them idle
Remember the runtime constraint: desktop awake + Claude Desktop open

Limitations: Discipline-dependent — easy to add a 'just one more' task that drifts past safe scope. Reviews lag the action: scheduled tasks have already executed by the time you read the output.

Recurring WorkHigher Risk

'Act without asking' removes the per-step approval gate — the same gate that gives you a chance to interrupt a prompt-injection attempt mid-task. It is faster for well-defined work but materially increases injection risk. Use it only when all three conditions hold: you're actively supervising, you're working with trusted files/sites/tools, and you can stop Claude immediately if something looks wrong. If any of the three is shaky, leave per-step approval on.

Condition 1: active supervision throughout the run
Condition 2: trusted files, sites, and tools — no untrusted web content
Condition 3: you can stop the run immediately
Best fit: short, well-scoped tasks over files you just created or curated
Worst fit: anything that ingests external content (email, web pages, documents from outside parties)

Limitations: Not a sandbox — it removes the most accessible defense against injection. Speed gains do not justify the mode for ambiguous or unattended work.

Mode SettingHigher Risk

Computer use is the highest-risk Cowork capability because it bypasses both the file-operation permission prompts and the code-execution VM. Claude clicks, types, and navigates the actual screen with no intervening sandbox. Build trust gradually like you would with a new colleague: start with lower-stakes tasks, block sensitive apps (banking, healthcare portals, dating apps), and stay aware that Claude is taking screenshots to read the screen. Even within granted apps, a click on a link in one app will open the target — so app-level permissions do not fully contain action.

Start with low-stakes tasks and expand only as trust accrues
Block sensitive apps explicitly (banking, healthcare, dating, personal messaging)
Be aware screenshots are taken to interpret the screen
Cross-app click-through: a link opened in an allowed app reaches its destination
No sandbox — different risk class from file or code-execution tools

Limitations: Blocklist only protects against intended access — a chain through an allowed app can still reach blocked surfaces via opened links. Productivity loss is real if you over-block.

Highest RiskPer-App Permissions

Web pages, emails, and documents pulled in from outside are the primary delivery surface for prompt-injection attacks. Cowork's default network access is intentionally restricted — only extend it to sites you trust. Note the asymmetry: network egress permissions do NOT apply to web fetch, web search, or MCPs (including Claude in Chrome). Web fetch runs server-side and is limited to search results and URLs you've shared, but the broader Chrome extension surface is governed separately. Team and Enterprise admins can disable web search and Claude in Chrome at the org level under Capabilities.

Default network access is restricted — extend deliberately, not by default
Egress allowlists do not cover web fetch, web search, or MCPs (Chrome included)
Web fetch is scoped to search results and URLs you have shared
Org-wide controls: Organization settings > Capabilities (web search)
Org-wide controls: Organization settings > Claude in Chrome (extension)
Avoid using Claude in Chrome on sensitive sessions (banking, payroll, identity)

Limitations: Egress allowlist gaps are easy to forget — adding a single MCP can reopen network reach you thought you closed. Org-wide controls are blunt instruments.

NetworkInjection Defense

Each MCP or plugin you install introduces a new path for attacks to reach Claude. Plugins are bundles — a single install can add skills, connectors, and sub-agents at once, significantly expanding scope of action. Local MCP servers bundled with plugins run on your computer with the same permissions as any other program you run; there is no Cowork-imposed sandbox around them. Stick to verified extensions from the Claude Desktop directory and read the requested permissions before installing.

Install only from the Claude Desktop directory or other verified sources
Read what permissions an extension requests before clicking install
Treat plugins as multi-component installs — assess every bundled piece
Local MCP servers run with your user permissions — no Cowork sandbox
Audit installed MCPs and plugins periodically; remove ones you do not use

Limitations: Reviewing every permission is realistic for paid Cowork users but tedious. Bundled plugins make it harder to opt into a single capability without inheriting the rest.

Install-TimeFoundational

The Claude for Excel and Claude for PowerPoint add-ins can read, edit, and pass context between applications. Claude might analyze a workbook and slide a chart into a deck without you directly instructing that transfer. Treat the Office add-ins + Cowork pairing as a single shared context — don't work with sensitive data in those add-ins while Cowork is active unless you intend the data to be available to whatever else Claude touches in the session.

Add-ins share context with Cowork — data can move between apps
Transfers can happen without an explicit 'move this' instruction
Avoid sensitive data in Excel/PPT add-ins while Cowork is active
Disable add-ins for sensitive workbooks rather than relying on prompt discipline

Limitations: Limits productivity for workflows that genuinely benefit from cross-app context. Doesn't apply to Excel/PPT used without the Claude add-ins.

Microsoft 365Context Leakage

When you message Cowork from your phone, the agent still runs on your desktop with whatever access you previously granted. The phone is effectively a remote control for your desktop's resources. This matters most on managed corporate computers: granting Cowork desktop access now extends that access to a personal mobile device that may not be under the same management posture. Review what you've granted Cowork and confirm the scope is appropriate for off-device triggering, not just for the corporate laptop in front of you.

Phone messaging does not move Cowork — it triggers the desktop agent
All previously-granted folders, connectors, and plugins remain in scope
Especially relevant for MDM-managed laptops paired with personal phones
Audit granted access through the lens of remote triggerability
Treat phone-side authentication as the new perimeter for desktop scope

Limitations: No way to scope mobile-triggered sessions more narrowly than desktop-triggered ones today.

Cross-DeviceManaged Devices

The recognizable signals of a prompt injection in progress: Claude suddenly discussing unrelated topics, attempting to access unexpected resources, or asking for sensitive information unprompted. The recommended response is to stop the task immediately and report to security@anthropic.com or via the in-app feedback button. Reports feed back into model training and content classifiers — the defenses everyone else benefits from get sharper because of yours.

Signal 1: Claude shifts to unrelated topics mid-task
Signal 2: attempts to access resources you did not mention
Signal 3: unprompted requests for sensitive information
First action: stop the task — do not continue to gather more evidence
Then report: security@anthropic.com or in-app feedback button

Limitations: Recognition is skill-dependent — subtler injections that mimic plausible task drift are harder to flag.

Incident ResponseAlways Applicable

Your phone is now a remote control for your desktop

The least-discussed safety consideration in the official article is that mobile messaging doesn't move Cowork to the phone — it triggers the desktop agent with all previously granted scope. If your laptop is managed by an employer and your phone is not, the mobile entry point quietly extends managed-device access into an unmanaged personal device. Audit accordingly before connecting mobile.

Practice	When it applies	Risk addressed	Effort
Narrow file access	Pre-session	Accidental read, write, deletion of sensitive files	Low — set once	Foundational
Monitor tasks, not commands	In-session	Scope creep, hidden tool drift, prompt injection in progress	Low — passive attention	Always
Scheduled task discipline	Recurring work	Unattended action, consequential side effects	Medium — setup + periodic review	High value
Avoid 'Act without asking'	Mode setting	Mid-task prompt injection	Low — leave default on	Default
Computer use caution	Per-task	Direct screen access without sandbox	Medium — blocklist + supervision	Highest stakes
Restrict browser/web	Pre-session + ongoing	Prompt injection from untrusted content	Low — admin or per-session	Foundational
Vet MCPs and plugins	Install-time	Bundled capability creep, untrusted code with full permissions	Low — read before install	Pre-install
Mobile awareness	Cross-device	Remote triggering of desktop scope from personal phone	Low — audit scope once	Managed devices

Foundational practices (narrow access, vet installs, restrict web) compound — they shrink the surface every other defense has to cover.

You just got Cowork on a new laptop — what's the first move?Grant File Access Narrowly — create a dedicated working folder, do not mount your home directory. Then vet the MCPs and plugins you install before turning them on.

You want to schedule a daily morning task that summarizes overnight email.Scheduled Task Discipline — start with a digest-only version that does NOT send replies or take action on the email. Review the output for a week before adding any outbound behavior.

You're tempted to flip on 'Act without asking' for a long refactor pass on a folder of files.'Act Without Asking' Mode — acceptable only if you'll actively watch the run, the files are trusted, and you can stop it instantly. If the run is long enough that you'll wander off, leave per-step approval on.

Mid-task, Claude starts asking about a file you didn't mention and the topic has drifted from your original ask.Report Suspicious Behavior Immediately — stop the task, then send the conversation context to security@anthropic.com or use the in-app feedback button.

You signed in to Claude on your personal phone for the first time and your work laptop is MDM-managed.Mobile-to-Desktop Awareness — your phone now triggers the agent on the managed laptop. Re-audit what you've granted Cowork through the lens of 'is this access appropriate from off-device?'

You want to install a plugin that bundles three MCPs you don't fully recognize.Evaluate MCPs and Plugins Carefully — read every requested permission, and remember that local MCP servers run with your full user permissions. If you can't account for what one of the bundled MCPs does, don't install the bundle.

Cowork safety is mostly about scope, not heroics

Most Cowork incidents trace back to broad scope at session start, not to clever attacks mid-session. The three pre-session practices — narrow file access, restrict web access, vet what you install — do more than any in-session vigilance can. The in-session practices (watch tasks not commands, leave per-step approval on by default) are the second layer. Reporting is the third. Each layer matters less if the layer above it was tight.

Using Cowork Safely

Grant File Access Narrowly

Monitor Tasks, Not Commands

Scheduled Task Discipline

'Act Without Asking' Mode

Computer Use Caution

Limit Browser & Web Access

Evaluate MCPs and Plugins Carefully

Watch Cross-App Data Sharing

Mobile-to-Desktop Awareness

Report Suspicious Behavior Immediately

Your phone is now a remote control for your desktop

Cowork safety is mostly about scope, not heroics