GitHub Copilot Workspace: When AI Plans Your Entire Feature (Not Just Code)

We've been using GitHub Copilot since it launched—autocomplete for code, that kind of thing. But Copilot Workspace is different. Instead of suggesting the next line, it reads your GitHub issue, generates a full implementation plan, and proposes all the file changes needed to ship the feature.

Sounds incredible. Or terrifying. We wanted to know which.

So we spent two weeks testing Copilot Workspace on 5 real features across client projects. Here's what actually happened.

What Is Copilot Workspace?

GitHub Copilot Workspace is GitHub's experimental tool that generates implementation plans from issues. You give it a GitHub issue, and it:

Analyzes your codebase
Creates a specification for what needs to change
Proposes a step-by-step implementation plan
Generates the actual code changes across multiple files
Opens a PR with everything ready to review

This isn't autocomplete. It's AI doing the upfront thinking—the planning phase we normally spend an hour on before writing any code.

The promise: turn vague requirements into concrete implementation plans in minutes.

The 5 Features We Tested

We picked real features from active projects. Not trivial todos, but actual work we'd bill for:

Feature 1: Add dark mode toggle to settings page (React/TypeScript)

Complexity: Medium
Files expected: 4-6 (component, context, styles, tests)
Estimated manual time: 2-3 hours

Feature 2: Implement CSV export for dashboard data (Next.js)

Complexity: Medium-high
Files expected: 3-4 (API route, utility functions, button component)
Estimated manual time: 3-4 hours

Feature 3: Add optimistic UI updates to form submission (React/TypeScript)

Complexity: High (state management, error handling)
Files expected: 5-7
Estimated manual time: 4-5 hours

Feature 4: Migrate component library from styled-components to CSS Modules (React)

Complexity: Very high (large refactor)
Files expected: 20+
Estimated manual time: 8-12 hours

Feature 5: Add rate limiting to API endpoints (Node.js/Express)

Complexity: Medium
Files expected: 3-5 (middleware, config, tests)
Estimated manual time: 2-3 hours

We wrote detailed GitHub issues for each, then let Copilot Workspace generate implementation plans.

What It Got Right

1. File Discovery Was Surprisingly Accurate

For Feature 1 (dark mode), Copilot Workspace identified 5 files that needed changes:

src/components/Settings.tsx (the settings page)
src/context/ThemeContext.tsx (it proposed creating this)
src/styles/theme.ts (theme definitions)
src/App.tsx (wrap with provider)
src/components/DarkModeToggle.tsx (the actual toggle)

We would've touched the exact same files. It understood the architecture from analyzing our codebase—how we use Context for global state, where theme definitions live, the component patterns we follow.

Verdict: This alone saved 15-20 minutes of planning.

2. It Understood Existing Patterns

For Feature 2 (CSV export), Copilot Workspace noticed we already had a utils/export.ts file with PDF export functions. It proposed:

Adding exportToCSV() alongside the existing exportToPDF()
Using the same error handling pattern
Following our existing function naming conventions

It didn't reinvent the wheel. It extended what was already there, matching our style.

3. The Spec Was Actually Useful

Before generating code, Copilot Workspace writes a specification—what the feature should do, what will change, potential edge cases to consider.

For Feature 3 (optimistic updates), the spec included:

"Add optimistic update to form state before API call"
"Implement rollback on error"
"Add loading states to prevent duplicate submissions"
"Consider race conditions if user edits during submission"

That last point—race conditions—we hadn't explicitly mentioned in the issue. Copilot Workspace inferred it from the problem domain. That's genuinely helpful.

What It Got Wrong

1. The Generated Code Was... Optimistic

For Feature 1 (dark mode), the generated code compiled and ran. It even worked—technically. But:

Variable names were generic (theme instead of our pattern: appTheme)
It created new utility functions instead of using our existing cn() helper
The CSS-in-JS solution used inline styles, not our Tailwind setup
No error boundaries around the theme provider
Tests were shallow (just smoke tests, no edge cases)

We could use maybe 60% of it. The rest needed rewriting to match our actual codebase standards.

2. It Struggled with Complexity

Feature 4 (styled-components to CSS Modules migration) was a disaster.

The plan looked reasonable: "Convert styled components to CSS Modules, update imports, maintain existing class names." But the implementation:

Missed 8 out of 24 components that needed changes
Generated CSS Modules that didn't account for dynamic props
Broke the component composition patterns we relied on
Suggested regex find-and-replace for imports (which would've broken things)

We ended up doing this migration manually. The AI plan was worse than useless—it would've taken longer to fix than starting from scratch.

Reality check: Large refactors are still a human job.

3. Context Limits Are Real

For Feature 5 (rate limiting), Copilot Workspace proposed adding middleware. Solid start. But it assumed we were using Express middleware patterns throughout the app.

We're not. We use a custom API wrapper that Copilot Workspace didn't have in its context window. So all the generated code was for the wrong abstraction layer.

The plan would've worked—in a different codebase. Not ours.

4. No Awareness of Non-Code Requirements

Feature 2 (CSV export) needed:

GDPR compliance (don't export PII without consent)
File size limits (what if dataset is 10GB?)
Background job for large exports

Copilot Workspace generated a simple "export everything to CSV" function with zero consideration for these real-world constraints. It solved the coding problem, not the actual problem.

When to Use Copilot Workspace

After 5 features, here's our honest take on when it's worth using:

Use It For:

Well-defined, medium-complexity features - Dark mode toggle, pagination, sorting, filtering, basic CRUD. If you can explain it in a 3-sentence issue, Copilot Workspace will generate a decent starting point.

Planning unfamiliar areas - We used it for a Kubernetes config change. Neither of us knows K8s well. The generated plan was educational—it showed us what files to touch and what the changes should look like. We still reviewed every line, but it was a good roadmap.

Prototyping - Need to spike a feature quickly? Copilot Workspace gets you 70% of the way in 5 minutes. Perfect for "let's see if this is even feasible" work.

Don't Use It For:

Large refactors - Anything touching 15+ files or requiring architectural changes. Copilot Workspace loses the plot. You'll spend more time fixing the plan than doing it yourself.

Domain-specific logic - Business rules, compliance requirements, performance constraints. The AI doesn't know your domain. It'll generate code that works in a vacuum but fails in production.

Anything customer-facing without review - This should be obvious, but: never ship Copilot Workspace code without thorough review. It generates plausible code, not correct code.

Our Workflow Now

We've kept Copilot Workspace in our toolkit, but with constraints:

Use it for planning, not implementation - We read the spec and file list, then write the code ourselves. This saves time without the risk of shipping AI slop.
Great for second opinions - "What did I miss?" Run Copilot Workspace on your plan. If it suggests files or edge cases you didn't think of, that's valuable.
Good for learning new tools - When working with unfamiliar frameworks or libraries, the generated code shows patterns we can learn from (even if we don't use it verbatim).
Skip it for client work - For bespoke web development projects, we're not confident enough in the output quality yet. Maybe in 6 months. Not today.

The Honest Take

GitHub Copilot Workspace is impressive technology. Genuinely. The fact that it can read a GitHub issue, analyze a codebase, and propose a coherent implementation plan is remarkable.

But it's not replacing developers. Not even close.

What it is: a very smart intern who can draft plans quickly but needs supervision on everything. You still need to:

Write detailed requirements
Review the plan for gaps
Rewrite most of the generated code
Test thoroughly
Understand the domain and edge cases

Where it shines: saving 30-60 minutes of planning time on straightforward features. Giving you a starting point when you're stuck. Catching edge cases you might've missed.

Where it fails: anything requiring judgment, domain expertise, or understanding of non-functional requirements.

If you're already a Copilot subscriber, Workspace is worth trying. It's free in technical preview. Just don't expect it to ship production features unsupervised.

If you're not using Copilot yet, this alone isn't worth the subscription. Regular Copilot autocomplete is more useful day-to-day.

Takeaways

After testing 5 real features:

File discovery: 8/10 - Usually identifies the right files to touch
Specification quality: 7/10 - Good starting point, misses domain-specific concerns
Code generation: 5/10 - Works but rarely matches your actual codebase patterns
Complex refactors: 2/10 - More harmful than helpful
Time saved: 20-40% on planning, 0-10% on total feature time

It's a useful tool. Not a game-changer. Yet.

We'll keep testing it on new features and update this post as Copilot Workspace improves. For now, it's in the toolbox—used selectively, reviewed carefully, never trusted blindly.

Building complex features that need human judgment? We've shipped hundreds of custom applications and know when to use AI tools and when to think for ourselves. Get in touch if you need a team that understands the difference.