GitHub Copilot Workspace: When AI Plans Your Entire Feature (Not Just Code)
Testing GitHub's new Copilot Workspace that generates implementation plans. We tried it on 5 real features—here's what it got right and wrong.
We've been using GitHub Copilot since it launched—autocomplete for code, that kind of thing. But Copilot Workspace is different. Instead of suggesting the next line, it reads your GitHub issue, generates a full implementation plan, and proposes all the file changes needed to ship the feature.
Sounds incredible. Or terrifying. We wanted to know which.
So we spent two weeks testing Copilot Workspace on 5 real features across client projects. Here's what actually happened.
What Is Copilot Workspace?
GitHub Copilot Workspace is GitHub's experimental tool that generates implementation plans from issues. You give it a GitHub issue, and it:
- Analyzes your codebase
- Creates a specification for what needs to change
- Proposes a step-by-step implementation plan
- Generates the actual code changes across multiple files
- Opens a PR with everything ready to review
This isn't autocomplete. It's AI doing the upfront thinking—the planning phase we normally spend an hour on before writing any code.
The promise: turn vague requirements into concrete implementation plans in minutes.
The 5 Features We Tested
We picked real features from active projects. Not trivial todos, but actual work we'd bill for:
Feature 1: Add dark mode toggle to settings page (React/TypeScript)
- Complexity: Medium
- Files expected: 4-6 (component, context, styles, tests)
- Estimated manual time: 2-3 hours
Feature 2: Implement CSV export for dashboard data (Next.js)
- Complexity: Medium-high
- Files expected: 3-4 (API route, utility functions, button component)
- Estimated manual time: 3-4 hours
Feature 3: Add optimistic UI updates to form submission (React/TypeScript)
- Complexity: High (state management, error handling)
- Files expected: 5-7
- Estimated manual time: 4-5 hours
Feature 4: Migrate component library from styled-components to CSS Modules (React)
- Complexity: Very high (large refactor)
- Files expected: 20+
- Estimated manual time: 8-12 hours
Feature 5: Add rate limiting to API endpoints (Node.js/Express)
- Complexity: Medium
- Files expected: 3-5 (middleware, config, tests)
- Estimated manual time: 2-3 hours
We wrote detailed GitHub issues for each, then let Copilot Workspace generate implementation plans.
What It Got Right
1. File Discovery Was Surprisingly Accurate
For Feature 1 (dark mode), Copilot Workspace identified 5 files that needed changes:
src/components/Settings.tsx(the settings page)src/context/ThemeContext.tsx(it proposed creating this)src/styles/theme.ts(theme definitions)src/App.tsx(wrap with provider)src/components/DarkModeToggle.tsx(the actual toggle)
We would've touched the exact same files. It understood the architecture from analyzing our codebase—how we use Context for global state, where theme definitions live, the component patterns we follow.
Verdict: This alone saved 15-20 minutes of planning.
2. It Understood Existing Patterns
For Feature 2 (CSV export), Copilot Workspace noticed we already had a utils/export.ts file with PDF export functions. It proposed:
- Adding
exportToCSV()alongside the existingexportToPDF() - Using the same error handling pattern
- Following our existing function naming conventions
It didn't reinvent the wheel. It extended what was already there, matching our style.
3. The Spec Was Actually Useful
Before generating code, Copilot Workspace writes a specification—what the feature should do, what will change, potential edge cases to consider.
For Feature 3 (optimistic updates), the spec included:
- "Add optimistic update to form state before API call"
- "Implement rollback on error"
- "Add loading states to prevent duplicate submissions"
- "Consider race conditions if user edits during submission"
That last point—race conditions—we hadn't explicitly mentioned in the issue. Copilot Workspace inferred it from the problem domain. That's genuinely helpful.
What It Got Wrong
1. The Generated Code Was... Optimistic
For Feature 1 (dark mode), the generated code compiled and ran. It even worked—technically. But:
- Variable names were generic (
themeinstead of our pattern:appTheme) - It created new utility functions instead of using our existing
cn()helper - The CSS-in-JS solution used inline styles, not our Tailwind setup
- No error boundaries around the theme provider
- Tests were shallow (just smoke tests, no edge cases)
We could use maybe 60% of it. The rest needed rewriting to match our actual codebase standards.
2. It Struggled with Complexity
Feature 4 (styled-components to CSS Modules migration) was a disaster.
The plan looked reasonable: "Convert styled components to CSS Modules, update imports, maintain existing class names." But the implementation:
- Missed 8 out of 24 components that needed changes
- Generated CSS Modules that didn't account for dynamic props
- Broke the component composition patterns we relied on
- Suggested regex find-and-replace for imports (which would've broken things)
We ended up doing this migration manually. The AI plan was worse than useless—it would've taken longer to fix than starting from scratch.
Reality check: Large refactors are still a human job.
3. Context Limits Are Real
For Feature 5 (rate limiting), Copilot Workspace proposed adding middleware. Solid start. But it assumed we were using Express middleware patterns throughout the app.
We're not. We use a custom API wrapper that Copilot Workspace didn't have in its context window. So all the generated code was for the wrong abstraction layer.
The plan would've worked—in a different codebase. Not ours.
4. No Awareness of Non-Code Requirements
Feature 2 (CSV export) needed:
- GDPR compliance (don't export PII without consent)
- File size limits (what if dataset is 10GB?)
- Background job for large exports
Copilot Workspace generated a simple "export everything to CSV" function with zero consideration for these real-world constraints. It solved the coding problem, not the actual problem.
When to Use Copilot Workspace
After 5 features, here's our honest take on when it's worth using:
Use It For:
Well-defined, medium-complexity features - Dark mode toggle, pagination, sorting, filtering, basic CRUD. If you can explain it in a 3-sentence issue, Copilot Workspace will generate a decent starting point.
Planning unfamiliar areas - We used it for a Kubernetes config change. Neither of us knows K8s well. The generated plan was educational—it showed us what files to touch and what the changes should look like. We still reviewed every line, but it was a good roadmap.
Prototyping - Need to spike a feature quickly? Copilot Workspace gets you 70% of the way in 5 minutes. Perfect for "let's see if this is even feasible" work.
Don't Use It For:
Large refactors - Anything touching 15+ files or requiring architectural changes. Copilot Workspace loses the plot. You'll spend more time fixing the plan than doing it yourself.
Domain-specific logic - Business rules, compliance requirements, performance constraints. The AI doesn't know your domain. It'll generate code that works in a vacuum but fails in production.
Anything customer-facing without review - This should be obvious, but: never ship Copilot Workspace code without thorough review. It generates plausible code, not correct code.
Our Workflow Now
We've kept Copilot Workspace in our toolkit, but with constraints:
-
Use it for planning, not implementation - We read the spec and file list, then write the code ourselves. This saves time without the risk of shipping AI slop.
-
Great for second opinions - "What did I miss?" Run Copilot Workspace on your plan. If it suggests files or edge cases you didn't think of, that's valuable.
-
Good for learning new tools - When working with unfamiliar frameworks or libraries, the generated code shows patterns we can learn from (even if we don't use it verbatim).
-
Skip it for client work - For bespoke web development projects, we're not confident enough in the output quality yet. Maybe in 6 months. Not today.
The Honest Take
GitHub Copilot Workspace is impressive technology. Genuinely. The fact that it can read a GitHub issue, analyze a codebase, and propose a coherent implementation plan is remarkable.
But it's not replacing developers. Not even close.
What it is: a very smart intern who can draft plans quickly but needs supervision on everything. You still need to:
- Write detailed requirements
- Review the plan for gaps
- Rewrite most of the generated code
- Test thoroughly
- Understand the domain and edge cases
Where it shines: saving 30-60 minutes of planning time on straightforward features. Giving you a starting point when you're stuck. Catching edge cases you might've missed.
Where it fails: anything requiring judgment, domain expertise, or understanding of non-functional requirements.
If you're already a Copilot subscriber, Workspace is worth trying. It's free in technical preview. Just don't expect it to ship production features unsupervised.
If you're not using Copilot yet, this alone isn't worth the subscription. Regular Copilot autocomplete is more useful day-to-day.
Takeaways
After testing 5 real features:
- File discovery: 8/10 - Usually identifies the right files to touch
- Specification quality: 7/10 - Good starting point, misses domain-specific concerns
- Code generation: 5/10 - Works but rarely matches your actual codebase patterns
- Complex refactors: 2/10 - More harmful than helpful
- Time saved: 20-40% on planning, 0-10% on total feature time
It's a useful tool. Not a game-changer. Yet.
We'll keep testing it on new features and update this post as Copilot Workspace improves. For now, it's in the toolbox—used selectively, reviewed carefully, never trusted blindly.
Building complex features that need human judgment? We've shipped hundreds of custom applications and know when to use AI tools and when to think for ourselves. Get in touch if you need a team that understands the difference.