The Agent-Team Model: PM, Architect, Developer, QA, Security as Specialised Roles
Software has roles. Agent teams should too. Concrete role definitions, typed handoffs, and why specialisation outperforms generalism for enterprise AI software development.
Article 2A made the negative case: a single-AI-engineer product silently relocates eight of nine SDLC stages to the buyer’s organisation. This article makes the positive case. The right alternative is not a smarter, more autonomous single agent. It is a team of specialised agents — different roles, different priors, different outputs — composed through typed handoffs the same way a real engineering team is.
This article — Series 2, part 2 — is concrete. Each role gets a one-paragraph definition, a single-sentence purpose, an input artifact, an output artifact, and a named failure mode. The point is to make the pattern reproducible, not to sell a product.
The Seven Roles of Enterprise SDLC
A software change moves through seven roles in any serious organisation. The roles can be performed by humans, by agents, or by a mix — but the boundaries are what produce a defensible result. Treating them as one undifferentiated “engineer” is what loses the boundaries.
sequenceDiagram
participant PM as PM Agent
participant A as Architect Agent
participant D as Developer Agent
participant Q as QA Agent
participant S as Security Agent
participant DO as DevOps Agent
participant CX as Customer-facing Agent
PM->>A: Scoped requirement + acceptance criteria
A->>D: Design doc + invariants + file targets
D->>Q: Diff + execution log
Q->>S: Verified diff + coverage report
S->>DO: Approved diff + audit record
DO->>CX: Released artifact + release notes
CX-->>PM: Customer signal back into the loop
1. PM Agent
Purpose: turn ambiguous requests into scoped, prioritised work items with explicit acceptance criteria.
- Input: ticket / customer signal / strategic directive
- Output: scoped work item (title, description, acceptance criteria, target_branch, priority)
- Failure mode if missing: developer agents start with prompts that aren’t requirements, output drifts, downstream rework
2. Architect Agent
Purpose: turn scoped requirements into a design that other agents can implement against.
- Input: PM-scoped work item, repo context, existing patterns
- Output: design document (file targets, invariants, blast radius, alternatives considered)
- Failure mode if missing: developer agents make architectural decisions inline; cross-cutting changes accumulate technical debt
3. Developer Agent
Purpose: turn a design into surgical, minimal-diff code that aligns with project conventions.
- Input: architect’s design doc + acceptance criteria
- Output: diff + tests + execution log
- Failure mode if missing: nothing ships
4. QA Agent
Purpose: enforce coverage thresholds, mutation testing, and adversarial-input testing on every diff.
- Input: developer’s diff
- Output: coverage delta + test report + verdict (pass / reject)
- Failure mode if missing: tests-pass-but-don’t-test ships; regressions surface in production
5. Security Agent
Purpose: run threat modelling, SAST, secret scanning, dependency review, and produce a structured audit record.
- Input: QA-verified diff
- Output: security finding artifact + audit record
- Failure mode if missing: AI-typical vulnerability classes (string-concatenated SQL, hardcoded secrets, prompt injection) merge silently
6. DevOps Agent
Purpose: deployment gates, environment promotion, and rollback authority.
- Input: approved diff + audit record
- Output: deployment status + rollout decision + rollback artifact
- Failure mode if missing: deploy is risky and slow; rollback is manual
7. Customer-facing / UX Agent
Purpose: translate engineering outputs into customer-visible artifacts and capture customer signals back into the loop.
- Input: deployment status, customer feedback channels
- Output: release notes, customer-facing copy review, signal funnelled back to PM
- Failure mode if missing: customer reality drifts from engineering’s understanding; product diverges from market
Role Boundaries Are the System Architecture
Notice what each role does NOT do. The architect doesn’t write code. The developer doesn’t approve security. The QA doesn’t ship to production. The security agent doesn’t decide priority. These boundaries are not bureaucratic — they are how independent eyes get applied to each stage.
A single agent that does all seven would have to model all seven sets of priors simultaneously. Even when an LLM is large enough to do that technically, it produces output without the structural guarantee that any one stage was reviewed by a different mind. Independent review by different priors is a property of separation, not of model capability.
The boundary is also where typed handoff payloads enforce the architecture. Each agent’s output schema is fixed. The receiving agent validates it before consuming. A malformed handoff fails fast at the boundary instead of corrupting the entire downstream chain. This is the LangGraph / pydantic / zod pattern from Agent Workflows in Enterprise Software Development — it’s how the role-team model survives contact with messy inputs.
Roles Table — Inputs, Outputs, Failure Modes
The compact view, all seven roles in one place:
| Role | Input artifact | Output artifact | Failure mode if absent |
|---|---|---|---|
| PM | Customer / strategic signal | Scoped work item w/ acceptance criteria | Drift, rework, prompt-shaped requirements |
| Architect | Scoped work item | Design doc w/ invariants + alternatives | Inline architectural decisions, debt |
| Developer | Design doc | Diff + tests + log | Nothing ships |
| QA | Diff | Coverage delta + test report | Tests-pass-but-don’t-test |
| Security | Verified diff | Security findings + audit record | Silent vuln-class merges |
| DevOps | Approved diff + audit record | Deployment status + rollback artifact | Risky / slow / un-rollbackable deploy |
| Customer-facing | Deployment + customer signal | Release notes + signal feedback | Customer reality drift |
Why Specialisation Outperforms Generalism
The instinct against the agent-team model is “isn’t a smart enough single agent equivalent?” The honest answer is: not for the kind of work enterprise software actually is. Three reasons make the difference durable.
First, independence catches what fluency hides. A single agent reviewing its own code shares its blind spots. A QA agent built on a different prior, trained against test-design rather than code-generation, catches mistakes the developer agent’s training distribution doesn’t surface. Same for security — an agent whose job description is “find what’s wrong” is structurally different from an agent whose job description is “make it work.”
Second, specialisation makes the artifact reviewable. A team of seven specialised agents produces seven artifacts a human can inspect: design doc, diff, coverage report, security findings, audit record, deployment log, release notes. A single agent doing all seven produces one chat transcript. The reviewable surface is the artifact set. Fewer artifacts = thinner audit story = harder procurement defence.
Third, role boundaries are how concurrency scales. Two developer agents working in parallel on the same feature can coordinate via the same architect agent’s design doc, without re-resolving the design from scratch. A single-agent pattern forces serialisation of any concurrent work because there is no shared, pre-decided design to coordinate around. As enterprise software volume scales, the team-of-agents pattern is the one that doesn’t grind to a halt.
Generalist vs Specialist — The Direct Comparison
The same SDLC concerns, scored against the single-AI-engineer frame and the agent-team frame:
| Concern | Single AI engineer | Agent team |
|---|---|---|
| Design quality | Inline, implicit, opaque | Explicit design doc with invariants |
| Review independence | None — same mind that wrote it | Built-in: separate QA + security agents |
| Security ownership | Implied / inline | Named role with structured findings |
| Audit completeness | Author-only attestation | Per-stage artifact + control mappings |
| On-call coverage | Out of scope | DevOps agent owns deployment + rollback |
| Customer feedback loop | Unaddressed | Customer-facing role closes the loop |
| Procurement story | ”Buy our agent" | "Buy a coordinated team” |
The argument for specialisation in software is the same argument that makes a five-person team different from a five-times-experienced individual. Different priors catch different mistakes. The product of independent reviews is greater than the sum of the parts. When the unit of work is “every change to a regulated codebase,” that compounding is exactly what the buyer is paying for.
The Axiom Mapping
VibeFlow operationalises this directly. The default workflow ships with named personas — Aria the PM, an Architect, a Developer (Kai when the Principal Engineer override is active), Quinn the QA Lead, Sophie the Security Lead, plus customer-facing personas where applicable. Each persona has its own intake statuses, its own role-specific instructions, and its own contribution to the audit record. The status flow planning → implementing → done → security_review → qa_verified is the seven-role chain compressed to the stages that matter for the agent team’s daily work; the deploy and customer-facing steps live in the surrounding pipeline by design.
The protocol layer matters here too. The A2A Protocol is what makes specialised agents from different teams or vendors composable into one coordinated team. The MCP Gateway is what gives every agent its tools without each agent owning its own integrations. The LLM Gateway is what gives every agent its model with per-role policy and cost controls. The agent-team frame is not a single product — it is a stack, with open protocols underneath and persona-typed agents on top.
For more on the daily operational shape, see Agent Workflows in Enterprise Software Development and From Individual Copilots to Team-Wide AI Orchestration. The platform comparison series (1A landing, 1B compliance, 1C integrations) shows what the agent-team shape looks like vs the single-agent and PM-augmented alternatives.
The Frame Choice Is a Procurement Choice
Buyers picking AI tools for software development are picking a frame, whether they realise it or not. “AI software engineer” pre-decides that the buyer’s organisation will absorb seven of the eight other roles. “Agent team” pre-decides that the team’s roles get specialised, repeatable representation. Neither is universally right — but the buyer should know which choice they’re making, and pick consciously.
Article 2C closes the series by arguing that even the agent-team framing isn’t the whole answer. The right unit of comparison is the entire delivery process — SDLC discipline as the actual differentiator.
For role-specific reading, see /for/engineering-leaders, /for/engineering-managers, /for/platform-teams, and /for/cisos. Or start free with VibeFlow to see the agent team running on your own repo.
Written by
AXIOM Team