Agent-Team Model: Specialised Roles for AI SDLC

Article 2A made the negative case: a single-AI-engineer product silently relocates eight of nine SDLC stages to the buyer’s organisation. This article makes the positive case. The right alternative is not a smarter, more autonomous single agent. It is a team of specialised agents — different roles, different priors, different outputs — composed through typed handoffs the same way a real engineering team is.

This article — Series 2, part 2 — is concrete. Each role gets a one-paragraph definition, a single-sentence purpose, an input artifact, an output artifact, and a named failure mode. The point is to make the pattern reproducible, not to sell a product.

The Seven Roles of Enterprise SDLC

A software change moves through seven roles in any serious organisation. The roles can be performed by humans, by agents, or by a mix — but the boundaries are what produce a defensible result. Treating them as one undifferentiated “engineer” is what loses the boundaries.

sequenceDiagram
    participant PM as PM Agent
    participant A as Architect Agent
    participant D as Developer Agent
    participant Q as QA Agent
    participant S as Security Agent
    participant DO as DevOps Agent
    participant CX as Customer-facing Agent
    PM->>A: Scoped requirement + acceptance criteria
    A->>D: Design doc + invariants + file targets
    D->>Q: Diff + execution log
    Q->>S: Verified diff + coverage report
    S->>DO: Approved diff + audit record
    DO->>CX: Released artifact + release notes
    CX-->>PM: Customer signal back into the loop

1. PM Agent

Purpose: turn ambiguous requests into scoped, prioritised work items with explicit acceptance criteria.

Input: ticket / customer signal / strategic directive
Output: scoped work item (title, description, acceptance criteria, target_branch, priority)
Failure mode if missing: developer agents start with prompts that aren’t requirements, output drifts, downstream rework

2. Architect Agent

Purpose: turn scoped requirements into a design that other agents can implement against.

Input: PM-scoped work item, repo context, existing patterns
Output: design document (file targets, invariants, blast radius, alternatives considered)
Failure mode if missing: developer agents make architectural decisions inline; cross-cutting changes accumulate technical debt

3. Developer Agent

Purpose: turn a design into surgical, minimal-diff code that aligns with project conventions.

Input: architect’s design doc + acceptance criteria
Output: diff + tests + execution log
Failure mode if missing: nothing ships

4. QA Agent

Purpose: enforce coverage thresholds, mutation testing, and adversarial-input testing on every diff.

Input: developer’s diff
Output: coverage delta + test report + verdict (pass / reject)
Failure mode if missing: tests-pass-but-don’t-test ships; regressions surface in production

5. Security Agent

Purpose: run threat modelling, SAST, secret scanning, dependency review, and produce a structured audit record.

Input: QA-verified diff
Output: security finding artifact + audit record
Failure mode if missing: AI-typical vulnerability classes (string-concatenated SQL, hardcoded secrets, prompt injection) merge silently

6. DevOps Agent

Purpose: deployment gates, environment promotion, and rollback authority.

Input: approved diff + audit record
Output: deployment status + rollout decision + rollback artifact
Failure mode if missing: deploy is risky and slow; rollback is manual

7. Customer-facing / UX Agent

Purpose: translate engineering outputs into customer-visible artifacts and capture customer signals back into the loop.

Input: deployment status, customer feedback channels
Output: release notes, customer-facing copy review, signal funnelled back to PM
Failure mode if missing: customer reality drifts from engineering’s understanding; product diverges from market

Role Boundaries Are the System Architecture

Notice what each role does NOT do. The architect doesn’t write code. The developer doesn’t approve security. The QA doesn’t ship to production. The security agent doesn’t decide priority. These boundaries are not bureaucratic — they are how independent eyes get applied to each stage.

A single agent that does all seven would have to model all seven sets of priors simultaneously. Even when an LLM is large enough to do that technically, it produces output without the structural guarantee that any one stage was reviewed by a different mind. Independent review by different priors is a property of separation, not of model capability.

The boundary is also where typed handoff payloads enforce the architecture. Each agent’s output schema is fixed. The receiving agent validates it before consuming. A malformed handoff fails fast at the boundary instead of corrupting the entire downstream chain. This is the LangGraph / pydantic / zod pattern from Agent Workflows in Enterprise Software Development — it’s how the role-team model survives contact with messy inputs.

Roles Table — Inputs, Outputs, Failure Modes

The compact view, all seven roles in one place:

Role	Input artifact	Output artifact	Failure mode if absent
PM	Customer / strategic signal	Scoped work item w/ acceptance criteria	Drift, rework, prompt-shaped requirements
Architect	Scoped work item	Design doc w/ invariants + alternatives	Inline architectural decisions, debt
Developer	Design doc	Diff + tests + log	Nothing ships
QA	Diff	Coverage delta + test report	Tests-pass-but-don’t-test
Security	Verified diff	Security findings + audit record	Silent vuln-class merges
DevOps	Approved diff + audit record	Deployment status + rollback artifact	Risky / slow / un-rollbackable deploy
Customer-facing	Deployment + customer signal	Release notes + signal feedback	Customer reality drift

Why Specialisation Outperforms Generalism

The instinct against the agent-team model is “isn’t a smart enough single agent equivalent?” The honest answer is: not for the kind of work enterprise software actually is. Three reasons make the difference durable.

First, independence catches what fluency hides. A single agent reviewing its own code shares its blind spots. A QA agent built on a different prior, trained against test-design rather than code-generation, catches mistakes the developer agent’s training distribution doesn’t surface. Same for security — an agent whose job description is “find what’s wrong” is structurally different from an agent whose job description is “make it work.”

Second, specialisation makes the artifact reviewable. A team of seven specialised agents produces seven artifacts a human can inspect: design doc, diff, coverage report, security findings, audit record, deployment log, release notes. A single agent doing all seven produces one chat transcript. The reviewable surface is the artifact set. Fewer artifacts = thinner audit story = harder procurement defence.

Third, role boundaries are how concurrency scales. Two developer agents working in parallel on the same feature can coordinate via the same architect agent’s design doc, without re-resolving the design from scratch. A single-agent pattern forces serialisation of any concurrent work because there is no shared, pre-decided design to coordinate around. As enterprise software volume scales, the team-of-agents pattern is the one that doesn’t grind to a halt.

Generalist vs Specialist — The Direct Comparison

The same SDLC concerns, scored against the single-AI-engineer frame and the agent-team frame:

Concern	Single AI engineer	Agent team
Design quality	Inline, implicit, opaque	Explicit design doc with invariants
Review independence	None — same mind that wrote it	Built-in: separate QA + security agents
Security ownership	Implied / inline	Named role with structured findings
Audit completeness	Author-only attestation	Per-stage artifact + control mappings
On-call coverage	Out of scope	DevOps agent owns deployment + rollback
Customer feedback loop	Unaddressed	Customer-facing role closes the loop
Procurement story	”Buy our agent"	"Buy a coordinated team”

The argument for specialisation in software is the same argument that makes a five-person team different from a five-times-experienced individual. Different priors catch different mistakes. The product of independent reviews is greater than the sum of the parts. When the unit of work is “every change to a regulated codebase,” that compounding is exactly what the buyer is paying for.

The Axiom Mapping

VibeFlow operationalises this directly. The default workflow ships with named personas — Aria the PM, an Architect, a Developer (Kai when the Principal Engineer override is active), Quinn the QA Lead, Sophie the Security Lead, plus customer-facing personas where applicable. Each persona has its own intake statuses, its own role-specific instructions, and its own contribution to the audit record. The status flow planning → implementing → done → security_review → qa_verified is the seven-role chain compressed to the stages that matter for the agent team’s daily work; the deploy and customer-facing steps live in the surrounding pipeline by design.

The protocol layer matters here too. The A2A Protocol is what makes specialised agents from different teams or vendors composable into one coordinated team. The MCP Gateway is what gives every agent its tools without each agent owning its own integrations. The LLM Gateway is what gives every agent its model with per-role policy and cost controls. The agent-team frame is not a single product — it is a stack, with open protocols underneath and persona-typed agents on top.

For more on the daily operational shape, see Agent Workflows in Enterprise Software Development and From Individual Copilots to Team-Wide AI Orchestration. The platform comparison series (1A landing, 1B compliance, 1C integrations) shows what the agent-team shape looks like vs the single-agent and PM-augmented alternatives.

The Frame Choice Is a Procurement Choice

Buyers picking AI tools for software development are picking a frame, whether they realise it or not. “AI software engineer” pre-decides that the buyer’s organisation will absorb seven of the eight other roles. “Agent team” pre-decides that the team’s roles get specialised, repeatable representation. Neither is universally right — but the buyer should know which choice they’re making, and pick consciously.

Article 2C closes the series by arguing that even the agent-team framing isn’t the whole answer. The right unit of comparison is the entire delivery process — SDLC discipline as the actual differentiator.

For role-specific reading, see /for/engineering-leaders, /for/engineering-managers, /for/platform-teams, and /for/cisos. Or start free with VibeFlow to see the agent team running on your own repo.

The Agent-Team Model: PM, Architect, Developer, QA, Security as Specialised Roles