Building Multi-Agent Systems with Claude Opus 4 for Complex Tasks
The era of the "single prompt" solution is ending. As we move from simple chatbots to autonomous systems capable of executing complex enterprise workflows—like automated due diligence, legacy code migration, or dynamic market analysis—the limitation isn't the model's knowledge, but its architecture.
To solve multi-step, ambiguous problems, we must move toward Multi-Agent Systems (MAS). In these architectures, a highly intelligent "Orchestrator" breaks down high-level goals into sub-tasks and delegates them to specialized agents.
Currently, Claude 3 Opus stands as the premier choice for this "Orchestrator" role due to its superior reasoning, long-context recall, and ability to follow complex chain-of-thought instructions without hallucinating. While the industry anticipates the arrival of next-generation models (often speculated as "Claude 4"), the architectural patterns we build today with Opus are the foundation for those future capabilities.
In this article, we will engineer a robust multi-agent system using Python and the Anthropic API, designed for CTOs and Senior Engineers ready to move beyond proof-of-concept.
The Architecture: The Hub-and-Spoke "Supervisor" Pattern
For complex tasks, a flat structure where agents talk to everyone else often leads to infinite loops and state drift. Instead, we use a Hub-and-Spoke (Supervisor) pattern.
- The Brain (Supervisor): Powered by Claude 3 Opus. It holds the "Global State" and the "Plan." It does not execute tools (like scraping or database writes) directly unless critical. Its job is to think, critique, and route.
- The Workers (Sub-Agents): Powered by faster, cost-effective models like Claude 3.5 Sonnet or Haiku. These agents possess specific "Tools" (functions) and are myopic—they only care about their specific sub-task (e.g., "Scrape this URL" or "Run this SQL query").
- Shared State: A persistent JSON or Pydantic object that tracks the history of actions, results, and the current plan.
Technical Implementation
We will build a system where an Orchestrator (Opus) manages a research workflow. It will delegate tasks to a "Search Agent" and a "Writer Agent."
1. Prerequisites and Setup
We rely on the native anthropic SDK and pydantic for strict type validation—a non-negotiable for production Product Engineering.
pip install anthropic pydantic
2. Defining the State and Tools
First, we define the structure of our messages and the tools our agents can use.
import os
import json
from typing import List, Dict, Any, Optional
from pydantic import BaseModel, Field
from anthropic import Anthropic
# Initialize Client
client = Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
# Define a tool for our workers
def web_search_tool(query: str):
# In production, replace with SerpAPI or similar
return f"Simulated search results for: {query}"
TOOLS_DEFINITION = [
{
"name": "web_search",
"description": "Searches the web for information.",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "The search query"}
},
"required": ["query"]
}
}
]
3. The Opus Orchestrator (The "Brain")
The critical engineering challenge is the System Prompt. We must force Claude 3 Opus to act strictly as a manager, not a worker. We use XML tagging (which Claude prefers) to define clear boundaries.
ORCHESTRATOR_SYSTEM_PROMPT = """
You are the Chief Orchestrator of a research team.
Your Goal: Answer the user's complex question by delegating tasks to your workers.
You have access to the following workers:
1. 'researcher': Can search the internet.
2. 'writer': Can compile information into a summary.
Instructions:
1. Analyze the user's request.
2. Break it down into step-by-step sub-tasks.
3. Output a JSON object with the key "next_action" and "payload".
- If you need information, route to 'researcher'.
- If you have enough info, route to 'writer'.
- If the task is done, output "FINISH".
Do not perform the search yourself. DELEGATE.
"""
def get_orchestrator_decision(messages: List[Dict]) -> Dict:
"""
Asks Claude 3 Opus to decide the next step.
"""
response = client.messages.create(
model="claude-3-opus-20240229", # Using Opus for high-level reasoning
max_tokens=1024,
system=ORCHESTRATOR_SYSTEM_PROMPT,
messages=messages
)
# In a real app, use robust JSON parsing/validation here
try:
decision = json.loads(response.content[0].text)
return decision
except json.JSONDecodeError:
# Fallback logic or retry mechanism
return {"next_action": "ERROR", "payload": "Failed to parse JSON"}
4. The Worker Agents (The Execution Layer)
For the workers, we don't need the massive reasoning cost of Opus. We can use Claude 3.5 Sonnet, which is faster and highly capable of tool execution.
def run_worker_agent(agent_name: str, task_description: str) -> str:
"""
Executes a specific task using a sub-agent (Sonnet).
"""
if agent_name == "researcher":
# This agent is 'bound' to the web_search tool
response = client.messages.create(
model="claude-3-5-sonnet-20240620",
max_tokens=1024,
tools=TOOLS_DEFINITION,
system=f"You are a specialist {agent_name}. Perform the requested task using your tools.",
messages=[{"role": "user", "content": task_description}]
)
# specific logic to handle tool use response
if response.stop_reason == "tool_use":
tool_use = next(block for block in response.content if block.type == "tool_use")
# Execute the function (simulated)
result = web_search_tool(tool_use.input["query"])
return f"Search Result: {result}"
elif agent_name == "writer":
# Pure generation task
response = client.messages.create(
model="claude-3-5-sonnet-20240620",
max_tokens=1024,
system="You are a technical writer. Summarize the provided context.",
messages=[{"role": "user", "content": task_description}]
)
return response.content[0].text
return "Unknown Agent"
5. The Execution Loop (The "Runtime")
Finally, we need a while loop to maintain the life-cycle of the request. This allows the system to be "stateful" over time, a requirement for Intelligent AI Workflows.
def run_multi_agent_system(user_query: str):
conversation_history = [{"role": "user", "content": user_query}]
print(f"Starting Task: {user_query}")
while True:
# 1. Orchestrator decides
decision = get_orchestrator_decision(conversation_history)
action = decision.get("next_action")
payload = decision.get("payload")
print(f"Orchestrator Decision: {action}")
if action == "FINISH":
print("Final Answer:", payload)
break
if action == "ERROR":
print("Orchestration Error")
break
# 2. Worker executes
worker_result = run_worker_agent(action, payload)
# 3. Update State (Context)
# We append the result back to the history so Opus knows what happened
conversation_history.append({
"role": "assistant",
"content": f"Delegated to {action}. Result: {worker_result}"
})
Advanced Considerations for CTOs
Handling State and Context Drift
In production, you cannot simply append messages forever; the context window (even Opus's 200k tokens) will fill up, increasing latency and cost.
conversation_history into a bulleted list of "Facts Known". Pass only the "Facts Known" and the "Current Objective" to the Orchestrator.Error Recovery and Self-Correction
Agents fail. A web scraper might return a 403; a database query might syntax error.
Cost vs. Accuracy
The "Orchestrator" pattern allows you to optimize costs. You pay for the intelligence of Opus only for the high-level planning (routing), while 90% of the token volume (reading docs, scraping, summarizing) is handled by the cheaper Sonnet or Haiku models.
Conclusion
Building multi-agent systems is no longer about prompt engineering; it is a software architecture discipline. It requires robust state management, clear interface definitions, and a strategic mix of models.
While we await the release of models like Claude 4, the patterns established here—Orchestration, Delegation, and Self-Correction—are future-proof. They allow you to swap in more powerful "brains" as they become available without rewriting your entire infrastructure.
If your organization is looking to scale its engineering capabilities to build these Custom AI Agents or complex Product Engineering solutions, partnering with a specialized team can accelerate your roadmap. 4Geeks Teams offers the on-demand, senior engineering talent required to turn these architectural concepts into deployed, revenue-generating software.