What Are AI Agents — and Why Multi-Agent Systems?
An AI agent is an LLM that can take actions: searching the web, reading files, writing code, calling APIs. Unlike a single prompt-response, agents can plan a sequence of steps, use tools, observe results, and adjust their approach until a goal is achieved.
Multi-agent systems take this further by assigning specialized roles to different agents. Just as a company has a researcher, a writer, and an editor — each expert in their domain — a CrewAI crew has agents with defined personas, goals, and tool access. The agents collaborate, passing outputs between them, to produce results no single agent could achieve alone.
Common patterns: a Researcher gathers information → a Analyst synthesizes it → a Writer drafts the output → a Reviewer checks quality. CrewAI makes this orchestration simple.
Installing CrewAI
pip install crewai crewai-tools
# Verify installation
python -c "import crewai; print(crewai.__version__)"
You'll also need an OpenAI API key (or you can configure CrewAI to use other LLM providers like Anthropic, Groq, or local Ollama models):
export OPENAI_API_KEY="sk-..."
Core Concepts
- Agent: An LLM with a role, goal, backstory, and optional tools. The backstory shapes its behavior — "You are a meticulous fact-checker who never assumes..." produces very different behavior than a generic agent.
- Task: A specific unit of work assigned to an agent. Has a description, expected output, and optionally an output file or structured output format.
- Crew: The orchestrator that manages multiple agents and tasks. Supports sequential (one after another) and hierarchical (manager delegates to workers) processes.
- Tools: Functions agents can call — web search, file operations, code execution, API calls. CrewAI ships with many built-in tools.
Creating Your First Agent: The Researcher
from crewai import Agent, Task, Crew, Process
from crewai_tools import SerperDevTool
# Tool for web search (requires SERPER_API_KEY env var, free tier available)
search_tool = SerperDevTool()
researcher = Agent(
role="Senior Research Analyst",
goal="Uncover cutting-edge developments in AI and summarize them accurately",
backstory=(
"You are a veteran technology researcher with 15 years of experience. "
"You're known for your ability to cut through hype and identify what's "
"genuinely important. You always verify claims with multiple sources."
),
tools=[search_tool],
verbose=True, # logs agent's thinking
allow_delegation=False, # this agent won't delegate to others
llm="gpt-4o", # you can specify per-agent LLMs
)
Creating a Second Agent: The Writer
writer = Agent(
role="Tech Content Writer",
goal=(
"Create engaging, accurate articles that make complex AI topics "
"accessible to a general tech audience"
),
backstory=(
"You are an experienced technology journalist who has written for "
"publications like Wired and MIT Technology Review. You translate "
"technical findings into compelling narratives without dumbing them down."
),
tools=[], # writer doesn't need search tools
verbose=True,
allow_delegation=False,
llm="gpt-4o",
)
Defining Tasks
research_task = Task(
description=(
"Research the latest developments in AI agents published in the last 30 days. "
"Focus on: new frameworks, benchmark results, and real-world deployments. "
"Identify the 3 most significant developments and explain why each matters."
),
expected_output=(
"A structured report with 3 sections, each covering one major development. "
"Include: what it is, why it matters, and a direct source URL."
),
agent=researcher,
output_file="research_report.md", # saves output to file
)
write_task = Task(
description=(
"Using the research report provided, write a 600-word blog article "
"titled 'This Week in AI Agents'. Maintain an informative but conversational tone. "
"Include an intro, one section per development, and a conclusion with your take."
),
expected_output=(
"A complete, publication-ready blog article in markdown format. "
"The article should flow naturally and not feel like a list of facts."
),
agent=writer,
context=[research_task], # tells CrewAI this task depends on research_task's output
output_file="blog_article.md",
)
Assembling and Running the Crew
crew = Crew(
agents=[researcher, writer],
tasks=[research_task, write_task],
process=Process.sequential, # tasks run in order
verbose=True,
)
result = crew.kickoff()
print("=== FINAL OUTPUT ===")
print(result)
When you run this, you'll see each agent's thinking process in the terminal — what they search for, what they find, how they formulate their output. CrewAI handles the orchestration so you don't need to manually pass outputs between agents.
Using Tools
CrewAI provides many built-in tools. Here are the most useful:
from crewai_tools import (
SerperDevTool, # web search via Serper API
FileReadTool, # read local files
FileWriterTool, # write files
DirectoryReadTool, # list directory contents
WebsiteSearchTool, # search a specific website
PDFSearchTool, # RAG over PDF files
CodeInterpreterTool, # execute Python code
)
# Give an agent multiple tools
data_analyst = Agent(
role="Data Analyst",
goal="Analyze data files and produce insights",
backstory="Expert data analyst who codes in Python",
tools=[FileReadTool(), FileWriterTool(), CodeInterpreterTool()],
)
Debugging and Monitoring Your Crew
Set verbose=True on both agents and the crew to see the full thought process. In production, CrewAI integrates with Agentops for observability — add agentops.init() before your crew runs to get a dashboard with timings, token usage, and full trace logs.
Common debugging tips:
- If an agent loops without progress, check that its tools are returning useful data
- If output quality is low, strengthen the
backstoryandexpected_outputfields - Use
Process.hierarchicalwith a manager LLM for complex workflows where a manager agent should coordinate workers
Real Use Cases
Content Pipeline
Researcher → Outline Writer → Article Writer → SEO Reviewer → Editor. Each agent specializes. The crew produces a complete, SEO-optimized article from a single topic keyword input.
Research Automation
Provide a list of companies. A Researcher agent searches each one, a Data Extractor pulls specific fields (revenue, headcount, funding), and a Report Writer formats a comparison table.
Code Review Crew
Feed a pull request diff. A Security Auditor checks for vulnerabilities, a Performance Analyst looks for bottlenecks, and a Code Quality Reviewer checks style — each files their findings, which a Summary Agent consolidates into a final review comment.
Limitations and Best Practices
- Costs add up fast: Multi-agent systems make many LLM calls. Always test with GPT-4o mini before switching to GPT-4o
- Non-determinism: Agents don't always produce the same output. Build workflows that are tolerant of variation
- Long tasks need checkpoints: Use
output_fileon intermediate tasks so you can resume from a checkpoint if something fails - Avoid circular dependencies: Tasks should have a clear dependency order; circular task dependencies will cause infinite loops
- Keep backstories focused and specific — vague backstories lead to generic outputs