Why this article exists
I’ve been building automations with n8n since the “please don’t sneeze on the SQLite DB” era. It’s brilliant for wiring APIs together, but the moment you add AI, the shape of a workflow changes. Your flows stop being rigid pipes and start looking like decision trees run by a very opinionated intern. If you go a step further and use multiple agents — each with its own tools, memory and responsibilities — you can ship automations that plan, research, act, and verify without human babysitting.
This post is a practical guide to multi‑agent automation in n8n: how the building blocks map to LangChain concepts, how to wire agents together, what to do about memory, and where people usually face‑plant (I did, repeatedly). You’ll get end‑to‑end examples you can steal.
Who’s this for? Developers and technical operators who already ship n8n flows and want to add agentic behaviour for real work: ticket triage, content ops, research + RAG, customer support, on‑call tooling, that sort of thing.
TL;DR
- n8n gives you an AI Agent that can call tools (HTTP, code, other workflows, vector stores) and stream output. You can nest agents and build hierarchies.
- Use AI Agent Tool to expose one agent as a tool to another. Use Call n8n Workflow Tool to turn any sub‑workflow into a callable tool.
- For memory, start with Simple/Window Buffer for prototypes; for production, use Redis/Postgres/Zep memory nodes or your own store. Avoid Simple Memory in queue mode.
- For RAG, use the Vector Store nodes (Weaviate, Qdrant, Pinecone) with Retriever or the Vector Store Question‑Answer Tool.
- Scale with queue mode (Redis + Postgres) and set worker concurrency sensibly.
- Force structure with Structured Output Parser or run a parsing chain after the agent.
The mental model: n8n’s agentic pieces
Think of an agent as a loop that plans → calls a tool → observes → repeats until the goal is satisfied. In n8n, that maps nicely:
-
AI Agent: the planner/decider. Attach a chat model (OpenAI, Anthropic, Groq, Mistral Cloud, Azure OpenAI) and tick Require Specific Output Format when you need strict structure.
-
Tools: plug‑in abilities for the agent. Out of the box you get HTTP Request, Custom Code, Wikipedia, SerpAPI, and crucially:
- Call n8n Workflow Tool: treat a whole workflow as a tool.
- AI Agent Tool: treat another agent as a tool. This is how you do multi‑agent orchestration on one canvas.
- Vector Stores: Pinecone, Weaviate, Qdrant nodes as tools or via Retriever.
- $fromAI(): let the model fill in node parameters at runtime (handy for tool inputs).
-
Memory: Simple/Window Buffer for quick chats; Redis/Postgres/Zep/Mongo memory nodes for persistence and scale.
-
Chat Trigger: gives you a chat UI, session IDs, optional streaming, and pluggable memory. Great for internal tools.
I’ll show how these fit with three real workflows.
Pattern 1: Single agent with a serious tool‑belt
Use when: one agent can complete the task but needs to call APIs, run code, or query a knowledge base.
Sketch
Chat Trigger → AI Agent (Tools Agent)
├─ HTTP Request Tool (generic API calls)
├─ Vector Store QA Tool (RAG over docs)
└─ Custom Code Tool (one‑off transforms)
→ Structured Output Parser (optional)
→ Slack/Email/DB
Gotchas I actually hit
- If you enable a strict output format inside the Agent, it’s not always reliable mid‑tool‑loop. I often pipe the agent’s final text to a tiny LLM chain for parsing instead.
$fromAI()
is brilliant but remember it only works on tool parameters. I treat it like hints:$fromAI('email', 'customer contact address')
.
Example: inbound support triage Goal: classify a customer message, search the knowledge base, and create/update a Jira ticket with a clean summary.
-
Chat Trigger with streaming on. Attach Window Buffer Memory with
k=10
for context. -
AI Agent with tools:
- Vector Store QA Tool bound to your Weaviate/Qdrant collection of product docs.
- Call n8n Workflow Tool pointing at a sub‑workflow that creates/updates Jira issues.
-
Add a Structured Output Parser with a JSON Schema like:
{
"type": "object",
"properties": {
"intent": {"type": "string", "enum": ["bug", "question", "billing", "feature"]},
"priority": {"type": "string", "enum": ["low","medium","high"]},
"summary": {"type": "string"},
"proposedFix": {"type": "string"}
},
"required": ["intent","summary"]
}
- Pipe the parsed object to Jira. If parsing fails, fall back to an Auto‑fixing Output Parser or run a micro chain to coerce JSON.
Why this works Single agent keeps state simple. Tools handle IO, vector search handles facts, and you end with structured data your ops team trusts.
Pattern 2: Supervisor + specialists (true multi‑agent)
Use when: tasks need planning and delegation: research → draft → review, or classify → route → act.
Sketch
Webhook/Chat → Supervisor Agent
├─ AI Agent Tool: Researcher (RAG + web)
├─ AI Agent Tool: Actions (Jira/Slack/Notion tools)
└─ AI Agent Tool: Auditor (only verifies / scores output)
→ Parser → Destinations
Key wiring details
- Build three separate agents on the same canvas.
- Expose each with AI Agent Tool and connect them to the Supervisor.
- Give each agent its own system prompt and tools. The Actions agent gets HTTP + Workflow tools, the Researcher gets Vector Store QA + web search, the Auditor only reads and returns a score with reasoning.
- Keep memory local to the Supervisor and pass compact context chunks to specialists. It cuts token spillover and weird cross‑talk.
Example: content ops pipeline Input: a product link. Output: a markdown brief, keywords, and two image prompts.
- Supervisor plans: “research, draft brief, audit tone, then hand over.”
- Researcher uses RAG over your product docs + optional web search.
- Drafting agent produces a markdown brief and 2 alt versions.
- Auditor checks tone and brand rules and returns a score + fix suggestions.
- Final parser normalises fields; the workflow posts to Notion and schedules a review in Slack.
Tip Use Call n8n Workflow Tool for messy steps like “render screenshots, upload to S3, return URL.” Makes the agent’s surface area small and testable.
Pattern 3: Agentic RAG with tool‑taking actions
Use when: the agent must answer questions from a vector store and then do real work.
Sketch
Chat/Webhook → Tools Agent
├─ Vector Store Retriever → Vector Store QA Tool
├─ HTTP Request Tool (internal APIs)
└─ Call n8n Workflow Tool (side‑effects: create user, refund, etc.)
→ Response
Practical notes
- For Weaviate/Qdrant/Pinecone, prefer a Retriever → QA tool when you need summaries; wiring the vector store directly as a tool also works for quick answers.
- Chunking and metadata matter more than your model. Keep chunks 500–1,000 tokens with overlap; tag source URLs, versions, and ACLs.
Memory, sessions, and why your bot “forgets”
- The Simple/Window Buffer nodes are fine for demos. In production, especially in queue mode, use a proper memory store (Redis, Postgres, Zep, Mongo) so any worker can reload context.
- With Chat Trigger, set “Load Previous Session” to From Memory and connect both Chat Trigger and Agent to the same memory node. Otherwise you’ll chase phantom context bugs.
- Cap the window. When you throw 100+ messages at a context window, quality tanks and cost spikes. I generally keep
k=8–12
and summarise older turns into a single system note.
Reliability: retries, branching, parsing
- There’s no universal try/catch node. I fake it with If/Filter/Switch plus a Code node around likely failure points, and I send failures to an Error Workflow with enough context to replay.
- A lot of “agent flakiness” is actually output formatting. Use a strict schema at the end of the chain, and keep agent prompts short, with explicit tool selection instructions and success criteria.
- When you must fan‑out, use Loop Over Items (Split in Batches) to control rate limits, then Merge to rejoin results. For parallelism you scale at the worker/concurrency level.
Scaling: queue mode without tears
If you plan to run agents for real users, flip to queue mode. You’ll run one main (UI + triggers) and a pool of workers that pull jobs from Redis and write results to Postgres. Keep webhook processors separate if you’re handling bursty traffic.
Minimal docker‑compose.yml (works on my homelab and a small VPS)
version: "3.9"
services:
postgres:
image: postgres:15
environment:
POSTGRES_USER: n8n
POSTGRES_PASSWORD: n8npassword
POSTGRES_DB: n8n
volumes:
- pg:/var/lib/postgresql/data
redis:
image: redis:7
command: ["redis-server", "--save", "", "--appendonly", "no"]
n8n-main:
image: docker.n8n.io/n8nio/n8n:latest
depends_on: [postgres, redis]
environment:
# Core
N8N_ENCRYPTION_KEY: "change-me-super-secret"
GENERIC_TIMEZONE: "Europe/Bucharest"
WEBHOOK_URL: "https://your.n8n.domain" # set correctly or webhooks sulk
# DB
DB_TYPE: postgres
DB_POSTGRESDB_HOST: postgres
DB_POSTGRESDB_PORT: 5432
DB_POSTGRESDB_DATABASE: n8n
DB_POSTGRESDB_USER: n8n
DB_POSTGRESDB_PASSWORD: n8npassword
# Queues
EXECUTIONS_MODE: queue
QUEUE_BULL_REDIS_HOST: redis
QUEUE_BULL_REDIS_PORT: 6379
ports: ["5678:5678"]
volumes:
- n8n:/home/node/.n8n
n8n-worker:
image: docker.n8n.io/n8nio/n8n:latest
depends_on: [postgres, redis]
environment:
N8N_ENCRYPTION_KEY: "change-me-super-secret" # must match main
DB_TYPE: postgres
DB_POSTGRESDB_HOST: postgres
DB_POSTGRESDB_PORT: 5432
DB_POSTGRESDB_DATABASE: n8n
DB_POSTGRESDB_USER: n8n
DB_POSTGRESDB_PASSWORD: n8npassword
EXECUTIONS_MODE: queue
QUEUE_BULL_REDIS_HOST: redis
QUEUE_BULL_REDIS_PORT: 6379
command: ["n8n", "worker", "--concurrency=5"]
deploy:
replicas: 2 # scale workers
volumes:
pg: {}
n8n: {}
Notes
- Use Postgres for anything serious. SQLite plus queue mode is a great way to learn about locks the hard way.
- Set
N8N_ENCRYPTION_KEY
everywhere before first boot, or you’ll get credential decryption errors when scaling workers. - If you need binary persistence at scale, push binary data to S3 compatible storage.
Three practical blueprints you can ship today
1) AI helpdesk orchestrator (Slack + Jira + RAG)
What it does: Watches a Slack channel, answers simple questions from docs, opens or updates Jira with a structured summary, and pings an on‑call user only when confidence is low.
Flow
Slack Trigger → Supervisor Agent
├─ AI Agent Tool: Knowledge (Vector Store QA Tool)
├─ AI Agent Tool: Actions (Call n8n Workflow → Create/Update Jira)
└─ AI Agent Tool: Auditor (scores confidence)
→ If (confidence < 0.6) → Slack Mention @human
Code node to normalise agent output:
// Run Once for All Items
const out = $json;
function clamp(x, a, b){ return Math.max(a, Math.min(b, x)); }
return [{ json: {
intent: String(out.intent||'unknown').toLowerCase(),
confidence: clamp(Number(out.confidence||0), 0, 1),
summary: out.summary?.slice(0, 4000) || '',
proposedFix: out.proposedFix || ''
}}];
n8n tips
- In the Jira sub‑workflow, add a Loop Over Items when bulk‑updating comments to avoid rate limits.
- Stream agent responses back to Slack for long tasks and post a final summary once actions complete.
2) Multi‑agent Telegram concierge for a product team
What it does: Telegram bot that understands “ship notes for v1.2”, fetches merged PRs since last tag, drafts a changelog, and opens a Notion page. Human can reply “publish” to trigger a GitHub release.
Flow
Telegram Trigger → Supervisor Agent
├─ Researcher (GitHub API + Vector Store over internal runbooks)
├─ Writer (strict JSON schema for changelog sections)
└─ Publisher (Call n8n Workflow → Create Notion page + optional GitHub release)
Structured schema for Writer:
{
"type": "object",
"properties": {
"highlights": {"type": "array", "items": {"type": "string"}},
"changes": {"type": "array", "items": {"type": "string"}},
"breaking": {"type": "array", "items": {"type": "string"}}
},
"required": ["highlights","changes"]
}
n8n tips
- Use $fromAI() to let the agent pick the GitHub tag range by inferring dates from chat context.
- Keep the Publisher agent tool‑only and ban it from free‑text output to avoid accidental releases.
3) Agentic RAG for customer‑facing docs + follow‑up actions
What it does: Embedded chat widget on your docs site that answers questions from Weaviate/Qdrant, then offers to create a sandbox API key or schedule a call.
Flow
Chat Trigger (embedded) + Memory → Tools Agent
├─ Vector Store Retriever → QA Tool
├─ HTTP Request Tool (internal provisioning API)
└─ Call n8n Workflow Tool (Calendaring flow)
→ Respond to Chat
n8n tips
- Use the Chat Memory Manager node to occasionally trim memory and inject “system clarifications” like pricing changes.
- If your chat lives behind a load‑balancer, enable streaming responses and test CORS on the Chat Trigger.
Prompting and tool hygiene that actually helps
- Keep the system prompt under 20 lines. Define goal, tools policy, and output format. Add success criteria the Auditor can check.
- Ask the agent to show reasoning to itself, but return only JSON or a short answer to users.
- For tool selection, write one sentence: “Use tools whenever information must be verified, state changed, or retrieval needed.” It dramatically reduces “I guessed!” moments.
- Ban dangerous actions behind a human‑in‑the‑loop tool and require a confirmation token.
Cost, latency, and model picks
- For classification/structuring, cheap fast models are fine. For planning and multi‑step tool use, pick a model good at tool calling. I mix: a small planner model with strict JSON, then a bigger model for heavy RAG or generation.
- Groq is great when you need low latency; Mistral models are good value; OpenAI/Anthropic are still the safest for complex tool orchestration.
- Cache any deterministic steps. A 300‑line Code node can replace a thousand expensive tokens.
Debugging checklist I wish I had sooner
- Does the agent know the tool exists? Double‑check the tool is connected to the agent and not to the canvas by accident.
- Are you in queue mode with Simple Memory? That’s non‑deterministic across workers. Use a proper memory store.
- Is the output parser attached to the final node? If you parse inside a tool loop, you’ll get brittle failures.
- Are you rate‑limited? Add Loop Over Items, backoff, and proper error logging.
- Did you set the encryption key before first boot? Mismatched keys break credentials across workers.
Where to go next
- Wrap your risky steps in sub‑workflows and expose them via Call n8n Workflow Tool.
- Introduce a lightweight Auditor agent to stop junk from leaking into Jira/Notion/CRM.
- Flip to queue mode early. Even a single worker buys you headroom and easier debugging.