Multi-agent automation: when n8n gets AI

Why this article exists

I’ve been building automations with n8n since the “please don’t sneeze on the SQLite DB” era. It’s brilliant for wiring APIs together, but the moment you add AI, the shape of a workflow changes. Your flows stop being rigid pipes and start looking like decision trees run by a very opinionated intern. If you go a step further and use multiple agents — each with its own tools, memory and responsibilities — you can ship automations that plan, research, act, and verify without human babysitting.

This post is a practical guide to multi‑agent automation in n8n: how the building blocks map to LangChain concepts, how to wire agents together, what to do about memory, and where people usually face‑plant (I did, repeatedly). You’ll get end‑to‑end examples you can steal.

Who’s this for? Developers and technical operators who already ship n8n flows and want to add agentic behaviour for real work: ticket triage, content ops, research + RAG, customer support, on‑call tooling, that sort of thing.

TL;DR

n8n gives you an AI Agent that can call tools (HTTP, code, other workflows, vector stores) and stream output. You can nest agents and build hierarchies.
Use AI Agent Tool to expose one agent as a tool to another. Use Call n8n Workflow Tool to turn any sub‑workflow into a callable tool.
For memory, start with Simple/Window Buffer for prototypes; for production, use Redis/Postgres/Zep memory nodes or your own store. Avoid Simple Memory in queue mode.
For RAG, use the Vector Store nodes (Weaviate, Qdrant, Pinecone) with Retriever or the Vector Store Question‑Answer Tool.
Scale with queue mode (Redis + Postgres) and set worker concurrency sensibly.
Force structure with Structured Output Parser or run a parsing chain after the agent.

The mental model: n8n’s agentic pieces

Think of an agent as a loop that plans → calls a tool → observes → repeats until the goal is satisfied. In n8n, that maps nicely:

AI Agent: the planner/decider. Attach a chat model (OpenAI, Anthropic, Groq, Mistral Cloud, Azure OpenAI) and tick Require Specific Output Format when you need strict structure.
Tools: plug‑in abilities for the agent. Out of the box you get HTTP Request, Custom Code, Wikipedia, SerpAPI, and crucially:
- Call n8n Workflow Tool: treat a whole workflow as a tool.
- AI Agent Tool: treat another agent as a tool. This is how you do multi‑agent orchestration on one canvas.
- Vector Stores: Pinecone, Weaviate, Qdrant nodes as tools or via Retriever.
- $fromAI(): let the model fill in node parameters at runtime (handy for tool inputs).
Memory: Simple/Window Buffer for quick chats; Redis/Postgres/Zep/Mongo memory nodes for persistence and scale.
Chat Trigger: gives you a chat UI, session IDs, optional streaming, and pluggable memory. Great for internal tools.

I’ll show how these fit with three real workflows.

Pattern 1: Single agent with a serious tool‑belt

Use when: one agent can complete the task but needs to call APIs, run code, or query a knowledge base.

Sketch

Chat Trigger → AI Agent (Tools Agent)
  ├─ HTTP Request Tool (generic API calls)
  ├─ Vector Store QA Tool (RAG over docs)
  └─ Custom Code Tool (one‑off transforms)
→ Structured Output Parser (optional)
→ Slack/Email/DB

Gotchas I actually hit

If you enable a strict output format inside the Agent, it’s not always reliable mid‑tool‑loop. I often pipe the agent’s final text to a tiny LLM chain for parsing instead.
$fromAI() is brilliant but remember it only works on tool parameters. I treat it like hints: $fromAI('email', 'customer contact address').

Example: inbound support triage Goal: classify a customer message, search the knowledge base, and create/update a Jira ticket with a clean summary.

Chat Trigger with streaming on. Attach Window Buffer Memory with k=10 for context.
AI Agent with tools:
- Vector Store QA Tool bound to your Weaviate/Qdrant collection of product docs.
- Call n8n Workflow Tool pointing at a sub‑workflow that creates/updates Jira issues.
Add a Structured Output Parser with a JSON Schema like:

{
  "type": "object",
  "properties": {
    "intent": {"type": "string", "enum": ["bug", "question", "billing", "feature"]},
    "priority": {"type": "string", "enum": ["low","medium","high"]},
    "summary": {"type": "string"},
    "proposedFix": {"type": "string"}
  },
  "required": ["intent","summary"]
}

Pipe the parsed object to Jira. If parsing fails, fall back to an Auto‑fixing Output Parser or run a micro chain to coerce JSON.

Why this works Single agent keeps state simple. Tools handle IO, vector search handles facts, and you end with structured data your ops team trusts.

Pattern 2: Supervisor + specialists (true multi‑agent)

Use when: tasks need planning and delegation: research → draft → review, or classify → route → act.

Sketch

Webhook/Chat → Supervisor Agent
  ├─ AI Agent Tool: Researcher (RAG + web)
  ├─ AI Agent Tool: Actions (Jira/Slack/Notion tools)
  └─ AI Agent Tool: Auditor (only verifies / scores output)
→ Parser → Destinations

Key wiring details

Build three separate agents on the same canvas.
Expose each with AI Agent Tool and connect them to the Supervisor.
Give each agent its own system prompt and tools. The Actions agent gets HTTP + Workflow tools, the Researcher gets Vector Store QA + web search, the Auditor only reads and returns a score with reasoning.
Keep memory local to the Supervisor and pass compact context chunks to specialists. It cuts token spillover and weird cross‑talk.

Example: content ops pipeline Input: a product link. Output: a markdown brief, keywords, and two image prompts.

Supervisor plans: “research, draft brief, audit tone, then hand over.”
Researcher uses RAG over your product docs + optional web search.
Drafting agent produces a markdown brief and 2 alt versions.
Auditor checks tone and brand rules and returns a score + fix suggestions.
Final parser normalises fields; the workflow posts to Notion and schedules a review in Slack.

Tip Use Call n8n Workflow Tool for messy steps like “render screenshots, upload to S3, return URL.” Makes the agent’s surface area small and testable.

Pattern 3: Agentic RAG with tool‑taking actions

Use when: the agent must answer questions from a vector store and then do real work.

Sketch

Chat/Webhook → Tools Agent
  ├─ Vector Store Retriever → Vector Store QA Tool
  ├─ HTTP Request Tool (internal APIs)
  └─ Call n8n Workflow Tool (side‑effects: create user, refund, etc.)
→ Response

Practical notes

For Weaviate/Qdrant/Pinecone, prefer a Retriever → QA tool when you need summaries; wiring the vector store directly as a tool also works for quick answers.
Chunking and metadata matter more than your model. Keep chunks 500–1,000 tokens with overlap; tag source URLs, versions, and ACLs.

Memory, sessions, and why your bot “forgets”

The Simple/Window Buffer nodes are fine for demos. In production, especially in queue mode, use a proper memory store (Redis, Postgres, Zep, Mongo) so any worker can reload context.
With Chat Trigger, set “Load Previous Session” to From Memory and connect both Chat Trigger and Agent to the same memory node. Otherwise you’ll chase phantom context bugs.
Cap the window. When you throw 100+ messages at a context window, quality tanks and cost spikes. I generally keep k=8–12 and summarise older turns into a single system note.

Reliability: retries, branching, parsing

There’s no universal try/catch node. I fake it with If/Filter/Switch plus a Code node around likely failure points, and I send failures to an Error Workflow with enough context to replay.
A lot of “agent flakiness” is actually output formatting. Use a strict schema at the end of the chain, and keep agent prompts short, with explicit tool selection instructions and success criteria.
When you must fan‑out, use Loop Over Items (Split in Batches) to control rate limits, then Merge to rejoin results. For parallelism you scale at the worker/concurrency level.

Scaling: queue mode without tears

If you plan to run agents for real users, flip to queue mode. You’ll run one main (UI + triggers) and a pool of workers that pull jobs from Redis and write results to Postgres. Keep webhook processors separate if you’re handling bursty traffic.

Minimal docker‑compose.yml (works on my homelab and a small VPS)

version: "3.9"
services:
  postgres:
    image: postgres:15
    environment:
      POSTGRES_USER: n8n
      POSTGRES_PASSWORD: n8npassword
      POSTGRES_DB: n8n
    volumes:
      - pg:/var/lib/postgresql/data
 
  redis:
    image: redis:7
    command: ["redis-server", "--save", "", "--appendonly", "no"]
 
  n8n-main:
    image: docker.n8n.io/n8nio/n8n:latest
    depends_on: [postgres, redis]
    environment:
      # Core
      N8N_ENCRYPTION_KEY: "change-me-super-secret"
      GENERIC_TIMEZONE: "Europe/Bucharest"
      WEBHOOK_URL: "https://your.n8n.domain"  # set correctly or webhooks sulk
      # DB
      DB_TYPE: postgres
      DB_POSTGRESDB_HOST: postgres
      DB_POSTGRESDB_PORT: 5432
      DB_POSTGRESDB_DATABASE: n8n
      DB_POSTGRESDB_USER: n8n
      DB_POSTGRESDB_PASSWORD: n8npassword
      # Queues
      EXECUTIONS_MODE: queue
      QUEUE_BULL_REDIS_HOST: redis
      QUEUE_BULL_REDIS_PORT: 6379
    ports: ["5678:5678"]
    volumes:
      - n8n:/home/node/.n8n
 
  n8n-worker:
    image: docker.n8n.io/n8nio/n8n:latest
    depends_on: [postgres, redis]
    environment:
      N8N_ENCRYPTION_KEY: "change-me-super-secret"  # must match main
      DB_TYPE: postgres
      DB_POSTGRESDB_HOST: postgres
      DB_POSTGRESDB_PORT: 5432
      DB_POSTGRESDB_DATABASE: n8n
      DB_POSTGRESDB_USER: n8n
      DB_POSTGRESDB_PASSWORD: n8npassword
      EXECUTIONS_MODE: queue
      QUEUE_BULL_REDIS_HOST: redis
      QUEUE_BULL_REDIS_PORT: 6379
    command: ["n8n", "worker", "--concurrency=5"]
    deploy:
      replicas: 2  # scale workers
volumes:
  pg: {}
  n8n: {}

Notes

Use Postgres for anything serious. SQLite plus queue mode is a great way to learn about locks the hard way.
Set N8N_ENCRYPTION_KEY everywhere before first boot, or you’ll get credential decryption errors when scaling workers.
If you need binary persistence at scale, push binary data to S3 compatible storage.

Three practical blueprints you can ship today

1) AI helpdesk orchestrator (Slack + Jira + RAG)

What it does: Watches a Slack channel, answers simple questions from docs, opens or updates Jira with a structured summary, and pings an on‑call user only when confidence is low.

Flow

Slack Trigger → Supervisor Agent
  ├─ AI Agent Tool: Knowledge (Vector Store QA Tool)
  ├─ AI Agent Tool: Actions (Call n8n Workflow → Create/Update Jira)
  └─ AI Agent Tool: Auditor (scores confidence)
→ If (confidence < 0.6) → Slack Mention @human

Code node to normalise agent output:

// Run Once for All Items
const out = $json;
function clamp(x, a, b){ return Math.max(a, Math.min(b, x)); }
return [{ json: {
  intent: String(out.intent||'unknown').toLowerCase(),
  confidence: clamp(Number(out.confidence||0), 0, 1),
  summary: out.summary?.slice(0, 4000) || '',
  proposedFix: out.proposedFix || ''
}}];

n8n tips

In the Jira sub‑workflow, add a Loop Over Items when bulk‑updating comments to avoid rate limits.
Stream agent responses back to Slack for long tasks and post a final summary once actions complete.

2) Multi‑agent Telegram concierge for a product team

What it does: Telegram bot that understands “ship notes for v1.2”, fetches merged PRs since last tag, drafts a changelog, and opens a Notion page. Human can reply “publish” to trigger a GitHub release.

Flow

Telegram Trigger → Supervisor Agent
  ├─ Researcher (GitHub API + Vector Store over internal runbooks)
  ├─ Writer (strict JSON schema for changelog sections)
  └─ Publisher (Call n8n Workflow → Create Notion page + optional GitHub release)

Structured schema for Writer:

{
  "type": "object",
  "properties": {
    "highlights": {"type": "array", "items": {"type": "string"}},
    "changes": {"type": "array", "items": {"type": "string"}},
    "breaking": {"type": "array", "items": {"type": "string"}}
  },
  "required": ["highlights","changes"]
}

n8n tips

Use $fromAI() to let the agent pick the GitHub tag range by inferring dates from chat context.
Keep the Publisher agent tool‑only and ban it from free‑text output to avoid accidental releases.

3) Agentic RAG for customer‑facing docs + follow‑up actions

What it does: Embedded chat widget on your docs site that answers questions from Weaviate/Qdrant, then offers to create a sandbox API key or schedule a call.

Flow

Chat Trigger (embedded) + Memory → Tools Agent
  ├─ Vector Store Retriever → QA Tool
  ├─ HTTP Request Tool (internal provisioning API)
  └─ Call n8n Workflow Tool (Calendaring flow)
→ Respond to Chat

n8n tips

Use the Chat Memory Manager node to occasionally trim memory and inject “system clarifications” like pricing changes.
If your chat lives behind a load‑balancer, enable streaming responses and test CORS on the Chat Trigger.

Prompting and tool hygiene that actually helps

Keep the system prompt under 20 lines. Define goal, tools policy, and output format. Add success criteria the Auditor can check.
Ask the agent to show reasoning to itself, but return only JSON or a short answer to users.
For tool selection, write one sentence: “Use tools whenever information must be verified, state changed, or retrieval needed.” It dramatically reduces “I guessed!” moments.
Ban dangerous actions behind a human‑in‑the‑loop tool and require a confirmation token.

Cost, latency, and model picks

For classification/structuring, cheap fast models are fine. For planning and multi‑step tool use, pick a model good at tool calling. I mix: a small planner model with strict JSON, then a bigger model for heavy RAG or generation.
Groq is great when you need low latency; Mistral models are good value; OpenAI/Anthropic are still the safest for complex tool orchestration.
Cache any deterministic steps. A 300‑line Code node can replace a thousand expensive tokens.

Debugging checklist I wish I had sooner

Does the agent know the tool exists? Double‑check the tool is connected to the agent and not to the canvas by accident.
Are you in queue mode with Simple Memory? That’s non‑deterministic across workers. Use a proper memory store.
Is the output parser attached to the final node? If you parse inside a tool loop, you’ll get brittle failures.
Are you rate‑limited? Add Loop Over Items, backoff, and proper error logging.
Did you set the encryption key before first boot? Mismatched keys break credentials across workers.

Where to go next

Wrap your risky steps in sub‑workflows and expose them via Call n8n Workflow Tool.
Introduce a lightweight Auditor agent to stop junk from leaking into Jira/Notion/CRM.
Flip to queue mode early. Even a single worker buys you headroom and easier debugging.