v0.12.5 · AGPL-3.0 · Rust

Multi-agent AI
orchestration

Open-source infrastructure for coordinating multiple AI agents. Full orchestration — sub-task decomposition, agent delegation, task verification. Streaming multi-modal chat, verify-then-respond hallucination prevention, self-healing inference, knowledge graph with query expansion, visual workflow editor, multi-Ollama load balancing, production security hardening, full audit trails. macOS, Linux, and Windows.

39K
Lines of Code
96
Tools
8
Services
381
Unit Tests
97
E2E Tests
Broodlink Operations Dashboard showing agent roster, system health, task metrics, knowledge graph, and recent activity

Everything agents need to collaborate

Broodlink provides the infrastructure layer between your AI agents and production deployment. Not a library — a full system.

💬

Conversational Gateway v0.12

Slack, Teams, Telegram, or the built-in dashboard chat UI — pick your interface. Streaming responses with tool call pause/resume. Multi-modal — images, audio, video, documents downloaded, classified, and stored. Auto-fetches URLs from conversation history so the model reads live web content without tool calling. Agents store memories mid-conversation.

🧠

Knowledge Graph Memory

LLM-powered entity extraction builds a relationship graph from agent memories. Multi-hop traversal, temporal edges, entity resolution via embedding similarity. Pure Postgres — no Neo4j required.

🔍

Hybrid Search + Query Expansion v0.11

BM25 keyword + semantic vector fusion with temporal decay and reranking. Query expansion generates 2–3 alternative phrasings via Ollama before searching — "database setup" also finds "PostgreSQL configuration." Smart chunking respects markdown headings, code fences, and tables.

Smart Task Routing + Decomposition v0.12

Complex tasks are automatically decomposed into sub-tasks via LLM, then each sub-task routed independently. 5-factor scoring picks the best agent per task — capability match, success rate, availability, cost, and recency. Parent tasks complete when all children finish. Fail-open — if decomposition fails, the task routes normally.

🔀

Versioned Agent Brain

Dolt gives you git-like version control on agent memory and decisions. Branch, diff, and rollback what your agents know. Query knowledge as of any commit.

🔗

MCP + A2A Protocols

Full MCP server (Streamable HTTP, SSE, stdio) for Claude Desktop and VS Code. A2A gateway for cross-platform agent interop with 150+ ecosystem partners.

🛡

Governance Stack

Guardrails (tool blocks, content filters, rate overrides), approval gates with auto-approve, per-agent budgets with daily replenishment, role-based dashboard access (viewer/operator/admin), access code auth for Telegram. Defence-in-depth security — HSTS/CSP headers, SSRF protection, parameterized SQL, JWT RS256 validation, NATS token auth, HMAC-signed webhooks, command allowlist, canonicalized paths. 15 hardening fixes in v0.12.5.

📋

Full Audit Trail

Every tool call logged with trace_id, agent_id, parameters, result, and duration. Distributed tracing via OTLP/Jaeger. Append-only. Nothing gets lost.

⚙️

Control Plane v0.6.0

10-tab dashboard control panel. Toggle agents, set budgets, cancel tasks, start workflows, manage guardrails, manage webhooks, retry dead letters, create formulas, manage chat, manage users. Not just a window — a cockpit.

📡

Webhook Gateway

Inbound commands + outbound notifications for Slack, Teams, Telegram. Events: agent.offline, task.failed, budget.low, workflow state changes, guardrail violations.

🔄

Formula Registry v0.7.0

Workflow formulas stored in Postgres, manageable from the dashboard. Create, edit, toggle formulas without SSH. Conditional branching, parallel steps, per-step retries, error handlers.

🤝

Agent Delegation + Negotiation v0.12

Agents delegate work to other agents through the coordinator with accept/decline lifecycle. Agents can decline tasks with reasons, suggest alternatives, request context before committing. Declined agents excluded from re-routing. Dead-letter after 3 declines. Originating agent notified on completion. Full negotiation audit trail.

Proactive Ops new

Agents schedule their own future work (cron or one-shot). Notification rules auto-alert on error spikes, DLQ backlog, or low budgets — and can auto-launch incident postmortem workflows. The system acts on its own.

🔒

Security Hardened v0.12.5

15-fix security release across 8 binaries. RBAC on all mutations, HSTS+CSP+nosniff on every HTTP service, SSRF blocking, parameterized SQL throughout, JWT algorithm validation (RS256-only, blocks confusion attacks), localhost-only binding by default, NATS token auth, HMAC-SHA256 webhook signing, 256-bit session tokens, shell command allowlist with metacharacter rejection, path traversal protection via canonicalization, TLS enforced in non-dev profiles, no default passwords anywhere.

🔬

Dual Verification v0.12

Two layers. Chat verification: every LLM response carries a confidence score (1–5) — low confidence triggers a second reasoning model to fact-check before the user sees it. Task verification: every completed task's output is automatically checked by the coordinator before finalizing. Both are fail-open — verification never blocks the pipeline. Analytics dashboard tracks correction rates.

⚖️

Multi-Ollama Load Balancing v0.11

Distribute inference across multiple Ollama instances. Least-loaded routing with RAII permits, health checks every 30 seconds, automatic failover. One long generation no longer blocks every other request.

💚

Self-Healing Inference v0.9

When the primary model OOMs or errors, the system attempts recovery automatically. If recovery fails, it enters degraded mode — routing to a fast fallback model instead of returning errors. After a 5-minute cooldown, it probes the primary model and auto-recovers. Zero operator intervention.

🖥️

Cross-Platform v0.12

Runs on macOS, Linux, and Windows. GitHub Actions builds 8 binaries for 5 platform targets on every tag. Download a zip, run it. No Rust toolchain required. Chocolatey + PowerShell on Windows, native everywhere else.

🔐

Trust-Gated Tools v0.12.3

Five progressive trust levels — untrusted, probation, standard, elevated, system. New agents start with no tools. Read-only capabilities earned at probation. Core tools at standard. Privileged operations at elevated. Per-agent allow/deny overrides. Agent-level RBAC nobody else ships in OSS.

📡

Streaming Event Protocol v0.12.3

17 typed event kinds across every service — route start, agent match, tool exec, message delta, permission denied, guardrail triggered, budget debit, task completed, verification result. Shared Rust crate and Python SDK. Formal observability, not callbacks.

🧬

Deterministic Bootstrap v0.12.3

Startup pipeline as a directed acyclic graph. Kahn's topological sort with cycle detection. Seven stages: prefetch → config → dependencies → databases → services → deferred init → ready. Per-stage timing and diagnostics. No more ad-hoc start scripts.

The most complete self-hosted agent memory

Query expansion + BM25 + vector + temporal decay + reranker + smart chunking + knowledge graph + versioned history + audit trail. All on commodity Postgres. No Neo4j required.

🔍 Hybrid Search Pipeline v0.11

  • Agent calls hybrid_search with query
  • Query expansion: Ollama generates 2–3 alternative phrasings new
  • BM25 keyword search — Postgres tsvector, weighted A/B/C fields
  • Semantic vector search — Ollama → Qdrant, 768-dim cosine
  • All variants searched in parallel, merged by max score
  • Min-max normalisation + weighted score fusion
  • Temporal decay: score × e-λ·age
  • Optional reranking via dedicated model ~200ms
  • Results with BM25 + vector + fused scores

🧠 Knowledge Graph Pipeline v0.5.0

  • Agent stores memory via store_memory
  • Outbox queues for async processing
  • Smart chunking: split at headings, code fences, tables — never mid-structure new
  • Ollama generates 768-dim embedding per chunk → Qdrant
  • LLM extracts entities + relationships from content
  • Entity resolution: exact name → embedding sim → new
  • Edges with temporal validity + weight reinforcement
  • Queryable via graph_traverse — multi-hop CTE
Capability Broodlink Zep/Graphiti LangGraph Mem0
BM25 keyword search
Vector search
Query expansion
Smart chunk boundaries
Knowledge graph
Temporal edges
Versioned memory
Verify-then-respond
Task output verification
Streaming responses
Sub-task decomposition
Agent delegation + negotiation
Self-healing inference
Multi-modal attachments
Native chat UI
Cross-platform binaries
Trust-gated tool access
Typed streaming event protocol
Conversational gateway
Role-based access control
Self-hosted / no API keys
Full audit trail

Talk to your agents from anywhere

Slack, Teams, Telegram, or the built-in dashboard chat. An agent picks up your message, runs tools, and replies. Not slash commands — full streaming conversations with multi-modal attachments.

💬 Conversational Flow v0.11

  • User sends message (text, image, audio, video, document) via Slack / Teams / Telegram / dashboard
  • Platform-specific file download + attachment classification + storage v0.11
  • a2a-gateway verifies access code auth (Telegram) or signing secret
  • Creates or resumes chat_session, dedup check (30s window)
  • Auto-fetches URLs from last 5 messages — injects live web content into context v0.11
  • Streaming: "Thinking..." placeholder → progressive message edits every 800ms
  • Tool calls pause the stream, show "Using tool: {name}...", resume after
  • Thinking-only failure → auto-retry with reduced 3-tool set
  • Confidence < 3 → verify with DeepSeek-R1 before delivery
  • Response streamed back to platform thread

🛡 Enterprise Controls

  • Access code auth for Telegram — zero info leakage on wrong codes
  • Verify-then-respond: uncertain answers fact-checked by second model
  • Task output verification: coordinator checks results before finalizing v0.12
  • Sub-task decomposition: complex work split automatically, fail-open v0.12
  • Agent delegation: full request/accept/decline/complete lifecycle v0.12
  • Self-healing inference: OOM → auto-recovery → degraded mode → cooldown probe
  • Multi-Ollama: least-loaded routing across instances with health checks
  • Budget enforcement, guardrails, deadlock detection
  • Notification rules auto-alert on error spikes, DLQ backlog, low budgets
  • Scheduled tasks fire on cron — agents plan their own future work
  • Verification analytics dashboard tracks correction rates

8 Rust services, one system

Agents connect to beads-bridge. Everything else is internal. NATS decouples services. Dolt versions knowledge. Postgres handles hot paths, the knowledge graph, and the formula registry. Ollama self-heals on OOM. Runs on macOS, Linux, and Windows.

                    Agents (Claude, Qwen, custom bots, ...)
                            │
                    ┌───────▼───────┐
                    │ beads-bridge  │ :3310  Tool API (96 tools)
                    │               │   JWT RS256 + Rate Limiting
                    └──┬────┬───┬──┘        + Budget Enforcement
                       │    │   │
          ┌────────────┘    │   └────────────┐
          │                 │                │
  ┌───────▼──────┐  ┌──────▼──────┐  ┌──────▼──────┐
  │    Dolt      │  │  Postgres   │  │    NATS     │
  │  :3307       │  │  :5432      │  │  :4222      │
  │ (versioned)  │  │ (hot paths  │  │ (messaging) │
  │              │  │ + KG + DLQ  │  │             │
  │              │  │ + formulas) │  │             │
  └──────────────┘  └──────┬──────┘  └──────┬──────┘
                           │                │
                    ┌──────▼──────┐  ┌──────▼──────┐
                    │ embedding-  │  │ coordinator │
                    │ worker      │  │ + decompose │
                    │ + KG extract│  │ + delegate  │
                    └──────┬──────┘  │ + verify    │
                           │         └─────────────┘
                           │
              ┌────────────┼────────────┐
              │            │            │
      ┌───────▼──┐  ┌─────▼────┐  ┌───▼────────┐
      │  Ollama  │  │  Qdrant  │  │ heartbeat  │
      │  :11434  │  │  :6333   │  │ (5m cycle) │
      │embeddings│  │ 2 vector │  │ + KG sync  │
      │+ KG LLM  │  │ collectns│  │ + budgets  │
      │+ pool LB │  └──────────┘  │ + alerts   │
      │+ selfheal│                └────────────┘
      └──────────┘

  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐
  │  status-api  │  │  mcp-server  │  │ a2a-gateway  │
  │  :3312       │  │  :3311       │  │  :3313       │
  │  + RBAC auth │  └──────────────┘  │ + chat GW    │
  │  + NATS pub  │                    │ + Brave 🔍   │
  └──────┬───────┘                    │ + webhooks   │
         │                            └──────────────┘
  ┌──────▼───────┐
  │  Hugo Site   │    Slack · Teams · Telegram
  │  :1313       │    + Dashboard Chat UI
  │  18 pages    │    ───────────────────────
  └──────────────┘    Streaming conversations
                      via webhook gateway

                      macOS · Linux · Windows
                      5 release targets

What's inside

Every component chosen for a reason. No cloud dependencies. Self-hosted and self-contained.

Service Port Purpose
beads-bridge3310Universal tool API — 96 tools, JWT auth, rate limiting, circuit breakers, guardrails, budget enforcement, query expansion
coordinatorSub-task decomposition, agent delegation protocol, task output verification, smart routing, workflow orchestration, negotiation, deadlock detection, dead-letter queue
heartbeat5-min sync: Dolt commit, agent metrics, KG backfill, budget replenishment, session cleanup, notification rule evaluation, scheduled task dispatch
embedding-workerOutbox → smart chunking (headings, code fences, tables) → Ollama embeddings → Qdrant + LLM entity extraction → knowledge graph
status-api3312Dashboard API + control panel + native chat UI + model download UI + RBAC auth (viewer/operator/admin), 55+ endpoints, verification analytics, agent onboarding
mcp-server3311MCP protocol — Streamable HTTP + stdio. Proxies all bridge tools.
a2a-gateway3313Google A2A protocol + streaming multi-modal chat gateway (Slack/Teams/Telegram) + Brave Search + verify-then-respond + multi-Ollama pool + self-healing inference + webhooks

Real numbers from a running system

Benchmarked on a Mac Studio — Apple M4 Max, 16 cores, 128 GB unified memory. Full stack on Google's Gemma 4 family (Apache 2.0, native tool calling, 256K context). Local models at ~85–95% of cloud quality, zero cost, full data privacy.

Inference Speed

  • gemma4:31b (primary + vision) — 19 GB, 256K context, native tool calling
  • gemma4:26b (code, MoE) — 17 GB, Apache 2.0
  • deepseek-r1:32b (verifier) — 19 GB, 9–22 tok/s
  • gemma4:e4b (fallback / KG / expansion) — 9.6 GB
  • nomic-embed-text — 72ms per embedding
  • ~65 GB total VRAM, 63 GB headroom on Mac Studio

📊 Service Latency

  • beads-bridge health check — 26ms
  • status-api health check — 23ms
  • list_tasks (Postgres) — 38ms
  • semantic_search (Ollama + Qdrant) — 135ms
  • Postgres queries — 0.3–1.1ms (8,233 audit rows)
  • Sub-second end-to-end for all operations

Running in minutes

One command installs all 8 binaries. Native infrastructure via brew and direct binaries — no container runtime required.

terminal
# Install (macOS / Linux)
curl -fsSL https://raw.githubusercontent.com/nevenkordic/broodlink/main/install.sh | sh

# Install (Windows PowerShell)
irm https://raw.githubusercontent.com/nevenkordic/broodlink/main/install.ps1 | iex

# Then launch the setup wizard
broodlink

# Or build from source
git clone https://github.com/nevenkordic/broodlink.git
cd broodlink && bash scripts/bootstrap.sh

# Verify
cargo test --workspace             # 381 unit tests
bash tests/e2e.sh                  # 97 end-to-end tests

The infrastructure your agents need

Orchestration, memory, messaging, governance, and audit — in one system. Stop stitching libraries together.