Vol. I — First Edition Nous Research For the Newly Arrived

Hermes & the Eight Powers A Field Guide for First-Time Travelers

Eight features, mapped & named, so you can choose the right one in under a minute and stop wondering whether you needed cron, delegation, or kanban.

Hermes is an agent that does work for you — but the question every newcomer hits is the same: which lever do I pull? Schedule something? Spawn helpers? Set a goal and walk away? The eight features in this guide each answer a different version of "how should this work get done?" — and once you can name them, the docs stop feeling like a maze.

Each entry below tells you, in roughly the order you'd ask: what it is, when to reach for it, when to not, and the smallest example that does something useful.

What do you want Hermes to do?

A Map of Intentions — Follow the Lines
A task needs doing START HERE when? how many? how long? how observed? later? on a schedule? in fresh sessions? a few right now, in parallel? work that survives restarts & humans? react to lifecycle events? recurring or one-shot isolated & ephemeral named, shared board tool / message / startup I · CRON II · DELEGATION III · KANBAN VII · HOOKS keep going? tool chains? at scale? don't stop until a goal is met 3+ tool calls in one Python script thousands of prompts in parallel V · GOALS VI · CODE EXEC VIII · BATCH N S
ISchedule

Scheduled Tasks a.k.a. Cron

"Set it. Forget it. Trust the tick."

Tells Hermes do this later — once, hourly, every Sunday at 9, whatever. The gateway daemon ticks every 60 seconds and runs anything that's due in a fresh agent session, then delivers the result wherever you said (chat, file, Telegram, Discord, email…).

Three ways to schedule: a relative delay (30m), an interval (every 2h), or a full cron expression (0 9 * * 1-5). You can also just ask Hermes in plain English — "every morning at 9, summarize Hacker News and DM me on Telegram" — and it'll wire up the cron job for you.

First-timer pitfall: cron runs in a fresh session with zero memory of your last chat. Whatever the agent needs to know, the prompt has to say.

Reach for it when
You want recurring or future-scheduled work — daily briefs, hourly polls, "remind me in 30 minutes."
Don't reach for it when
You need the result now in this conversation. Use the agent normally.
/cron add "every 1h" "Summarize new feed items" --skill blogwatcher
TICK · 60s fresh agent session RUNS PROMPT → chat / file / Slack / email
A daemon ticks; due jobs spawn fresh sessions; output goes where you asked.
IIHelpers

Subagent Delegation a.k.a. delegate_task

"Many hands — one mind to read the report."

Lets the main agent spawn child agents to do work in parallel, each with a fresh isolated context and a restricted toolset. Up to 3 children at once by default. Only their final summaries come back — intermediate noise stays out of your context.

Think fork & join: the parent blocks until every child returns, then continues. Children know nothing the parent knows unless the parent passes it explicitly via the context field.

Crucial distinction: delegation is synchronous and ephemeral. If the user interrupts the parent, every child dies and their work is discarded. For durable work, use Cron or Kanban.

Reach for it when
You need 3 research questions answered in parallel, want to refactor a big codebase without flooding context, or want to delegate cheaper subtasks to a smaller model.
Don't reach for it when
The work has to survive interrupts, needs human input, or might be picked up by another role.
delegate_task(tasks=[ {"goal": "Research WebAssembly 2025", "toolsets": ["web"]}, {"goal": "Research RISC-V 2025", "toolsets": ["web"]}, {"goal": "Research quantum 2025", "toolsets": ["web"]}, ])
parent YOUR AGENT child 1 child 2 child 3 PARALLEL · ISOLATED · FRESH CONTEXT → summaries return to parent
Fork into N isolated children, join their summaries.
IIIBoard

Kanban a.k.a. multi-agent task board

"A board your agents share — with rows that survive everything."

A durable SQLite-backed task board where multiple named profiles (researcher, reviewer, writer, deploy-bot) collaborate over time. Tasks have status, dependencies, comments, retries, and a full audit trail. The dispatcher inside the gateway claims ready tasks and spawns whichever profile owns them.

Where delegation is a function call, kanban is a work queue: every handoff is a row, humans can comment or unblock at any point, and a crashed worker gets reclaimed and retried instead of losing its task.

Open the dashboard (hermes dashboard → Kanban tab) and you get a Linear-style drag-and-drop board with columns for Triage / Todo / Ready / Running / Blocked / Done.

Reach for it when
Work crosses agent boundaries, needs to survive restarts, might need human input, or you have fleet work (one specialist + 50 subjects).
Don't reach for it when
You just want the parent agent to think about three things in parallel and merge results — that's delegation's job.
$ hermes kanban init $ hermes gateway start $ hermes kanban create "research AI funding" \ --assignee researcher $ hermes kanban watch # live event stream
TODO spec API draft post READY review PR RUNNING build translate BLOCKED deploy ⚠ DONE
Six columns. Real handoffs. Real audit log.
IVStories

Kanban Stories a.k.a. four worked examples

"See it in motion before you commit."

The companion tutorial walks four real shapes of work with dashboard screenshots:

① Solo dev shipping a feature — three linked tasks (schema → API → tests), each handing off structured --summary + --metadata so the next worker doesn't re-read a design doc.

② Fleet farming — one translator profile, one transcriber profile, one copywriter, all draining their queues in parallel while you sleep.

③ Role pipeline with retry — PM writes spec, engineer implements, reviewer rejects, engineer iterates, reviewer approves. Retry history is preserved, so the second attempt sees why the first failed.

④ Crash & circuit breaker — worker OOM-kills, dispatcher detects the dead PID, requeues. After N consecutive spawn-failures the breaker trips and the task auto-blocks.

If you're new, this is the page that makes Kanban click. The reference page tells you what; the stories show you why.

Read it when
You skimmed the Kanban reference and aren't sure when you'd actually pick it over a TODO list. The story comparing flat-todo vs. retry-aware runs is the moment of clarity.
STORY ③ · ROLE PIPELINE WITH RETRY spec @pm DONE impl @eng DONE (run 2) run 1 BLOCKED "missing strength check" run 2 COMPLETED review @rev READY
Story ③: every retry is a row. The next attempt reads the prior block reason.
VPersist

Persistent Goals a.k.a. /goal

"Don't stop until it's done — and stop saying 'keep going'."

You hand Hermes a goal that survives across turns. After every turn, a tiny judge model decides "done?" or "continue?" If continue, Hermes auto-feeds itself a continuation prompt and works again — up to 20 turns by default before pausing for you to /goal resume.

This is Hermes' take on the Ralph loop pattern. Use it for tasks where you'd otherwise be sitting there saying "keep going" three times: lint sweeps, multi-file ports, "investigate this drift bug and write a report."

User messages always preempt the loop — type something and it pauses. /goal pause, /goal resume, /goal clear, /goal status are the levers.

The judge is deliberately conservative: it only marks a goal done when the response explicitly confirms the work, the deliverable is clearly produced, or the goal is unachievable.

Reach for it when
The job is iterative and well-defined: "fix every failing test in tests/foo/ until scripts/run_tests.sh passes."
Don't reach for it when
The agent normally finishes in one turn. Goals are for tasks that need self-driven iteration.
/goal Fix every lint error in src/ and verify ruff check passes # → ⊙ Goal set (20-turn budget) # → turn 1 runs, judge says continue # → ↻ Continuing toward goal (1/20) # → ✓ Goal achieved
THE GOAL LOOP agent turn DOES THE WORK judge DONE? CONTINUE? ↻ continue → fresh continuation prompt DONE 20-turn budget · pauses if hit
Loop, judge, repeat — until done or the budget runs out.
VIScript

Code Execution a.k.a. execute_code

"One Python script. Many tools. One LLM turn."

Hermes can write a Python script that calls Hermes toolsweb_search, read_file, patch, terminal, search_files… — programmatically, with normal Python flow control between calls. The script runs in a sandboxed child process and only its print() output comes back.

Why this matters: in a normal multi-tool workflow, every intermediate result enters the LLM context. With execute_code, you can loop over 50 search results, filter, fetch, summarize — and the LLM only sees the final summary. Massive token savings on tool-heavy work.

The agent decides on its own when to use this. It picks execute_code when there are 3+ tool calls with logic between them (loops, filters, conditionals).

Reach for it when
Mechanical pipelines — "find every file with X, read each, extract Y, count and report." (Or just trust the agent to pick it.)
Don't reach for it when
The work needs reasoning between steps — that's what delegation is for. Also: Linux/macOS only (uses Unix domain sockets).
# intermediate results never enter the context from hermes_tools import web_search, web_extract results = web_search("Rust async runtimes", limit=5) summaries = [] for r in results["data"]["web"]: page = web_extract([r["url"]]) # filter, summarize… summaries.append(...) print(json.dumps(summaries)) # ← only this returns
CONTEXT-EFFICIENT TOOL CHAINS LLM turn writes PYTHON SCRIPT web_search(...) for r in results: web_extract(r) filter / sum… print(final) VIA UNIX SOCKET ONLY print() RETURNS
All the tool noise stays out of context.
VIIReact

Event Hooks a.k.a. listen, block, inject

"Run your code at the moments that matter."

Three hook flavors let you wedge custom logic into agent lifecycle: Gateway hooks (Python handlers that fire on Telegram/Slack/Discord events), Plugin hooks (Python, fire in CLI and gateway), and Shell hooks (drop-in shell scripts declared in config.yaml).

Three things hooks can do, in increasing power:

Observe — log every tool call, count subagent runs, ping Telegram when a long task crosses 10 steps.
Blockpre_tool_call can veto a dangerous shell command before it runs (e.g. forbid rm -rf /).
Injectpre_llm_call can prepend context to the user's message every turn (memory recall, "today is Friday," current git status).

A famous community pattern: drop a ~/.hermes/BOOT.md with natural-language startup instructions, wire a hook on gateway:startup, and Hermes runs your checklist on every boot.

Reach for it when
You want auditing, guardrails, auto-formatting after writes, RAG-style context injection, or a startup ritual.
hooks: pre_tool_call: - matcher: "terminal" command: "~/.hermes/agent-hooks/block-rm.sh" post_tool_call: - matcher: "write_file|patch" command: "~/.hermes/agent-hooks/auto-format.sh" pre_llm_call: - command: "~/.hermes/agent-hooks/inject-git-status.sh"
THE AGENT TURN PIPELINE USER MSG PRE_LLM PRE_TOOL POST_TOOL POST_LLM inject CTX block log log ↑ INJECT & BLOCK ↓ OBSERVE
Hooks slot into the turn pipeline at named moments.
VIIIScale

Batch Processing a.k.a. batch_runner.py

"Thousands of prompts. One run. ShareGPT trajectories out."

Batch processing is the training-data generation tool. Feed it a JSONL of prompts, set parallelism, and every prompt gets its own isolated agent session with sampled toolsets. Output is structured ShareGPT-format trajectories with full conversation history, tool stats, and reasoning coverage.

Resume is content-aware — if a run crashes halfway, --resume scans existing batch files, matches completed prompts by their actual text, and only re-runs what's missing. Quality filters drop samples without reasoning and entries with hallucinated tool names.

You can pin a per-prompt Docker image (so a "compile this Rust" prompt gets a Rust container) and a per-prompt working directory.

This is for fine-tuning & eval workflows, not for agent users. If you're not training a model, skip it.

Reach for it when
You're generating tool-use trajectories for fine-tuning, or you want to measure how well a model handles a benchmark of prompts.
Don't reach for it when
You just want the agent to handle one task — that's just… using the agent.
$ python batch_runner.py \ --dataset_file=data/prompts.jsonl \ --batch_size=20 \ --run_name=coding_v1 \ --num_workers=8 \ --max_turns=15 # → data/coding_v1/trajectories.jsonl
JSONL → WORKERS → TRAJECTORIES PROMPTS …1000s N WORKERS worker 1 worker 2 worker 3 TRAJECTORIES .jsonl RESUMABLE · QUALITY-FILTERED · SHAREGPT FORMAT
For training-data generation & eval workflows.

The trio that confuses everyone

Cron vs. Delegation vs. Kanban vs. Goals — named by what they are, not what they look like.

Cron Delegation Kanban Goals
Shape Scheduled fire-and-forget Synchronous fork & join Durable queue + state machine Auto-loop on the same session
When does it run? On a schedule (or future timestamp) Right now, blocks parent When dispatcher claims it (~60s tick) Every turn until done
Survives restart? ✓ — daemon picks up ✗ — dies with the parent turn ✓ — rows live forever ✓ — persisted in session state
Human in the loop? Notify only No — subagents can't ask ✓ — comment, unblock anytime ✓ — user msg always preempts
Multiple workers? Each job is solo Up to 3 concurrent (configurable) Many named profiles, fleet-able One agent, many turns
Best for "Every morning at 9, do X." "Research these 3 things in parallel." "Pipeline this work across roles." "Don't stop until all tests pass."
Reaches for it
via
/cron add or natural language Agent decides via delegate_task hermes kanban create /goal <text>