Introduction
Real-time cost observability and guardrails for AI agents.
What is AgentFlare?
AgentFlare is a lightweight observability and safety layer for AI agents. It sits between your agent code and the LLM API, watching every token that flows through. The moment your agent's spend crosses a limit you define, AgentFlare pauses it automatically and fires a Slack alert — before a runaway loop costs you hundreds of dollars.
Why AgentFlare?
AI agents can fail silently in expensive ways. A bug in a loop, a poorly scoped prompt, or an accidental retry storm can rack up hundreds of dollars in API costs in minutes. Existing monitoring tools show you what happened — AgentFlare stops it while it's happening.
| Problem | AgentFlare's answer |
|---|---|
| Runaway agent loops | Auto-pause at your budget threshold |
| No visibility into token spend | Per-call cost tracking in real time |
| Silent failures on prod | Slack alert when an agent is paused |
| Hard to integrate | 3-line SDK drop-in for any Python agent |
Core concepts
Agent
An agent is identified by a string agent_id you choose (e.g. "sales-outreach-v2"). Every event, cost total, and config setting is scoped to that ID.
Event
Every LLM call, tool call, agent start, and agent end is an event. Events carry token counts and model name, which AgentFlare uses to calculate dollar cost.
Threshold
A daily spending limit (in USD) you set per agent. When the agent's 24-hour total crosses the threshold, AgentFlare marks it as paused. Your agent SDK checks this flag and stops itself.
Pause / Resume
Agents can be paused automatically (by threshold) or manually (via dashboard or API). Pausing is reversible — resume from the dashboard or via POST /config/pause.