Introducing dart_agent_core — An Agent Framework That Runs on Your Phone

2026-03-18

We needed agents on the client side

When we started building Memex, we knew early on that a single LLM call wouldn't cut it. Organizing someone's life records — text, photos, voice — into structured cards, extracting knowledge, discovering cross-record insights — this requires multiple specialized agents working together, each with its own tools, memory, and decision loop.

The problem: every serious agent framework assumes a server. LangChain, CrewAI, AutoGen — they're all Python or TypeScript, designed to run on a backend. They expect a process that stays alive, a filesystem they control, and network access they manage.

But Memex runs on your phone. There is no backend. Your data never leaves the device. We needed an agent framework that works under these constraints — one that runs in Dart, inside a Flutter app, with no Python runtime, no Node.js sidecar, no cloud orchestration layer.

Nothing like that existed. So we built dart_agent_core.

Design principles

We didn't set out to build a general-purpose framework. We built what Memex needed, then realized it was general enough to be useful on its own. A few principles guided the design:

The agent loop is the core abstraction. At its heart, an agent is a loop: receive input → call LLM → maybe call tools → feed results back → repeat until done. Everything else — skills, planning, memory, sub-agents — layers on top of this loop without breaking it. If you understand the loop, you understand the framework.

Tools are just functions. Wrap any Dart function — sync or async — as a tool with a JSON Schema definition. The framework handles argument parsing, execution, and result feeding. No special base classes, no decorators, no ceremony.

State is explicit and persistent. AgentState tracks conversation history, token usage, active skills, plan steps, and custom metadata. FileStateStorage persists it to disk as JSON. When the app restarts, the agent picks up where it left off. This matters on mobile — apps get killed, users switch away, phones restart.

Skills are composable. A Skill bundles a system prompt and tools under a name. Skills can be always-on or dynamically toggled by the agent at runtime. This keeps the context window focused — the agent activates what it needs for the current task and deactivates the rest.

No opinions on the LLM. The framework provides a unified LLMClientinterface. We ship implementations for OpenAI, Google Gemini, Anthropic Claude, and every OpenAI/Anthropic-compatible provider. The agent doesn't know or care which model it's talking to.

What it actually does

Multi-provider support. One interface, many backends. OpenAI, Gemini, Claude, Bedrock, Kimi, Qwen, Doubao, Zhipu, MiniMax, Ollama, OpenRouter — all work through the same LLMClient abstraction. Switch providers by changing one line.

Streaming. runStream() yields fine-grained events — model chunks, tool call requests, tool results, retries. In Flutter, this means you can update the UI character by character as the model responds. No polling, no waiting for the full response.

Sub-agent delegation. Register named sub-agents or clone the current agent with a clean context. The parent dispatches tasks via a built-in delegate_task tool. Each worker runs in its own isolated state. In Memex, the orchestrator agent delegates to specialized agents for card generation, knowledge extraction, and insight discovery.

Planning. Enable PlanMode and the agent gains a write_todos tool to maintain a step-by-step task list. Each step has a status — pending, in progress, completed, cancelled. The host app can observe plan changes via controller hooks and render progress in the UI.

Context compression. Long-running sessions accumulate tokens. LLMBasedContextCompressor summarizes old messages into episodic memories when the count exceeds a threshold, keeping recent messages intact. The agent can retrieve original messages via a built-in retrieve_memorytool when the summary isn't detailed enough.

Loop detection. Agents sometimes get stuck — calling the same tool with the same arguments in a loop. DefaultLoopDetector catches repeated identical calls and can run periodic LLM-based diagnosis for subtler patterns.

Controller hooks. AgentController provides interception points around every major step. Observe events (pub/sub style) or intercept requests to approve or block actions. In Memex, this is how we implement user confirmation before destructive operations.

File-system Skills with JavaScript execution. Beyond pure Dart Skills, agents can discover and load Skills from SKILL.md files on the local filesystem. With a JavaScript runtime configured, these Skills can execute JS scripts — including fetch() for HTTP requests. This opens up extensibility without recompiling the app.

Why Dart, specifically

This is the question we get most. The answer is straightforward: Memex is a Flutter app. Flutter runs Dart. If the agent framework is also Dart, there's zero bridging cost — no FFI, no platform channels, no serialization between languages. The agent runs in the same process as the UI, shares the same memory, and can directly call any Dart function as a tool.

Dart is also surprisingly well-suited for this kind of work. It has strong async/await support, streams for real-time event handling, and good enough performance for orchestration logic. The heavy lifting — the actual inference — happens on the LLM provider's side. The framework just needs to manage the loop, and Dart handles that cleanly.

How Memex uses it

Every built-in agent in Memex runs on dart_agent_core:

Card Agent — takes raw input and generates the right type of structured timeline card
PKM Agent — extracts knowledge and files it using P.A.R.A. methodology
Insight Agent — discovers patterns across records and generates visualizations
Comment Agent — adds contextual AI commentary to cards
Memory Agent — summarizes and consolidates long-term memories
Super Agent — orchestrates the others, deciding what to run and when

Each agent has its own model configuration, skills, and tools. They communicate through an event bus — when a card is created, the insight agent picks it up; when knowledge is extracted, the memory agent processes it.

The custom agent system in Memex — where users build their own agents with event-driven triggers, custom prompts, and JavaScript execution — is also powered by the same framework. User-created agents are first-class citizens with the same capabilities as built-in ones.

Open source

dart_agent_core is published on pub.devunder the MIT license. It's a standalone library — you don't need Memex to use it. If you're building a Flutter app that needs AI agent capabilities, it's ready to go:

dependencies:
  dart_agent_core: ^1.0.6

The GitHub repo includes examples for every feature — basic tool use, streaming, persistent state, planning, skills, sub-agents, controller hooks, and provider-specific setups for OpenAI, Gemini, Claude, Kimi, Qwen, and more.

We built this because we needed it. We open-sourced it because we think the ecosystem needs it too. There should be a way to run real AI agents on mobile devices — not toy demos, but production agents with tool use, memory, planning, and multi-agent coordination. That's what dart_agent_core is.

If you're building something similar, we'd love to hear about it.