Building Custom AI Agents on Your Phone — No Server Required

Most AI agent frameworks assume you have a server. LangChain, CrewAI, AutoGen — they run on a backend with a Python runtime, persistent processes, and network access they control. That makes sense for enterprise workflows. It does not make sense if you want an agent that runs on your phone, works with your personal data, and does not require infrastructure you have to maintain.

Memex includes a custom agent system that runs entirely on your device. Every built-in agent — knowledge extraction, card generation, insight discovery, memory summarization — runs on this same infrastructure. And that infrastructure is fully open to you. You can create agents with the same capabilities as the built-in ones.

This post explains what you can build, how the system works, and where the limits are.

What you can build

A custom agent in Memex is not a chatbot. It is an autonomous process that activates on events, processes data using an LLM, and takes actions. Here are some concrete examples:

  • A daily summary agent that triggers at the end of each day, reads your records from the past 24 hours, and generates a structured daily review card.
  • A health tracking agent that watches for records mentioning exercise, sleep, or meals, extracts the relevant metrics, and maintains a running health log in its working directory.
  • A web research agent that uses JavaScript fetch() to call external APIs when you record a question or topic you want to explore.
  • A translation agent that triggers on new records and produces a translated version in a second language, filed alongside the original.
  • A multi-step workflow where Agent A extracts entities from your record, Agent B enriches them with external data, and Agent C generates a formatted summary — all chained together with dependency ordering.

Creating an agent

The setup is straightforward. In Memex, go to the agent configuration screen. You define:

  • Name — what you call the agent.
  • Host type — Pure mode for custom agents.
  • Trigger event — when the agent activates. Options include: on user input, after knowledge extraction, on card creation, on insight generation, or any system event.
  • System prompt — the instructions that shape the agent's behavior, personality, and output format.
  • Model — each agent can use a different LLM provider and model.
  • Working directory — a scoped filesystem where the agent can read and write files.

That is enough to create a basic agent. No code required for this part. The agent will activate on its trigger event, process the input using the LLM with your system prompt, and produce output.

Adding skills with SKILL.md

For more sophisticated behavior, agents use the open Agent Skills standard. Each agent can have a SKILL.md file — a structured document that defines instructions, available tools, and resources the agent can use.

Skills are stored in a directory that the agent discovers and loads on demand. This means you can add new capabilities to an agent by dropping a skill file into its directory, without modifying the agent configuration itself.

The Agent Skills standard was originally developed by Anthropic for packaging agent capabilities. Memex adopts it as the native format for both built-in and custom agents.

JavaScript execution

This is where custom agents get genuinely powerful. Skills can include JavaScript code that the agent executes locally on your device. The JavaScript runtime supports fetch() for HTTP requests, which means your agent can:

  • Call external APIs — weather data, stock prices, news feeds, translation services.
  • Transform data — parse JSON responses, format text, compute statistics.
  • Scrape web content — fetch a URL and extract relevant information.

All of this runs locally. The JavaScript executes on your phone, not on a remote server. The fetch() calls go directly from your device to the target API. Memex does not proxy or log these requests.

Inter-agent workflows

Agents can depend on each other using the dependsOn configuration. When Agent A finishes, Agent B starts. This lets you build multi-step pipelines:

  • Agent A extracts structured data from a voice memo.
  • Agent B enriches the data with external context via a JavaScript API call.
  • Agent C generates a formatted card from the enriched data.

Each agent in the chain runs in its own isolated state with its own model configuration. The output of one agent becomes available to the next through the event bus.

Agents can run synchronously (blocking, inline with the user flow) or asynchronously (queued as background tasks). Async agents also support auto-retry with a configurable maximum retry count, which is useful for agents that call unreliable external APIs.

How it works under the hood

The execution flow is the same for built-in and custom agents:

  • A system event fires (user input, card created, insight generated, etc.).
  • The event bus dispatches to all agents subscribed to that event type.
  • Each agent loads its SKILL.md and system prompt.
  • The LLM processes the event with the agent's available tools.
  • The agent executes actions — file I/O, JavaScript, fetch.
  • Results flow to downstream dependent agents and are presented to the user.

The entire pipeline runs on the dart_agent_core framework, which we built specifically for mobile agent execution. It handles state persistence (so agents survive app restarts), loop detection (so agents do not burn through your API budget), and context compression (so long-running agents do not overflow the context window).

What you cannot do (yet)

The custom agent system is powerful but has limits:

  • Agents cannot modify the UI directly. They produce data that the existing UI renders. Custom card templates are on the roadmap but not available yet.
  • There is no agent marketplace yet. You cannot share or install agents from other users with one tap. This is planned as part of the Extension Market roadmap item.
  • Agents run on the phone, so they are constrained by phone resources — battery, memory, and thermal limits. Long-running agents with many LLM calls will drain battery faster.
  • The JavaScript runtime is sandboxed. It supports fetch() and basic data processing but not arbitrary Node.js modules.

Getting started

If you want to try building a custom agent:

  • Make sure you have Memex installed and a model provider connected. If not, read the getting started guide first.
  • Go to the agent configuration screen and create a new agent with Pure mode.
  • Start simple — a system prompt that summarizes your daily records, triggered on user input. No JavaScript, no dependencies.
  • Once that works, add a working directory and try file read/write operations.
  • Then add JavaScript skills for external API calls if you need them.

The Agent Skills documentation explains the SKILL.md format in detail. The Memex source code includes the built-in agent implementations as reference examples.

For the technical architecture behind the agent system, read our post on what it takes to run AI agents on a phone. For the framework itself, see the dart_agent_core introduction.


FAQ

Do I need to know how to code to create custom agents?

No. The basic agent setup — name, trigger event, system prompt, and model — requires no coding. If you want agents that execute JavaScript or call external APIs, you will need some programming knowledge to write the skill scripts.

What can custom agents do?

Custom agents can read and write files in their working directory, execute JavaScript code including HTTP requests via fetch(), respond to system events like user input or card creation, and chain together with other agents using dependency ordering.

Are custom agents as powerful as built-in agents?

Yes. Custom agents plug into the same event bus, use the same tool system, and have the same capabilities as the built-in agents. The infrastructure is identical.

What is the Agent Skills standard?

Agent Skills is an open standard for packaging agent capabilities. Each agent reads its behavior from a SKILL.md file — a folder of instructions, scripts, and resources. The standard was originally developed by Anthropic and is used by Memex for both built-in and custom agents.