Memfog — persistent memory for AI coding agents (Claude Code, Cursor, Codex)

Published 2026-06-28 Updated 2026-06-28 Read 22 min Words ~5,125 Memfog · memfog.com

tl;dr — the whole post in six bullets

Memfog is a persistent memory layer for AI coding agents — Claude Code, Cursor, Codex, Gemini CLI, OpenCode, Cline. It quietly captures what the agent does in every session (prompts, file edits, terminal output, decisions) and surfaces the right context the next time you sit down to work.
Auto-captured by design. Drop-in hooks for each supported agent. No manual saving, no "press a button to save this", no prompts to copy-paste. The capture runs as a background process; the surfacing runs at the moment of next-session context-load.
Sub-second BM25 search across thousands of sessions. Type the words you remember in plain language and the platform returns the matching captured events with the right precision. Vector and graph retrieval are on the roadmap; BM25 is the substrate today.
Per-user isolation in the cloud sync — your account, your data, no shared corpus, no cross-tenant retrieval surface. Local-only mode is available for the desktop client when the customer needs the memory to never leave the machine.
Apache-licensed source code. The customer can read the platform's implementation, run an audit, fork the code, deploy a private build for high-compliance environments. The license discipline matters because the customer's coding-session content is high-sensitivity data.
Free during beta. Three commands from signup to memory: create an account, run `npx memfog connect claude-code` (or the equivalent for your agent), code as usual. Built with Rust + React.

#The setup: every coding agent forgets your project the moment the session ends

There is a small but constant friction in the daily workflow of every developer who has standardised on an AI coding agent — Claude Code, Cursor, the OpenAI Codex CLI, Gemini CLI, OpenCode, Cline, or any of the dozen others that have shipped in the past eighteen months. The agent is excellent inside a single session. The agent reads the file you point it at, navigates the project context the IDE surfaces, runs the terminal commands you approve, and produces the change you wanted. The agent is also, structurally, amnesiac the moment the session ends. The next session starts from scratch.

For small one-off edits, the amnesia is invisible — the next session doesn't need to remember anything because the next session is a different task. For real production work, where the developer is working on the same problem space across many sessions over days or weeks, the amnesia is a constant operating tax. The developer spends the first three minutes of every new session re-establishing the context the agent had perfectly internalised at the end of the previous session: which files matter, which decisions have already been made, which approaches have already been tried and rejected, which terminal commands have already been run, which test failures have already been investigated.

The cost compounds across the year. A developer running ten sessions a day spends thirty minutes a day on context re-establishment that did not have to happen. Across a year of full-time use, that's roughly 120 working hours — three weeks — of pure re-establishment overhead. And the cost is qualitatively worse than the time suggests because the re-establishment process is the kind of fiddly low-attention work that erodes the developer's flow state and primes the agent with a thinner context than the agent had at session-end.

The structural fix is persistent memory. Capture what the agent did in the previous session, surface the relevant subset at the start of the next session, let the developer skip the re-establishment ritual entirely. The fix is not new conceptually — every developer who has worked seriously with AI coding agents has tried some variant of it, usually by writing a small script that dumps the session state to a file or by manually copy-pasting context summaries between sessions. The fix has never been productised cleanly for the multi-agent landscape that has emerged in 2026.

Memfog exists because the founders watched their own engineers work this way across the Ollasoftware portfolio, watched a growing crowd of developers in the broader market work this way against the same multi-agent stack, and concluded that the right move was to ship the memory layer as the product. The bet was simple: drop-in hooks for every major agent, auto-capture by default, sub-second recall, the customer's data in the customer's account, an open-source codebase the customer can audit.

#What Memfog actually is, in one paragraph and then in detail

Memfog is a cloud-operated persistent memory layer that runs as a managed SaaS service with a desktop client and a small server component. The mental model is a background process that listens to the events your AI coding agent produces — prompts you wrote, files the agent edited, terminal output it produced, decisions it made — captures them into a structured event log scoped to your account, and exposes a search-and-recall surface that the next session can read from. There is nothing to install on the agent side beyond a small hook file the connect command drops into the agent's standard hook directory; the memory infrastructure runs on Ollasoftware's own servers (or on the customer's machine in the local-only mode the desktop client supports).

Inside the platform there are three composable layers. The capture layer hooks into each supported agent at the agent's designated hook points — Claude Code's `~/.claude/hooks/` directory, Cursor's extension surface, the Codex CLI's session hooks, the Gemini CLI's session lifecycle, OpenCode's tool-invocation surface, Cline's message surface. The hook fires on the events that matter (session start, session end, prompt submission, file edit, terminal command, agent response) and forwards the structured event into the capture pipeline. The capture is auto-typed — the platform knows which event is which kind of work product and stores it with the right metadata.

The storage and indexing layer holds the captured events in a per-account corpus, indexed for sub-second BM25 retrieval against the conventional text-search workload that "find the session where I worked on the auth migration" produces. The corpus is per-account-isolated; one customer cannot retrieve another customer's events, even though both customers share the platform's infrastructure. The cloud sync is the canonical deployment; for customers whose work is high-sensitivity, the desktop client runs in local-only mode where the corpus never leaves the customer's machine.

The recall layer is the surface the next session reads from. When the developer starts a new session with the agent, the hook runs the recall query (driven by the project path, the current branch, the recent file activity, optional explicit context the developer provides), gets back the relevant subset of past events, and feeds the subset into the agent's initial context window. The developer does not have to know the recall layer is running; it surfaces context as if the agent had remembered.

Operationally, the platform is in beta as of mid-2026 and is free during the beta period. The team is explicit about the stage — pre-revenue, pre-production-pricing-decision, building in public against a roadmap published on the brand site. The customer joining now gets the full feature surface, the free billing, and an early influence on the roadmap; the customer joining at GA gets a more mature product at whatever pricing the team has decided on.

#Auto-captured by design: hooks for every major coding agent

The decision to auto-capture rather than ask the developer to opt in per session is the architectural decision that distinguishes the platform from the manual-save scripts most developers have tried before. The hook-based capture runs continuously in the background; the developer does not press a save button, does not approve each event, does not paste context summaries between sessions. The events arrive in the capture pipeline by virtue of the agent producing them.

For each supported agent, the hook surface is the agent's designated extension point. Claude Code's `~/.claude/hooks/` directory accepts pre-prompt, post-prompt, pre-tool-use, post-tool-use, and session-end hooks that fire on the events of the same name. The platform's connect command drops the appropriate hook scripts into the directory and they take effect on the next Claude Code invocation. The customer can audit the hook scripts before approving them; the scripts are short, plain shell with a clear handoff to the platform's capture endpoint.

Cursor's extension surface accepts a similar set of hooks through Cursor's extension API. The Codex CLI and the Gemini CLI both expose session-lifecycle hooks. OpenCode and Cline expose the tool-invocation and message surfaces. Across all six, the hook integration is approximately the same shape: a small script that fires on the agent's events and forwards them to the platform's capture endpoint, with the API key the customer minted at signup.

The auto-capture pattern is opt-out rather than opt-in at the session level. Once the hook is installed, every session captures unless the customer explicitly suppresses capture for a session (by setting the `MEMFOG_CAPTURE=off` environment variable for that session, or by toggling capture in the dashboard for a defined window). The opt-out model is the right one for the use case because the moments the developer most needs the captured context are exactly the moments the developer is least likely to have remembered to enable capture — the bug investigation that turns out to be load-bearing later, the architecture conversation that ends up being referenced for the next three months.

The capture pipeline does some basic filtering at write time. Events that are clearly not useful for future recall — empty completions, single-character edits, terminal commands that produce no output — are dropped at the hook level before they reach the platform. The filtering is conservative enough that almost everything genuinely useful survives; the storage cost of capturing a slightly noisy corpus is meaningfully lower than the operational cost of debugging a recall that should have surfaced an event that was filtered out at write time.

#Sub-second BM25 search across thousands of sessions

The recall surface is the part of the platform the developer interacts with most directly. The substrate today is BM25 — the canonical text-search ranking function that powers most of the production search engines the developer has ever used — applied over the captured event corpus with the appropriate per-event-type tuning.

The choice to ship BM25 first rather than starting with vector search is a deliberate one and worth being explicit about. Vector search is the obvious "we have AI, we should use embeddings" choice for an agent-memory product, and the team has been clear that vector search plus graph retrieval are on the roadmap. BM25 ships first because BM25 is the right substrate for the dominant query shape the developer actually runs against the captured corpus: "find the session where I worked on X." X is a concrete keyword — a function name, a file path, a library name, a specific error message — and BM25's exact-term matching surfaces the relevant session with the precision the use case requires. Vector search would surface the conceptually-similar session, which is the wrong shape of result when the developer wants the specific session.

The performance target the team has set itself is sub-second retrieval across a corpus of thousands of sessions, and the platform meets that target today on the typical developer workload (a corpus that has accumulated months of daily-use captures). The substrate is fast enough that the recall layer can run at the moment of next-session context-load without the developer noticing the latency; the agent's session starts with the right context already populated, indistinguishable in feel from the agent having remembered the previous session natively.

The query surface is plain language. The developer types the words they remember — "the session where I added the retry middleware to the email service" — and the platform returns the matching sessions ranked by relevance. The result view surfaces the captured events for each session (the prompts, the file edits, the terminal output, the decisions) with the right discoverability cues so the developer can drill into the specific event they need. For deeper queries — "find every session in the auth-migration branch where I touched the JWT verifier" — the same surface accepts the conventional filter syntax (`branch:auth-migration AND file:jwt-verifier`).

On the roadmap, vector retrieval and graph retrieval are the two additions that will extend the substrate. Vector retrieval handles the conceptually-similar-session query shape that BM25 cannot serve well. Graph retrieval handles the cross-session-dependency query shape — "which sessions led up to this decision," "which sessions tested the implementation that emerged from this conversation" — that neither BM25 nor vector retrieval can serve well on its own. The team has been explicit that both are coming, that BM25 is the right substrate today for the dominant query shape, and that the additions will compose with BM25 rather than replacing it.

“ BM25 ships first because BM25 is the right substrate for the dominant query shape: "find the session where I worked on X." Vector would surface the conceptually-similar session, which is the wrong shape of result when the developer wants the specific session.

#Your account, your data: per-user isolation and local-only mode

The data ownership posture is the part of the platform that determines whether the developer's trust in it is justified. Coding sessions contain some of the highest-sensitivity content the developer produces — proprietary code, customer data the developer pasted into the agent for debugging, credentials accidentally surfaced in terminal output, internal-only architecture decisions, in-progress work that the developer's employer would not want exposed. The platform's posture on all of this matters more than the recall quality does.

Cloud sync is the default deployment and is per-account-isolated. The captured corpus for one customer is segregated at the storage layer from every other customer's corpus. The recall queries can only retrieve events from the calling customer's corpus; there is no shared cross-tenant retrieval surface, no "find similar code across all Memfog users" feature, no aggregated training signal extracted from customer corpora. The customer's data is the customer's data, period.

For customers whose work is high-sensitivity enough that even encrypted cloud storage is the wrong posture — defence contractors, healthcare developers working on regulated codebases, financial-services teams under strict data-residency commitments — the desktop client runs in local-only mode. In local-only mode, the captured corpus lives entirely on the customer's machine. The cloud sync is disabled. The recall queries run against the local index. The platform never sees the customer's captured events. The trade-off is that the corpus is machine-local — switching machines means starting a new corpus — but for the use case that demands this posture, the trade-off is acceptable.

Apache-licensed source code is the third pillar of the data-ownership story. The customer can read the platform's implementation — see what the capture hooks actually send, see what the storage layer actually retains, see what the recall layer actually returns. The customer can run a security audit against the codebase. The customer can fork the code and deploy a private build for environments where running against the public cloud is the wrong choice. The license discipline matters because trust in a memory layer for high-sensitivity content is not earned by marketing copy; it is earned by letting the customer verify the claims.

#The three-command setup

The onboarding ritual is deliberately short. Three steps, roughly five minutes total. The first step is creating the account at the dashboard — email plus password, email verification, done. There is no qualification wall, no waitlist, no approval queue. The customer is logged in and ready to connect their first agent within ninety seconds of clicking signup.

The second step is the agent connect command. For Claude Code, the command is `npx memfog connect claude-code` and it drops the appropriate hooks into `~/.claude/hooks/`. For the other supported agents, the connect command takes the agent's name as the argument and drops the appropriate hook into the agent's standard hook directory. The customer can audit the hook scripts before they take effect; they are short, plain shell, with a clear handoff to the platform's capture endpoint and the customer's account token.

The third step is to code, ship, and forget. The platform's capture pipeline runs in the background. Every session captures into the customer's account. The next session starts with the right context already loaded, surfaced through the recall layer at the moment of context-load. The developer does not have to think about the platform during normal use; the value shows up as the absence of the re-establishment ritual.

For customers who want more control over the recall behaviour — the developer who wants to explicitly ask "what did I do last week on the payment-flow rewrite" rather than relying on the implicit recall at session start — the platform exposes a CLI and a dashboard search surface. The CLI runs the same queries the implicit recall layer runs and produces the same result set. The dashboard surfaces a more browse-friendly view for the offline exploration use case.

#Which agents Memfog supports

The platform ships hooks for the six AI coding agents that account for the dominant share of the production-developer market as of mid-2026: Claude Code, Cursor, the OpenAI Codex CLI, Gemini CLI, OpenCode, and Cline. Each has its own integration story and its own set of hook points; the platform's connect command handles the differences so the customer does not have to.

Claude Code is the most fully-supported agent and the canonical first deployment for most customers. Anthropic's native hook surface in `~/.claude/hooks/` accepts the full lifecycle of events the platform captures — pre-prompt, post-prompt, pre-tool-use, post-tool-use, session-start, session-end — and the platform fires on each. For developers running Claude Code at production volume, the recall quality is highest here because the agent surfaces the richest event metadata.

Cursor is the next-most-fully-supported. Cursor's extension surface exposes the agent's tool invocations and message events; the platform's extension hooks into both and captures the events with the appropriate metadata. The recall quality is comparable to Claude Code at the moment of capture; the slight difference shows up in the session-lifecycle metadata where Cursor's surface is shallower than Claude Code's.

The OpenAI Codex CLI exposes session-lifecycle hooks that the platform consumes; the capture handles prompts, tool invocations, and terminal output. Gemini CLI is similar in shape. OpenCode and Cline expose their own variants of the same surface and the platform's connect command handles the differences.

For agents the platform does not yet support natively, the team is direct about the path: the customer can build their own hook script against the platform's capture endpoint (which is a simple REST POST surface) and the events flow through the same pipeline as the natively-supported agents. The hook surface specification is documented in the platform's docs and is intentionally simple. As new coding agents emerge — and the rate of new entrants in this category has been roughly one per quarter — the team adds native support based on customer demand. The roadmap explicitly mentions that the bar for adding native support is "a customer has asked for it" rather than "we have a strategic relationship with the agent vendor."

#How Memfog compares to the alternatives

The agent-memory category has emerged in roughly the last twelve months and has a small set of credible options. It is worth being direct about how the platform sits against each.

mem0 is the closest peer for the general-purpose AI-memory category. mem0 is a memory layer for AI agents broadly — not specifically scoped to coding agents — with a broad supported-runtime surface and an active open-source community. Memfog extends past mem0 specifically on the coding-agent surface: the hook integrations are tuned for the six agents the platform targets; the captured event schema is specific to coding-session work product (file edits, terminal output, agent tool invocations) rather than general-purpose conversational events; the recall surface is tuned for the developer's actual query shape. For teams building memory for general-purpose agents, mem0 is the right answer; for teams whose use case is specifically the coding-agent workflow, the platform is the focused alternative.

Letta (formerly MemGPT) is the academic/research-driven alternative that pioneered the long-context virtual-memory pattern for LLMs. Letta is a meaningfully more complex system with a stronger story for very-long-running agent conversations. For the agent-memory category narrowly defined as "give my agent a longer effective context across sessions," Letta is appropriate; for the specific case of "remember what my coding agent did across sessions," the platform's narrower focus produces a better fit.

Cursor's built-in project memory and Cline's built-in context features are the agent-native alternatives. They are competent within the specific agent they ship inside and limited beyond it. The platform's value over the agent-native memory is that the memory is portable across agents — the developer who works in Claude Code in the morning and Cursor in the afternoon has one memory corpus that both agents read from, rather than two separate memories that do not see each other.

Manual scripts — the file-dumping, copy-pasting, custom-hook approaches developers have built before commercial alternatives existed — are the lower-cost alternative. The platform's value over the manual script is the integration breadth across six agents, the search-and-recall surface that is faster and more precise than grepping a directory of dumps, and the cloud-sync that makes the memory survive machine changes. For developers whose workload is exactly one agent and one machine, the manual script may be sufficient; for developers running the multi-agent multi-machine pattern most production developers actually run, the platform is the consolidation.

#The team behind the product

Memfog is built and operated by Ollasoftware, the AI software development company headquartered in Bengaluru that has shipped more than forty AI brands in production over the last four years. The platform is one of the more recent products in the portfolio and is built specifically against the agent-tooling workflows the Ollasoftware engineering team itself runs daily across the broader portfolio. The team eats its own dog food in a literal sense — the same engineers who ship Crawlcrawl, Aeoniti, and OllaDNS use Memfog for their own coding-session memory, which is one of the reasons the recall quality has reached production-shape within months rather than years.

The platform is built with Rust on the server side and React on the client side. The Rust choice carries the operational discipline the broader Ollasoftware Rust group has accumulated across the sibling products (OllaDNS, 24observe, Qcrawl, Crawlcrawl) — the same async-Rust patterns, the same Postgres + ClickHouse substrate for the storage and indexing layers, the same Caddy front. The React choice on the dashboard side is conventional and inherits the team's design-system work across the portfolio.

The parent group, Networkers Home, is the cybersecurity and networking training institute that has placed more than forty-five thousand alumni across eight hundred hiring partners since 2007. The institutional context matters for the platform specifically because the data-handling posture — per-user isolation, local-only mode, Apache-licensed source code — is anchored on the disciplinary heritage of an organisation that has been operating in a sensitive-data environment for nearly two decades.

#What is on the roadmap

The team publishes the roadmap on the brand site and updates it as work ships. The visible near-term threads are concrete: vector retrieval as the second retrieval substrate alongside BM25 (for the conceptually-similar-session query shape that BM25 cannot serve well); graph retrieval as the third (for cross-session-dependency queries that neither BM25 nor vector can serve well alone); expanded agent support beyond the current six (the bar is "a customer has asked for it"); and a richer event-type catalogue that captures additional agent-specific events the current schema treats generically.

Underneath those visible features is steady investment in the capture pipeline's reliability. The hook surface is the most fragile part of the platform — an upstream agent that changes its hook semantics in a release can silently break the capture for that agent's users — and the team has invested in detection-and-recovery patterns that surface the breakage in the dashboard the day it happens rather than three weeks later when a developer notices a missing session.

On the recall side, the team is investing in the implicit-recall layer that runs at session start. The current heuristic — driven by the project path, the current branch, recent file activity — handles the common case well; the roadmap extends the heuristic to handle the less-common cases (the developer who switches projects mid-session, the developer working across multiple repositories that share context, the developer whose work spans IDE sessions and terminal-only sessions).

Pricing during the current phase is free during the beta. The team has signalled that GA pricing will be designed to be sustainable for individual developers and small teams; the principle the team has stated is that a memory layer is most valuable when the developer has a long-tenured corpus, and a pricing model that disincentivises long-tenured use defeats the purpose. Enterprise contracts for teams that need SSO, RBAC, and the operational ergonomics that come with that are on the roadmap and will be added at GA.

#How to start

If you build software with AI coding agents — Claude Code, Cursor, Codex, Gemini CLI, OpenCode, Cline — and you have noticed that you spend the first few minutes of every new session re-establishing context the agent had perfectly internalised at the end of the previous one, the right next move takes about five minutes. Sign up at memfog.com, run `npx memfog connect `, and start your next coding session normally.

The first session captures into your account. The second session starts with the recall layer surfacing the relevant context from the first. By the third or fourth session, the absence of the re-establishment ritual is noticeable. By the second week of normal use, the recall quality has crossed the threshold where the developer's default expectation becomes "of course the agent remembers." The platform succeeds when the developer stops thinking about it as a feature and starts treating the persistent memory as a property of the agent.

For deeper evaluation, the platform is fully usable during beta with no credit card and no usage caps. The team explicitly invites aggressive feedback during this phase — the customers whose feedback shapes the GA product get the most influence on the roadmap, and the platform's public-build-in-public posture means the feedback loop is short.

For customers whose work is high-sensitivity enough to require the local-only mode, the desktop client supports the local-only deployment from the same connect command (`npx memfog connect --local-only`). The corpus stays on the machine, the cloud sync is disabled, and the recall layer runs against the local index.

If you would like the team to walk you through a specific deployment — particularly the team-wide deployment for an engineering organisation that wants the platform standardised across multiple developers, or the enterprise integration for an environment with specific data-handling requirements — the Ollasoftware contact page reaches the engineers who built the platform directly.

#FAQs about Memfog

1. What is Memfog?

Memfog is a cloud-operated persistent memory layer for AI coding agents — Claude Code, Cursor, Codex, Gemini CLI, OpenCode, Cline. It captures what the agent does in every session (prompts, file edits, terminal output, decisions) into a per-user corpus, and surfaces the right context at the start of the next session through a sub-second BM25 search layer. Free during beta. Built and operated by Ollasoftware.

2. Which AI coding agents does Memfog support?

Six today: Claude Code (most fully supported, with the richest hook surface in ~/.claude/hooks/), Cursor (via its extension API), OpenAI Codex CLI, Gemini CLI, OpenCode, and Cline. For agents not yet natively supported, the platform documents a simple REST hook surface that the customer can integrate against; new agents are added based on customer demand.

3. How does the capture work?

Auto-captured by design. The connect command drops a small hook script into the agent's designated hook directory. The hook fires on the agent's lifecycle events (session start/end, pre/post prompt, pre/post tool use) and forwards the structured event into the capture pipeline. The capture runs continuously in the background; the developer does not press a save button. Opt-out per session is available via the MEMFOG_CAPTURE=off environment variable.

4. Is the search vector-based or keyword-based?

BM25 (keyword-based) today. The team has been explicit that BM25 is the right substrate for the dominant query shape ("find the session where I worked on X" where X is a specific function name, file path, or error message). Vector retrieval and graph retrieval are on the roadmap as additional substrates for the conceptually-similar-session and cross-session-dependency query shapes that BM25 cannot serve well on its own.

5. How is my coding-session data protected?

Three layers. (1) Per-user isolation in the cloud sync — your captured corpus is segregated at the storage layer; there is no shared cross-tenant retrieval surface and no aggregated training signal extracted from customer corpora. (2) Local-only mode in the desktop client — the corpus never leaves your machine. Trade-off is that the corpus is machine-local rather than synced across machines. (3) Apache-licensed source code — you can read the implementation, audit the data handling, fork and deploy a private build for environments where the public cloud is the wrong choice.

6. How does Memfog compare to mem0, Letta and the built-in memory features in Cursor/Cline?

mem0 is a general-purpose agent-memory layer; Memfog is specifically tuned for coding-session work product with hook integrations for the six major coding agents. Letta (MemGPT) is the academic-driven long-context virtual-memory pattern; appropriate for very-long-running agent conversations but heavier than the specific case of "remember what my coding agent did across sessions." Cursor and Cline ship built-in memory features that work inside their own agents; Memfog is portable across agents — one corpus that Claude Code in the morning and Cursor in the afternoon both read from.

7. How much does Memfog cost?

Free during beta — no credit card, no usage caps. At GA the platform will introduce a pricing model designed to be sustainable for individual developers and small teams; the team has stated explicitly that the model will not disincentivise long-tenured corpora, since the memory layer is most valuable when the corpus has accumulated over time. Enterprise contracts for teams requiring SSO, RBAC, and operational ergonomics will be added at GA.

8. Who is behind Memfog?

Memfog is built and operated by Ollasoftware, the Bengaluru-headquartered AI software development company. The platform is built with Rust on the server side and React on the client side, and shares the operational substrate (async-Rust, Postgres + ClickHouse, Caddy) with the broader Rust portfolio (OllaDNS, 24observe, Qcrawl, Crawlcrawl). The Ollasoftware engineering team uses the platform daily for its own coding-session memory. The parent group is Networkers Home, the cybersecurity and networking training institute founded in 2007 with 45,000+ alumni placed across 800+ hiring partners.