Designing AI Agent Memory with Six Layers—The Implementation Cost of the ‘Memory OS’ That Eliminates Personalization
Related Articles
“I don’t remember last week’s instructions”—That AI has a structure similar to dementia
A common complaint from small and medium-sized enterprises that have introduced AI agents is:
“You’ve already forgotten what I instructed last week.”
This is no laughing matter. Most current AI agents reset their memory the moment a session ends. In other words, every time, it’s like starting from scratch. This situation is akin to a veteran employee leaving without a handover, happening repeatedly every day.
While trying to solve the issue of personalization with AI, the AI itself is causing “memory personalization.” It’s an ironic situation.
An open-source framework has emerged to tackle this issue: the “Memory OS” built on the Hermes Agent—a system that provides AI agents with a six-layer memory structure to implement long-term memory.
The question is simple: “Can this be used by local small and medium-sized enterprises? How much will it cost?”
To cut to the chase, depending on the conditions, there is a possibility of acquiring a “memory-enabled AI” for under 500,000 yen initially, and a monthly fee of 10,000 to 30,000 yen. Let’s break down what this means structurally.
—
The Six-Layer Structure of the Memory OS—What to Remember and How to Recall
The design philosophy of the Memory OS is clear. Just like human memory, it separates “what to save,” “how to organize it,” and “when to recall it” into layers.
Let’s look at the six layers from the bottom up.
Layer 1: File Layer (Raw Storage)
This layer retains Hermes workspace files and session databases as they are. It’s like having documents stuffed into a drawer—a storage for raw data.
Layer 2: Vector DB Layer (Semantic Search)
This layer vectorizes the stored information (converts it into arrays of numbers) to make it searchable based on meaning. Instead of keyword matching, it can pull up “information with similar meanings.” This is the decisive difference from traditional search methods.
Layer 3: Structured Facts Layer
This layer stores factual information such as “The representative of Company A is Mr. Tanaka” and “The budget limit is 500,000 yen per month” as structured data. It serves to accurately retain “confirmed information” that could become ambiguous with vector searches alone.
Layer 4: Auto Curation Layer
This is where it gets interesting. The accumulated memory is automatically organized, integrated, and duplicates eliminated. It’s akin to the process where “memories are organized while you sleep” for humans. This layer prevents the decline in search accuracy as information increases.
Layer 5: Context Layer (Context Management)
When the AI agent performs tasks, this layer controls “what should be recalled at this moment.” Instead of loading all memories every time, it injects only the necessary context into the prompt. This directly relates to reducing token costs.
Layer 6: Interface Layer (LLM Bridge)
This is an abstraction layer that can connect with any model, such as OpenAI, Anthropic, or local LLMs. Memory does not disappear even if you switch models. The design avoids vendor lock-in.
The key point is that these six layers operate independently. You can use only the second layer or replace the fourth layer with your own logic, allowing for flexible operations. Not being forced into an “all-in-one” solution is significant for small and medium-sized enterprises.
—
So, how much will it cost?—Breaking Down Implementation Costs
It’s clear that the technology is interesting. The real question is, “Can our company use it?” Let’s break down the costs into three parts.
1. Initial Setup Cost: 150,000 to 500,000 yen
The Memory OS itself is open-source, so there are no software licensing fees. If you have an environment where Docker and Redis can run, basic setup can be completed in half a day.
The costs arise from the “connection to your business operations.”
- Designing to funnel existing business data (customer information, past interactions, etc.) into the Memory OS: 50,000 to 150,000 yen
- Prompt design for the AI agent and adjustment of memory retrieval logic: 50,000 to 200,000 yen
- Testing and tuning: 50,000 to 150,000 yen
If you have an engineer who can handle Docker in-house, it can be set up for around 150,000 yen. Even if outsourced, 500,000 yen is often sufficient.
Let’s provide a comparison. Traditionally, if you were to develop a similar “memory-enabled AI” from scratch, you would need to build vector DB design, search logic, and curation functions from the ground up. Estimates easily reach 3,000,000 to 5,000,000 yen. Now, it can be done for under 500,000 yen. Costs have dropped to 1/6 to 1/10. This is a structural change.
2. Monthly Operating Cost: 10,000 to 30,000 yen
- Server costs (VPS or cloud): 5,000 to 15,000 yen per month
- LLM API usage fees (for memory organization and search): 3,000 to 10,000 yen per month
- DB operation costs (Redis, etc.): 2,000 to 5,000 yen per month
In total, 10,000 to 30,000 yen per month. This is less than the daily wage of one part-time employee.
If you use a local LLM (like Ollama), you can eliminate API costs. There is a trade-off with accuracy, but for internal knowledge search purposes, it is sufficiently practical.
3. Learning and Retention Cost: Can be effectively zero
“Bringing in an external trainer for employee training at 50,000 yen per day”—this is an outdated idea.
The benefits of the Memory OS are felt behind the scenes of the AI agent, not in the parts directly interacted with by end-users (employees on the ground). Employees will continue to give instructions via chat as before. The difference is that “the AI remembers last week’s discussions.” If the UI doesn’t change, there are no training costs.
It’s sufficient for 1 to 2 administrators to understand memory settings and maintenance, which can be learned independently through documentation and GitHub issues.
—
The Structure That Causes Personalization to “Die on Its Own”
Now we get to the main point. The true value of the Memory OS is not in cost reduction.
It is the structural elimination of personalization.
What is the biggest management risk for small and medium-sized enterprises? The departure of veteran employees. “Only that person knows” and “It only works in that person’s way”—this state will change with AI agents powered by the Memory OS.
What specifically will happen?
- The context of customer interactions will be accumulated. “Mr. Tanaka from Company A is strict about deadlines” and “Company B has a budget surplus every March”—such tacit knowledge will be automatically accumulated in the structured facts layer. Even if the responsible person leaves, the memory remains.
- Business procedures will be automatically recorded. “This estimate was created using this procedure” and “This complaint was handled in this way”—the curation layer will automatically organize this information. There’s no need to write manuals.
- Handover will not “occur.” If a new person asks the AI, all past contexts will be available. There’s no need to create handover documents, nor is there a need for a handover period.
Instead of trying hard to eliminate personalization, we create a state where it “naturally disappears” as a system. This is the essential value of the Memory OS.
—
Keep an Eye on Competing Technologies—Eywa, ElasticMem
The Memory OS is not the only option. It’s important to be aware of other research that has emerged around the same time.
Eywa tracks the “provenance” of memories. This means it records “when and in what context this information was remembered.” It provides a mechanism to ensure the reliability of memories, making it strong in fields like healthcare and legal where accountability for “why that decision was made” is required.
ElasticMem implements a “learnable latent memory” that dynamically expands and contracts memory capacity. It automatically extends memory according to the complexity of tasks, minimizing resource waste.
Both are still in the research stage and are not yet in a state as “ready to use” as the Memory OS. However, there is a high likelihood that open-source implementations will emerge within six months to a year. By establishing a foundational memory design with the Memory OS now, it will be easier to switch or integrate later. Companies that act first will have a structural advantage.
—
So, what should we do in the end?
Just three things.
1. First, try it with one business operation.
You don’t need to think about full company implementation. “Have the AI remember customer interaction history” or “Have it memorize past patterns for estimate creation”—focus on one operation and run the agent with the Memory OS for two weeks.
2. Decide “what to have it remember.”
There’s no need to use all six layers. Initially, just the vector DB layer and the structured facts layer will be sufficient. By narrowing down the information to be remembered, both accuracy and costs become easier to manage.
3. Conduct a “memory inventory” once a month.
AI memory can spoil if left unattended. Accumulating old or contradictory information will reduce search accuracy. Even with an automatic curation layer, it’s advisable to implement a monthly human review.
—
Initial costs under 500,000 yen and monthly fees under 30,000 yen. With this investment, you can acquire a “memory-enabled AI,” structurally eliminating personalization.
The era of spending 3,000,000 to 5,000,000 yen on scratch development is over. Now that open-source has slashed costs to 1/10, the reasons for not doing it are more costly.
Just one operation, for two weeks. That’s all it takes.
JA
EN