The Gist
Organizational memory AI. Engram builds AI that forms proprietary memory for each enterprise.
Efficiency partnerships. Early partners include Microsoft, Notion and Harvey, with a focus on cutting token use.
Enterprise cost impact. CIOs may gain AI efficiency, lower costs and tighter control over proprietary knowledge if Engram delivers.
Engram, a San Francisco-based startup building a learned memory layer for AI, emerged from stealth on June 23 with $98 million in funding co-led by General Catalyst and Modern Capital. The round valued the 13-person company at $600 million. Kleiner Perkins, Sequoia Capital, Factory, Amplify Partners and Neo also participated, alongside angel investors including Wiz co-founder Assaf Rappaport, OpenAI co-founder Andrej Karpathy and Berkeley AI Research co-director Pieter Abbeel.
The company is building what it described as a persistent "memory layer" for enterprise AI — technology rooted in a June 2025 Stanford research paper co-authored by CTO Sabri Eyuboglu that introduced a method for compressing long-context information into reusable model memory. According to Engram, these models use up to 100x fewer tokens while matching or outperforming frontier models — a direct pitch to enterprises facing rising AI inference costs as agents scale across functions. General Catalyst framed the investment as addressing a structural gap in current AI, arguing today's models remain "brilliant and amnesic" without persistent organizational context.
Engram launched with Microsoft, Notion and Harvey as early partners. Microsoft is evaluating the models within Microsoft 365, Notion is integrating the memory layer into custom agents and Harvey is applying it to legal workflows.
Engram Feature Breakdown
Engram's core capabilities center on compressing and retaining organizational knowledge for AI systems.
Capability | Description |
|---|---|
Learned memory layer | Trains models to form compact, reusable memory unique to each customer |
Cartridges method | Converts large document sets into small reusable memories, according to Engram |
Active Reading | Training method for deep model study of organizational material |
Token efficiency | Engram claims models use 1–10% of tokens vs. frontier models |
Partner integrations | Deployed within Microsoft 365, Notion agents and Harvey legal workflows |
AI Memory: What Executives Must Know
Long-lived AI agents create a new infrastructure problem: context windows choke on accumulated history, degrading performance. Systems built for sustained operation monitor token usage, trigger checkpoints before windows overflow and maintain durable audit trails outside the active conversation.
The Stack Behind AI Memory
Enterprise AI memory relies on layered infrastructure: orchestration layers, vector databases, retrieval pipelines, monitoring tools and governance controls. Each layer adds cost and operational complexity.
Retrieval-augmented generation systems continuously query indexed enterprise data, while agentic workflows may call multiple tools, APIs and models before completing a single task.
As one infrastructure executive put it: "Inference cost is visible. What's less visible is all the operational drag around it — pipelines that fail silently and retry, messy, redundant data ingestion and agentic workflows with no real observability. That's where the money actually goes missing."
Governance & Data Ownership
Governance is not optional in persistent-memory architectures. An MIT Technology Review survey of more than 2,000 senior executives found that companies "deeply committed" to AI and data sovereignty reported roughly 5x higher ROI from generative and agentic AI deployments than those with weaker controls.
Nearly 95% of respondents said they plan to establish their own AI and data platforms within three years, with security, data localization and ownership ranking as the top drivers.
Infrastructure Gaps Persist
Despite AI budgets increasing by more than 10% at many organizations, only 22% of companies are considered "future ready" with their data infrastructure. More than half remain stuck with disconnected systems and incompatible technologies.
Compliance Starts at the Design Phase
Responsible AI design must be embedded in system architecture from the start. For memory-enabled agents that persist context across sessions and users, AI governance must extend to the user experience, the data layer and team operating procedures.
"When an AI reads a 70,000-word legal contract (~400 kilobytes), its internal memory can exceed 100 gigabytes, 250,000 times the original file, increasing costs and latency. We study once, ahead of time, compressing it into a compact memory reusable on every query," said Sabri Eyuboglu, CTO and co-founder of Engram.
Engram Background
Engram delivers custom AI solutions for mid-to-large knowledge-work organizations and platform partners, developing models that internalize a customer's unique context across internal documents, communications and knowledge workspaces. Its offerings emphasize context compression, retrieval, fine-tuning and long-context memory, primarily serving enterprise SaaS platforms, legal and professional services firms and large corporate teams.