How to Build a Universal Long-Term Memory Layer for AI Agents Using Mem0 and OpenAI

how-to-build-a-universal-long-term-memory-layer-for-ai-agents-using-mem0-and-openai

Source: MarkTechPost

In this tutorial, we build a universal long-term memory layer for AI agents using Mem0, OpenAI models, and ChromaDB. We design a system that can extract structured memories from natural conversations, store them semantically, retrieve them intelligently, and integrate them directly into personalized agent responses. We move beyond simple chat history and implement persistent, user-scoped memory with full CRUD control, semantic search, multi-user isolation, and custom configuration. Finally, we construct a production-ready memory-augmented agent architecture that demonstrates how modern AI systems can reason with contextual continuity rather than operate statelessly.

!pip install mem0ai openai rich chromadb -q   import os import getpass from datetime import datetime   print("=" * 60) print("🔐  MEM0 Advanced Tutorial — API Key Setup") print("=" * 60)   OPENAI_API_KEY = getpass.getpass("Enter your OpenAI API key: ") os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY   print("n✅ API key set!n")   from openai import OpenAI from mem0 import Memory from rich.console import Console from rich.panel import Panel from rich.table import Table from rich.markdown import Markdown from rich import print as rprint import json   console = Console() openai_client = OpenAI()   console.rule("[bold cyan]MODULE 1: Basic Memory Setup[/bold cyan]")   memory = Memory()   print(Panel(    "[green]✓ Memory instance created with default config[/green]n"    "  • LLM: gpt-4.1-nano (OpenAI)n"    "  • Vector Store: ChromaDB (local)n"    "  • Embedder: text-embedding-3-small",    title="Memory Config", border_style="cyan" )) 

We install all required dependencies and securely configure our OpenAI API key. We initialize the Mem0 Memory instance along with the OpenAI client and Rich console utilities. We establish the foundation of our long-term memory system with the default configuration powered by ChromaDB and OpenAI embeddings.

console.rule("[bold cyan]MODULE 2: Adding & Retrieving Memories[/bold cyan]")   USER_ID = "alice_tutorial"   print("n📝 Adding memories for user:", USER_ID)   conversations = [    [        {"role": "user", "content": "Hi! I'm Alice. I'm a software engineer who loves Python and machine learning."},        {"role": "assistant", "content": "Nice to meet you Alice! Python and ML are great areas to be in."}    ],    [        {"role": "user", "content": "I prefer dark mode in all my IDEs and I use VS Code as my main editor."},        {"role": "assistant", "content": "Good to know! VS Code with dark mode is a popular combo."}    ],    [        {"role": "user", "content": "I'm currently building a RAG pipeline for my company's internal docs. It's for a fintech startup."},        {"role": "assistant", "content": "That's exciting! RAG pipelines are really valuable for enterprise use cases."}    ],    [        {"role": "user", "content": "I have a dog named Max and I enjoy hiking on weekends."},        {"role": "assistant", "content": "Max sounds lovely! Hiking is a great way to recharge."}    ], ]   results = [] for i, convo in enumerate(conversations):    result = memory.add(convo, user_id=USER_ID)    extracted = result.get("results", [])    for mem in extracted:        results.append(mem)    print(f"  Conversation {i+1}: {len(extracted)} memory(ies) extracted")   print(f"n✅ Total memories stored: {len(results)}")

We simulate realistic multi-turn conversations and store them using Mem0’s automatic memory extraction pipeline. We add structured conversational data for a specific user and allow the LLM to extract meaningful long-term facts. We verify how many memories are created, confirming that semantic knowledge is successfully persisted.

console.rule("[bold cyan]MODULE 3: Semantic Search[/bold cyan]")   queries = [    "What programming languages does the user prefer?",    "What is Alice working on professionally?",    "What are Alice's hobbies?",    "What tools and IDE does Alice use?", ]   for query in queries:    search_results = memory.search(query=query, user_id=USER_ID, limit=2)    table = Table(title=f"🔍 Query: {query}", show_lines=True)    table.add_column("Memory", style="white", max_width=60)    table.add_column("Score", style="green", justify="center")       for r in search_results.get("results", []):        score = r.get("score", "N/A")        score_str = f"{score:.4f}" if isinstance(score, float) else str(score)        table.add_row(r["memory"], score_str)       console.print(table)    print()   console.rule("[bold cyan]MODULE 4: CRUD Operations[/bold cyan]")   all_memories = memory.get_all(user_id=USER_ID) memories_list = all_memories.get("results", [])   print(f"n📚 All memories for '{USER_ID}':") for i, mem in enumerate(memories_list):    print(f"  [{i+1}] ID: {mem['id'][:8]}...  →  {mem['memory']}")   if memories_list:    first_id = memories_list[0]["id"]    original_text = memories_list[0]["memory"]       print(f"n✏️  Updating memory: '{original_text}'")    memory.update(memory_id=first_id, data=original_text + " (confirmed)")       updated = memory.get(memory_id=first_id)    print(f"   After update: '{updated['memory']}'")

We perform semantic search queries to retrieve relevant memories using natural language. We demonstrate how Mem0 ranks stored memories by similarity score and returns the most contextually aligned information. We also perform CRUD operations by listing, updating, and validating stored memory entries.

console.rule("[bold cyan]MODULE 5: Memory-Augmented Chat[/bold cyan]")   def chat_with_memory(user_message: str, user_id: str, session_history: list) -> str:       relevant = memory.search(query=user_message, user_id=user_id, limit=5)    memory_context = "n".join(        f"- {r['memory']}" for r in relevant.get("results", [])    ) or "No relevant memories found."       system_prompt = f"""You are a highly personalized AI assistant. You have access to long-term memories about this user.   RELEVANT USER MEMORIES: {memory_context}   Use these memories to provide context-aware, personalized responses. Be natural — don't explicitly announce that you're using memories."""       messages = [{"role": "system", "content": system_prompt}]    messages.extend(session_history[-6:])    messages.append({"role": "user", "content": user_message})       response = openai_client.chat.completions.create(        model="gpt-4.1-nano-2025-04-14",        messages=messages    )    assistant_response = response.choices[0].message.content       exchange = [        {"role": "user", "content": user_message},        {"role": "assistant", "content": assistant_response}    ]    memory.add(exchange, user_id=user_id)       session_history.append({"role": "user", "content": user_message})    session_history.append({"role": "assistant", "content": assistant_response})       return assistant_response     session = [] demo_messages = [    "Can you recommend a good IDE setup for me?",    "What kind of project am I currently building at work?",    "Suggest a weekend activity I might enjoy.",    "What's a good tech stack for my current project?", ]   print("n🤖 Starting memory-augmented conversation with Alice...n")   for msg in demo_messages:    print(Panel(f"[bold yellow]User:[/bold yellow] {msg}", border_style="yellow"))    response = chat_with_memory(msg, USER_ID, session)    print(Panel(f"[bold green]Assistant:[/bold green] {response}", border_style="green"))    print()

We build a fully memory-augmented chat loop that retrieves relevant memories before generating responses. We dynamically inject personalized context into the system prompt and store each new exchange back into long-term memory. We simulate a multi-turn session to demonstrate contextual continuity and personalization in action.

console.rule("[bold cyan]MODULE 6: Multi-User Memory Isolation[/bold cyan]")   USER_BOB = "bob_tutorial"   bob_conversations = [    [        {"role": "user", "content": "I'm Bob, a data scientist specializing in computer vision and PyTorch."},        {"role": "assistant", "content": "Great to meet you Bob!"}    ],    [        {"role": "user", "content": "I prefer Jupyter notebooks over VS Code, and I use Vim keybindings."},        {"role": "assistant", "content": "Classic setup for data science work!"}    ], ]   for convo in bob_conversations:    memory.add(convo, user_id=USER_BOB)   print("n🔐 Testing memory isolation between Alice and Bob:n")   test_query = "What programming tools does this user prefer?"   alice_results = memory.search(query=test_query, user_id=USER_ID, limit=3) bob_results = memory.search(query=test_query, user_id=USER_BOB, limit=3)   print("👩 Alice's memories:") for r in alice_results.get("results", []):    print(f"   • {r['memory']}")   print("n👨 Bob's memories:") for r in bob_results.get("results", []):    print(f"   • {r['memory']}")

We demonstrate user-level memory isolation by introducing a second user with distinct preferences. We store separate conversational data and validate that searches remain scoped to the correct user_id. We confirm that memory namespaces are isolated, ensuring secure multi-user agent deployments.

print("n✅ Memory isolation confirmed — users cannot see each other's data.")   console.rule("[bold cyan]MODULE 7: Custom Configuration[/bold cyan]")   custom_config = {    "llm": {        "provider": "openai",        "config": {            "model": "gpt-4.1-nano-2025-04-14",            "temperature": 0.1,            "max_tokens": 2000,        }    },    "embedder": {        "provider": "openai",        "config": {            "model": "text-embedding-3-small",        }    },    "vector_store": {        "provider": "chroma",        "config": {            "collection_name": "advanced_tutorial_v2",            "path": "https://www.marktechpost.com/tmp/chroma_advanced",        }    },    "version": "v1.1" }   custom_memory = Memory.from_config(custom_config)   print(Panel(    "[green]✓ Custom memory instance created[/green]n"    "  • LLM: gpt-4.1-nano with temperature=0.1n"    "  • Embedder: text-embedding-3-smalln"    "  • Vector Store: ChromaDB at /tmp/chroma_advancedn"    "  • Collection: advanced_tutorial_v2",    title="Custom Config Applied", border_style="magenta" ))   custom_memory.add(    [{"role": "user", "content": "I'm a researcher studying neural plasticity and brain-computer interfaces."}],    user_id="researcher_01" )   result = custom_memory.search("What field does this person work in?", user_id="researcher_01", limit=2) print("n🔍 Custom memory search result:") for r in result.get("results", []):    print(f"   • {r['memory']}")   console.rule("[bold cyan]MODULE 8: Memory History[/bold cyan]")   all_alice = memory.get_all(user_id=USER_ID) alice_memories = all_alice.get("results", [])   table = Table(title=f"📋 Full Memory Profile: {USER_ID}", show_lines=True, width=90) table.add_column("#", style="dim", width=3) table.add_column("Memory ID", style="cyan", width=12) table.add_column("Memory Content", style="white") table.add_column("Created At", style="yellow", width=12)   for i, mem in enumerate(alice_memories):    mem_id = mem["id"][:8] + "..."    created = mem.get("created_at", "N/A")    if created and created != "N/A":        try:            created = datetime.fromisoformat(created.replace("Z", "+00:00")).strftime("%m/%d %H:%M")        except:            created = str(created)[:10]    table.add_row(str(i+1), mem_id, mem["memory"], created)   console.print(table)   console.rule("[bold cyan]MODULE 9: Memory Deletion[/bold cyan]")   all_mems = memory.get_all(user_id=USER_ID).get("results", []) if all_mems:    last_mem = all_mems[-1]    print(f"n🗑️  Deleting memory: '{last_mem['memory']}'")    memory.delete(memory_id=last_mem["id"])       updated_count = len(memory.get_all(user_id=USER_ID).get("results", []))    print(f"✅ Deleted. Remaining memories for {USER_ID}: {updated_count}")   console.rule("[bold cyan]✅ TUTORIAL COMPLETE[/bold cyan]")   summary = """ # 🎓 Mem0 Advanced Tutorial Summary   ## What You Learned: 1. **Basic Setup** — Instantiate Memory with default & custom configs 2. **Add Memories** — From conversations (auto-extracted by LLM) 3. **Semantic Search** — Retrieve relevant memories by natural language query 4. **CRUD Operations** — Get, Update, Delete individual memories 5. **Memory-Augmented Chat** — Full pipeline: retrieve → respond → store 6. **Multi-User Isolation** — Separate memory namespaces per user_id 7. **Custom Configuration** — Custom LLM, embedder, and vector store 8. **Memory History** — View full memory profiles with timestamps 9. **Cleanup** — Delete specific or all memories   ## Key Concepts: - `memory.add(messages, user_id=...)` - `memory.search(query, user_id=...)` - `memory.get_all(user_id=...)` - `memory.update(memory_id, data)` - `memory.delete(memory_id)` - `Memory.from_config(config)`   ## Next Steps: - Swap ChromaDB for Qdrant, Pinecone, or Weaviate - Use the hosted Mem0 Platform (app.mem0.ai) for production - Integrate with LangChain, CrewAI, or LangGraph agents - Add `agent_id` for agent-level memory scoping """   console.print(Markdown(summary))

We create a fully custom Mem0 configuration with explicit parameters for the LLM, embedder, and vector store. We test the custom memory instance and explore memory history, timestamps, and structured profiling. Finally, we demonstrate deletion and cleanup operations, completing the full lifecycle management of long-term agent memory.

In conclusion, we implemented a complete memory infrastructure for AI agents using Mem0 as a universal memory abstraction layer. We demonstrated how to add, retrieve, update, delete, isolate, and customize long-term memories while integrating them into a dynamic chat loop. We showed how semantic memory retrieval transforms generic assistants into context-aware systems capable of personalization and continuity across sessions. With this foundation in place, we are now equipped to extend the architecture into multi-agent systems, enterprise-grade deployments, alternative vector databases, and advanced agent frameworks, turning memory into a core capability rather than an afterthought.


Check out the Full Implementation Code and Notebook. Also, feel free to follow us on Twitter and don’t forget to join our 130k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us