Building Next-Gen Agentic AI: A Complete Framework for Cognitive Blueprint Driven Runtime Agents with Memory Tools and Validation

building-next-gen-agentic-ai:-a-complete-framework-for-cognitive-blueprint-driven-runtime-agents-with-memory-tools-and-validation

Source: MarkTechPost

In this tutorial, we build a complete cognitive blueprint and runtime agent framework. We define structured blueprints for identity, goals, planning, memory, validation, and tool access, and use them to create agents that not only respond but also plan, execute, validate, and systematically improve their outputs. Along the tutorial, we show how the same runtime engine can support multiple agent personalities and behaviors through blueprint portability, making the overall design modular, extensible, and practical for advanced agentic AI experimentation.

import json, yaml, time, math, textwrap, datetime, getpass, os from typing import Any, Callable, Dict, List, Optional from dataclasses import dataclass, field from enum import Enum   from openai import OpenAI from pydantic import BaseModel from rich.console import Console from rich.panel import Panel from rich.table import Table from rich.tree import Tree   try:    from google.colab import userdata    OPENAI_API_KEY = userdata.get('OPENAI_API_KEY') except Exception:    OPENAI_API_KEY = getpass.getpass("🔑 Enter your OpenAI API key: ")   os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY client = OpenAI(api_key=OPENAI_API_KEY) console = Console()   class PlanningStrategy(str, Enum):    SEQUENTIAL   = "sequential"    HIERARCHICAL = "hierarchical"    REACTIVE     = "reactive"   class MemoryType(str, Enum):    SHORT_TERM = "short_term"    EPISODIC   = "episodic"    PERSISTENT = "persistent"   class BlueprintIdentity(BaseModel):    name: str    version: str = "1.0.0"    description: str    author: str = "unknown"   class BlueprintMemory(BaseModel):    type: MemoryType = MemoryType.SHORT_TERM    window_size: int = 10    summarize_after: int = 20   class BlueprintPlanning(BaseModel):    strategy: PlanningStrategy = PlanningStrategy.SEQUENTIAL    max_steps: int = 8    max_retries: int = 2    think_before_acting: bool = True   class BlueprintValidation(BaseModel):    require_reasoning: bool = True    min_response_length: int = 10    forbidden_phrases: List[str] = []   class CognitiveBlueprint(BaseModel):    identity: BlueprintIdentity    goals: List[str]    constraints: List[str] = []    tools: List[str] = []    memory: BlueprintMemory = BlueprintMemory()    planning: BlueprintPlanning = BlueprintPlanning()    validation: BlueprintValidation = BlueprintValidation()    system_prompt_extra: str = ""   def load_blueprint_from_yaml(yaml_str: str) -> CognitiveBlueprint:    return CognitiveBlueprint(**yaml.safe_load(yaml_str))   RESEARCH_AGENT_YAML = """ identity:  name: ResearchBot  version: 1.2.0  description: Answers research questions using calculation and reasoning  author: Auton Framework Demo goals:  - Answer user questions accurately using available tools  - Show step-by-step reasoning for all answers  - Cite the method used for each calculation constraints:  - Never fabricate numbers or statistics  - Always validate mathematical results before reporting  - Do not answer questions outside your tool capabilities tools:  - calculator  - unit_converter  - date_calculator  - search_wikipedia_stub memory:  type: episodic  window_size: 12  summarize_after: 30 planning:  strategy: sequential  max_steps: 6  max_retries: 2  think_before_acting: true validation:  require_reasoning: true  min_response_length: 20  forbidden_phrases:    - "I don't know"    - "I cannot determine" """   DATA_ANALYST_YAML = """ identity:  name: DataAnalystBot  version: 2.0.0  description: Performs statistical analysis and data summarization  author: Auton Framework Demo goals:  - Compute descriptive statistics for given data  - Identify trends and anomalies  - Present findings clearly with numbers constraints:  - Only work with numerical data  - Always report uncertainty when sample size is small (< 5 items) tools:  - calculator  - statistics_engine  - list_sorter memory:  type: short_term  window_size: 6 planning:  strategy: hierarchical  max_steps: 10  max_retries: 3  think_before_acting: true validation:  require_reasoning: true  min_response_length: 30  forbidden_phrases: [] """ 

We set up the core environment and define the cognitive blueprint, which structures how an agent thinks and behaves. We create strongly typed models for identity, memory configuration, planning strategy, and validation rules using Pydantic and enums. We also define two YAML-based blueprints, allowing us to configure different agent personalities and capabilities without changing the underlying runtime system.

@dataclass class ToolSpec:    name: str    description: str    parameters: Dict[str, str]    function: Callable    returns: str   class ToolRegistry:    def __init__(self):        self._tools: Dict[str, ToolSpec] = {}      def register(self, name: str, description: str,                 parameters: Dict[str, str], returns: str):        def decorator(fn: Callable) -> Callable:            self._tools[name] = ToolSpec(name, description, parameters, fn, returns)            return fn        return decorator      def get(self, name: str) -> Optional[ToolSpec]:        return self._tools.get(name)      def call(self, name: str, **kwargs) -> Any:        spec = self._tools.get(name)        if not spec:            raise ValueError(f"Tool '{name}' not found in registry")        return spec.function(**kwargs)      def get_tool_descriptions(self, allowed: List[str]) -> str:        lines = []        for name in allowed:            spec = self._tools.get(name)            if spec:                params = ", ".join(f"{k}: {v}" for k, v in spec.parameters.items())                lines.append(                    f"• {spec.name}({params})n"                    f"  → {spec.description}n"                    f"  Returns: {spec.returns}"                )        return "n".join(lines)      def list_tools(self) -> List[str]:        return list(self._tools.keys())   registry = ToolRegistry()   @registry.register(    name="calculator",    description="Evaluates a safe mathematical expression",    parameters={"expression": "A math expression string, e.g. '2 ** 10 + 5 * 3'"},    returns="Numeric result as float" ) def calculator(expression: str) -> str:    try:        allowed = {k: v for k, v in math.__dict__.items() if not k.startswith("_")}        allowed.update({"abs": abs, "round": round, "pow": pow})        return str(eval(expression, {"__builtins__": {}}, allowed))    except Exception as e:        return f"Error: {e}"   @registry.register(    name="unit_converter",    description="Converts between common units of measurement",    parameters={        "value": "Numeric value to convert",        "from_unit": "Source unit (km, miles, kg, lbs, celsius, fahrenheit, liters, gallons, meters, feet)",        "to_unit": "Target unit"    },    returns="Converted value as string with units" ) def unit_converter(value: float, from_unit: str, to_unit: str) -> str:    conversions = {        ("km", "miles"): lambda x: x * 0.621371,        ("miles", "km"): lambda x: x * 1.60934,        ("kg", "lbs"):   lambda x: x * 2.20462,        ("lbs", "kg"):   lambda x: x / 2.20462,        ("celsius", "fahrenheit"): lambda x: x * 9/5 + 32,        ("fahrenheit", "celsius"): lambda x: (x - 32) * 5/9,        ("liters", "gallons"): lambda x: x * 0.264172,        ("gallons", "liters"): lambda x: x * 3.78541,        ("meters", "feet"): lambda x: x * 3.28084,        ("feet", "meters"): lambda x: x / 3.28084,    }    key = (from_unit.lower(), to_unit.lower())    if key in conversions:        return f"{conversions[key](float(value)):.4f} {to_unit}"    return f"Conversion from {from_unit} to {to_unit} not supported"   @registry.register(    name="date_calculator",    description="Calculates days between two dates, or adds/subtracts days from a date",    parameters={        "operation": "'days_between' or 'add_days'",        "date1": "Date string in YYYY-MM-DD format",        "date2": "Second date for days_between (YYYY-MM-DD), or number of days for add_days"    },    returns="Result as string" ) def date_calculator(operation: str, date1: str, date2: str) -> str:    try:        d1 = datetime.datetime.strptime(date1, "%Y-%m-%d")        if operation == "days_between":            d2 = datetime.datetime.strptime(date2, "%Y-%m-%d")            return f"{abs((d2 - d1).days)} days between {date1} and {date2}"        elif operation == "add_days":            result = d1 + datetime.timedelta(days=int(date2))            return f"{result.strftime('%Y-%m-%d')} (added {date2} days to {date1})"        return f"Unknown operation: {operation}"    except Exception as e:        return f"Error: {e}"   @registry.register(    name="search_wikipedia_stub",    description="Returns a stub summary for well-known topics (demo — no live internet)",    parameters={"topic": "Topic to look up"},    returns="Short text summary" ) def search_wikipedia_stub(topic: str) -> str:    stubs = {        "openai": "OpenAI is an AI research company founded in 2015. It created GPT-4 and the ChatGPT product.",    }    for key, val in stubs.items():        if key in topic.lower():            return val    return f"No stub found for '{topic}'. In production, this would query Wikipedia's API."

We implement the tool registry that allows agents to discover and use external capabilities dynamically. We design a structured system in which tools are registered with metadata, including parameters, descriptions, and return values. We also implement several practical tools, such as a calculator, unit converter, date calculator, and a Wikipedia search stub that the agents can invoke during execution.

@registry.register(    name="statistics_engine",    description="Computes descriptive statistics on a list of numbers",    parameters={"numbers": "Comma-separated list of numbers, e.g. '4,8,15,16,23,42'"},    returns="JSON with mean, median, std_dev, min, max, count" ) def statistics_engine(numbers: str) -> str:    try:        nums = [float(x.strip()) for x in numbers.split(",")]        n = len(nums)        mean = sum(nums) / n        sorted_nums = sorted(nums)        mid = n // 2        median = sorted_nums[mid] if n % 2 else (sorted_nums[mid-1] + sorted_nums[mid]) / 2        std_dev = math.sqrt(sum((x - mean) ** 2 for x in nums) / n)        return json.dumps({            "count": n, "mean": round(mean, 4), "median": round(median, 4),            "std_dev": round(std_dev, 4), "min": min(nums),            "max": max(nums), "range": max(nums) - min(nums)        }, indent=2)    except Exception as e:        return f"Error: {e}"   @registry.register(    name="list_sorter",    description="Sorts a comma-separated list of numbers",    parameters={"numbers": "Comma-separated numbers", "order": "'asc' or 'desc'"},    returns="Sorted comma-separated list" ) def list_sorter(numbers: str, order: str = "asc") -> str:    nums = [float(x.strip()) for x in numbers.split(",")]    nums.sort(reverse=(order == "desc"))    return ", ".join(str(n) for n in nums)   @dataclass class MemoryEntry:    role: str    content: str    timestamp: float = field(default_factory=time.time)    metadata: Dict = field(default_factory=dict)   class MemoryManager:    def __init__(self, config: BlueprintMemory, llm_client: OpenAI):        self.config = config        self.client = llm_client        self._history: List[MemoryEntry] = []        self._summary: str = ""      def add(self, role: str, content: str, metadata: Dict = None):        self._history.append(MemoryEntry(role=role, content=content, metadata=metadata or {}))        if (self.config.type == MemoryType.EPISODIC and                len(self._history) > self.config.summarize_after):            self._compress_memory()      def _compress_memory(self):        to_compress = self._history[:-self.config.window_size]        self._history = self._history[-self.config.window_size:]        text = "n".join(f"{e.role}: {e.content[:200]}" for e in to_compress)        try:            resp = self.client.chat.completions.create(                model="gpt-4o-mini",                messages=[{"role": "user", "content":                    f"Summarize this conversation history in 3 sentences:n{text}"}],                max_tokens=150            )            self._summary += " " + resp.choices[0].message.content.strip()        except Exception:            self._summary += f" [compressed {len(to_compress)} messages]"      def get_messages(self, system_prompt: str) -> List[Dict]:        messages = [{"role": "system", "content": system_prompt}]        if self._summary:            messages.append({"role": "system",                "content": f"[Memory Summary]: {self._summary.strip()}"})        for entry in self._history[-self.config.window_size:]:            messages.append({                "role": entry.role if entry.role != "tool" else "assistant",                "content": entry.content            })        return messages      def clear(self):        self._history = []        self._summary = ""      @property    def message_count(self) -> int:        return len(self._history)

We extend the tool ecosystem and introduce the memory management layer that stores conversation history and compresses it when necessary. We implement statistical tools and sorting utilities that enable the data analysis agent to perform structured numerical operations. At the same time, we design a memory system that tracks interactions, summarizes long histories, and provides contextual messages to the language model.

@dataclass class PlanStep:    step_id: int    description: str    tool: Optional[str]    tool_args: Dict[str, Any]    reasoning: str   @dataclass class Plan:    task: str    steps: List[PlanStep]    strategy: PlanningStrategy   class Planner:    def __init__(self, blueprint: CognitiveBlueprint,                 registry: ToolRegistry, llm_client: OpenAI):        self.blueprint = blueprint        self.registry  = registry        self.client    = llm_client      def _build_planner_prompt(self) -> str:        bp = self.blueprint        return textwrap.dedent(f"""            You are {bp.identity.name}, version {bp.identity.version}.            {bp.identity.description}              ## Your Goals:            {chr(10).join(f'  - {g}' for g in bp.goals)}              ## Your Constraints:            {chr(10).join(f'  - {c}' for c in bp.constraints)}              ## Available Tools:            {self.registry.get_tool_descriptions(bp.tools)}              ## Planning Strategy: {bp.planning.strategy}            ## Max Steps: {bp.planning.max_steps}              Given a user task, produce a JSON execution plan with this exact structure:            {{              "steps": [                {{                  "step_id": 1,                  "description": "What this step does",                  "tool": "tool_name or null if no tool needed",                  "tool_args": {{"arg1": "value1"}},                  "reasoning": "Why this step is needed"                }}              ]            }}              Rules:            - Only use tools listed above            - Set tool to null for pure reasoning steps            - Keep steps <= {bp.planning.max_steps}            - Return ONLY valid JSON, no markdown fences            {bp.system_prompt_extra}        """).strip()      def plan(self, task: str, memory: MemoryManager) -> Plan:        system_prompt = self._build_planner_prompt()        messages = memory.get_messages(system_prompt)        messages.append({"role": "user", "content":            f"Create a plan to complete this task: {task}"})        resp = self.client.chat.completions.create(            model="gpt-4o-mini", messages=messages,            max_tokens=1200, temperature=0.2        )        raw = resp.choices[0].message.content.strip()        raw = raw.replace("```json", "").replace("```", "").strip()        data = json.loads(raw)        steps = [            PlanStep(                step_id=s["step_id"], description=s["description"],                tool=s.get("tool"), tool_args=s.get("tool_args", {}),                reasoning=s.get("reasoning", "")            )            for s in data["steps"]        ]        return Plan(task=task, steps=steps, strategy=self.blueprint.planning.strategy)   @dataclass class StepResult:    step_id: int    success: bool    output: str    tool_used: Optional[str]    error: Optional[str] = None   @dataclass class ExecutionTrace:    plan: Plan    results: List[StepResult]    final_answer: str   class Executor:    def __init__(self, blueprint: CognitiveBlueprint,                 registry: ToolRegistry, llm_client: OpenAI):        self.blueprint = blueprint        self.registry  = registry        self.client    = llm_client

We implement the planning system that transforms a user task into a structured execution plan composed of multiple steps. We design a planner that instructs the language model to produce a JSON plan containing reasoning, tool selection, and arguments for each step. This planning layer allows the agent to break complex problems into smaller executable actions before performing them.

  def execute_plan(self, plan: Plan, memory: MemoryManager,                     verbose: bool = True) -> ExecutionTrace:        results: List[StepResult] = []        if verbose:            console.print(f"n[bold yellow]⚡ Executing:[/] {plan.task}")            console.print(f"   Strategy: {plan.strategy} | Steps: {len(plan.steps)}")          for step in plan.steps:            if verbose:                console.print(f"n  [cyan]Step {step.step_id}:[/] {step.description}")            try:                if step.tool and step.tool != "null":                    if verbose:                        console.print(f"   🔧 Tool: [green]{step.tool}[/] | Args: {step.tool_args}")                    output = self.registry.call(step.tool, **step.tool_args)                    result = StepResult(step.step_id, True, str(output), step.tool)                    if verbose:                        console.print(f"   ✅ Result: {output}")                else:                    context_text = "n".join(                        f"Step {r.step_id} result: {r.output}" for r in results)                    prompt = (                        f"Previous results:n{context_text}nn"                        f"Now complete this step: {step.description}n"                        f"Reasoning hint: {step.reasoning}"                    ) if context_text else (                        f"Complete this step: {step.description}n"                        f"Reasoning hint: {step.reasoning}"                    )                    sys_prompt = (                        f"You are {self.blueprint.identity.name}. "                        f"{self.blueprint.identity.description}. "                        f"Constraints: {'; '.join(self.blueprint.constraints)}"                    )                    resp = self.client.chat.completions.create(                        model="gpt-4o-mini",                        messages=[                            {"role": "system", "content": sys_prompt},                            {"role": "user",   "content": prompt}                        ],                        max_tokens=500, temperature=0.3                    )                    output = resp.choices[0].message.content.strip()                    result = StepResult(step.step_id, True, output, None)                    if verbose:                        preview = output[:120] + "..." if len(output) > 120 else output                        console.print(f"   🤔 Reasoning: {preview}")            except Exception as e:                result = StepResult(step.step_id, False, "", step.tool, str(e))                if verbose:                    console.print(f"   ❌ Error: {e}")            results.append(result)          final_answer = self._synthesize(plan, results, memory)        return ExecutionTrace(plan=plan, results=results, final_answer=final_answer)      def _synthesize(self, plan: Plan, results: List[StepResult],                    memory: MemoryManager) -> str:        steps_summary = "n".join(            f"Step {r.step_id} ({'✅' if r.success else '❌'}): {r.output[:300]}"            for r in results        )        synthesis_prompt = (            f"Original task: {plan.task}nn"            f"Step results:n{steps_summary}nn"            f"Provide a clear, complete final answer. Integrate all step results."        )        sys_prompt = (            f"You are {self.blueprint.identity.name}. "            + ("Always show your reasoning. " if self.blueprint.validation.require_reasoning else "")            + f"Goals: {'; '.join(self.blueprint.goals)}"        )        messages = memory.get_messages(sys_prompt)        messages.append({"role": "user", "content": synthesis_prompt})        resp = self.client.chat.completions.create(            model="gpt-4o-mini", messages=messages,            max_tokens=600, temperature=0.3        )        return resp.choices[0].message.content.strip()   @dataclass class ValidationResult:    passed: bool    issues: List[str]    score: float   class Validator:    def __init__(self, blueprint: CognitiveBlueprint, llm_client: OpenAI):        self.blueprint = blueprint        self.client    = llm_client      def validate(self, answer: str, task: str,                 use_llm_check: bool = False) -> ValidationResult:        issues = []        v = self.blueprint.validation          if len(answer) < v.min_response_length:            issues.append(f"Response too short: {len(answer)} chars (min: {v.min_response_length})")          answer_lower = answer.lower()        for phrase in v.forbidden_phrases:            if phrase.lower() in answer_lower:                issues.append(f"Forbidden phrase detected: '{phrase}'")          if v.require_reasoning:            indicators = ["because", "therefore", "since", "step", "first",                          "result", "calculated", "computed", "found that"]            if not any(ind in answer_lower for ind in indicators):                issues.append("Response lacks visible reasoning or explanation")          if use_llm_check:            issues.extend(self._llm_quality_check(answer, task))          return ValidationResult(passed=len(issues) == 0,                                issues=issues,                                score=max(0.0, 1.0 - len(issues) * 0.25))      def _llm_quality_check(self, answer: str, task: str) -> List[str]:        prompt = (            f"Task: {task}nnAnswer: {answer[:500]}nn"            f'Does this answer address the task? Reply JSON: {{"on_topic": true/false, "issue": "..."}}'        )        try:            resp = self.client.chat.completions.create(                model="gpt-4o-mini",                messages=[{"role": "user", "content": prompt}],                max_tokens=100            )            raw = resp.choices[0].message.content.strip().replace("```json","").replace("```","")            data = json.loads(raw)            if not data.get("on_topic", True):                return [f"LLM quality check: {data.get('issue', 'off-topic')}"]        except Exception:            pass        return []

We build the executor and validation logic that actually performs the steps generated by the planner. We implement a system that can either call registered tools or perform reasoning through the language model, depending on the step definition. We also add a validator that checks the final response against blueprint constraints such as minimum length, reasoning requirements, and forbidden phrases.

@dataclass class AgentResponse:    agent_name: str    task: str    final_answer: str    trace: ExecutionTrace    validation: ValidationResult    retries: int    total_steps: int   class RuntimeEngine:    def __init__(self, blueprint: CognitiveBlueprint,                 registry: ToolRegistry, llm_client: OpenAI):        self.blueprint = blueprint        self.memory    = MemoryManager(blueprint.memory, llm_client)        self.planner   = Planner(blueprint, registry, llm_client)        self.executor  = Executor(blueprint, registry, llm_client)        self.validator = Validator(blueprint, llm_client)      def run(self, task: str, verbose: bool = True) -> AgentResponse:        bp = self.blueprint        if verbose:            console.print(Panel(                f"[bold]Agent:[/] {bp.identity.name} v{bp.identity.version}n"                f"[bold]Task:[/] {task}n"                f"[bold]Strategy:[/] {bp.planning.strategy} | "                f"Max Steps: {bp.planning.max_steps} | "                f"Max Retries: {bp.planning.max_retries}",                title="🚀 Runtime Engine Starting", border_style="blue"            ))          self.memory.add("user", task)        retries, trace, validation = 0, None, None          for attempt in range(bp.planning.max_retries + 1):            if attempt > 0 and verbose:                console.print(f"n[yellow]⟳ Retry {attempt}/{bp.planning.max_retries}[/]")                console.print(f"  Issues: {', '.join(validation.issues)}")              if verbose:                console.print("n[bold magenta]📋 Phase 1: Planning...[/]")            try:                plan = self.planner.plan(task, self.memory)                if verbose:                    tree = Tree(f"[bold]Plan ({len(plan.steps)} steps)[/]")                    for s in plan.steps:                        icon = "🔧" if s.tool else "🤔"                        branch = tree.add(f"{icon} Step {s.step_id}: {s.description}")                        if s.tool:                            branch.add(f"[green]Tool:[/] {s.tool}")                            branch.add(f"[yellow]Args:[/] {s.tool_args}")                    console.print(tree)            except Exception as e:                if verbose: console.print(f"[red]Planning failed:[/] {e}")                break              if verbose:                console.print("n[bold magenta]⚡ Phase 2: Executing...[/]")            trace = self.executor.execute_plan(plan, self.memory, verbose=verbose)              if verbose:                console.print("n[bold magenta]✅ Phase 3: Validating...[/]")            validation = self.validator.validate(trace.final_answer, task)              if verbose:                status = "[green]PASSED[/]" if validation.passed else "[red]FAILED[/]"                console.print(f"  Validation: {status} | Score: {validation.score:.2f}")                for issue in validation.issues:                    console.print(f"  ⚠️  {issue}")              if validation.passed:                break              retries += 1            self.memory.add("assistant", trace.final_answer)            self.memory.add("user",                f"Your previous answer had issues: {'; '.join(validation.issues)}. "                f"Please improve."            )          if trace:            self.memory.add("assistant", trace.final_answer)          if verbose:            console.print(Panel(                trace.final_answer if trace else "No answer generated",                title=f"🎯 Final Answer — {bp.identity.name}",                border_style="green"            ))          return AgentResponse(            agent_name=bp.identity.name, task=task,            final_answer=trace.final_answer if trace else "",            trace=trace, validation=validation,            retries=retries,            total_steps=len(trace.results) if trace else 0        )      def reset_memory(self):        self.memory.clear()   def build_engine(blueprint_yaml: str, registry: ToolRegistry,                 llm_client: OpenAI) -> RuntimeEngine:    return RuntimeEngine(load_blueprint_from_yaml(blueprint_yaml), registry, llm_client)   if __name__ == "__main__":      print("n" + "="*60)    print("DEMO 1: ResearchBot")    print("="*60)    research_engine = build_engine(RESEARCH_AGENT_YAML, registry, client)    research_engine.run(        task=(            "how many steps of 20cm height would that be? Also, if I burn 0.15 "            "calories per step, what's the total calorie burn? Show all calculations."        )    )      print("n" + "="*60)    print("DEMO 2: DataAnalystBot")    print("="*60)    analyst_engine = build_engine(DATA_ANALYST_YAML, registry, client)    analyst_engine.run(        task=(            "Analyze this dataset of monthly sales figures (in thousands): "            "142, 198, 173, 155, 221, 189, 203, 167, 244, 198, 212, 231. "            "Compute key statistics, identify the best and worst months, "            "and calculate growth from first to last month."        )    )      print("n" + "="*60)    print("PORTABILITY DEMO: Same task → 2 different blueprints")    print("="*60)    SHARED_TASK = "Calculate 15% of 2,500 and tell me the result."      responses = {}    for name, yaml_str in [        ("ResearchBot",    RESEARCH_AGENT_YAML),        ("DataAnalystBot", DATA_ANALYST_YAML),    ]:        eng = build_engine(yaml_str, registry, client)        responses[name] = eng.run(SHARED_TASK, verbose=False)      table = Table(title="🔄 Blueprint Portability", show_header=True, show_lines=True)    table.add_column("Agent",   style="cyan",   width=18)    table.add_column("Steps",   style="yellow", width=6)    table.add_column("Valid?",  width=7)    table.add_column("Score",   width=6)    table.add_column("Answer Preview", width=55)      for name, r in responses.items():        table.add_row(            name, str(r.total_steps),            "✅" if r.validation.passed else "❌",            f"{r.validation.score:.2f}",            r.final_answer[:140] + "..."        )    console.print(table)

We assemble the runtime engine that orchestrates planning, execution, memory updates, and validation into a complete autonomous workflow. We run multiple demonstrations showing how different blueprints produce different behaviors while using the same core architecture. Finally, we illustrate blueprint portability by running the same task across two agents and comparing their results.

In conclusion, we created a fully functional Auton-style runtime system that integrates cognitive blueprints, tool registries, memory management, planning, execution, and validation into a cohesive framework. We demonstrated how different agents can share the same underlying architecture while behaving differently through customized blueprints, highlighting the design’s flexibility and power. Through this implementation, we not only explored how modern runtime agents operate but also built a strong foundation that we can extend further with richer tools, stronger memory systems, and more advanced autonomous behaviors.


Check out the Full Codes here and Related PaperAlso, feel free to follow us on Twitter and don’t forget to join our 120k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.