Search Is Becoming Agent Infrastructure

Dec 16, 2025· 6 min read
For the past few years, the "Search" box has been the front door to AI. Whether through a chatbot or a Retrieval-Augmented Generation (RAG) pipeline, we treated search as a way to feed an LLM the documents it needs to provide a better answer.
But a fundamental shift is happening in how production teams are starting to build. More accurately, search is being re-encapsulated as the context acquisition layer of the Agent Runtime. We are moving from a world where retrieval is a passive "lookup" to one where it is an active, state-aware primitive that fuels autonomous action.
Here are the six engineering shifts defining the transition from search infrastructure to agent infrastructure.

Search used to be the endpoint

In classical information retrieval (IR), the primary objective was to return a ranked list of documents that best match a user query. Retrieval was the destination: once the system presented results, its job was done, leaving the human user to interpret and take action. Even early RAG systems followed this "retrieve-then-read" architecture—a single-turn process where the search was essentially the final evidence injection before the response.

In agent systems, search becomes a workflow step

In the agentic paradigm, search is no longer a destination; it is an internal tool used by the system to resolve uncertainty. Agents don't just "find information"—they invoke search primitives to:
  • Clarify intent by probing indices when a user request is ambiguous.
  • Retrieve evidence to support multi-step reasoning.
  • Decide on the next tool by searching through API documentation or tool registries.
  • Verify state and recover from failure by searching system logs or database events to see if a prior action succeeded.
As noted in A Comprehensive Survey on Reinforcement Learning-based Agentic Search: Foundations, Roles, Optimizations, Evaluations, and Applications, agentic search represents a shift from retrieval as static evidence injection to retrieval as dynamic tool use for problem solving.

Retrieval must become state-aware

Standard search is often "stateless"—a query in, a result out. However, an agent’s needs change depending on where it is in its execution trajectory. State-aware retrieval means the retrieval query is conditioned on workflow state, task history, permissions, freshness, and the next action being considered.
Instead of a simple vector lookup, the system maintains task-scoped memory, workflow state, and long-term context separately, then assembles only the slice needed for the next action. This ensures the agent isn't just seeing what is relevant to a keyword, but what is correct for its current state of reasoning.

Hybrid retrieval is runtime context assembly

While vector databases have dominated recent AI headlines, production teams are rediscovering that "vector DB is not enough". In many enterprise workflows, keyword search remains surprisingly competitive, especially for exact terms, high-freshness data, and structured business state.
According to research in Keyword search is all you need from Amazon Science, tool-based keyword search implementations can achieve over 90% of the performance metrics of traditional RAG systems without the complexity of a standing vector database. A production-grade agent runtime performs Hybrid Retrieval, dynamically assembling context from:
  • Keyword search (BM25) for exact matches.
  • Vector search for semantic similarity.
  • SQL / Structured search for business state and schema discovery.
  • Logs and event streams for observability and failure recovery.

Context assembly needs policy

As agents move from public web search to internal enterprise knowledge, context selection becomes a policy problem. A retrieved document is not safe just because it is relevant. It must also be authorized, fresh, attributable, and appropriate for the current workflow state.
As outlined in Context Kubernetes: Declarative Orchestration of Enterprise Knowledge for Agentic AI Systems, managing agent context now involves responsibilities traditionally associated with operating systems:
  • Permission filtering: Ensuring the agent's authority is a strict subset of human authority.
  • PII Redaction: Automatically scrubbing sensitive data before it reaches the model.
  • Freshness Checks: Detecting stale or conflicted data in sub-milliseconds to prevent the agent from acting on "phantom" content.
  • Audit Logging: Recording exactly which context unit led to which agent action.

Evaluate retrieval by workflow outcome

In search infrastructure, we optimized for "Recall@K" or "Mean Reciprocal Rank" (MRR). In agent infrastructure, these metrics are secondary. The most important metrics move closer to Workflow Outcome.
Engineers are now evaluating their retrieval stacks based on:
  • Task Completion: Did the retrieved information actually solve the user's problem?
  • Tool Correctness: Did the context prevent the agent from misconfiguring an API call?
  • Failure Recovery Rate: How often did the agent use search to successfully undo a mistake or pivot after a crash?

Looking Ahead

Search did not disappear; it moved inside the agent loop. It has evolved from a user-facing answer interface into a runtime primitive for acquiring, filtering, and validating context before action. As we build these "Action Systems," the core engineering question changes: Is your infrastructure ready to provide for the agent's next move a context that is truly verifiable, auditable, and authorized?
Buy Me a Coffee
上一篇
The Production Agent Stack
下一篇
Demystifying Agentic Search Engines