Why Closed Data Stacks Fail in the Age of AI Agents

AI agents can dramatically increase query volumes in data warehouses, but closed ecosystems funnel all those queries through expensive compute paths. In a recent episode of The New Stack podcast, Fivetran CPO Anjan Kundavaram discussed the economics of this shift, warning that locking down data infrastructure is the wrong response. Instead, he advocates for open data infrastructure and semantic discipline to enable cost-effective, high-quality AI analytics. Below, we explore key insights from his conversation.

Why are closed data stacks particularly problematic for AI agents?

Closed data stacks route every query through the same expensive compute engine, regardless of the task's complexity. Kundavaram compares this to "using a Lamborghini to mow the lawn all the time." AI agents can run 10 to 100 times more queries than traditional analytics workflows, and in a closed system, each query incurs the same high cost. With multiple engines available in an open stack, agents can intelligently route simple queries to cheaper options and reserve costly resources for complex analytical questions. This flexibility is lost when the stack is locked, leading to inflated operational expenses that become unsustainable at scale.

Why Closed Data Stacks Fail in the Age of AI Agents — Source: thenewstack.io

How do agents change cost dynamics compared to human-driven analytics?

Human analysts expect near-instant responses, but agents can tolerate longer wait times if it reduces cost significantly. Kundavaram notes, "An agent could go spend more time if the agent thinks you're going to save 10x the cost." This tolerance means agents can choose slower, cheaper compute paths for routine queries, reserving faster engines only when necessary. In a closed stack, however, every query—whether trivial or complex—traverses the same expensive path, negating this optimization. The result is a cost structure that penalizes high-volume agentic workloads rather than benefiting from them.

What is the "triple whammy" Kundavaram describes for enterprises?

Kundavaram warns that when customer data is scattered across many systems without consolidated context, three problems compound: poor AI output because agents lack complete information, sharply higher costs from the sheer volume of queries, and wasted resources on queries that yield weak results. He calls this a "triple whammy." The root cause is fragmented data silos; even if a closed stack centralizes compute, it doesn't address the context gap. Enterprises must first unify data and establish semantic clarity to avoid this vicious cycle, which otherwise makes agentic AI both expensive and unreliable.

Why is the instinct to clamp down on analytics budgets counterproductive?

Many data leaders respond to rising query costs by implementing strict controls. Kundavaram recounts a conversation with a leader from a large company whose analytics budgets had soared. Fivetran's own internal team initially wanted to impose limits. But Kundavaram argues this is exactly the wrong move: "No, don't put controls. Let's innovate." Locking down usage stifles experimentation and prevents organizations from discovering the productivity gains that agents can unlock. Instead, companies should invest in open infrastructure that allows agents to optimize costs naturally. The real solution is not restriction but architectural flexibility and data governance.

What role does semantic discipline play in making AI agents cost-effective?

Open data infrastructure alone isn't enough—semantic discipline ensures that data is well-defined, consistent, and easily interpretable by agents. Without it, agents may still run expensive queries on poor-quality data, negating the benefits of open compute. Kundavaram emphasizes that consolidating context and defining clear business semantics allows agents to request only relevant information, reducing both cost and error. This discipline includes standardized naming, metadata management, and unified data models. When combined with open access to multiple compute engines, it creates an environment where agents operate efficiently, delivering accurate answers at minimal expense.

How does Fivetran's "Open Data Infrastructure" initiative address these challenges?

Fivetran promotes Open Data Infrastructure, a framework that prevents vendors from taxing AI workloads through proprietary compute paths. At Google Cloud Next, the company launched a Data Access Benchmark to measure and expose hidden costs. The goal is to give enterprises visibility into how their data flows across engines and to encourage interoperability. By supporting multiple compute destinations—like data lakes and warehouses—Fivetran enables agents to choose the most cost-effective path for each query. Kundavaram argues that this approach is not just pro-customer but essential for the agent era, where inflexible stacks will become obsolete.