Home Services Process Work Open Source Blog es Book a call
perplexity search agents llm code generation

Search as Code: When AI Stops Calling APIs and Starts Writing Them

Perplexity's new architecture lets models write custom search pipelines as Python code, slashing token usage by 85% and outperforming frontier models. Here's why this changes everything for AI agents.

June 2026 4 min
Search as Code: When AI Stops Calling APIs and Starts Writing Them

Every AI agent that does research follows the same tired ritual: write a query, call the search API, get back ten blue links, read them, synthesize, repeat. This loop works, but it's fundamentally wasteful. The search engine was designed for humans scrolling through a browser, not for an LLM trying to ingest a hundred pages per minute. Most of what comes back is noise, and the agent has no way to filter it except through more token-hungry reasoning. Perplexity's new "Search as Code" architecture breaks this pattern by letting the model write its own search pipeline in Python instead of calling a fixed API. This is not an incremental improvement. It is a paradigm shift in how agents interact with information systems.

The practical win is obvious from the numbers. In a CVE research task involving 200 critical vulnerabilities, the Search as Code agent used 85 percent fewer tokens than Perplexity's own standard pipeline. That's not a minor optimization; it's a fundamental rethinking of the agent's relationship with the search layer. Instead of shoveling raw results into the context window and hoping the model can separate signal from noise, the agent writes code that does the filtering at the source. It runs parallel queries tailored to specific vendor formats, programmatically deduplicates, and verifies schemas before anything enters the context. The model becomes not just a consumer of search results, but the architect of the search itself.

Let's talk about what makes this architecture tick. Perplexity describes three layers: the model that decides on a strategy, a sandbox that executes the generated code, and an Agentic Search SDK that exposes search primitives as composable functions. The key insight is that the model isn't writing arbitrary Python; it's orchestrating a library of building blocks. Retrieve, filter, deduplicate, rerank — each is a simple SDK call. The model's job is to sequence and parameterize them correctly. This is exactly the same pattern we've seen in AI coding benchmarks for years: the most capable models aren't those that memorize answers, but those that can compose functions to solve novel problems. Here, the same skill is applied to the search process itself.

The implications for builders are profound. First, this approach eliminates the context pollution that has plagued research agents. Standard search tools stuff the window with irrelevant links; the model wastes tokens trying to ignore them. When the agent writes its own filters, it can be surgical. Second, it shifts the burden of optimization from the API designer to the model. If a new search strategy works better for a specific domain, the model can discover it and encode it directly, without waiting for an API update. This is a step toward agents that can build their own tools on the fly, adapting their infrastructure to the task rather than being locked into a rigid interface.

But let's not pretend this is without risk. Generating code that talks to a search backend introduces a whole new attack surface. Even with a sandbox, there are questions about prompt injection: can a malicious webpage trick the generated script into exfiltrating data? Perplexity seems aware of this by isolating execution, but the complexity is real. Models also make mistakes. A model that writes a buggy filter might discard crucial results, and debugging generated code is far harder than debugging a prompt. The 85 percent token reduction is impressive, but it came from a specific cybersecurity task. How well does this generalize across domains with messier data? On Perplexity's own benchmarks, SaC leads in four of five categories, but self-reported numbers always warrant skepticism.

Still, the direction is unmistakable. A separate survey paper cited in the article argues that code is becoming the operational layer for agents: models handle strategy, deterministic runtimes handle execution, and the infrastructure of sandboxes and verification becomes the real bottleneck. Perplexity's Search as Code is a concrete realization of that vision. It treats the model as a programmer, not a tool user. And when models can program their own search infrastructure, the boundaries of what they can autonomously research expand dramatically. The API era for agents is ending. The code generation era is just beginning.

Toni Soriano
Toni Soriano
Principal AI Engineer at Cloudstudio. 18+ years building production systems. Creator of Ollama Laravel (87K+ downloads).
LinkedIn →

Need an AI agent?

We design and build autonomous agents for complex business processes. Let's talk about your use case.

Free Resource

Get the AI Implementation Checklist

10 questions every team should answer before building AI systems. Avoid the most common mistakes we see in production projects.

Check your inbox!

We've sent you the AI Implementation Checklist.

No spam. Unsubscribe anytime.