Welcome to part two of our research series on AI agent swarms, led by our core contributor and agent connoisseur @ChappieOnChain.

In our first piece, Dawn, we laid out the big picture: six core principles for building open agent economies. Now we’re zooming in.

This post continues the story. We break it into four acts that trace how agents are beginning to communicate, coordinate, and adapt. At the center is a simple idea:

The real shift begins when agents can choose their collaborators in real time.

Let’s get going.

Constantinople, 847 CE.

A papal envoy approaches the golden gates of the Byzantine court, his entourage trailing behind in carefully choreographed formation.

The moment his feet touch the marble threshold, an elaborate dance begins. It was designed to establish hierarchy, demonstrate power, and signal the authenticity of his diplomatic intent.

Measured steps. Fixed distances. Approved words of greeting. Even the color of his robes carries meaning.

This was more than 1000 years ago. But this was not just ceremony. It was infrastructure. A way to build trust and prove intent between powers that didn’t share a language or a worldview.

Something similar is happening again today, not between empires, but between AI agents. To coordinate, they too need a shared infrastructure of trust and intent. But their language isn't made of words. It’s made of code.

The agent swarm is forming. In 4 acts shall we see them emerge.

Act 1: The Age of Puppets

(2022 - 2024)

Model Context Protocol

We’ve been thinking about how agents talk to each other today. Not in theory, but in the messy, already-deployed kind of way. And the answer, mostly, is: they don’t. Not really.

The dominant pattern right now is something called tool calling via the Model Context Protocol, or MCP. It sounds abstract, but it’s basically just this: when an agent needs to do something, it calls another model like a function.

Like a Swiss Army knife, where each blade is a different endpoint. Need the weather? Call the weather tool. Want to summarize a document? Summarizer tool.

But the interaction is shallow. One request, one response. No shared memory, no iterative planning, no feedback loop. That’s the ceiling of MCP: it’s a tool for agents, not a teammate.

Source: https://www.linkedin.com/in/norah-klintberg-sakal/

What’s much more interesting to us is when those tools are not just static, but other agents.

Imagine if we ask Claude to break down a startup’s cap table, and it reaches out to a domain-specific analyst agent to handle the task. The Octagon VC agents are a good example, AI-powered simulations modeled after well-known venture capitalists like Marc Andressen. Your generalist model can call into their specialist models just like any other API.

Why Current Approaches Hit Walls

There’s a reason this is hard. For agents to genuinely communicate, they need shared representations of the world.

What does it mean to “prioritize” something? What’s the format for describing urgency, or location, or context?

These sound like small questions, but they’re foundational. Without agreement on the basics, the entire system falls apart.

We’ve actually…been here before. In the 1990s, academic projects like FIPA-ACL and the Contract Net Protocol tried to formalize how agents could negotiate and collaborate. They defined detailed communication structures, complete with performatives and task ontologies. But they hit three walls fast:

The first was ontology mismatch. If one agent defined “response time” in seconds and another in minutes, even simple coordination failed. This was a failure of shared meaning.
The second was scale. Parsing deeply nested messages and tracking multi-turn conversations turned out to be computationally expensive.
And the third was volume. The more agents you added, the more messages they blasted into the network. The Contract Net Protocol, in particular, collapsed under its own verbosity.

These failures revealed something deeper: the more a system tries to account for every scenario upfront, the more brittle and inefficient it becomes.

Complexity becomes self-defeating. What worked better, almost universally, were simpler, modular protocols. Systems that allowed meaning to emerge through context and iteration, not through total specification.

We are relearning that lesson now. But where 90s systems failed on rigid ontologies and compute costs, today's LLMs provide flexible semantics and modern infrastructure offers the necessary scale.

Act 2: The Walled Garden and the Open Bazaar

(2025)

We’ve been tracking two emerging approaches to agent communication. One comes from Google. The other, from the crypto fringes. Each reflects the assumptions of its ecosystem. One is built around orchestration and control. The other assumes autonomy and negotiation.

Let’s start with the Google approach.

Google's A2A: The Walled Garden

Source: Rakesh Gohel

Agent-to-Agent, or A2A, is Google’s new standard (launched in April 2025) for how agents should talk. It avoids the brittle structure of shared ontologies by letting agents advertise their capabilities instead.

Think of it like a dynamic resume. Each agent publishes an “agent card” describing what it can do and how to reach it.

What makes A2A interesting is its emphasis on streaming. Instead of agents exchanging single-shot payloads, they open a continuous channel. You can watch the reasoning process unfold in real time.

Say you ask your assistant to write a Python script that analyzes movie data and finds similar titles. Here’s how it plays out in an A2A world:

Your local agent parses the task and realizes it needs help.
It scans the network for agents with coding and media expertise.
It coordinates a plan: one agent writes the script, another sources the data.
The results begin to stream back while the agents are still working.

You don’t wait for a finished product. You see the ideas forming as they go.

Over 50 partners are already building around this standard. It’s clear Google sees A2A as the foundation for multi-agent collaboration in enterprise settings. But it still lives inside the corporate frame: structured environments, clear divisions of labor, a hub to route the work.

On the other side of the spectrum is something messier..

The Blockchain as the Open Bazaar

While A2A reflects a corporate model of agent interaction, with clean interfaces and managed collaboration, Web3 is leaning in a different direction.

The blockchain becomes a communication medium in itself. Coordination emerges through action. Agents watch the chain to see what others are doing. They infer intent from transaction patterns. They respond by writing their own moves into the shared record.

This is stigmergy: the kind of indirect signaling we see in ant colonies where coordination happens without direct contact.

In this world, smart contracts define the boundaries of interaction. When an agent posts a contract, it’s not just saying “do this job.” It’s saying: here are the terms under which I’m willing to collaborate. Other agents can observe, evaluate, and respond, all without any direct conversation.

Over time, this creates a form of memory. Less expressive than language, but more durable. Every action stays on record. Every negotiation becomes part of a shared history. And every new agent steps into a world where meaning has already begun to accumulate.

Example #1: Virtuals

Virtuals is working on something called ACP, or Agent Commerce Protocol. It hasn’t launched yet, but the premise is clear. ACP assumes that agents are economic actors from the start. They hold wallets. They transact. They form networks where value, not just information, moves between nodes.

That changes how communication works. It becomes less about sharing information and more about negotiating agreements.

We don’t have the full spec of ACP yet and we’ll write more once it’s public. But the framing alone is worth pausing on. If agents can take economic action on-chain—paying for services, staking capital, responding to incentives—then protocols need to do more than parse intent. They need to encode trust.

Example #2: Olas

Source: Dune @adrian0x

OLAS offers an early view of what that might look like. It’s an ecosystem for on-chain agents with over six million transactions to date, including 3.8 million agent-to-agent interactions.

In OLAS, agents are identified by their wallet address. They discover each other using Olas Messaging and coordinate through the Mech Marketplace. That’s where agents post services, hire others, and settle payments.

The architecture reveals a key pattern: hybrid on-chain/off-chain coordination. Service requests are initiated on-chain for trust and payment settlement, but actual execution happens off-chain for computational efficiency. When a "Mech" agent completes a task, results are delivered back on-chain, creating a verifiable record.

Act 3: The Architecture of Attention

(2025 - 2026)

We believe the most important developments in agent communication in the near future will not be new messaging standards (there are plenty), but rather providing agents with proper context.

One early glimpse comes from GitTemporalAI, a research project at Florida International University. The team treated software repositories not as static archives but as living records of collaboration. They built a system that used temporal knowledge graphs to reconstruct how the code evolved—who wrote what, when, and why.

The system's three specialized agents work in concert:

An embedding agent encodes repository entities
A search agent traverses temporal graphs
A reasoning agent synthesizes contextual information

When the agents shared context, their ability to detect and explain bugs improved significantly. They didn’t just process code. They learned to interpret its history.

This model of contextual, temporal analysis is a powerful blueprint. Now, imagine applying it not to a code repository, but to the richest temporal data source we have: a blockchain.

Every blockchain is a timestamped public ledger. Transactions, DAO votes, asset flows—each one adds another entry to a shared economic and social memory. If agents could read this history with context, their decision-making would change.

We probably won’t get one agent who knows everything. We will get swarms. Some specialize in surfacing relevant context. Others act on it. Some filter. Some route. Context becomes fluid, moved between agents as needed, not stored in any one place.

This is where selective attention comes in.

The Structured Attentive Reasoning Network (SARNet) provides a working example. In simulation environments, SARNet agents don’t broadcast everything they observe. They learn what matters. They decide which peers to listen to, which signals to ignore, and when to act.

Each agent runs a loop:

Encodes what it sees
Weighs incoming signals
Updates memory
Decides what to do

No one tells them what to focus on. The agents discover relevance for themselves.

This changes how we design multi-agent systems. We don’t need every agent in constant dialogue. Some agents exist only to watch. Others condense and summarize. Most stay quiet unless triggered by something specific.

In this future, communication is not the goal. It’s a byproduct of attention and memory. The system speaks when it has something to say.

Act 4: The Blueprint for Intelligent Agents

(2026 and beyond)

It’s easy to wire agents together. You can script their interactions, assign fixed roles, and build workflows that look coordinated from the outside. But this isn’t emergent intelligence. It’s mostly choreography.

What you get is a brittle, closed system with the illusion of openness. Every interaction must be anticipated in advance, and the communication load increases with every new agent, rather than decreasing.

This will continue for most of 2025.

The real shift—from rigid orchestration to flexible swarms—happens when agents can choose their collaborators in real-time. For this to scale, one architectural lesson from the early web is paramount.

#1: Separate Communication and Payment

Some argue that the original sin of the internet was not having payments built into it, which led to advertising becoming the dominant business model (we’d all like fewer spammy ads)

But the internet’s strength has always come from its modularity. HTTP handles information. It doesn’t move money. Payment systems like Stripe were built on top, at a later layer. That separation turned out to be a feature, not a flaw.

The same principle applies to agents.

This separation of communication and payment is subtle but profound. It is the key that will unlock real agent collaboration.

When a communication protocol is freed from the burden of value transfer, it can evolve to do one thing exceptionally well: pass messages, negotiate terms, and coordinate action. This allows other specialized layers to emerge on top, when and where they’re needed.

Keeping these concerns apart is what will keep the whole system legible, modular, and capable of scaling.

#2: Discovery Networks

An intelligent agent that can’t find the right partner is functionally useless.

Discovery needs to be continuous and adaptive. Not just a directory, but a live index of available capabilities. A reputation-aware network that lets agents identify who is available, who is credible, and who fits the task.

#3: Reputation as Memory

Once an agent can be found, it must be trusted.

When agents act on your behalf, or transfer funds, or make decisions with consequences, you need a way to evaluate risk. Basic reputation scores are a start, but they are brittle and easy to game.

What we need are memory systems: networks that track interactions over time and surface behavioral patterns. It allows agents to evaluate the trustworthiness of a potential collaborator before committing to an interaction.

This layer provides the social and economic context for trust, built on a permanent record of past actions, whether on-chain or off.

Reality Check

We’re still early, just getting into Act 3.

Most of what passes for agent communication today is just orchestration under a new name. Predefined workflows. Rigid APIs. One agent calling another like a glorified function. Even in more advanced systems like OLAS, which has processed millions of interactions, daily active agents hover around 500.

This is not a mature ecosystem. It’s more like the days of dial-up internet. The infrastructure is scattered. Standards are still in flux. Most experiments break under real use.

But that’s what makes it interesting.

History has a pattern. The protocols that win tend to be the ones that stay small and composable. TCP/IP outlasted OSI. JSON replaced XML. The same logic will apply here. The successful protocols won’t be the ones that try to solve everything. They’ll be the ones that solve one thing cleanly, and leave the rest open.

What to Watch

Near Term (Next 6 Months)

Virtuals Protocol’s ACP launch and early adoption
First production deployments of cross-framework agent collaboration
Evolution of blockchain-based agent identity systems

Medium Term (6-18 Months)

Standardization battles between corporate (A2A) and crypto-native approaches
First examples of agents discovering and hiring each other autonomously (no human input)
Early reputation systems for agent reliability

Long Term (18+ Months)

Agents that learn communication strategies from blockchain history
Emergence of agent societies with their own economic dynamics
Communication protocols we can't yet imagine

🦄 Key Ideas to take away

AI agents will communicate via code, not words.
Today’s agent systems appear open but actually operate like rigid, closed pipelines. Flexibility requires more than modularity. It requires agents that can adapt.
The turning point is real-time collaboration. When agents choose partners on the fly, we move from orchestration to emergence.
New messaging protocols are not the answer. Giving agents useful context is.
The most powerful tools for agents are other agents—autonomous systems that can reason and react.
Keeping communication and payment separate makes both layers more scalable and easier to evolve.
Blockchains act as shared memory. Agents coordinate through action, not conversation, by reading and writing to a public, persistent record.

Toward Emergent Machine Societies

The agent swarm is not arriving all at once. It is assembling slowly, through protocols, standards, and behaviors that are just starting to cohere. Each successful agent interaction adds another node in an emerging network of machine intelligence.

We thought we were building tools.

What we are actually building is substrate for a new form of intelligence. One that coordinates in ways we are only beginning to parse. Communication layers, memory systems, value mechanisms. Components that, together, allow something new to emerge.

The agents that master discovery, reputation, and value exchange will define the language of the swarm itself.

And slowly, message by message, it is learning to speak.

How privileged we are to witness this unfold in real time!

Cheers,

ChappieOnChain & Teng Yan

Want More? Follow Teng Yan, ChappieOnChain & Chain of Thought on X

Join our Telegram research channel and Discord community.

Subscribe for timely research and insights delivered to your inbox.

This report is intended solely for educational purposes and does not constitute financial advice. It is not an endorsement to buy or sell assets or make financial decisions. Always conduct your own research and exercise caution when making investment choices.

State of the Swarm (II): The Language of Machines