Introduction
This post looks at multi-agent architectures and compares them with microservice architectures.
Since multi-agent architecture and microservice architecture have similarities, we do not want to reinvent the wheel. We want to take the design patterns and lessons learnt from microservices and apply them where appropriate. We also need to recognize differences where they exist and incorporate them into our architecture. Finally, it is anticipated that many solutions will consist of a hybrid of AI agents and other services.
The intent of this post is to offer some useful comparisons and lay the groundwork for future research, thinking and development in this area.
What is an AI Agent
Broadly speaking an AI agent is a software component capable of making decisions and taking actions to achieve specific objectives or perform specific tasks. It often operates autonomously or semi-autonomously, utilizing advanced technologies such as large language models (LLMs).
For more detail on AI Agents checkout my What is an AI Agent anyway? post.
Multi-Agent Systems
In these systems multiple agents work together to solve problems and meet goals. Each agent has its own objectives and capabilities but operates in a shared environment. This involves communication and coordination between the agents.
In the example above Agent D is acting as the orchestrator of the process, this is one form of collaboration. But there are other patterns for collaboration, each with pros and cons.
Microservice Principles
Some of the principles of microservices architecture are as follows:
Focus on Roles and Responsibilities
Each microservice focuses on a particular business role or technical function. It has a specific set of responsibilities.
Internal Implementation
The languages, tooling and approaches used can differ between services. Each service is free to implement its internals in a way that best meets the specific needs of the service.
Well-Defined Communication
Microservices communicate through APIs or messaging processes, emphasizing structured and clear interactions.
Loose Coupling
Services maintain independence, allowing flexibility and scalability ideally without direct dependencies on other services.
Orchestration Patterns
Collaboration between services often involves an orchestrator, which coordinates processes. Other orchestration patterns such as sagas exist, each with pros and cons.
Monitoring
In a distributed system with multiple components monitoring is especially important.
Principles in Common
Let’s take a look at multi-agent and microservices architectures through the lens of the microservice principles outlined above.
Focus on Roles and Responsibilities
We noted that each microservice focuses on a particular business role or technical function. It has a specific set of responsibilities.
This is exactly the same principle for an agent. Each agent focuses on a particular business role or technical function. It has a specific set of responsibilities.
Internal Implementation
Each microservice is free to implement its internals in whatever way best meets the specific needs of the service. How does this compare with an agent, they’re pretty similar:
So, an agent is a microservice. It has been built with a particular set of internals that are best suited to its functions.
Being a little provocative we could say that the difference between an agent and a microservice is an implementation detail.
Or, since one of the key components of an agent is an LLM or other AI model, we could perhaps define an agent as a microservice with a brain.
As we explore some of the other principles it will still be useful to draw a distinction between a microservice that is implemented as an agent and one that is not. Gave some thought to names for a non agentic microservice and building on the “plain old” term, decided on Plain Old Microservice (POM).
Well-Defined Communication
Microservices typically communicate through APIs or message processing. Under that umbrella there are a number of different mechanisms and approaches.
Agents can also communicate through APIs or message processing. There are also a variety of mechanisms and approaches, including those developed specifically for agent to agent communications such as Google’s A2A protocol.
To think further about communication structure let’s consider a hybrid system that includes both microservices that are agents and Plain Old Microservices (POMs), along with people and business events interacting with the system.
For Person to Agent (PtoA) the interaction is usually one using natural language.
In Plain Old Microservice to Plain Old Microservice (POMtoPOM) communication there are three commonly used options:
· Synchronous Communication. POM2 calls POM3 on POM3’s API and expects an immediate response.
· Asynchronous One-to-One Communication . POM2 calls POM3 on POM3’s API and can wait for a response. Note that there are several ways in which the response can be returned, these include: POM2 can poll POM3 at intervals to see if the response is ready, POM2 could provide a callback for POM3 to use when the response is ready.
· Publish/Subscribe Communication. POM2 publishes an event; POM3 listens for this event and acts accordingly. Oftentimes POM2 may have no awareness of who is listening for the event. There is also a form of asynchronous one to one, where POM2 publishes an event and expects service POM3 to publish a certain event type with the results.
With business events the publish/subscribe (pub/sub) model is often a natural choice, business events are published to an event bus and subscribers listen for them. This approach works for Business Event to POM (BEtoPOM) and Business Event to Agent (BEtoA) communication.
In POMs the instructions as how how to respond to the business event are built into the business logic in the POM. For an agent we do need to have provided the agent with prompt and context on how to respond to the event.
So what is unique about AtoA communication, why might we use a new protocol for this or would our previous approaches suffice. To get a sense of this let’s take a look at Google’s A2A protocol.
The following table from Google describes the fundamental communication elements in A2A:
| Element | Description | Key Purpose |
|---|---|---|
| Agent Card | A JSON metadata document describing an agent’s identity, capabilities, endpoint, skills, and authentication requirements. | Enables clients to discover agents and understand how to interact with them securely and effectively. |
| Task | A stateful unit of work initiated by an agent, with a unique ID and defined lifecycle. | Facilitates tracking of long-running operations and enables multi-turn interactions and collaboration. |
| Message | A single turn of communication between a client and an agent, containing content and a role (“user” or “agent”). | Conveys instructions, context, questions, answers, or status updates that are not necessarily formal artifacts. |
| Part | The fundamental content container (for example, TextPart, FilePart, DataPart) used within Messages and Artifacts. | Provides flexibility for agents to exchange various content types within messages and artifacts. |
| Artifact | A tangible output generated by an agent during a task (for example, a document, image, or structured data). | Delivers the concrete results of an agent’s work, ensuring structured and retrievable outputs. |
Google states that the A2A protocol supports the following forms of interaction:
- Request/Response (Polling): Clients send a request and the server responds. For long-running tasks, the client periodically polls the server for updates.
- Streaming with Server-Sent Events (SSE): Clients initiate a stream to receive real-time, incremental results or status updates from the server over an open HTTP connection.
- Push Notifications: For very long-running tasks or disconnected scenarios, the server can actively send asynchronous notifications to a client-provided webhook when significant task updates occur.
They note that the protocol will evolve and there is some flexibility around implementation. This is especially true for interaction around push notifications where we can certainly image a pub/sub approach instead of pushing to webhooks.
In many ways the Task may be the most architecturally consequential element of the protocol. It is more conversational in nature than our classic POM to POM communications. Consider this scenario:
Note the outgoing dialog between the user and the agents. All the dialog is maintained within the context of a single task.
I previously noted that, being a little provocative, we could say that the difference between an agent and a microservice (POM) is an implementation detail. However, that is challenged by this scenario. The implementation detail of being an agent has changed the external nature of the communication between components.
Another key element of the A2A protocol is that all agents essentially support the same API. This differs from some of the POM approaches. For instance, in REST style the service typically defines a variety of endpoints.
Loose Coupling
There are many aspects of loose coupling. Let’s focus on a three of the key patterns.
Minimal Shared Knowledge
Services should not share databases or internal logic, they should share data only through their APIs. An interesting area of shared knowledge that is worth exploring is an Ontology. There are other techniques that can be used beside a shared ontology, but I do think consideration of whether a shared ontology is of more value, or less value, in a multi-agent architecture is useful.
Resilience and Fault Tolerance
If one service fails then we want the others to continue operating. Many times we need to decide on a case by case basic how a service should behave if it attempts to access another service that is unavailable. Techniques such as retires, circuit breakers and caching can assist here. It would be instructive to take an example of agent to agent communication via a task, such as included above, and consider scenarios where one of the agents is unavailable. This topic is ripe for exploration.
Asynchronous Communication
We might argue that event based asynchronous communication through pub/sub is the ultimate lose coupling. However, we still need to handle the situation where the subscribing service is unavailable and thus the event is never handled.
Orchestration Patterns
All the multi-agent architecture diagrams I have seen so far use a single (Controller) agent as the orchestrator.
There are other orchestration patterns, such as the event driven (pub/sub) approach. Although I should really call this Choreography.
As we might expect, each of these approaches has pros and cons. I do think that consideration of choreography in a multi-agent architecture is important and I would love to see that included in architecture diagrams and presentations.
Monitoring
Put simply, monitoring is particularly key in any distributed system, whether it be composed of microservices, agents or a hybrid. The part of monitoring that is common includes things such looking for failures, components that are down, or response time issues.
There are certainly special consideration for monitoring Agents, such as whether the Agent is operating in accordance with its guidelines. Also, monitoring covers checking for security issues or breaches and there are considerations unique to agents here.
Monitoring and agent security are broad topics that I couldn’t hope to cover in detail here, but it worth keeping our eyes open for thinking from others in this area.
Conclusion
It may be that my sample size is not large, but I have seen little or no discussion around applying the patterns and best practices that have been built up around microservices to multi-agent architectures.
There is a natural focus on new things, but in software development is is key that we can embrace the new while not forgetting the lessons we have learnt in the past. The combination of these things helps us to build production-ready solutions that meet business needs.
Many solutions will be a hybrid of Plain Old Microservices (POMs) and Agents. So consideration of how these components will interoperate is important.
This post attempts to offer some useful comparisons between microservice and multi-agent architectures and lay the groundwork for future research, thinking and development in this area. This is an area that I continue to explore and will certainly be producing further posts on this topic.