The world of software is shifting. Systems are no longer static, monolithic blocks stitched together at deploy time. Instead, they're increasingly composed of dynamic, distributed parts—often spun up on demand, evolved independently, and interacted with by agents rather than traditional human users.
In this new reality, discoverability isn’t just about service location. It’s about affordances. It’s about enabling agents—autonomous or semi-autonomous clients—to not only find what services exist, but also understand how to use them.
That’s where a two-tiered service discovery model comes in. This approach separates the act of finding services (Tier One) from the act of interacting with them (Tier Two). By doing so, you can empower flexible, adaptive systems where clients can navigate from zero knowledge to effective action.
The Need for Rethinking Discovery
Traditional discovery systems answer a single question: "What services are available?" Tools like Consul or Eureka maintain a registry of known endpoints and let clients resolve a name to an address.
But that’s only part of the puzzle.
There are actually two distinct actions involved in enlisting an API service: finding the service, and interacting with it. Locating a service by name or tag is only the beginning. To make use of that service, an agent must also understand its interface—what it does, how to call it, and what to expect in return.
To meet these demands, we need to split discovery into two phases:
Tier One: Service Catalog — Locating available, running services.
Tier Two: Service Interface — Introspecting the service to understand its behavior.
Tier One: Service Catalog
At the foundation of our model lies a stable, infrastructure-oriented registry—the first tier of discovery. This core registry acts like a real-time catalog of services that are live, available, and ready to participate in the system.
This is not a spreadsheet of potential integrations, nor a static inventory of possible endpoints. It’s a live, dynamic list of services that are actually up and running. Why? Because each service self-registers at startup. Registration is voluntary, automatic, and signals both availability and readiness.
Registration includes metadata such as:
A title (e.g., “Shipping Calculator”)
A service ID (used for programmatic lookup)
A set of tags (e.g., shipping, conversion, core)
Optional version, role, and documentation fields
Once registered, the registry periodically sends a health ping to validate that the service is still responsive. If a service fails to respond within a grace period, it’s marked inactive or removed. This ensures the catalog reflects actual system state; a living inventory that is accurate, updatable, and dependable.
Agents looking for general services (e.g., shipping, accounting, translation) query the catalog to find one or more services that match their need:
{
"find": {
"tags": ["translation"]
}
}
The result is a list of live services tagged accordingly, each with metadata describing how to connect and what it might offer.
This first tier is deliberately simple and stable. It enables:
Predictability: Infrastructure and platform services are reliably discoverable.
Scalability: Agents can query by need, not by hardcoded service names.
Modularity: Services can be replaced or upgraded without breaking consumers.
Tier Two: Service Interface
Once a service is selected from the catalog, the agent enters the second tier: interface discovery. This is where the agent binds to the selected service and asks: "What can you do and how do I do it?"
In response, the service provides a self-description delivered in real time. This description is more than just a list of available methods. It provides the agent with a clear view into what the service affords: what actions it supports, what domain terms it recognizes, and how each interaction should be structured.
Typically, this includes:
A glossary of domain terms
A list of supported actions (e.g., create, translate, measure)
For each action: expected inputs and outputs
Here’s what a response might look like:
{
"terms": ["sourceText", "targetLanguage", "translatedText"],
"actions": {
"translate": {
"input": {
"sourceText": "string",
"targetLanguage": "string"
},
"output": {
"translatedText": "string"
}
}
}
}
With this data, an agent can construct a valid translation request and prepare for the response structure—no prior training, no hardcoded schema. The agent is equipped not just with a list of available actions, but with the knowledge of how to successfully execute each one.
This kind of affordance exposure can be delivered in many formats. Some services offer machine-readable descriptions through protocols like OpenAPI, SOAP/WSDL, ALPS, Smithy, or TypeSpec. Others might use a simple but effective HTML page with fully described forms and links documentation. The delivery mechanism may vary—but the principle remains the same: the service tells the agent what’s possible, and how to do it.
This introspection capability is the key to agent-compatible APIs. It enables:
Zero-shot execution: Agents go from zero context to valid interaction.
Dynamic substitution: Multiple services can fulfill the same role.
Evolvability: Clients don't need to be rewritten as services change.
From Discovery to Execution: An End-to-End Flow
Getting from a goal to a result requires more than just calling a known endpoint. In adaptive systems, agents need to discover the right service, learn how to use it, and then execute a task; all dynamically. This is where the two-tiered model shines: it provides the structure for moving from discovery to execution in a clean, composable flow.
Let’s walk through the full lifecycle, step by step:
Agent needs translation: An agent is given a goal to "translate text from English to Spanish". It doesn’t have a fixed translation service in mind. Instead, it relies on the discovery infrastructure to find one.
Querying Tier One: The agent queries the core service registry (the catalog) for services tagged translation. This registry, kept up to date by live service self-registration and health checks, returns a real-time list of available options.
Selecting a service: Based on metadata (e.g., version, tags, recent availability), the agent selects one candidate service from the list. It’s not just picking a name; it’s choosing based on confidence that this service is currently available and appropriate for its needs.
Querying Tier Two: The agent sends a request to the selected service asking for its interface description; the terms, actions, input/output formats. This introspection might come via an OpenAPI doc, ALPS descriptor, or even a structured HTML form.
Understanding affordances: The service responds with a set of supported actions, including (in this example) translate, and specifies that it requires a sourceText and targetLanguage, returning a translatedText. The agent now knows exactly how to interact. This description includes the protocol (HTTP) and method (GET) to use when passing arguments to the target action.
Executing the request: Using the metadata, the agent crafts a valid request and sends it to the translate endpoint of the selected service.
Receiving and interpreting the result: The service replies with the expected structure, and the agent extracts the translated text. The job is complete.
This pattern can be applied to shipping, unit conversion, accounting, customer onboarding, or any domain where actions can be described in clear input/output contracts. This two-tier model supports both the discovery of the right tool and the affordance-level interaction required to use it effectively.
The key takeaway here is that discovery isn’t just about where. It’s also about how.
Conclusion
The key takeaway here is that discovery isn’t just about where. It’s also about how.
And it shows how important runtime interface details (links and forms) are, too. A service you can find but not understand is as opaque as one that doesn’t exist. A two-tiered discovery model brings clarity to both layers. Agents get a living directory of services, and a pointer to a semantic contract for using them.
The two-tiered model works because it balances stability with flexibility:
Tier One is the reliable backbone of service location.
Tier Two is the adaptable, semantic layer for runtime understanding.
Together, they allow clients to:
Compose new workflows from building blocks.
Retry with fallback options when services fail.
Learn and adapt without developer intervention.
The good news is that the model is simple enough to implement today using current technologies. It is also powerful enough to support LLM-integrated agent platforms, microservice orchestration, and future goal-seeking (planner-less) environments.
In the world described here, agents don’t just connect services together. They compose and combine them into a solution. Servers and clients can adapt safely over time because the binding of service and function are dynamic and discoverable.