Updated March 5, 2026
0:00 Welcome to Colaberry AI podcast brought to you by Colaberry AI Research Labs and Carl Foundation. Today, we're doing a, technically focused deep dive into the model context protocol, MCP. That's the one Anthropic released back in November 2024. That's right. And our focus today is really on the tech specs. 0:17 We'll be digging into the architecture, the methods, basically how it works according to the documentation. Mhmm. The goal is to unpack, you know, its core mechanics and what that really means for AI systems. Absolutely. And just to set the scene, maybe let's quickly talk about why context is so critical for generative AI in the first place. 0:36 I mean, you can't get good output without good input. Exactly. It's fundamental. A model's baseline abilities, sure, they come from its training data, its architecture Yeah. How it was trained. 0:45 But getting it to produce something relevant, something coherent for a specific task, that totally depends on providing the right context. It's like giving it the necessary coordinates to work within. Okay. So how is this context actually fed to the models? The source is mentioned a few ways depending on the type of AI. 1:01 Yeah. It varies. For text models, think GBT, DeepSeq, Lamay. Yeah. You get several mechanisms. 1:08 There's the prompt context itself, obviously, the text you feed it. Then there's the token window, how much text it can remember at once. GBT four turbo, for instance, handles around, what, 128,000 tokens. Wow. Yeah. 1:19 Conversation history is key for chatbots, keeps track of the back and forth. And, retrieval augmented generation, r a, is a big one. It dynamically pulls in info from outside documents. Yeah. R a. 1:30 So it's not just stuck with its training data. And this applies beyond text too, like, for images, daily, Gemini, those kinds of models. Mhmm. For image and multimodal models, context comes in through text descriptions, the prompts guiding generation. And if you give it an image to start with, that's visual context. 1:46 Mhmm. It analyzes that image before adding things. Then there's cross modal context combining test and images. The model has to interpret both together to create something meaningful. Interesting. 1:56 And I see specialized models, like, for code or speech. They have their own specific ways too. They do. Code models, like codex or deep c coder, they look at the code that's already there, previous blocks, function names, even comments. That's context. 2:09 They also understand programming language syntax, the patterns. And sometimes, they could even pull context from external docs like APIs. Makes sense. For speech and audio, think whisper or audio palem. The context is often the audio segment that just came before. 2:25 It informs what comes next. Plus, they analyze linguistic stuff, acoustic features, tone, speed, intonation that affects both understanding speech and generating it. So it seems like the common thread here, no matter the model type, is that short, really relevant context is crucial for getting good coherent outputs. That's spot on. The quality of the context is fundamental. 2:44 And, you know, the trend now is towards AI agents. Systems that don't just passively receive context, but actively fetch it. They search data sources, request data, process it. It creates this loop. Input, ash, l l m, ash output, but with feedback through retrieval, using tools, and managing memory. 3:00 Okay. So this brings us to the problems MCP is trying to solve. Before MCP, what were the big headaches? The sources mentioned some limitations. Yeah. 3:08 The main issue is fragmentation. Before MCP, connecting AI to data sources usually meant building custom one off integrations for every single application. This led to, well, inconsistent prompt logic across different apps and this big scaling problem they call the n times m problem. The n times m problem. Yeah. 3:27 Where you have n client apps needing to talk to m different back end servers or data sources. It just got incredibly complex point to point integrations everywhere. Really hard to scale and maintain. Sounds like a classic integration mess. So how does MCP actually tackle this fragmented data access issue? 3:43 MCP basically provides a standardized protocol, a common language, if you will, for how AI systems should talk to all these different data sources, repositories, business tools, IDEs, you name it. And what are the main advantages of having this standard protocol? What does it unlock? Several key things. First, it makes it much easier to share contextual information with the language models. 4:03 Second, it gives AI systems a standard way to discover and use external tools and capabilities. And third, it really helps in building composable applications and workflows. You can decouple the AI apart from the specific data or tool providers. That means more flexibility, less integration pain. And the tech underneath it all, how does this communication actually happen? 4:25 It uses JSON RPC two point o. Mhmm. That's a pretty standard, lightweight, remote procedure call protocol. It defines how messages are structured between the key components. Which are? 4:34 Right. The three main players are the host, which is usually the LLM application itself, the clients, which are like connectors running inside the host app, and the servers, which are the external services providing the context or tools. JSON RPC just standardizes the chat between them. Okay. Are there tools already using this? 4:51 Any real world examples of MCP support? Yeah. Definitely seeing some uptake. Tools like Cursor, Codeium's Windsurf. There's a Versus Code extension called Cline, and Anthropic's own clawed desktop and code interfaces are using it too. 5:04 So it's getting traction. Let's dig into that architecture a bit more, this client host server model. What are the big wins with this setup? Security, state management? Both, actually. 5:14 Separation is key for security. It creates clear boundaries, helps isolate sensitive stuff. And for state management, MCP is actually a stateful session protocol. That means it can keep track of the interaction over time, coordinate things across multiple requests within a session. That's really important for more complex tasks. 5:32 Okay. Host, client, server. Can we break down their specific jobs in this MCP world? What does each one do? Sure. 5:39 The host is kind of the central coordinator. It manages the clients, starts them, stops them, sets their permissions. It handles user authorization, integrates with the actual AI or LLM for things like sampling, and, crucially, it aggregates all the context coming in from different servers. Got it. And the clients? 5:55 Clients are the AI apps or agents needing that external access. The neat thing is they can connect to any MCP server with pretty minimal setup because of the standard. Their job is to call the tools disposed by servers, query for resources, and also handle prompts, basically filling in the templates the server provides. Though importantly, the user is ultimately in control of the prompts via the client interface. Okay. 6:18 So the client executes, but the user guides. And the servers. Servers are the providers. They offer up specific context, maybe structured data, maybe access to a file system and capabilities like tools. They expose these things as resources and prompts. 6:33 They run independently, handle requests from clients. Essentially, MCP servers act as standard wrappers or intermediaries for all sorts of back end systems. That separation of concern seems really powerful. Okay. Let's talk features. 6:45 The sources mentioned primitives, server features, composability. Starting with primitives, what are the absolute basic building blocks? MCP has three core primitives, prompts, resources, and tools. Prompts are basically those predefined templates or instructions that guide how the LLM interacts with stuff. Resources are structured data things that provide extra context. 7:06 Think files, database entries, etcetera. And tools are executable functions. They let the model actually do things in external systems, perform actions. These three cover most interaction types. Right. 7:17 Prompts, resources, tools. Simple, but covers a lot. Now looking at the server features, especially with that November 20 '24 protocol update, What's new with prompts? The update clarifies how servers expose these prompts as structured messages. Clients can discover them, get their structure, and fill in the blanks, essentially customize them. 7:36 A key point is that they're user controlled via the client, usually triggered by some user action or workflow. Servers declare they support prompts using a capabilities flag, and there's a list change notification so servers could tell clients, hey. I've got new or updated prompts available. Makes sense. How about resources in the updated protocol? 7:53 How is data shared? Resources let servers share all sorts of structured data. It's very application driven. Each resource gets a unique ID, a URI. Like prompts, servers declare resources capability. 8:05 Clients can subscribe to specific resources using their URI to get notified of changes. There's also a list change notification for the overall list of resources. The docs show examples like a project directory using a file, URI, or even structuring multiple code repositories. Okay. And the third feature, tools. 8:23 How does MCP let models use these external tools? MCP provides a standard way for models to interact with tools. These are considered model controlled in the sense that the LMM can discover what a tool does from its description and decide to invoke it based on the user's request. Again, servers declare tools capability, and list change tells clients about updates. The protocol defines how to call a tool and get the result back. 8:45 The source has also mentioned client features like routes and sampling. What do those do? Right. Clients can have extra features too. Routes basically define a server's operating boundary within a local file system. 8:56 The client tells the server, okay, you can operate within this directory route using a file dot URI. It standardizes file system access and notifications for changes within that route. Sampling lets the client specifically ask the server to perform an LLM completion request, maybe with specific parameters like temperature, but only if the server has explicitly given permission for that client to make sampling requests. And finally, composability. This sounds like a big architectural benefit. 9:22 How does MCP help build more complex AI systems? Composability is a huge plus. Because MCP standardizes the interfaces, you can easily chain things together. Imagine an AI agent that needs to perform a complex task. It could orchestrate by delegating subtasks to other more specialized agents, each potentially connected to different MCP servers providing the specific data or tools needed for that subtask. 9:45 It allows for this sort of distributed intelligence. You could build complex, multilayered systems where agents cooperate, access resources through MCP, all pretty seamlessly. It encourages modularity. That potential for complex layered systems is really fascinating. Okay. 10:00 Shifting gears slightly, security and trust. What principles does MCP follow here? Security is definitely considered. The key principles are user consent and control users must explicitly approve data access and actions. Most surprises. 10:15 Data privacy, the host, needs explicit consent before sharing user data with any server. Tool safety, especially with tools that might execute code, there's an emphasis on caution. Users need to understand what a tool does before approving its use. And LRM sampling control servers have to explicitly approve sampling requests from clients, adding a control layer over model generation initiated via MCP. Good to see those bake in. 10:38 The sources also mentioned some core design principles itself. What was the philosophy? Yeah. A few key ideas. Servers should be easy to build. 10:46 That encourages adoption. It should be highly composable, easy to plug together. Servers are designed not to hold the entire conversation state that keeps them simpler and more scalable. And the protocol allows for progressive future addition. You can add new capabilities over time without breaking everything. 11:00 Okay. Let's get back to the communication mechanics. Based on JSON RPC two point o, what are the actual message types flying back and forth? Just the standard three from JSON RPC two point o. Requests, which go either way, have parameters and expect a response. 11:17 Responses, which contain either the successful result or an error message. And notifications, which are just one way messages, fire and forget, no response expected. That's the foundation. And how do a client and server figure out what the other one can actually do, the features they support? Through a capability negotiation system. 11:35 Right at the start when they connect, they exchange lists of supported features. A server might say, I support resources, tools, and prompts. A client might say, I support sampling and routes. They have to respect what the other side says it can handle. And the protocol even allows negotiating additional capabilities later in the session if needed. 11:52 It ensures they're speaking the same dialect of MCP, essentially. Okay. Let's dive into the base protocol details. You said it must follow JSON RPC two point o strictly. Absolutely. 12:01 That's non negotiable. All messages have to conform to that spec. It dictates the structure for requests, responses, notifications, things like unique IDs for request responses so they match up, notifications not needing an ID, sticking to JSON RPC two point o ensures basic interoperability. And the sources mentioned protocol layers. Can you quickly sketch those out? 12:23 Sure. Think of it like a stack. At the bottom is the base protocol that's the core JSON RPC messaging. On top of that is life cycle management handling, connection setup, negotiation, shutdown. Then you have the server features layer protocols for prompts, resources, tools, and a parallel client features layer for things like sampling and routes. 12:42 Finally, a utilities layer might provide common things like logging or maybe argument completion hints. It's a modular structure. Got it. Let's walk through the connection life cycle then. From start to finish, what are the phases? 12:53 Okay. It starts with initialization. This is critical. Client and server negotiate the protocol version they'll use and exchange their capabilities lists. The client sends an initialized request with its info. 13:03 The server responds with its choices. Assuming that goes well. If they agree on a version and capabilities, they move into the operation phase. This is just normal communication, sending requests, responses, notifications based on what they negotiated. And to end it. 13:17 That's the shutdown phase. It's designed to be graceful. The client sends a shutdown request. Mhmm. The server acknowledges it, cleans up anything it needs to, like stopping child processes, and then the client closes the connection. 13:30 Can we zoom in on initialization? What absolutely has to happen there? Right. Initialization must cover a few things. First, version negotiation. 13:38 Client says what versions it supports. Server picks the latest what it also supports. If there's no overlap, the server should just disconnect. Second, capability negotiation. This is where they agree on the specific optional features they'll use. 13:51 Client lists its capabilities, like routes, sampling, server lists its, like, prompts, resources, tools. They only use features both sides agree on. Makes sense. Both sides must respect the negotiated capabilities during the operation phase. You can't just try to use a feature that wasn't agreed upon. 14:08 Sometimes implementation details might also be shared during initialization. And the shutdown process, any specifics on how that graceful termination works? The client usually sends a status shutdown notification. The server gets it, confirms, and then tries to wrap things up cleanly within a reasonable time out. It might wait for child processes to finish, then it signals it's terminating, maybe with sigterm or sigkill. 14:30 The client sees this and closes the connection. And good error handling is important throughout handling things like version mismatches, negotiation failures, timeouts during operation or shutdown. Implementations need to deal with those gracefully. Okay. This provides a solid technical picture. 14:46 Let's talk benefits. For application developers building AI apps, what's the payoff with MCP? For devs, it's huge. Connecting to a new tool or data source that supports MCP, it becomes almost zero extra work because the interface is standard. They get access to this whole ecosystem potentially. 15:02 They can focus on their app's core logic, not writing endless custom integration code. Plus, they can leverage the model's intelligence to use tools, making interactions richer without them having to code every specific API call sequence. And for the people providing those tools or APIs, why should they adopt MCP? For them, it means potentially way more adoption. If AI apps can easily connect via MCP, their tool gets used more. 15:28 It simplifies their integration story too, less need to build custom SDKs or libraries for every platform. They tap into intelligent agents using their service, which might open up totally new use cases they hadn't even thought of. What about the end users? The people actually using these AI applications. End users get more powerful, more context aware apps. 15:47 Things feel more seamless because the AI can connect to more of the tools and data they already use. It leads to smarter assistance overall and potentially more customizable personalized experiences because the AI has a standard way to access relevant personal or work context with user permission, of course. And finally, for enterprises, how does MCP help in a business context? Enterprises get standardization, which is always good for managing complexity. It means more consistent AI development across teams. 16:14 The separation of concerns helps speed up development. It makes centralized access control and governance easier, managing who or what AI can access which data or tools. And it improves scalability and maintainability of their AI systems, letting them better leverage existing data silos and enterprise tools through the standard protocol. Now the sources mentioned an MCP registry API is in the works. What's that about? 16:37 What problem is it solving? Right. As more MCP servers pop up, finding the right one becomes a challenge. The registry API is meant to be a central directory. It aims to solve discoverability, helping clients find servers, providing protocol information about what each server offers, maybe adding trust and verification layers, simplifying publication for server providers, managing versioning info, and handling general metadata management. 17:01 Basically, it's needed to make the ecosystem navigable and trustworthy as it grows. That makes sense. And while MCP often works locally, there's mention of remote servers and oath oath two point o. Why is that important? Yeah. 17:13 While local communication is common, supporting remote servers, hosted publicly, dramatically increases the reach. An AI could potentially access tools hosted anywhere, but that brings security challenges. Integrating OAuth two point o provides a standard, secure way for clients to authenticate and get authorization to talk to these remote servers. It handles the user consent flow securely, issuing access tokens. It's pretty much essential for building trust in a distributed MCP network where clients talk to servers across the Internet. 17:42 Okay. This leads to the final really interesting point, self evolving agents enabled by this registry. What's the idea there? This is where it gets futuristic, but plausible. With a registry, an AI agent wouldn't need all its tools preprogrammed. 17:56 It could dynamically query the registry at runtime. I need to achieve X. Are there any MCP servers that offer a tool for X? Find one? Connect to it using the standard protocol and use the new capability? 18:09 This could make agents incredibly adaptable. They could learn new skills on the fly by discovering and integrating new tools. It broadens their application range hugely, improves user experience because the agent gets smarter over time, and reduces initial development effort. Wow. Agents that can discover and add their own tools. 18:25 That's a pretty big leap. But it must come with challenges. Right? Oh, absolutely. Huge considerations. 18:31 Governance and security become even more critical. What tools can an agent discover and use? How is data handled? Trust and verification in the registry is vital. Is this server safe? 18:41 Performance discovery can't add too much lag. Tool selection and reasoning, the agent needs to choose tools wisely and safely. Debugging these dynamic agents would be tough. Versioning and compatibility between the agent and constantly changing tools is another headache. And overall governance and auditability tracking what these autonomous agents are doing is crucial for responsible use. 19:03 That's a lot to think about. So this has been a really deep dive into MCP. We've covered the architecture, the primitives, prompts, resources, tools, the capability negotiation, the potential of the registry, o f for remote access, and this fascinating idea of self evolving agents. Yeah. I think the takeaway is that MCP offers a really well thought out standardized framework. 19:22 It's designed to boost AI capabilities and interoperability by tackling that core challenge of context and access to external systems in a structured, secure way. It really sets the stage for more integrated and capable AI. So the final thought for our listeners, considering this potential for self evolving AI agents enabled by protocols like MCT, how might our whole interaction with AI fundamentally shift in the next few years? And, you know, what new challenges, but also opportunities, pop up when we think about managing these increasingly autonomous systems? Thank you for listening in. 19:52 Subscribe and follow Colaberry on social media links in the description, and check out our website, www.colaberry.a I backslash podcast for more insights like this.