Skip to content

Add best practices when using load balancer#325

Closed
jizhuozhi wants to merge 1 commit into
modelcontextprotocol:mainfrom
jizhuozhi:main
Closed

Add best practices when using load balancer#325
jizhuozhi wants to merge 1 commit into
modelcontextprotocol:mainfrom
jizhuozhi:main

Conversation

@jizhuozhi
Copy link
Copy Markdown

Motivation and Context

When the MCP Server provides services with multiple replicas (especially as an external provider), there is usually a load balancer. Usually, unless a persistent connection is used, the load balancer assumes that the request is stateless. However, the MCP protocol is a stateful protocol. When using HTTP (SSE) communication, it is necessary to ensure that the instance that receives the message request is the instance that establishes the SSE connection.

Therefore, as a best practice, when using a load balancer, session stickiness must be guaranteed, and when the instance that establishes the SSE connection does not exist, the request must be rejected (which means that consistent hashing is not suitable in this scenario).

In addition, since common session stickiness is implemented based on cookies, cookie records may be necessary for MCP Clients.

How Has This Been Tested?

As above, for the MCP Server provides services with multiple replicas, using openresty as the loadbalancer to compare Round Robin and Cookie-based Session Stickiness, and Transport with HTTP Client which supports cookie jar. Round Robin couldn't work when multi-sessions, but Cookie-based Session Stickiness could working well.

Breaking Changes

None

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

@Joffref
Copy link
Copy Markdown
Contributor

Joffref commented Apr 12, 2025

I wouldn’t list that as required; I’d rather present it as the easiest path. Session management is ultimately at the discretion of whoever implements the MCP server. Using Redis to store sessions and share it across instances can also solve the problem, but it introduces a more complex architecture.

@jizhuozhi
Copy link
Copy Markdown
Author

Using Redis to store sessions and share it across instances can also solve the problem, but it introduces a more complex architecture.

The session management mentioned here is not simply sharing some data. According to the MCP protocol definition, a complete session includes three handshakes (the client sends a request to the SSE endpoint, the server returns the message endpoint information through SSE, and the client sends a notification to the message endpoint) to establish a channel (the client through the message endpoint, and the server through SSE), and this channel is what we call a session.

The problem now is that if the load balancer sends the message request to a server that has no SSE connection (that is, it cannot recognize the sessionId), then this request will fail.

@Joffref
Copy link
Copy Markdown
Contributor

Joffref commented Apr 12, 2025

According to the current spec: https://modelcontextprotocol.io/specification/2025-03-26/basic/transports#session-management, the session ID is returned during the init phase. This implies that the ID is not negotiated anywhere else in the protocol. Unless I’m missing something?

@jizhuozhi
Copy link
Copy Markdown
Author

jizhuozhi commented Apr 12, 2025

According to the current spec: https://modelcontextprotocol.io/specification/2025-03-26/basic/transports#session-management, the session ID is returned during the init phase. This implies that the ID is not negotiated anywhere else in the protocol. Unless I’m missing something?

Yes, so this session stickiness is required means, load balancer which claims to support MCP would be best to support it without MCP sessionId through parsing SSE event, likes sticky cookie in nginx.

@Joffref
Copy link
Copy Markdown
Contributor

Joffref commented Apr 12, 2025

I agree with you on LB/Gateway — ultimately, it's not required, just recommended. That said, the solution I mentioned earlier works fine as well. Where I’ve landed with this discussion is that we need an MCP Gateway specification, and it should be defined here.

@jizhuozhi
Copy link
Copy Markdown
Author

Where I’ve landed with this discussion is that we need an MCP Gateway specification, and it should be defined here.

Yes, we need a specification to guide how load balancers or gateways support MCP. I have seen many different implementations so far, and their common point is that they all use the gateway as the MCP Server and communicate with business services through Redis PUB/SUB or other event buses or message middleware, which is not friendly to the standard uniform ecosystem.

@raphaelkieling
Copy link
Copy Markdown

We have the same problem here, the redis pub/sub is doable, but it's too much to connect an existing application that is running through 10/20 pods ONLY to support the mcp protocol. For now, we resolved making a single pod to handle mcp connections, commonly, this is more than enough to support internal chat calls.

Also, it's good to mention that HTTP stateless proposal is coming, and hopefully will make it a lot more easier:
#102

@evalstate
Copy link
Copy Markdown
Member

This is a really good discussion - do you think it would be a good idea to collaborate on this within the Hosting Working Group to publish the best practices for the community here?:

https://github.com/modelcontextprotocol-community/working-groups

@jizhuozhi
Copy link
Copy Markdown
Author

This is a really good discussion - do you think it would be a good idea to collaborate on this within the Hosting Working Group to publish the best practices for the community here?:

https://github.com/modelcontextprotocol-community/working-groups

I will see it latter after my holiday :)

@youxihu
Copy link
Copy Markdown

youxihu commented Jun 27, 2025

so
i use nginx. HOW to unstream my mcp services? like some conf show me ?

@jizhuozhi
Copy link
Copy Markdown
Author

so i use nginx. HOW to unstream my mcp services? like some conf show me ?

A working solution is using sticky cookie https://nginx.org/en/docs/http/ngx_http_upstream_module.html#sticky_cookie to make sure client persist the selected instance.

As special remainder: client should support cookiejar to persist the information between requests

@jonathanhefner
Copy link
Copy Markdown
Member

I agree with what @Joffref said:

I wouldn’t list that as required; I’d rather present it as the easiest path. Session management is ultimately at the discretion of whoever implements the MCP server. Using Redis to store sessions and share it across instances can also solve the problem, but it introduces a more complex architecture.

So I will close, but thank you for the pull request! 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants