MCP Operations Server: AI-Enabled Managed Ops Explained

To bridge your local Python code to a production-ready AKS environment, you need a Dockerfile that doesn’t just run the code, but does so securely and efficiently.

By 2026, the standard for MCP servers in production is to move away from STDIO (local command line) and use SSE (Server-Sent Events) over HTTP. This allows your AI agents to talk to the server over a network.

1. The Production Dockerfile

This Dockerfile uses a “non-root” user (security best practice) and installs the necessary drivers to talk to the Docker socket or Kubernetes API.

Dockerfile

# Use a lightweight Python 2026-ready base image
FROM python:3.12-slim
# Install system dependencies (curl for health checks)
RUN apt-get update && apt-get install -y \
curl \
&& rm -rf /var/lib/apt/lists/*
# Set working directory
WORKDIR /app
# Create a non-root user for security
RUN groupadd -r mcpuser && useradd -r -g mcpuser mcpuser
# Copy requirements and install
# Note: includes 'mcp[cli]' for server capabilities
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy server code
COPY server.py .
# Give our non-root user access to the app folder
RUN chown -r mcpuser:mcpuser /app
USER mcpuser
# Expose the port for SSE/HTTP transport (Standard for 2026)
EXPOSE 8000
# Start the server using the FastMCP production runner
CMD ["python", "server.py", "--transport", "sse", "--port", "8000"]

2. The requirements.txt

You’ll need these specific libraries:

Plaintext

fastmcp>=1.0.0
docker>=7.0.0
kubernetes>=30.0.0
uvicorn # Required for high-performance HTTP transport

3. Deploying to AKS (The “Support” Strategy)

When you deploy this to your client’s AKS cluster, you’ll use a standard Kubernetes Deployment.

Why this is better for your role:

  • Scaling: If the dev team grows, you can scale the MCP server to 3 replicas so the AI assistant never lags.
  • Security: Instead of sharing your personal kubeconfig, the MCP server uses a ServiceAccount with “View Only” permissions. This means the AI can see the logs but can’t accidentally delete the production database.

4. How to Pitch the “AI Operations” Tier

You can now offer a new support tier called “AI-Enabled Managed Ops”:

“I’ve built a custom MCP Operations Server for our cluster. It allows our internal AI agents to perform health checks, retrieve logs, and analyze container stats using natural language. This doesn’t replace me; it allows me to respond to your requests 10x faster because the AI is doing the ‘data gathering’ for me inside our secure perimeter.”

One final piece of the puzzle

To make this work in AKS, the pod needs permission to “see” the other pods.

To finish the MCP server integration on AKS, you need to grant the pod the right permissions to “talk” to the Kubernetes API.

If you don’t do this, the AI will be “blind”—it will try to list pods and get a 403 Forbidden error.


1. The RBAC Strategy

We will use three Kubernetes objects:

  • ServiceAccount: The identity for your MCP pod.
  • ClusterRole: A set of rules that allow “Viewing” (reading pods, logs, and events).
  • ClusterRoleBinding: The “glue” that attaches the Role to the ServiceAccount.

2. The RBAC YAML (mcp-rbac.yaml)

YAML

# 1. The Identity
apiVersion: v1
kind: ServiceAccount
metadata:
name: mcp-server-sa
namespace: default
---
# 2. The Permissions (Read-Only/Viewer)
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: mcp-pod-viewer
rules:
- apiGroups: [""]
# Accessing 'pods' for list/get, and 'pods/log' specifically for tracing
resources: ["pods", "pods/log", "pods/status", "events", "nodes", "services"]
verbs: ["get", "list", "watch"]
---
# 3. The Connection
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: mcp-server-binding
subjects:
- kind: ServiceAccount
name: mcp-server-sa
namespace: default
roleRef:
kind: ClusterRole
name: mcp-pod-viewer
apiGroup: rbac.authorization.k8s.io

3. Updating your Deployment

Finally, ensure your MCP server deployment uses this serviceAccountName:

YAML

spec:
template:
spec:
serviceAccountName: mcp-server-sa
containers:
- name: mcp-server
image: your-mcp-image:latest
# ... other config ...

4. Why this is “Safe” for your Client

When you explain this to the company, emphasize these three points:

  • Namespace Scoping: Even though it’s a ClusterRole, you can swap it for a Role if you only want the AI to see specific namespaces (e.g., only production-frontend).
  • No Secrets Access: Notice that secrets is not in the list of resources. The AI literally cannot see the database passwords, even if it tries.
  • Auditability: Every time the AI agent asks the MCP server for logs, Kubernetes logs that action under the mcp-server-sa identity. You have a perfect audit trail.

Putting it all together

You now have the Terraform for infrastructure, the Python for the server, the Docker for the container, and the RBAC for security.

You’re ready to pitch this as a “Self-Healing AI Operations Layer.”

Leave a comment