Building a Public MCP Server - pdf.c0xl.ch from FastMCP to Production

                
                    Christian Lehnert •
                
                2026-03-20 •
                ~5 min read

Most MCP write-ups stop at "run a Python script on your laptop and point Claude Desktop at it." That is a demo, not a server. This post walks through what I actually run at pdf.c0xl.ch — a public, authenticated MCP server that converts DOCX to PDF, containerised end to end, with a fifteen-fold latency improvement from a single design choice. Small codebase. The production path is where the interesting decisions live.

What MCP Actually Is on the Wire

Model Context Protocol is two things people tend to conflate. One is a tool-calling contract between an LLM client and a tool provider. The other is a transport. For local tools the transport is stdio — the client spawns your process and talks over pipes. For anything remote the transport is SSE over HTTP, which is the part that matters the moment you want a tool that runs somewhere other than the machine the LLM is on.

SSE is the interesting choice. Not WebSockets, not bidirectional streaming, not gRPC. Plain HTTP with Server-Sent Events for server-to-client messages and regular POST for client-to-server. This makes the server trivially reverse-proxyable and cache-friendly, which is exactly what you want when you are putting it behind Caddy or an enterprise gateway.

The Tool

The server exposes one tool: convert a base64-encoded DOCX file to PDF and return the PDF as base64. Useful on its own for any Claude conversation where you want to hand the model a generated Word doc and get a PDF back without leaving the chat.
The implementation is FastMCP. The entire tool definition is about twenty lines:

from fastmcp import FastMCP
import base64
import subprocess
import tempfile

mcp = FastMCP("pdf-converter")

@mcp.tool()
def convert_docx_to_pdf(docx_base64: str) -> str:
    """Convert a base64-encoded .docx file to PDF using LibreOffice."""
    with tempfile.TemporaryDirectory() as workdir:
        docx_path = f"{workdir}/input.docx"
        with open(docx_path, "wb") as f:
            f.write(base64.b64decode(docx_base64))
        subprocess.run(
            ["libreoffice", "--headless", "--convert-to", "pdf",
             "--outdir", workdir, docx_path],
            check=True, capture_output=True
        )
        with open(f"{workdir}/input.pdf", "rb") as f:
            return base64.b64encode(f.read()).decode()

Ship this as-is and you have a working MCP server. You also have a server that takes four to six seconds per conversion, because every call spins up a full LibreOffice process from cold.

The unoserver Fix

LibreOffice has a persistent mode called unoserver. You run one LibreOffice instance as a daemon, and subsequent conversion requests talk to that already-running instance over a local socket. Cold start goes from four seconds to under three hundred milliseconds.
The tool becomes:

subprocess.run(
    ["unoconvert", "--convert-to", "pdf", docx_path, pdf_path],
    check=True, capture_output=True
)

And the container runs unoserver as a separate process in the background. The Dockerfile entry point is a small shell script that starts unoserver, waits for its socket to appear, then starts the MCP server.

This is the entire performance story. Ten-line change. Fifteen-fold latency improvement.

Container and Filesystem

The server runs in a single container. LibreOffice has a reputation for being heavy, which is deserved — the base image lands around 1.2 GB — but it is a one-time disk cost. Memory at idle is about 400 MB with unoserver loaded. Per-request memory is negligible because the conversion happens in a tempfile directory that is cleaned up by the context manager.

Two filesystem choices matter.

The working directory is a tmpfs mount. No PDF or DOCX ever touches persistent disk. If the container is killed mid-conversion, there is nothing to clean up and nothing to leak. This is both a performance win — tmpfs is RAM — and a privacy property I can point at.

The container itself runs read-only with tmpfs overlays for /tmp and /var. unoserver writes its socket to /tmp, which is tmpfs, and the OS has no writable persistent paths. If someone finds an RCE in unoserver, the blast radius is the container's ephemeral state.

Authentication

The server is public. That does not mean it is open. Every MCP request carries a Bearer token, checked by Caddy before the request ever reaches the FastMCP process.

The Caddyfile is five lines:

pdf.c0xl.ch {
    @authorized header Authorization "Bearer {env.MCP_TOKEN}"
    handle @authorized {
        reverse_proxy localhost:8080
    }
    respond 401
}

Token rotation is a single environment variable change and a Caddy reload. No code deploy.

This is the cheapest auth model that actually works for a single-tenant public service. If I needed multi-tenant I would move to per-user tokens with a small database, but I do not need that here. The server exists to serve me and whoever I hand a token to.

Connecting a Client

From Claude Desktop or any MCP-compatible client, the config is an SSE endpoint with a header:

{
  "mcpServers": {
    "pdf-converter": {
      "url": "https://pdf.c0xl.ch/sse",
      "headers": {
        "Authorization": "Bearer <token>"
      }
    }
  }
}

That is the entire client integration. From Claude's perspective the tool is indistinguishable from a local tool. From the server's perspective every request is authenticated HTTP. There is no trust boundary being stretched — the LLM never sees the token, the client handles auth, the server verifies, conversion happens, result comes back.

What This Is Not

This is not a replacement for a SaaS document conversion API. It is faster than Pandoc for the DOCX→PDF specific case because LibreOffice actually understands Word formatting and rendering, which no other open-source tool does completely. It is slower and less scalable than a fleet of stateless serverless converters. For personal and small-team use it is the right shape.

It is also not a general MCP architecture post. There are choices I made — single tool, stateless, no conversation context, no tool-to-tool chaining — that would not hold up for a more complex server. Those are different posts.

The Larger Point

MCP is going to matter for the same reason LSP mattered for editors. A standardised protocol between a thing-that-reasons and things-that-do-stuff produces an ecosystem that is larger than any single vendor can build alone. Right now the ecosystem is tiny and the quality bar is low. Publishing real, production-quality MCP servers — authenticated, observable, containerised, documented — is how that changes.

If you are building anything that plugs into an LLM workflow, ship it as an MCP server instead of a custom integration. The wiring is almost free. The interoperability pays for itself the first time you switch clients.

                Tagged:
            

#mcp #selfhosted #fastmcp #python #ai

                
                   ← Back to posts