OctoPerf MCP Server, Fully On-Premise: AI Load Testing With a Local LLM
When we released the OctoPerf MCP Server, it ran as a hosted endpoint at https://api.octoperf.com/mcp, and most teams connect to it straight from Claude.ai or Claude Code. But a recurring question came from banks, hospitals, defense and public-sector teams: what if nothing is allowed to leave our network, not even the prompt? This article answers that question with a full walkthrough.
We will stand up a 100% on-premise, air-gapped stack, and it only takes two things to install: OctoPerf Enterprise in Docker, and a local Qwen3 large language model running in LM Studio, which doubles as the Model Context Protocol client. By the end, you will drive your load tests in plain language from a chat window, with no API key, no cloud LLM and no outbound traffic.
