GHCPCLI-Local: Running GitHub Copilot CLI on Your Own Hardware

GHCPCLI-Local: Running GitHub Copilot CLI on Your Own Hardware
The primary purpose of GHCPCLI-Local is to run the GitHub Copilot CLI entirely on your own hardware, wired to a local LLM backend, so no prompts, code, or telemetry ever leave your network.
Purpose
The GitHub Copilot CLI is a capable terminal assistant, but by default every request travels to GitHub’s cloud inference and telemetry endpoints. For anyone working under data-residency rules, on an air-gapped network, or who simply prefers to keep their code on their own machine, that is a problem.
GHCPCLI-Local solves it without forking or patching anything. It is a thin wrapper script that sets the COPILOT_PROVIDER_* environment variables the Copilot CLI already reads at startup, then hands off to the unmodified copilot binary. Every Copilot CLI feature works as normal - it is just powered by a model running on your own GPU or CPU.
TL;DR: install the Copilot CLI, point GHCPCLI-Local at Ollama, llama.cpp, or LM Studio, and you have a fully local Copilot with optional self-hosted web search.
Source Code
The project is on GitHub: https://github.com/CraigWilsonOZ/GHCPCLI-Local
Use Cases
- Keeping source code and prompts on regulated or air-gapped networks where cloud inference is not permitted
- Running Copilot CLI offline or in low-connectivity environments
- Experimenting with different open-weight models (Qwen, Llama, and others) behind a familiar tool
- Reducing reliance on cloud inference for cost or privacy reasons while keeping the Copilot workflow
Prerequisites
A GitHub Copilot subscription is still required - the CLI binary is licensed by GitHub, and this project only redirects its inference calls. Individual, Business, and Enterprise plans all work.
| Tool | Purpose | Required |
|---|---|---|
copilot CLI |
The AI assistant being wrapped | Yes |
bash 4.0+ |
Runs all scripts | Yes |
curl |
Backend health checks | Yes |
python3 3.11+ and uv |
SearXNG MCP server runtime | For web search |
docker + docker compose |
Runs the SearXNG container stack | For web search |
| Ollama, llama.cpp, or LM Studio | Local LLM backend | At least one |
A GPU is strongly recommended for models 14B and above. As a rough guide, a 7-8B model needs 6-8 GB of VRAM, a 32-35B model needs 20-24 GB, and a 70B model needs 40 GB or more.
How It Works
setup.sh(orsetup.ps1on Windows) checks the required tools, prompts for backend URLs and default models, and installs the SearXNG MCP server.copilot-local.shloads your.envfile with the endpoint URLs and default models.- The script validates that the chosen backend is reachable before launching.
- It checks the model supports tool use and has a sufficient context window.
- It exports the
COPILOT_PROVIDER_*environment variables the Copilot CLI reads. - It sets
COPILOT_OFFLINE=trueso the CLI cannot reach GitHub telemetry or authentication endpoints. - It calls
exec copilot, replacing the shell with the Copilot CLI process. - Copilot talks to your local LLM server over an OpenAI-compatible HTTP API, and inference runs on your own CPU or GPU.
With SearXNG configured in .mcp.json, Copilot can call a search tool to retrieve web results. The MCP server is started and managed by Copilot CLI automatically over the stdio transport declared in .mcp.json.
Architecture
copilot-local.sh
|
+-- Loads .env (endpoint URLs and default models)
+-- Validates backend is reachable
+-- Checks model supports tool use and context window
+-- Exports COPILOT_PROVIDER_* environment variables
|
+-- exec copilot (replaces this shell with the Copilot CLI)
|
| OpenAI-compatible API (HTTP)
v
Local LLM server (Ollama / llama.cpp / LM Studio)
|
| model weights on local disk
v
LLM inference (runs on your CPU/GPU)
The repository also ships a Python MCP server (SearXNG-MCP/) that exposes a search() tool, and a self-hosted SearXNG Docker stack (searxng/) for private web search with no third-party search API.
Getting Started
On Linux:
git clone https://github.com/CraigWilsonOZ/GHCPCLI-Local.git
cd GHCPCLI-Local
./setup.sh
# Start SearXNG (optional)
cd searxng && ./manage.sh start && cd ..
# Start your LLM backend
ollama serve
# Launch Copilot against a local model
./copilot-local.sh --backend ollama --model qwen3.6:35b
The Windows flow is identical using setup.ps1 and copilot-local.ps1. If .env already exists, the configuration step is skipped - delete it and re-run setup to reconfigure.
Security Considerations
COPILOT_OFFLINE=trueis set by default, so the CLI cannot reach GitHub telemetry or authentication endpoints.- All inference runs on your hardware or local network - no prompts or code leave your network.
- API keys should be passed via
COPILOT_PROVIDER_API_KEYin the shell environment, never on the command line with--api-key. - The SearXNG secret key must be a unique random value, generated with
./searxng/manage.sh secret-key.
Limitations
- A GitHub Copilot subscription is still required - this project redirects inference, it does not replace the licensed CLI.
- Local model quality and speed depend entirely on your hardware; smaller models will not match cloud-hosted frontier models.
- The chosen model must support tool use and a large enough context window for Copilot’s agentic workflow.
- Web search requires the additional Docker, Python, and
uvdependencies for the SearXNG stack.
Future Work
- Broader backend coverage as more OpenAI-compatible local servers mature
- Model recommendation guidance tuned to common GPU tiers
- Tighter validation and clearer diagnostics when a model lacks tool-use support
Conclusion
GHCPCLI-Local keeps the GitHub Copilot CLI workflow developers already know while moving every inference call onto hardware you control. It is a small, transparent wrapper rather than a fork, which makes it easy to audit and easy to drop. If you want the convenience of Copilot in the terminal without sending your code to the cloud, this is a practical way to get there.
References
- GHCPCLI-Local repository - https://github.com/CraigWilsonOZ/GHCPCLI-Local
- GitHub Copilot CLI documentation - https://docs.github.com/en/copilot/how-tos/copilot-cli/set-up-copilot-cli/install-copilot-cli
- GitHub Copilot subscriptions - https://github.com/features/copilot
- Ollama - https://ollama.com
- llama.cpp - https://github.com/ggerganov/llama.cpp
- LM Studio - https://lmstudio.ai
- SearXNG - https://docs.searxng.org