Run Claude Code Cheaply: A Complete Guide to Using LiteLLM with the GitHub Copilot Chat API
1. Background and Motivation
Claude Code is Anthropic’s coding agent. Many people use it for “vibe coding”: write some code, ask questions, refactor inside the conversation, chase down bugs, and get something very close to pair programming with a smart teammate.
In practice, though, two problems come up quickly:
- It can get expensive: frequent requests burn through Anthropic API credits quickly.
- The network can be unstable: in some environments, direct requests to Anthropic may time out or fail often.
At the same time, many developers are already paying for GitHub Copilot. Behind Copilot, GitHub provides access to multiple large models, including Claude at different points in time, and you have already paid for that pool of compute.
So a natural question follows:
Can Claude Code use your GitHub Copilot quota instead?
Yes, it can.
This article explains how to use LiteLLM as a local proxy layer so that Claude Code talks to a local endpoint, and LiteLLM forwards those requests to the GitHub Copilot Chat API (referred to below as the Copilot API).
1.1 Compliance note before you begin
Before going further, there is one important caveat:
Warning: Connecting Claude Code to the Copilot API through a proxy layer is not the main workflow officially documented or guaranteed by GitHub. Before you use it in practice, you should read and evaluate the latest GitHub Copilot terms of service, usage limits, and risk controls for yourself, and make sure the way you use it is acceptable for your situation.
If you understand that tradeoff and are comfortable taking responsibility for it, read on.
2. Overall Architecture
Claude Code can be pointed at a custom BASE_URL through environment variables. We use that to route its traffic to LiteLLM first, and let LiteLLM call the Copilot API.
The full request flow looks like this:
-
Client: Claude Code
- You use the
claudeCLI, which is Claude Code’s command-line client. - It is configured to send requests to local LiteLLM at
http://localhost:4000.
- You use the
-
Middleware: LiteLLM proxy
- LiteLLM runs as a local proxy service.
- It receives Anthropic-style requests from Claude Code.
- It translates request parameters into a format the Copilot API accepts.
- It injects the required headers so the request looks like it came from an editor plugin.
- It passes the Copilot API response back to Claude Code.
-
Backend: GitHub Copilot Chat API
- The Copilot API receives the transformed request.
- LiteLLM forwards the returned model output back to Claude Code.
This preserves the familiar Claude Code experience while shifting the actual backend compute to GitHub Copilot, which helps you:
- Reduce additional API spending by using your existing Copilot subscription.
- Improve request stability by taking advantage of the local proxy layer and GitHub’s network path.
3. Prerequisites
Before you start, you need:
- An active GitHub Copilot subscription
- A machine where you can run:
uv(recommended) orpipclaude(the Claude Code CLI)
Assume that this command already works:
claude --help
And that it prints the expected help output.
4. Step One: Create a LiteLLM Configuration File
LiteLLM is the central entry point in this setup. Through its configuration file, we will:
- Define a logical model name that Claude Code will use.
- Tell LiteLLM which actual Copilot-backed model to use.
- Add the parameters and headers required for successful Copilot API calls.
Create a config.yaml in any directory with content like this:
model_list:
- model_name: claude-opus-4.5
litellm_params:
# Use GitHub Copilot as the actual provider
model: github_copilot/claude-opus-4.5
# Drop non-standard parameters sent by Claude Code
drop_params: true
# Pretend to be an editor client so Copilot responds correctly
extra_headers:
Editor-Version: "vscode/1.106.3"
Editor-Plugin-Version: "copilot/1.388.0"
Copilot-Integration-Id: "vscode-chat"
User-Agent: "GithubCopilot/1.388.0"
There are three key points here:
-
model_name- This is the logical model name exposed to Claude Code.
- Later,
ANTHROPIC_MODELmust match it exactly.
-
model- This is LiteLLM’s internal provider/model identifier.
- In the example above, it is
github_copilot/claude-opus-4.5, but you can change it based on LiteLLM documentation and the models Copilot currently supports.
-
drop_params: true- This is important.
- Claude Code may include Anthropic-specific extension fields that the Copilot API does not understand.
- When
drop_paramsis enabled, LiteLLM strips unsupported fields to avoid 4xx errors caused by incompatible parameters.
If you want to expose multiple models to Claude Code, just add more entries to
model_list, each with a differentmodel_name.
5. Step Two: Install and Start the LiteLLM Proxy
The recommended installation method is uv. It provides isolated environments and a faster install experience, but pip works too.
5.1 Install LiteLLM with proxy support
# Install LiteLLM with proxy support using uv
uv tool install "litellm[proxy]"
# Or with pip:
# pip install "litellm[proxy]"
Once installed, litellm should be available on your PATH.
5.2 Start the LiteLLM proxy
In the directory that contains config.yaml, run:
litellm --config config.yaml --port 4000
This is your Window A. Keep it open so you can watch the logs.
5.3 Device authorization the first time you use the Copilot API
The first time LiteLLM talks to the Copilot API, it should walk you through GitHub’s device authorization flow:
- The terminal prints a URL, usually something like
https://github.com/login/device, plus an 8-character device code. - Open the URL in a browser.
- Paste the device code and approve the request.
- Return to the terminal and LiteLLM will continue automatically.
LiteLLM caches the acquired token locally, so you usually do not need to authorize again unless the token expires or you delete it manually.
6. Step Three: Configure Claude Code to Use LiteLLM
Now we want Claude Code to believe it is still talking to Anthropic while actually sending traffic to the local LiteLLM proxy.
You can do that either through environment variables for quick tests or through Claude Code’s settings file for a persistent setup.
6.1 Environment variables
Before starting claude, run the following in Window B:
export ANTHROPIC_AUTH_TOKEN="sk-any-string"
export ANTHROPIC_BASE_URL="http://localhost:4000"
export ANTHROPIC_MODEL="claude-opus-4.5"
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
Notes:
-
ANTHROPIC_AUTH_TOKEN- This does not matter to LiteLLM and will not be forwarded to Copilot.
- Claude Code just expects a non-empty value locally.
-
ANTHROPIC_BASE_URL- This replaces the default Anthropic endpoint with your local LiteLLM proxy.
- Make sure the port matches the one you started LiteLLM on.
-
ANTHROPIC_MODEL- This must match
model_nameinconfig.yamlexactly. - Otherwise LiteLLM will report that the model does not exist or return a similar error.
- This must match
-
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC- This reduces non-essential requests such as telemetry.
6.2 Configuration file
If you want this setup to apply every time claude runs, create or edit:
~/.claude/settings.json
With content like this:
{
"env": {
"ANTHROPIC_AUTH_TOKEN": "sk-any-string",
"ANTHROPIC_BASE_URL": "http://localhost:4000",
"ANTHROPIC_MODEL": "claude-opus-4.5",
"CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1"
}
}
After that, Claude Code loads the variables automatically when it starts.
If you already have a
settings.json, merge these fields into the existing JSON instead of replacing the file wholesale.
7. Step Four: Start Claude Code and Verify the Full Chain
At this point you should have two terminal windows:
-
Window A runs LiteLLM:
litellm --config config.yaml --port 4000 -
Window B runs Claude Code:
claude
If everything is configured correctly:
claudeshould start normally.- You can send a simple request such as, “Write a Python script that prints the squares of 1 through 10.”
- Then check the logs in Window A.
You should see:
- Requests arriving from the client.
- Log entries showing something like
github_copilot/claude-opus-4.5.
If so, the full request path is working:
Claude Code -> LiteLLM (local proxy) -> Copilot API -> LiteLLM -> Claude Code
7.1 Quick troubleshooting checklist
If it does not work, here are the first things to check:
-
Claude Code says the model does not exist or returns a 404-like error
- Make sure
ANTHROPIC_MODELmatchesmodel_nameinconfig.yamlexactly, including case and hyphens.
- Make sure
-
LiteLLM never receives any request
- Confirm that
ANTHROPIC_BASE_URLreally points tohttp://localhost:4000. - Verify that LiteLLM is running on the same machine and not being blocked by a firewall.
- Confirm that
-
LiteLLM logs show GitHub-related 401 or 403 errors
- Your Copilot authorization likely failed or expired.
- Restart LiteLLM and complete the device authorization flow again.
- Check that your GitHub account still has an active Copilot subscription.
Closing Thoughts
By introducing LiteLLM as a local middleware layer, we can:
-
Use GitHub Copilot as Claude Code’s backend compute which helps reduce the need for separate Anthropic API spending if you already subscribe to Copilot.
-
Improve network stability through a local proxy because your client only needs stable access to GitHub instead of making direct requests to Anthropic.
-
Keep the original Claude Code experience since you are still running
claudein the terminal and interacting with the same familiar workflow.
It is worth emphasizing one more time:
This is an advanced workaround, not an officially promoted or guaranteed long-term GitHub workflow. Before using it seriously, please read the latest Copilot terms and usage rules yourself and evaluate the compliance and risk tradeoffs.
If you are:
- already a heavy Claude Code user,
- already paying for GitHub Copilot,
- and trying to find a better balance between cost and connection reliability,
then this approach is definitely worth experimenting with.
And if you want to take it further, LiteLLM can sit in front of more models too, including OpenAI and native Anthropic APIs, opening the door to even more flexible Claude Code workflows.