GitHub Copilot's BYOK (Bring Your Own Key) feature is more powerful than its name suggests. It's not just about importing your own API key—it's about building a development environment where you can register multiple LLM providers and switch between models as your coding tasks demand.
As of April 2026, BYOK supports Anthropic, Google Gemini, OpenAI, OpenRouter, Ollama (local), Microsoft Foundry Local, and others. You can now be precise: "I want Claude for this file" or "let me test this snippet with Gemini." This level of flexibility transforms how we approach code writing.
What BYOK Actually Does
Traditionally, GitHub Copilot relied on OpenAI's GPT-4 models running on GitHub's infrastructure. BYOK changes that by letting you bring your own credentials and specify a different model provider.
Here's the critical distinction: BYOK applies to Chat and slash commands, not to inline code completions. When you type code and suggestions appear on the right, those still come from GitHub's backend. With BYOK, you choose which model powers the Chat panel and commands like /explain or /tests.
In practical terms:
- ✅ You control which LLM the Chat panel uses
- ✅ Slash commands use your chosen model
- ❌ Inline completions remain GitHub-managed
This means structuring your development around Chat-driven workflows—asking, iterating, and refining—rather than relying on auto-completions alone.
Registering Multiple Providers
VS Code's GitHub Copilot extension allows you to register multiple providers in a few clicks.
// .vscode/settings.json (workspace configuration example)
{
"github.copilot.advanced": {
"providers": [
{
"name": "Anthropic",
"models": [
{
"id": "claude-opus-4",
"displayName": "Claude Opus 4",
"apiKey": "YOUR_ANTHROPIC_API_KEY",
"baseUrl": "https://api.anthropic.com/v1"
},
{
"id": "claude-sonnet-4",
"displayName": "Claude Sonnet 4",
"apiKey": "YOUR_ANTHROPIC_API_KEY",
"baseUrl": "https://api.anthropic.com/v1"
}
]
},
{
"name": "Google Gemini",
"models": [
{
"id": "gemini-2.0-flash",
"displayName": "Gemini 2.0 Flash",
"apiKey": "YOUR_GEMINI_API_KEY",
"baseUrl": "https://generativelanguage.googleapis.com/v1beta/openai/"
}
]
},
{
"name": "OpenAI",
"models": [
{
"id": "gpt-4-turbo",
"displayName": "GPT-4 Turbo",
"apiKey": "YOUR_OPENAI_API_KEY",
"baseUrl": "https://api.openai.com/v1"
}
]
},
{
"name": "Ollama (Local)",
"models": [
{
"id": "llama2",
"displayName": "Llama 2 (Local)",
"apiKey": "not-needed-for-local",
"baseUrl": "http://localhost:11434/v1"
}
]
}
]
}
}Key points:
- Each provider's
apiKeyholds that service's actual credentials baseUrlvaries by provider—check their official documentation- For local models like Ollama, no API key is required
- You can also configure this through VS Code's UI (Settings → GitHub Copilot → Advanced → Providers)
Task-Based Model Selection
Smart development means matching your model to the work at hand. Different models have different strengths, and choosing the right one saves time and money.
Test Code Generation → Claude Opus
When you need comprehensive test coverage, deep reasoning is essential. Using /tests with Claude Opus catches edge cases you might otherwise miss.
// Example of comprehensive test output from Claude Opus
describe("calculateDiscount", () => {
it("should apply 10% discount for purchases over 100", () => {
expect(calculateDiscount(150)).toBe(135);
});
// Edge cases automatically considered
it("should return original amount for exact 100", () => {
expect(calculateDiscount(100)).toBe(100);
});
it("should handle negative amounts gracefully", () => {
expect(calculateDiscount(-50)).toThrow();
});
});Quick Documentation & Comments → Gemini Flash
For lightweight tasks—adding JSDoc, naming variables, writing comments—Gemini 2.0 Flash is unbeatable. Responses arrive in under 100ms, perfect for "add a docstring to this function" requests.
// Gemini Flash handles quick documentation well
/**
* Validates user authentication tokens and checks expiration
* @param {string} token - JWT token to validate
* @returns {boolean} True if valid, false otherwise
*/
function validateAuthToken(token) {
// implementation...
}Complex Refactoring Suggestions → Gemini 2.0 Pro
Gemini 2.0 Pro excels at multi-angle suggestions. "Rewrite this class functionally," "apply the Strategy pattern here"—it provides thoughtful alternatives.
Private/Sensitive Code → Ollama (Local)
For proprietary code, personal projects, or data requiring privacy, Ollama running locally is invaluable. Code never leaves your machine. The tradeoff is speed, but security-first contexts make this worthwhile.
Enterprise & Team Management
For organizations rolling out BYOK across teams:
- Per-team provider assignment: Designate different providers for different teams
- Automatic credential rotation: Refresh API keys every 90 days
- Centralized usage monitoring: Track which models are used, how often, and by whom
- Budget controls: Set monthly spending caps per team to prevent overages
Configuration lives in GitHub Admin → Organization Settings → Copilot Policies.
Pairing BYOK with Antigravity
You can run Antigravity (Microsoft's desktop AI assistant) alongside BYOK-configured VS Code:
- Antigravity: System-wide assistance—browser, email, file management
- VS Code + BYOK: Code-focused, with model choice per task
They operate independently, letting you optimize for different contexts without conflict.
Implementation Tips & Gotchas
-
Store API keys securely
- Never paste keys directly into
settings.json - Use environment variables:
process.env.ANTHROPIC_API_KEY - Always
.gitignoreyour.envfiles
- Never paste keys directly into
-
Handle network latency
- Set Ollama as a fallback for slow internet
- If Chat response feels sluggish, try a different model
-
Learn each model's personality
- Claude: logical, thorough, careful reasoning
- Gemini: intuitive, fast, broad knowledge
- GPT-4: versatile across domains
- Share patterns with your team
-
Remember: Completions stay with GitHub
- BYOK doesn't change inline suggestions
- Build workflows around Chat and slash commands instead
A multi-model development environment isn't just convenient—it's liberating. You're no longer constrained to a single model's strengths and weaknesses. Build the setup that fits your work, and watch your productivity shift.