Reading through the VSCode 1.118 release notes, I kept thinking "this is quietly significant." On the surface it's about adapting to Copilot's pricing change. But underneath, there are architectural-level efficiency improvements that directly affect how AI-assisted coding works day to day.
GitHub Copilot Moves to Pay-As-You-Go in June 2026
The context first: starting June 1, 2026, GitHub Copilot adds a consumption-based pricing option alongside existing monthly plans. VSCode 1.118 was released at the end of April with these changes built in ahead of time.
Pay-as-you-go means heavy users might pay more on high-usage months, while lighter users can pay less. The flat monthly subscription isn't going away — it's becoming one option among several.
Prompt Caching (KV Cache) Integration
The change with the most direct cost impact is deeper use of prompt caching.
VSCode 1.118 stores frequently referenced context in a KV cache when using Anthropic models (Claude). When the cache hits, cost drops to roughly one-tenth of a normal request.
This pays off when you're referencing the same files repeatedly, or when long system prompts are sent across multiple turns. Copilot users backed by Claude get this benefit automatically — no configuration required.
Improved Agent Context Management
When custom agent skills call multiple tools or load large reference documents, the chat context can balloon quickly. This caused a well-known degradation: the AI's responses becoming less coherent mid-session as the context fills with redundant tool results.
VSCode 1.118 compresses already-processed tool results before storing them in context. If you've experienced "the AI went weird in the middle of a long session," context pollution was likely part of the cause. This update reduces that.
Cleaner Extension API Permissions
The extension API got updated alongside the Copilot pricing changes, clarifying the permission model for third-party extensions that access Copilot's context.
This makes it easier for tools like database connectors and API testing extensions to integrate properly with Copilot's agent features — rather than working around ambiguous access boundaries.
What the Pay-As-You-Go Shift Means in Practice
Some usage patterns will cost more, some less, once consumption billing is active.
Patterns that may cost more:
- Keeping Copilot open all day with constant real-time completions
- Sending large code blocks for re-generation repeatedly
Patterns that may cost less:
- Asking questions in ways that hit the prompt cache
- Batching related work into fewer, longer conversations rather than many short ones
The shift from flat monthly pricing to consumption billing is a useful forcing function for thinking about efficiency. Being intentional about when to call the AI and how to structure conversations can make a real difference.
Looking back
VSCode 1.118 isn't about headline features. It's about the AI coding infrastructure maturing — prompt caching, context management, and pricing model adaptation all in one release. For developers who rely on Copilot daily, these are improvements you'll feel without necessarily noticing them explicitly.
Before June's pay-as-you-go transition, it's worth reviewing your own usage patterns once. Small adjustments in how you interact with Copilot can meaningfully change what you end up paying.