From reading the article. They offered their developers both Claude code and Copilot.
What they wanted was for them to use both and feedback which was better.
The developers voted with their feet and didn’t use Copilot.
What Microsoft were hoping was that the opposite would happen...
There's definitely a way to use Claude code that is token conscious.
I've tried throwing unsupervised agentic software factory workflows against the wall, and they burned through my tokens like nobody's business but didn't produce much.
Supervised, human-in-the-loop process on the other hand is much more productive but doesn't consume nearly as much. Maybe that's why everyone's pushing agentic approaches so much.
So, snippet from the article says the following:
> I understand that Microsoft is planning to remove most of its Claude Code licenses and push many of its developers to use Copilot CLI instead. While Claude Code has been a popular addition, it has also undermined Microsoft’s new GitHub Copilot CLI coding tool — a command line version of GitHub Copilot that runs outside of development apps like Visual Studio Code.
And people here are interpreting this as related mainly to the Claude burning too much tokens too quickly and suggesting Microsoft should rather use SomeOtherLLM©?
Is this Hacker News or rather Marketing Wars?
Feels about right.
I've launched an internal demo of Claude Code and Deepseek on the same day and we burned through our monthly allowance for Claude in just over a week, with more than a half of that budget being spent in one day. With DS people are unable to go through that same amount of money in a month, not even close.
With that Claude feels like an expensive toy, while DS is a shovel, purely because developers do not feel like they are eating into a precious resource while using it. Also it does not feel like there is much of a difference in capability between Claude and DS-pro. DS-pro and flash do feel like sonnet/opus and haiku, but flash is still very-very capable.
Related: Microsoft-owned GitHub recently switched to token-based billing:
https://github.blog/news-insights/company-news/github-copilo...
Claude tokens are priced by GitHub at a disproportionately premium price compared to Gemini and OpenAI. I wonder why?
https://docs.github.com/en/copilot/reference/copilot-billing...
Our shop is forced to use Copilot on gov cloud, and it’s so useless I usually stick to manually coding. Its syntax is messy, it randomly combines lines together, flips order, or drops a couple tokens worth of output in the middle of a line, and for some reason it consistently drops the last line of every code block. I assume we’re getting a few versions back of GPT under the hood. But it does make me appreciate how the models of the past year or so crossed the threshold from interesting to truly productivity-enhancing.
Between Copilot, Claude, and Gemini, I still actually prefer Gemini. I do a lot of scientific writing in addition to coding and Gemini is the only model I can trust to “just be right”. This trust then transfers over to its code output.
I have noticed particularly in recent weeks and maybe couple of months that token costs are just ridiculous. I can understand the upcoming IPOs and instinctive pressure to show profits ... but let's be honest, showcasing burning 1.3 million USD in tokens by a single developer in a month is the most ridiculous thing I have seen in my entire life. The general principles still apply. You expect investing X and have a return on such investment. Unfortunately that's not so easy to promise or expect. There's no real 1 to 1 correlation between amount of code written and returns, and even less between tokens burned and returns. I start to believe that the current token pricing approach, followed at the moment by all leading labs (especially considering OS models capabilities), is bordeline delusional ...
My experience is, Claude Code burns way more tokens compared to other agents, probably to ensure high levels of perceived quality, which is, most of the times not worth the bloat for the user. The bloat works for Anthropic as an advertisement at the cost of your tokens.
Thus does kind of beg the question: If developers are being laid off because AI is better/faster/cheaper or makes all their people 10x or whatever fig leaf, what happens if the required tooling ends up being more expensive? From the investor’s point of view is the drag of employee costs better or worse than a ballooning expense item?
I’ve been quite content with CoPilot’s $10/mo plan. Still offers access to Claude models (limited tokens) but has no time limits like the $20 Claude plan, so no interruptions in work flow. I use one of the free models for the more pedestrian tasks then sic Claude on the particularly thorny problems. Works very well for me.
More here: Microsoft reports are exposing AI's real cost problem: Using the tech is more expensive than paying human employees - https://fortune.com/2026/05/22/microsoft-ai-cost-problem-tok...
Cancellation effective June 30. This was a _pilot_ launched in December that accidentally consumed their 2026 yearly target spend on AI!
I expect the r/LocalLLaMA guys to be going nuts about this news.
The title is somewhat bait. It reads like MSFT is using less AI, while in fact it's just a force swap to Copilot.
Arguably, Copilot is GPT 5? Not sure what the CLI offers behind the covers.
I'm surprised they even had them in a first place. Doesn't Microsoft have a deep partnership with OpenAI? Aren't all Copilot things powered by various GPT models? I would assume the two companies have barter agreements of sorts.
This might actually be clever since Microsoft dev will be longing claude code features and might result in copilot getting way better
Well, that's the inevitable outcome of token-maxxing :shrugs:
Lots of these places measure employee token use with managers having dashboards. It seems like performative code production rather than making anything useful.
Speed without judgement always compounds badly.
Reminds me of when Steve Ballmer forbade his children to use iPods and pushed towards the Zune instead. Hahaha
How efficient is Claude at cleaning up unused code and making things more simple - as good as it is at adding code / features?
I switched to OpenRouter and OpenCode a while ago. It is much cheaper, much much cheaper, and A LOT more reliable. Particulary Gemini was a piece of trash when it came to uptime
If you properly keep documents, architecture, and decision records, token consumption can be pretty less. Iam managing everything with two codex plus sub. Repo size is 300 k loc ( backend).
I switched from Claude code to the GitHub copilot app recently. Since our repositories are hosted on GitHub I find the copilot app better integrated for the PR workflow with PR management available in the app. I don’t think I miss any of the features of Claude code I never thought I would make the switch but copilot upped the game.
Also it became very hard to convince management to keep both Claude code and GitHub Copilot enterprise licenses.
I think whats funny is that employees were most likely already covering the cost for these tools because they are useful. Companies didn't believe employees were using these tools and now have forced their usage and no longer have the costs subsidized.
Similarly companies seem to reward high token usage as a sign of someone willing to play ball with AI and again have forced higher costs on themselves for people reward hacking or using tokens out of spite.
That's very interesting to reconcile with the fact that not too far, Amazon employees feel incentivized to use as many tokens as possible.
after having used claude for quite some time, i would buy puts on microsoft
Surely a company as large as Microsoft is actively attempting to build their own models. They couldn't possibly have expected to stake the future of their software development on the conditions of a third party company?
The way coding agent work is fantastically wasteful. All the megabytes of code are processed over and over and over, sometimes withing just one session.
There are papers describing KV cache precomputation for commonly used documents (e.g. KVLink), but, of course, it's not a priority for model providers: they'd rather sell you more tokens, also they would rather get to AGI/ASI first than optimize usage of existing models...
"everybody needs to use these new AI tools or you will be left behind. no! not like that! the cheap, worser ones!"
Microsoft should host DeepseekV4 internally for its developers. And you're welcome.
What's the point of eating your own dog food when the only thing you are doing is reselling other people's dog food? Microsoft don't have any competing LLM.
Tokens aren’t that much of an issue when your not evaluated on the usage
They got DeepSeek on Azure, would cut costs by 10x … if they ran it on Huawei
I think tech companies are doing layoffs partly because they need to cover AI operating expenses.
How would one call such a strategy? Embrace and extend comes to mind.
I switched from Anthropic to OpenAI after spending ~$40K in equivalent token costs using Claude over 3 months.
I found Opus 4.7 to be slow and wasteful with token usage. It's shocking how inefficient it is with tasks like bash tool usage and web searching, delegating them to a dozen subagents only to get stuck and never return until you esc and intervene. That, in addition to all of the broken tooling Anthropic built in to limit token usage like the broken monitoring tool made managing Claude a chore. I was happy to pay $200/month for Opus 4.5 when they had more capacity, but 4.7 felt like a huge step back and no longer worth the price and inconvenience.
I remember an OpenAI employee comment on the GPT5.5 release post about how they specifically geared it towards long-horizon tasks and its been a breathe of fresh air in that regard. I have five two-week long sessions going right now and there's been no degradation in performance or efficiency. It's much better at carrying rules/learnings forward even in long-running sessions and grounding/refreshing itself in verified facts when it loses context.
Its funny because in two weeks I've gotten way more done with GPT5.5 with way fewer tokens and way less handholding. I think this goes to show how important tooling and the harness is and how a capable model like Opus 4.7 can be severely handicapped by bad product decisions.
It seems that people are using LLMs to generate code but many complain of sub par code. I recall the early days of virtualization when folks will use it but complain about performance. HW capacity continued to improve until virtualization became de facto standard. I wonder if sub par code will become better as more powerful agents models or compute become available.
My impression is they're being cancelled in favor of full internal adoption of Copilot CLI, which has got much better over the past few months.
It's been said that technologies are not product. CC might be better, but at the end of the day M$ is going to want to cut costs and have employees use their own technology. Perhaps Copilot CLI is close enough, and the CC product doesn't justify the cost of the Claude (technology) license when M$ has their own technology to leverage.
Side note, it's so frustrating that The Verge puts a paywall at the fold. It makes me feel like the rest of the story is not worth reading. I'm not inclined to pay $2 to read a link that was posted on an aggregator.
To be fair, Microsoft dogfooding something for once would be great.
What per cent of internal Microsoft IP runs through Anthropic? Do they not care about trade secrets, or certain groups allowed or not allowed to use tools that expose IP to external vendors?
This feels like these kind of bad incentive problems we always here about on here ... Like bugs and vipers.
Doesn't MS have the compute to run GPT 5.5 for all its employees?
This is an AI generated summary of a blog post (https://www.thelowdownblog.com/2026/05/microsoft-cancels-int...) which is a summary of an AI generated article (https://blazetrends.com/microsoft-cancels-claude-code-pilot-...) which is a summary of another AI generated article (https://www.themodelwire.com/article/microsoft-starts-cancel...) which is a summary of an article from The Verge (https://www.theverge.com/tech/930447/microsoft-claude-code-d...). I guess it would be better to link the Verge article instead.
[dead]
[flagged]
[flagged]
[flagged]
AI slop ruined a story about AI? This thread is a story about itself.
Microsoft poorly manages token use of most expensive models in a pilot. Then they use that failure to advertise/position their own Github Copilot agents to procurement teams, over the now widely validated Claude Code-based agents.
At least Codex is trying to win validation on merit.
The comments I see recommending selective use of cheaper models doesn't match the reality I experience working in the industry. I have the constant threat hanging over my head of being fired if I don't churn out code quickly enough. I'm not willing to gamble with my livelyhood by using a less effective model.
Saving money on tokens isn't something that's rewarded during performance reviews; particularly because it's difficult to quantify how much you saved versus hypothetically using a more expensive model.