Claude: Elevated errors across many models [resolved]
Posted by forks 19 hours ago
Comments
Comment by evilturnip 19 hours ago
Claude Code is especially buggy in windows terminal. The rendering is quite slow, choppy and lines frequently get garbled.
In contrast, using antigravity cli is the exact opposite: fast, smooth and very responsive.
Comment by crystal_revenge 18 hours ago
This is an aside, but I'm really struck by how many people on HN use Windows (based on repeated mentions I've seen in comments). I've worked for a pretty wide range of companies over the last decade and only one, maybe two companies even had any people that worked on Windows machines. I haven't worked at a company where devs used Windows in 15 years (and even that company eventually switched to linux).
As I've gotten deeper into LLMs/AI roles even Macs have seemed to start having equal share compared to devs running full Linux setups.
Is this just a sign of that a larger and larger portion of HN users are working for large corporations? I honestly can't even remember that last time I saw a serious developer pull out a Windows laptop.
Comment by pixelpoet 18 hours ago
For a long time, high end graphics and games was mainly done on Windows and Visual Studio. I'm from that world, and only made the Linux transition last year November after Microsoft forced everyone's hand.
Comment by dmd 16 hours ago
Comment by kowbell 15 hours ago
This implies games are not still mainly done on Windows, which they absolutely are!
Comment by smcleod 16 hours ago
Comment by recursive 18 hours ago
Comment by maccard 17 hours ago
Comment by lmc 18 hours ago
Comment by tracker1 17 hours ago
Comment by freedomben 17 hours ago
Comment by latentsea 18 hours ago
Comment by JauntyHatAngle 18 hours ago
A lot of devs like gaming. Gaming is more simple on windows. Gaming PCs are usually high spec. High spec is good for most coding.
That's why I use windows quite often. My laptop is Linux, but when I'm running heavy models I'll still remote into my main Windows PC, which I also use for gaming.
Though in terms of workplaces - sure, I reckon you're on the money. Big corps often still force windows onto their Devs.
Comment by coldtea 18 hours ago
Comment by computerex 18 hours ago
Comment by alephnerd 17 hours ago
Even the HN dataset on HuggingFace shows that most engagement on HN is now during non-US hours [0] and drops off as Europe goes to sleep.
[0] - https://huggingface.co/datasets/open-index/hacker-news
Comment by nilkn 13 hours ago
Comment by slashdave 16 hours ago
That said, I bet you will be really hard pressed to find a single Windows machine at Anthropic.
Comment by leemoore 17 hours ago
Comment by garyfirestorm 16 hours ago
Comment by gambiting 18 hours ago
Windows really has a fantastic support for C++ and rendering programmers imho, the tooling is world class and Visual Studio has no match as an IDE. Even if somehow my tools worked on Mac or Linux I'd still pick windows out of sheer convenience of using it for work.
But as things stand - all major console toolchains are windows only. If you're making a game for PlayStation, Xbox or Switch, you have to be on windows.
Comment by vidarh 18 hours ago
Comment by sterlind 16 hours ago
Comment by Atotalnoob 11 hours ago
Comment by szatkus 16 hours ago
Comment by watwut 16 hours ago
I think that your range of companies was much less wide then you think.
> I honestly can't even remember that last time I saw a serious developer pull out a Windows laptop.
What is unserious developer?
Comment by ls612 17 hours ago
Comment by recursive 16 hours ago
Comment by cute_boi 18 hours ago
Comment by simsla 17 hours ago
But I do prefer a Linux or Mac for development, just because it's so much less hassle to set up.
Windows developers who don't set up a good terminal environment... I honestly don't know how they manage.
Comment by moronicles 17 hours ago
Comment by mgfist 19 hours ago
Comment by simplyluke 18 hours ago
Comment by Closi 18 hours ago
Coding is probably solved, at least to a large extent, but that doesn't mean engineering is solved too.
This is like someone saying that the wright brothers solved sustained/powered human flight, and another person saying "well if that's the case, why do planes still crash? obviously flight isn't solved.". Well, there are always improvements, but planes can fly and llm's can code.
Comment by smoe 17 hours ago
They been claiming more than just “coding is solved” for a while now.
https://www.businessinsider.com/anthropic-claude-code-founde...
Comment by codeflo 18 hours ago
Comment by jacobgold 18 hours ago
The risk for them is that someone matches their products while also having non-janky products and reliable services.
Distributed systems infrastructure, especially, is much less forgiving of vibe coding than application code. Coding agents are not even close to being good enough to design and build large-scale systems the way expert humans can.
There is nothing wrong with using agents to help write infrastructure code, but these systems have a way of punishing anyone who builds things they do not fully understand.
I'd love to see either Anthropic or OpenAI really step up their infrastructure game.
Comment by smoe 17 hours ago
It's worth noting that OpenAI's official uptime numbers are significantly better than Anthropic's:
99.9x% for API/Codex, versus <99.5% for Claude API/Claude Code.
I'd obviously like OpenAI's numbers to be higher too, but this is one reason it really annoys me when the head of Claude Code goes on podcasts implying that software engineering as a whole, not just the act of writing out code, is basically solved.
One wonders why hasn't months of presumably near-unlimited internal Mythic solved the issues unrelated to hardware shortages yet.
Comment by simsla 17 hours ago
I see a lot of people speaking about a 10x productivity improvement and so on. When I work on hobby projects, I do see that. Just last weekend, I set up a hobby project that I've been thinking about for a while. I'm pretty sure it would have taken me at least a week to implement manually, but instead, it took me three hours.
But some of the systems I have to work on during the day are so big and complicated that you can spend multiple days on a small feature or even just tracking down a bug, even with the support of Opus.
Expecting even a 2x productivity improvement on some those systems is wildly unrealistic. I'm seeing a lot of people get stressed out because the productivity gains from simple application building trickle into the expectations for these complex systems.
That said, if things keep improving at this rate it might just be a matter of time.
Comment by mdavid626 19 hours ago
Claude Code is sluggish, buggy, slow. Typical big enterprise garbage. The only good thing at Anthropic are the models.
Comment by CharlesW 19 hours ago
The same post on Claude Code: "Even though the System Prompt and tool descriptions are clearly more verbose, most of the extra tokens encode product features and rational design choices: a memory system, scheduled tasks, sub-agents, plan mode, worktree support. Whether those features are worth paying for depends on your needs. Calling the prompt 'bloated' without looking at the whole picture feels wrong to me."
Comment by skybrian 18 hours ago
Comment by mdavid626 7 hours ago
Did you even use pi.dev?
I used them both and I can tell you there isn’t a single feature what pi.dev misses from Claude Code.
Comment by ai_slop_hater 18 hours ago
Comment by orphea 19 hours ago
Comment by isoprophlex 18 hours ago
Comment by 0123456789ABCDE 17 hours ago
Comment by JVerstry 18 hours ago
Comment by _pdp_ 18 hours ago
Comment by spondyl 15 hours ago
Our org has been attempting to trial Fast Mode on and off but enabling it in Claude Code just says something like "not available with your cloud provider"
It turns out that for all of the matrices on the documentation site, there is a secret "other" set of infrastructure for those who bill Enterprise via AWS Marketplace where certain features like Fast Mode are incompatible when using "aws routing"
Anthropic Support straight up mentioned that Claude Code just doesn't handle this case correctly and erroneously gives you the impression that it's supported against your billing method, and that it's all effectively undocumented.
I suppose this means there are like 5 different AWS methods of use:
- Bedrock (Legacy)
- Bedrock
- Claude for AWS
- Claude via Marketplace
- Anthropic's own "primary infrastructure"
and then roll in other cloud provider variations
On that note, I recall Datadog having coupled billing and infrastructure per cloud (ie; billing via GCP requires using the GCP infra) and was wondering if commenters had any insights into if there is some special requirement/complexity around marketplace billing for cloud providers or if it's just some weird design choice?
Comment by winstonp 19 hours ago
Comment by jatora 19 hours ago
Comment by Raj_Sidwadkar 11 hours ago
It takes hours to manhunt these errors and also you cannot risk again to tell the local llm to fix it.
Comment by sunaookami 19 hours ago
Comment by comboy 3 hours ago
Comment by chankstein38 17 hours ago
Comment by monooso 19 hours ago
Or possibly as a result of.
Comment by smcleod 16 hours ago
Comment by segmondy 17 hours ago
Comment by celsoazevedo 18 hours ago
Comment by hotfixguru 18 hours ago
Comment by vidarh 18 hours ago
Comment by pixard 16 hours ago
But hey coding is a solved problem.
Comment by rpcope1 18 hours ago
Comment by maleldil 18 hours ago
Comment by matheusmoreira 17 hours ago
Good to see it's not just me...
Comment by witx 17 hours ago
Comment by quatonion 19 hours ago
Can you post some images of lines getting garbled. That sounds like a genuine bug Anthropic might want to look into. I haven't seen that ever.
Comment by idiotsecant 19 hours ago
Comment by quijoteuniv 19 hours ago
Comment by pier25 16 hours ago
Comment by cmrdporcupine 18 hours ago
I prefer it over opencode, which is my other option I use with my Codex sub
Comment by vidarh 18 hours ago
Comment by cmrdporcupine 18 hours ago
Comment by vidarh 17 hours ago
Comment by tcp_handshaker 19 hours ago
Comment by colechristensen 19 hours ago
Claude harnesses have plenty of bugs but I prefer capability over interface shininess any day. (though if I were running the show I'd have a sizable team set aside to do exclusively boring stability and polish work)
Comment by throwaway613746 18 hours ago
Comment by eatsyourtacos 19 hours ago
That sounds like a you issue.. it's wonderful on the terminal. It's their GUI which needs work (they have been improving, but still not a fan).
I've been using it on multiple computers for months and it's generally rock solid and lovely.
Comment by Wowfunhappy 18 hours ago
Except, starting this morning, one very long running session decided to start spawning subagents for each task. I'm not sure what caused this emergent behavior, but it seemed to be working fine, so I was eager to see where it went.
Except, as soon as a subagent hits a 500 error, the main agent seemingly doesn't know what to do. It kind of panics—"now the tree/install state is unknown!"—and ultimately does a git checkout "to verify and restore a known-good state before anything else".
I've paused the job for now since it's a sort of background experiment.
Comment by wxw 19 hours ago
Comment by testfrequency 17 hours ago
That said, I feel icky, like I just made a Facebook account in 2026 :(
Comment by redox99 15 hours ago
Codex ever since ~5.2 has been better at long tasks in large codebases.
Comment by throwaw12 18 hours ago
Can you move it to background connection?
Comment by Alifatisk 16 hours ago
Comment by enraged_camel 19 hours ago
I still use Codex, but mostly when I need to check Opus 4.8's work. Pretty sure I will stop doing that soon, because during the short time Fable was available, Codex was not able to find any important issues with the code Fable wrote.
Comment by nostrebored 18 hours ago
Both were trivial to set up with codex.
Comment by wxw 19 hours ago
Haven’t tried Cowork, interesting. Isn’t it just the same agent minus the git worktree based UI?
Frankly, neither Claude nor Codex are as good as hype entails.
Comment by antupis 19 hours ago
Comment by sunaookami 19 hours ago
Comment by nostrebored 18 hours ago
Comment by orphea 18 hours ago
Comment by black_knight 16 hours ago
Personally, Claude Opus (and in the few interactions I had with it, Fable) has been the far the superior experience. GPT-5.5 seems dumber and more certain about presenting me bullshit. Opus has better humor, and is less pretentious in its presentation. But this may all boil down to how the models react to my prompting.
What is without a doubt is that I wish they both were more intelligent – or maybe it is their wisdom I find lacking!
Comment by ai_slop_hater 18 hours ago
Comment by vmg12 18 hours ago
This is all wrong.
Comment by cute_boi 18 hours ago
Comment by bastard_op 19 hours ago
Comment by jonas21 18 hours ago
Comment by nikanj 18 hours ago
Comment by Wowfunhappy 18 hours ago
Comment by thinkingtoilet 18 hours ago
Comment by blitzar 19 hours ago
Comment by time0ut 19 hours ago
Comment by InsideOutSanta 19 hours ago
Comment by nullpoint420 19 hours ago
Comment by re-thc 19 hours ago
Comment by tcp_handshaker 19 hours ago
Comment by blitzar 19 hours ago
Crazy thing is ... its true.
Comment by rootlocus 19 hours ago
Comment by jdiff 19 hours ago
Comment by ceejayoz 18 hours ago
Comment by spicyusername 19 hours ago
Comment by iAMkenough 18 hours ago
Comment by pton_xd 17 hours ago
Comment by solomonb 17 hours ago
Comment by bravetraveler 19 hours ago
Can't wait for debugging to be solved. Hell, I might even subscribe for 'mostly'.
Comment by m_ke 19 hours ago
Comment by unshavedyak 19 hours ago
Comment by quatonion 19 hours ago
Comment by InsideOutSanta 19 hours ago
Comment by winstonp 19 hours ago
You just have to pay API prices.
Comment by maleldil 17 hours ago
Comment by uhuhuhuhuhuh 19 hours ago
Comment by 0xbadcafebee 17 hours ago
Comment by dwa3592 19 hours ago
Comment by jatora 19 hours ago
Comment by DonHopkins 19 hours ago
Comment by sharts 18 hours ago
Comment by re-thc 19 hours ago
Don't jinx it. They might use that name for their next model.
Comment by polack 17 hours ago
Neat. Didn't have a single request go through for 2 hours. Guess they need to improve their metrics before the IPO...
Comment by acedTrex 19 hours ago
Comment by figmert 19 hours ago
Yes I know you can run offline models, but it's hard to pass up on a little bit of snark.
Comment by t1234s 18 hours ago
Comment by xmprt 18 hours ago
Comment by ta-run 19 hours ago
Comment by Animats 16 hours ago
Comment by radium3d 18 hours ago
Comment by sharts 18 hours ago
Comment by timmytokyo 17 hours ago
Comment by tom_808 16 hours ago
Comment by jamesgrimshaw 19 hours ago
Comment by paulddraper 19 hours ago
Comment by sixothree 19 hours ago
Comment by cesarvarela 19 hours ago
Comment by champagnepapi 19 hours ago
Comment by Surac 19 hours ago
Comment by gottagocode 19 hours ago
Comment by caycep 19 hours ago
Comment by EstanislaoStan 19 hours ago
Comment by jr3592 19 hours ago
Comment by viccis 18 hours ago
Comment by tcp_handshaker 19 hours ago
Comment by throwaw12 18 hours ago
Comment by swader999 19 hours ago
Comment by RishiByte 19 hours ago
Comment by pedromlsreis 18 hours ago
Comment by quatonion 19 hours ago
Comment by consumer451 18 hours ago
Comment by 7e 19 hours ago
Comment by rvz 19 hours ago