After outages, Amazon to make senior engineers sign off on AI-assisted changes
Posted by ndr42 11 hours ago
https://www.ft.com/content/7cab4ec7-4712-4137-b602-119a44f77... (https://archive.ph/wXvF3)
https://twitter.com/lukolejnik/status/2031257644724342957 (https://xcancel.com/lukolejnik/status/2031257644724342957)
Comments
Comment by cobolcomesback 9 hours ago
This meeting happens literally every week, and has for years. Feels like the media is making a mountain out of a mole hill here.
Comment by davidclark 9 hours ago
>He asked staff to attend the meeting, which is normally optional.
Is that false? It also discusses a new policy:
>Junior and mid-level engineers will now require more senior engineers to sign off any AI-assisted changes, Treadwell added.
Is that inaccurate? It is good context that this is a regularly scheduled meeting. But, regularly scheduled meetings can have newsworthy things happen at them.
Comment by djb_hackernews 4 hours ago
My SVP asks me to do things all the time, indirectly. I do probably 5% of them.
Comment by MikeTheGreat 2 hours ago
Ok, this is pretty off-topic, but is this still true? I get that you can't have 10K people all actively participate in the meeting at the same time, but doesn't Zoom have a feature where you can broadcast to thousands and thousands?
Doesn't X/Twitter have a feature like this? (Although, to be fair, the last time I heard about that it was part of a headline like "DeSantis announcement of Presidential run on X/Twitter delayed for hours as X/Twitter's tech stack collapses under 200K viewers")
But still - nowadays it seems like it should be possible to have 10K employees all tune in at the same time and then call it a meeting, yes?
Comment by hibikir 2 hours ago
Very different from the typical weekly/montly outage meeting, where discussion is actually expected, instead of being a ritual.
Comment by sheept 2 hours ago
Comment by javcasas 3 hours ago
Comment by hyperpape 2 hours ago
Scale cuts both ways.
What matters isn't how big the meeting is, it's how important the material is, and how well presented it is.
Comment by wolvoleo 2 hours ago
If I ever attend it just put it on mute and look at the slides while I do some real work. That way my attendance gets registered and it doesn't stress me out later with too much stuff left hanging.
That percolation is also translation of what they say to things that are relevant at my level. Like what we will be working on next year, if there's going to be bonus or job losses.
I couldn't give a crap about the company's strategy as a whole and that's not my job anyway. Why should I. I'm not here because I believe in some holy mission. I just wanna do something I like and get paid.
Comment by hyperpape 1 hour ago
But this meeting is a course correction for how they're using AI, which is a huge initiative. He'll be trying to sell the right balance of "keep using the technology, but don't fuck anything up."
Too cautious, everyone freezes and there's a slowdown[0]. Too soft, everyone thinks it's "another empty warning not to fuck up" and they go right back to fucking everything up because the real message was "don't you dare slow down." After the talk, people will have conversations about "what did they really mean?"
[0] If you hate AI, feel free to flip the direction of the effect.
Comment by wolvoleo 1 hour ago
How are they expecting some juniors to do this when the industry as a whole doesn't know where to begin yet?
Like that Meta AI expert who wiped her whole mailbox with openclaw. These are the people who should come up with the answers.
Ps I mostly hate AI but I do see some potential. Right now it feels like we're entering a fireworks bunker looking for a pot of gold and having only a box of matches for illumination.
What we need to know from management is exactly what you mention. Do we go all out and accept that shit will hit the fan once in a while (the old move fast and break things) or do we micromanage and basically work manually like old. And that they accept the risk either way. That kind of strategy is really business leader kind of work. Blaming it on your techs when it inevitably goes wrong is not.
Because the tech as it is right now is very non-deterministic. One day it works magic and the next day it blows up.
And yes that SMILE thing was a good example. Been in too many of those time wasters.
Comment by swader999 3 hours ago
Comment by tmoertel 2 hours ago
Comment by encom 1 hour ago
Sorry, I got flashbacks...
Comment by FuckButtons 2 hours ago
Comment by airstrike 2 hours ago
Comment by LPisGood 31 minutes ago
It’s not really possible to measure how much it would cost to not have a meeting, and I think it’s pretty obvious that if there were no meetings ever, it would hurt a company a lot
Comment by airstrike 4 minutes ago
Comment by tibbar 3 hours ago
Comment by RealityVoid 3 hours ago
Comment by hnguyen1412 5 minutes ago
Comment by ljm 2 hours ago
Why is an SVP doing this if it's just gonna be ignored?
Comment by messh 2 hours ago
Comment by skeeter2020 8 hours ago
Comment by ceejayoz 5 hours ago
If I get a note from my boss like that, I consider it mandatory.
Comment by idiotsecant 3 hours ago
Comment by brewdad 2 hours ago
Comment by delecti 2 hours ago
Comment by dpark 1 hour ago
Comment by s3p 4 hours ago
Judging from the comment above, no, the meeting happens every week, and this week they were asked to attend.
Comment by cobolcomesback 8 hours ago
Note that the article doesn’t say that he told staff they have to attend the meeting. It says he “asked” staff to attend the meeting. Which again, it’s really really normal for there to be an encouragement of “hey, since we just had an operational event, it would be good to prioritize attending this meeting where we discuss how to avoid operational events”.
As for the second quote: senior engineers have always been required to sign off on changes from junior engineers. There’s nothing new there. And there is nothing specific to AI that was announced.
This entire meeting and message is basically just saying “hey we’ve been getting a little sloppy at following our operational best practices, this is a reminder to be less sloppy”. It’s a massive nothingburger.
Comment by BigTTYGothGF 7 hours ago
Being "asked" by your boss to attend an optional meeting is pretty close to being required, it's just got a little anti-friction coating on it.
Comment by cobolcomesback 4 hours ago
Different companies have different cultures. Weird that people can’t grok this.
Comment by ryandrake 3 hours ago
"Did ya get the memo... about that meeting? I'll just have my secretary forward you another copy of that memo, OK? Yeaaaaaaah..."
Comment by ragall 4 hours ago
Comment by i_cannot_hack 7 hours ago
> Under “contributing factors” the note included “novel GenAI usage for which best practices and safeguards are not yet fully established”.
Comment by 8note 7 hours ago
definitely a team by team question. if it was required it would be a crux rule that the code review isnt approved without an l6 approver.
Comment by BikiniPrince 3 hours ago
Comment by CoolGuySteve 9 hours ago
Items weren't displaying prices and it was impossible to add anything to your cart. It lasted from about 2pm to 5pm.
It's especially strange because if a computer glitch brought down a large retail competitor like Walmart I probably would have seen something even though their sales volume is lower.
Comment by malfist 8 hours ago
Comment by chatmasta 3 hours ago
Comment by BikiniPrince 3 hours ago
Comment by m3047 6 hours ago
Comment by kotaKat 9 hours ago
Comment by groundzeros2015 47 minutes ago
Comment by otterley 9 hours ago
That's been their job ever since cable news was invented.
Comment by ses1984 9 hours ago
https://en.wikipedia.org/wiki/Yellow_journalism
It probably goes back as long as they have been shouting news in the town square in Rome or before that even.
Comment by belval 9 hours ago
Comment by cmiles74 3 hours ago
Comment by osigurdson 3 hours ago
Comment by cobolcomesback 1 hour ago
Comment by falsemyrmidon 1 hour ago
Comment by 8note 7 hours ago
Comment by coredog64 4 hours ago
Comment by 8note 2 hours ago
"get a person to look at it" is a cop-out action item, and best intentions only. nothing that you could actually apply to make development better across the whole company
Comment by furyofantares 3 hours ago
Must have as the comments are hours older than OP.
Comment by embedding-shape 8 hours ago
Are you completely missing the point of the submission? It's not about "Amazon has a mandatory weekly meeting" but about the contents of that specific meeting, about AI-assisted tooling leading to "trends of incidents", having a "large blast radius" and "best practices and safeguards are not yet fully established".
No one cares how often the meeting in general is held, or if it's mandatory or not.
Comment by skeeter2020 8 hours ago
no, and that's what people are noting: the headline deliberately tries to blow this up into a big deal. When did you last see the HN post about Amazon's mandatory meeting to discuss a human-caused outage, or a post mortem? It's not because they don't happen...
Comment by ummonk 2 hours ago
Comment by thepasch 4 hours ago
I do not understand how “company that runs half the internet has had major recent outages and now explicitly names lax/non-existent LLM usage guidelines as a major reason” can possibly not be a big deal in the midst of an industry-wide hype wave over how the world’s biggest companies now run agent teams shipping 150 pull requests an hour.
The chain of events is “AWS has been having a pretty awful time as far as outages go”, and now “result of an operational meeting is that the company will cut down on the use of autonomous AI.” You don’t need CoT-level reasoning to come to the natural conclusion here.
If we could, as a species, collectively, stop measuring the relevance of a piece of news proportionally by how much we like hearing it, please?
Comment by mattgreenrocks 4 hours ago
Comment by emp17344 3 hours ago
Comment by cobolcomesback 1 hour ago
Im a massive AI skeptic. If anyone were to be jumping up and down on the corpse of AI and this incessant drive to use it everywhere, it’d be me. But I also work at Amazon. I got the email. I attended the meeting. I can personally attest that there are no new requirements for AI-generated code. The articles about this in the meeting at extremely misleading, if not outright wrong. But instead of believing the person that was actually there in the room, this thread is full of people dismissing my first-hand account of the situation because it doesn’t align with the “haha AI failed” viewpoint.
Comment by autoexec 2 hours ago
Comment by cobolcomesback 4 hours ago
I don’t blame you, because this is just bad reporting (and potentially intentionally malicious to make you think it’s about AWS). But the meeting and discussion was with the Amazon retail teams, talking about Amazon retail processes, and Amazon retail services. The teams and processes that handle this are entirely separate from any AWS outages you are thinking of.
The outages that Amazon retail has faced also have nothing to do with AI, and there was no “explicit call out” about AI causing anything.
Comment by age1mlclg6 4 hours ago
https://www.theguardian.com/us-news/ng-interactive/2026/jan/...
Comment by cmiles8 8 hours ago
Comment by coredog64 4 hours ago
Comment by inquirerGeneral 3 hours ago
Comment by Clent 8 hours ago
What is worth being pointed out is how quickly people blame "The Media" for how people use, consume and spread information on social networks.
Comment by otterley 8 hours ago
Comment by niwtsol 9 hours ago
Comment by happytoexplain 9 hours ago
Review by a senior is one of the biggest "silver bullet" illusions managers suffer from. For a person (senior or otherwise) to examine code or configuration with the granularity required to verify that it even approximates the result of their own level of experience, even only in terms of security/stability/correctness, requires an amount of time approaching the time spent if they had just done it themselves.
I.e. senior review is valuable, but it does not make bad code good.
This is one major facet of probably the single biggest problem of the last couple decades in system management: The misunderstanding by management that making something idiot proof means you can now hire idiots (not intended as an insult, just using the terminology of the phrase "idiot proof").
Comment by ardeaver 9 hours ago
The more expensive and less sexy option is to actually make testing easier (both programmatically and manually), write more tests and more levels of tests, and spend time reducing code complexity. The problem, I think, is people don't get promoted for preventing issues.
Comment by VorpalWay 2 hours ago
The key to making this scalable is to make as few parts as possible critical, and make the potential bad outcomes as benign as possible. (This lets you go to a lower rating in whatever safety standard applies to your industry.) You still need tests for the less critical parts though, while downtime is better than injury, if you want to sell future machines to your customers you need to have a good track record. At least if you don't want to compete on cost.
Comment by happyghost 2 hours ago
This is a good lesson for anyone I think. Definitely something I’m going to think more about. Thanks for sharing!
Comment by asdfman123 2 hours ago
If you told someone "I don't trust you, run all code by me first" it wouldn't go well. If you tell them "everyone's code gets reviewed" they're ok with it.
Comment by bluGill 8 hours ago
they do - but only after a company has been burned hard. They also can be promoted for their area being enough better that everyone notices.
still the best way to a promotion is write a major bug that you can come in at the last moment and be the hero for fixing.
Comment by tartoran 8 hours ago
Comment by recursive 8 hours ago
Comment by marcta 7 hours ago
Comment by bluGill 7 hours ago
Comment by joquarky 2 hours ago
Two years afterward, we got hit with ransomware. And obviously "I told you so" isn't a productive discussion topic at that point.
Comment by johnnyanmac 6 hours ago
Comment by bloppe 3 hours ago
Comment by brianwawok 2 hours ago
Comment by 8note 8 hours ago
cleaning up structural issues across a couple orgs is a senior => principal promo ive seen a couple of times
Comment by marginalia_nu 9 hours ago
Unchecked, AI models output code that is as buggy as it is inefficient. In smaller green field contexts, it's not so bad, but in a large code base, it's performs much worse as it will not have access to the bigger picture.
In my experience, you should be spending something like 5-15X the time the model takes to implement a feature on reviewing and making it fix its errors and inefficiencies. If you do that (with an expert's eye), the changes will usually have a high quality and will be correct and good.
If you do not do that due dilligence, the model will produce a staggering amount of low quality code, at a rate that is probably something like 100x what a human could output in a similar timespan. Unchecked, it's like having a small army of the most eager junior devs you can find going completely fucking ape in the codebase.
Comment by locusofself 8 hours ago
Comment by happytoexplain 8 hours ago
Comment by brandensilva 6 hours ago
Comment by ritlo 8 hours ago
What do the relatively hands-off "it can do whole features at a time" coding systems need to function without taking up a shitload of time in reviews? Great automated test coverage, and extensive specs.
I think we're going to find there's very little time-savings to be had for most real-world software projects from heavy application of LLMs, because the time will just go into tests that wouldn't otherwise have been written, and much more detailed specs that otherwise never would have been generated. I guess the bright-side take of this is that we may end up with better-tested and better-specified software? Though so very much of the industry is used to skipping those parts, and especially the less-capable (so far as software goes) orgs that really need the help and the relative amateurs and non-software-professionals that some hope will be able to become extremely productive with these tools, that I'm not sure we'll manage to drag processes & practices to where they need to be to get the most out of LLM coding tools anyway. Especially if the benefit to companies is "you will have better tests for... about the same amount of software as you'd have written without LLMs".
We may end up stuck at "it's very-aggressive autocomplete" as far as LLMs' useful role in them, for most projects, indefinitely.
On the plus side for "AI" companies, low-code solutions are still big business even though they usually fail to deliver the benefits the buyer hopes for, so there's likely a good deal of money to be made selling companies LLM solutions that end up not really being all that great.
Comment by ansibsha 6 hours ago
Code is the most precise specification we have for interfacing with computers.
Comment by tmaly 5 hours ago
Comment by marginalia_nu 4 hours ago
Comment by interestpiqued 2 hours ago
Comment by slopinthebag 7 hours ago
So I expect over time we will see genuine performance improvements, but Amdahl's law dictates it won't be as much as some people and ceo's are expecting.
Comment by dboreham 4 hours ago
Comment by _wire_ 8 hours ago
Writing tests to ensure a program is correct is the same problem as writing a correct program.
Evaluating conformance is a different category of concern from ensuring correctness. Tests are about conformance not correctness.
Ensuring correct programs is like cleaning in the sense that you can only push dirt around, you can't get rid of it.
You can push uncertainty around and but you can't eliminate it.
This is the point of Gödel's theorem. Shannon's information theory observes similar aspects for fidelity in communication.
As Douglas Adams noted: ultimately you've got to know where your towel is.
Comment by layer8 3 hours ago
Comment by shimman 8 hours ago
One thing I hope we'll all collectively learn from this is how grossly incompetent the elite managerial class has become. They're destroying society because they don't know what to do outside of copying each other.
It has to end.
Comment by SchemaLoad 2 hours ago
Comment by marginalia_nu 8 hours ago
For fairly straightforward changes it's probably a wash, but ironically enough it's often the trickier jobs where they can be beneficial as it will provide an ansatz that can be refined. It's also very good at tedious chores.
Comment by misnome 6 hours ago
Comment by bluGill 8 hours ago
Comment by hard24 8 hours ago
People seem to gloss over this... As a CEO if people don't function like this I'd be awake at night sweating.
Comment by bonesss 7 hours ago
Which results the software engineering issue I’m not seeing addressed by the hype: bugs cost tens to hundreds of times their coding cost to resolve if they require internal or external communication to address. Even if everyone has been 10x’ed, the math still strongly favours not making mistakes in the first place.
An LLM workflow that yields 10x an engineer but psychopathically lies and sabotages client facing processes/resources once a quarter is likely a NNPP (net negative producing programmer), once opportunity and volatility costs are factored in.
Comment by demosito666 3 hours ago
The math depends on importance of the software. A mistake in a typical CRUD enterprise app with 100 users has zero impact on anything. You will fix it when you have time, the important thing is that the app was delivered in a week a year ago and was solving some problem ever since. It has already made enormous profit if you compare it with today’s (yesterday’s ?) manual development that would take half a year and cost millions.
A mistake in a nuclear reactor control code would be a total different thing. Whatever time savings you made on coding are irrelevant if it allowed for a critical bug to slip through.
Between the two extremes you thus have a whole spectrum of tasks that either benefit or lose from applying coding with LLMs. And there are also more axes than this low to high failure cost, which also affect the math. For example, even non-important but large app will likely soon degrade into unmanageable state if developed with too little human intervention and you will be forced to start from scratch loosing a lot of time.
Comment by bluGill 3 hours ago
Comment by bluGill 8 hours ago
Comment by raw_anon_1111 7 hours ago
Comment by ansibsha 6 hours ago
We as an industry have been able to offload a lot of “how” via deterministic systems built by humans with expert understanding. LLMs give you the illusion of this.
Comment by raw_anon_1111 6 hours ago
1. I spoke to sales to find out about the customer
2. I read every line of the contract (SOW)
3. I did the initial requirements gathering over a couple of days with the client - or maybe up to 3 weeks
3. I designed every single bit of AWS architecture and code
4. I did the design review with the client
5. I led the customer acceptance testing
> We as an industry have been able to offload a lot of “how” via deterministic systems built by humans with expert understanding. LLMs
I assure you the mid level developers or god forbid foreign contractors were not “experts” with 30 years of coding experience and at the time 8 years of pre LLM AWS experience. It’s been well over a decade - ironically before LLMs - that my responsibility was only for code I wrote with my own two hands
Comment by ansibsha 2 hours ago
I’m not saying trusting cheap devs is a good idea either. I do think cheap devs are actually at risk here.
Comment by raw_anon_1111 1 hour ago
I didn’t blindly trust the Salesforce consultants either. I also didn’t verify every line of oSql (not a typo) they wrote.
Comment by icedchai 39 minutes ago
Comment by rectang 8 hours ago
I disagree, in the sense that an engineer who knows how to work with LLMs can produce code which only needs light review.
* Work in small increments
* Explicitly instruct the LLM to make minimal changes
* Think through possible failure modes
* Build in error-checking and validation for those failure modes
* Write tests which exercise all paths
This is a means to produce "viable" code using an LLM without close review. However, to your point, engineers able to execute this plan are likely to be pretty experienced, so it may not be economically viable.
Comment by marginalia_nu 8 hours ago
Comment by rectang 8 hours ago
The gains are especially notable when working in unfamiliar domains. I can glance over code and know "if this compiles and the tests succeed, it will work", even if I didn't have the knowledge to write it myself.
Comment by rsynnott 3 hours ago
... Errr... Yeah, that's not a great approach, unless you are defining 'work' extremely vaguely.
Comment by rectang 1 hour ago
I still make an effort to understand the generated code. If there’s a section I don’t get, I ask the LLM to explain it.
Most of the time it’s just API conventions and idioms I’m not yet familiar with. I have strong enough fundamentals that I generally know what I’m trying to accomplish and how it’s supposed to work and how to achieve it securely.
For example, I was writing some backend code that I knew needed a nonce check but I didn’t know what the conventions were for the framework. So I asked the LLM to add a nonce check, then scanned the docs for the code it generated.
Comment by johnnyanmac 6 hours ago
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
>When developers are allowed to use AI tools, they take 19% longer to complete issues—a significant slowdown that goes against developer beliefs and expert forecasts. This gap between perception and reality is striking: developers expected AI to speed them up by 24%, and even after experiencing the slowdown, they still believed AI had sped them up by 20%.
If we're being honest with ourselves, it's not making devs work faster. It at best frees their time up so they feel more productive.
Comment by rectang 5 hours ago
I'd like to think that I have this under control because the methodology of working in small increments helps me to recognize when I've gotten stuck in an eddy, but I'll have to watch out for it.
I still maintain that the LLM is saving me time overall. Besides helping in unfamiliar domains, it's also faster than me at leaf-node tasks like writing unit tests.
Comment by tmaly 5 hours ago
Comment by johnnyanmac 5 hours ago
Comment by marginalia_nu 8 hours ago
Comment by rectang 8 hours ago
Yes, code produced this way will have bugs, especially of the "unknown unknown" variety — but so would the code that I would have written by hand.
I think a bigger factor contributing to unforeseen bugs is whether the LLM's code is statistically likely to be correct:
* Is this a domain that the LLM has trained on a lot? (i.e. lots of React code out there, not much in your home-grown DSL)
* Is the codebase itself easy to understand, written with best practices, and adhering to popular conventions? Code which is hard for humans to understand is also hard for an LLM to understand.
Comment by marginalia_nu 8 hours ago
It introduces unnecessary indirection, additional abstractions, fails to re-use code. Humans do this too, but AI models can introduce this type of architectural rot much faster (because it's so fast), and humans usually notice when things start to go off the rails, whereas an AI model will just keep piling on bad code.
Comment by rectang 7 hours ago
---
applyTo: '**'
---
By default:
Make the smallest possible change.
Do not refactor existing code unless I explicitly ask.
Under this, Claude Opus at least produces pretty reliable code with my methodology even under surprisingly challenging circumstances, and recent ChatGPTs weren't bad either (though I'm no longer using them). Less powerful LLMs struggle, though.Comment by raw_anon_1111 7 hours ago
But I would never do the same for Azure.
Comment by jonnycoder 7 hours ago
Comment by Skidaddle 8 hours ago
Comment by UncleMeat 4 hours ago
"Seniors will do expert review" will slowly collapse.
Comment by raw_anon_1111 8 hours ago
No one cares about handcrafted artisanal code as long as it meets both functional and non functional requirements. The minute geeks get over themselves thinking they are some type of artists, the happier they will be.
I’ve had a job that requires coding for 30 years and before ther I was hobbyist and I’ve worked for from everything from 60 person startups to BigTech.
For my last two projects (consulting) and my current project, while I led the project, got the requirements, designed the architecture from an empty AWS account (yes using IAC) and delivered it. I didn’t look at a line of code. I verified the functional and non functional requirements, wrote the hand off documentation etc.
The customer is happy, my company is happy, and I bet you not a single person will ever look at a line of code I wrote. If they do get a developer to take it over, the developer will be grateful for my detailed AGENTS.md file.
Comment by sarchertech 7 hours ago
We know from experimentation that agents will change anything that isn’t nailed down. No natural language spec or test suite has ever come close to fully describing all observable behaviors of a non-trivial system.
This means that if no one is reviewing the code, agents adding features will change observable behaviors.
This gets exposed to users as churn, jank, and broken work flows.
Comment by raw_anon_1111 6 hours ago
Comment by sarchertech 6 hours ago
2. Assuming that techniques that work with human developers that have severely impaired judgement but are massively faster at producing code is a bad idea.
3. There’s no way you have enough experience with maintaining code written in this way to confidently hand wave away concerns.
Comment by raw_anon_1111 6 hours ago
Comment by sarchertech 3 hours ago
Comment by raw_anon_1111 3 hours ago
So many people on HN are so insulted that the people who put money in our bank accounts and in some cases stock in our brokerage accounts ever cared about their bespoke clean code, GOF patterns and they never did. LLM just made it more apparent.
It’s always been dumb for PR to be focused on for loops vs while loops instead of focusing on whether functional and non functional requirements are met
Comment by sarchertech 36 minutes ago
Comment by raw_anon_1111 28 minutes ago
Comment by hard24 8 hours ago
Speak for yourself. I don't hire people like you.
Comment by raw_anon_1111 8 hours ago
Even in late 2023 with the shit show of the current market, I had no issues having multiple offers within three weeks just by reaching out to my network and companies looking for people with my set of skills.
Comment by YCpedohaven 8 hours ago
Comment by raw_anon_1111 8 hours ago
Guess what? I also stopped caring how registers are used and counting clock cycles in my assembly language code like it’s the 80s and I’m still programming on a 1Mhz 65C02
Comment by icedchai 6 hours ago
But do you look at any of the AI output? Or is it just "it works, ship it"?
Comment by raw_anon_1111 4 hours ago
What I checked.
1. The bash shell scripts I had it write as my integration test suite
2. To make sure it wasn’t loading the files into Postgres the naive way -loading the file from S3 and doing bulk inserts instead of using the AWS extension that lets it load directly from S3. It’s the differ xe between taking 20 minutes and 20 seconds.
3. I had strict concurrency and failure recovery requirements. I made sure it was done the right way.
4. Various security, logging, log retention requirements
What I didn’t look at - a line of the code for the web admin site. I used AWS Cognito for authentication and checked to make sure that unauthorized users couldn’t use the website. Even that didn’t require looking at the code - I had automated tests that tested all of the endpoints.
Comment by icedchai 2 hours ago
I've witnessed human developers produce incredibly convoluted, slow "ETL pipelines" that took 10+ minutes to load single digit megabytes of data. It could've been reduced to a shell script that called psql \copy.
Comment by unshavedyak 2 hours ago
Hell, often it feels slower/worse. Foreign code is easily confusing at first, which slows you down - and bad code quickly gets bewildering and sends you down paths of clarifications that waste time.
Comment by SchemaLoad 2 hours ago
Then often it blows up in production. Makes me almost want to blanket reject PRs for being too difficult to understand. Hand written code almost has an aversion to complexity, you'd search around for existing examples, libraries, reusable components, or just a simpler idea before building something crazy complex. While with AI you can spit out your first idea quickly no matter how complex or flawed the original concept was.
Comment by MarkSweep 1 hour ago
Comment by js8 9 hours ago
It's actually often harder to fix something sloppy than to write it from scratch. To fix it, you need to hold in your head both the original, the new solution, and calculate the difference, which can be very confusing. The original solution can also anchor your thinking to some approach to the problem, which you wouldn't have if you solve it from scratch.
Comment by ummonk 2 hours ago
Comment by bluGill 8 hours ago
Comment by js8 4 hours ago
Comment by bluGill 3 hours ago
Comment by steveBK123 9 hours ago
If AI is a productivity boost and juniors are going to generate 10x the PRs, do you need 10x the seniors (expensive) or 1/10th the juniors (cost save).
A reminder that in many situations, pure code velocity was never the limiting factor.
Re: idiot prooofing I think this is a natural evolution as companies get larger they try to limit their downside & manage for the median rather than having a growth mindset in hiring/firing/performance.
Comment by AgentOrange1234 7 hours ago
Comment by onion2k 8 hours ago
I suspect that isn't the goal.
Review by more senior people shifts accountability from the Junior to a Senior, and reframes the problem from "Oh dear, the junior broke everything because they didn't know any better" to "Ah, that Senior is underperforming because they approved code that broke everything."
Comment by hintymad 4 hours ago
Especially in a big co like Amazon, most senior engineers are box drawers, meeting goers, gatekeepers, vision setters, org lubricants, VP's trustees, glorified product managers, and etc. They don't necessarily know more context than the more junior engineers, and they most likely will review slowly while uncovering fewer issues.
Comment by bs7280 8 hours ago
Whether or not these productivity gains are realized is another question, but spreadsheet based decision makers are going to try.
Comment by czscout 8 hours ago
Comment by bs7280 6 hours ago
Also - the definition of Senior will change, and a lot of current Seniors will not transition, while plenty of Juniors that put in a lot of time using code agents will transition.
Comment by ForHackernews 3 hours ago
But will they? I'm not at all convinced that babysitting an AI churning out volumes of code you don't understand will help you acquire the knowledge to understand and debug it.
Comment by esafak 2 minutes ago
Comment by simplyluke 8 hours ago
Comment by lovich 7 hours ago
American corporate culture has decided that training costs are someone else’s problem. Since every corporation acts this way it means all training costs have been pushed onto the labor market. Combine that with the past few decades of “oops, looks like you picked the wrong career that took years of learning and/or 10 to 100s of thousands of dollars to acquire but we’ve obsoleted that field” and new entrants into the labor market are just choosing not to join.
Take trucking for example. For the past decade I’ve heard logistics companies bemoan the lack of CDL holders, while simultaneously gleefully talk about how the moment self driving is figured out they are going to replace all of them.
We’re going to be outpaced by countries like China at some point because we’re doing the industrial equivalent of eating our seed corn and there is seemingly no will to slow that trend down, much less reverse it.
Comment by bluefirebrand 3 hours ago
I know I'm probably coming across as a lunatic lately on HN but I really do think we're on the path towards violence thanks to AI
You just cannot destroy this many people's livelihoods without backlash. It's leading nowhere good
But a handful of people are getting stupidly rich/richer so they'll never stop
Comment by lovich 3 hours ago
If you look at the luddite rebellion they weren't actually against industrial technology like looms. They were against being told they weren't needed anymore and thrown to the wolves because of the machines.
The rich have forgotten they are made of meat and/or are planning on returning to feudalism ala Yarvin, Thiel, Musk, and co's politics.
Comment by bluefirebrand 2 hours ago
I guess that makes me a modern luddite then
A software engineer luddite
A techno-luddite if you will
Maybe I have a new username
Comment by jetrink 9 hours ago
Maybe I don't have the correct mental model for how the typical junior engineer thinks though. I never wanted to bug senior people and make demands on their time if I could help it.
Comment by devonbleak 8 hours ago
Comment by 8note 1 hour ago
With a layout of 4 juniors, 5 intermediates, and 0-1 senior per team, putting all the changes through senior engineer review means you mostly wont be able to get CRs approved.
I guess it could result in forcing everyone who's sandbagging as intermediate instead of going to senior to have to get promoted?
Comment by SpicyLemonZest 7 hours ago
Comment by suzzer99 7 hours ago
Comment by zamalek 3 hours ago
My manager has been urging us to truly vibe code, just yesterday saying that "language is irrelevant because we've reached the point where it works - so you don't need to see it." This article is a godsend; I'll take this flawed silver bullet any day of the week.
Comment by raw_anon_1111 8 hours ago
I’m probably not going to review a random website built by someone except for usability, requirements and security.
Comment by OrangeDelonge 2 hours ago
Comment by happytoexplain 8 hours ago
I also said senior review is valuable, but I'm not 100% sure if you're implying I didn't.
Comment by mrothroc 6 hours ago
The other problem is that the type of errors LLMs make are different than juniors. There are huge sections of genuinely good code. So the senior gets "review fatigue" because so much looks good they just start rubber stamping.
I use an automated pipeline to generate code (including terraform, risking infrastructure nukes), and I am the senior reviewer. But I have gates that do a whole range of checks, both deterministic and stochastic, before it ever gets to me. Easy things are pushed back to the LLM for it to autofix. I only see things where my eyes can actually make a difference.
Amazon's instinct is right (add a gate), but the implementation is wrong (make it human). Automated checks first, humans for what's left.
Comment by qnleigh 8 hours ago
1. They can assess whether the use of AI is appropriate without looking in detail. E.g. if the AI changed 1000 lines of code to fix a minor bug, or changed code that is essential for security.
2. To discourage AI use, because of the added friction.
Comment by belval 9 hours ago
Comment by grvdrm 9 hours ago
I hear “x tool doesn’t really work well” and then I immediately ask: “does someone know how to use it well?” The answer “yes” is infrequent. Even a yes is often a maybe.
The problem is pervasive in my world (insurance). Number-producing features need to work in a UX and product sense but also produce the right numbers, and within range of expectations. Just checking the UX does what it’s supposed to do is one job, and checking the numbers an entirely separate task.
I don’t many folks that do both well.
Comment by rco8786 2 hours ago
Comment by mmcconnell1618 1 hour ago
Comment by lokar 8 hours ago
Comment by skeeter2020 8 hours ago
Comment by lokar 8 hours ago
Comment by sumeno 3 hours ago
Comment by radiator 7 hours ago
Comment by yalogin 2 hours ago
Comment by yifanl 9 hours ago
Comment by tartoran 8 hours ago
Comment by hnthrow0287345 8 hours ago
I would actually say having at least 2 people on any given work item should probably be the norm at Amazon's size if you also want to churn through people as Amazon does and also want quality.
Doing code reviews are not as highly valued in terms of incentives to the employees and it blocks them working on things they would get more compensation for.
Comment by mrbonner 8 hours ago
Comment by tartoran 8 hours ago
Comment by remarkEon 7 hours ago
Comment by happytoexplain 1 hour ago
We need smart people at every layer. If leadership isn't in that category, it spreads to all layers.
I don't know how we defeat capitalism to incentivize smart leadership. It's fundamentally opposed to market forces.
Comment by femiagbabiaka 9 hours ago
Comment by munk-a 4 hours ago
Comment by napolux 8 hours ago
Comment by RamblingCTO 9 hours ago
So you're saying that peer reviews are a waste of time and only idiots would use/propose them?
Comment by happytoexplain 8 hours ago
To partially clarify: "Idiot proof" is a broad concept that here refers specifically to abstraction layers, more or less (e.g. a UI framework is a little "idiot proof"; a WYSIWYG builder is more "idiot proof"). With AI, it's complicated, but bad leadership is over-interpreting the "idiot proof" aspects of it. It's a phrase, not an insult to users of these tools.
Comment by prakhar897 8 hours ago
1. Shipping: deliver tickets or be pipped.
2. Having Less comments on their PRs: for some drastically dumb reason, having a PR thoroughly reviewed is a sign of bad quality. L7 and above use this metric to Pip folks.
3. Docs: write docs, get them reviewed to show you're high level.
Without AI, an employee is worse off in all of the above compared to folks who will cheat to get ahead.
I can't see how "requesting" folks for forego their own self-preservation will work. especially when you've spent years pitting people against each other.
Comment by malfist 7 hours ago
Comment by dude250711 4 hours ago
Comment by embedding-shape 4 hours ago
I'm very far away from liking Amazon's engineering culture and general work culture, but having PRs with countless of discussions and feedback on it does signal that you've done a lot of work without collaborating with others before doing the work. Generally in teams that work well together and build great software, the PRs tend to have very little on them, as most of the issues were resolved while designing together with others.
Comment by tom_ 15 minutes ago
(And/but yes/no, I have never worked at NAGFAM...)
Comment by joeframbach 3 hours ago
Comment by ex-aws-dude 3 hours ago
Comment by dboreham 4 hours ago
Comment by 999900000999 2 hours ago
I missed my FAANG chance during the good years. No retirement for me!
Comment by philip1209 3 hours ago
People push AI-reviewed code like they wrote it. In the past, "wrote it" implies "reviewed it." With AI, that's no longer true.
I advocate for GitHub and other code review systems to add a "Require self-review" option, where people must attest that they reviewed and approved their own code. This change might seem symbolic, but it clearly sets workflows and expectations.
Comment by billbrown 46 minutes ago
Comment by kuekacang 1 hour ago
It also makes me more comfortable figuring out how a project's pull acceptance are like (maybe due to how fast local ui is compared to web-based git). On the other hand, I can only run some basic git cli commands and can't quickly comprehend raw text-based diff, especially when encountering some linux patches from time to time.
Comment by Tyr42 3 hours ago
Comment by nothrabannosir 2 hours ago
Comment by paxys 2 hours ago
Comment by jeremyjh 2 hours ago
Comment by 8note 1 hour ago
working at amazon, when I wanted to review code myself through the CR tool, Id still end up publishing it to the whole team and have to add some title shenanigans saying it was a self review or WIP and for others to not look at it yet
Comment by captainkrtek 27 minutes ago
In the pre-gen-AI days, if an engineer put up a PR, it implied (somewhat) they wrote their code, reviewed it implicitly as they wrote it, and made choices (ie: why is this the best approach).
If Claude is just the new high level programming language, in terms of prompting in natural language, the challenge is that we're not reviewing the natural language, we're reviewing the machine code without knowing what the inputs were. I'm not sure of a solution to this, but something along the lines of knowing the history of the prompting that ultimately led to the PR, the time/tokens involved, etc. may inform the "quality" or "effort" spent in producing the PR. A one-shotted feature vs. a multi-iteration feature may produce the same lines of code and general shape, but one is likely to be higher "quality" in terms of minimal defects.
Along the same lines, when I review some gen-AI produced PR, it feels like I'm reading assembly and having to reverse how we got here. It may be code that runs and is perfectly fine, but I can't tell what the higher level inputs were that produced it, and if they were sufficient.
Comment by jwpapi 3 minutes ago
Comment by cmiles8 8 hours ago
News from the inside makes it sound like things are getting pretty bad.
Comment by the_biot 3 hours ago
You mean senior programmers that have been there for ages don't want to spend their time reviewing AI slop? Who'd a thunk it!
Comment by rubyrfranklin2 18 minutes ago
Comment by sdevonoes 8 hours ago
There’s also this implicit imbalance engineers typically don’t like: it takes me 10 min to submit a complete feature thanks to Claude… but for the human reviewing my PR in a manual way it will take them 10-20 times that.
Edit: at the end real engineers know that what takes effort is a) to know what to build and why, b) to verify that what was built is correct. Currently AI doesn’t help much with any of these 2 points.
The inbetweens are needed but they are a byproduct. Senior leadership doesn’t know this, though.
Comment by hard24 8 hours ago
I'd prefer people wrote good quality code and checked it as they went along... whilst allowing room for other stuff they didn't think of to come to the front. The production process of using LLMs is entirely different, in its current state I don't see the net benefit.
E.g. if you have a very crystalised vision of what you want, why would I want an engineer to use an LLM to write it, when the LLM can't do both raw production and review? Could this change? Sure. But there's no benefit for me personally to shift toward working that way now - I'd rather it came into existence first before I expose myself to incremental risk that affects business operations. I want a comprehensive solution.
Comment by beardedetim 8 hours ago
It sounds like a piss poor deal for seniors unless senior engineer now means professional code reviewer.
Comment by malfist 3 hours ago
Comment by znpy 3 hours ago
This resonates with my experience.
The only thing you forgot is that you can also use the 12^H^H 14 leadership principles to argue whatever you want (and then the opposite of what you argued last month, still using the same leadership principles).
Comment by malfist 1 hour ago
Were you a knowledge source for the entire team? Well, you weren't learning and being curious. Did you ask a lot of questions to learn everything? Well, then you weren't "are right a lot".
Did you think big and come up with an architecture that saved Amazon a lot of money? Then you weren't inventing and simplifying. Build something simple to get out out the door quick? Well, you weren't thinking big.
Did you act quickly without consulting others to fix an issue? Well you weren't earning trust. Did you consult people to make sure they were happy with the solution? Well you weren't biased for action.
Thats just a few examples, there's so many more
Comment by rhubarbtree 2 hours ago
Comment by radiator 7 hours ago
Well, you'd think senior leadership should know how their business and their people work.
Comment by Barrin92 3 hours ago
Despite the name not a lot of seniority, leadership or engineering going around
Comment by asadotzler 4 hours ago
Comment by qnleigh 8 hours ago
Comment by 827a 43 minutes ago
We love this for Amazon, they're a very strong company making bold decisions.
Comment by lokar 9 hours ago
Code review should not be (primarily) about catching serious errors. If there are always a lot of errors, you can’t catch most of them with review. If there are few it’s not the best use of time.
The goal is to ensure the team is in sync on design, standards, etc. To train and educate Jr engineers, to spread understanding of the system. To bring more points of view to complex and important decisions.
These goals help you reduce the number of errors going into the review process, this should be the actual goal.
Comment by rossdavidh 4 hours ago
The fact that software is "soft" makes it seem like this doesn't apply, but it does, not least because of the fact that once you have gone down the wrong path with software design, it is very difficult to pull back and realize you need to go down an entirely different one.
Comment by lokar 1 hour ago
The analogy to manufacturing would be something like if the parts coming out a machine are all bad, just sending them to re-work is not a solution, you need to re-calibrate the machine.
Comment by paxys 1 hour ago
Comment by Herring 18 minutes ago
Comment by booleandilemma 1 hour ago
Comment by ritlo 9 hours ago
They're torn between "we want to fire 80% of you" and "... but if we don't give up quality/reliability, LLMs only save a little time, not a ton, so we can only fire like 5% of you max".
(It's the same in writing, these things are only a huge speed-up if it's OK for the output to be low-quality, but good output using LLMs only saves a little time versus writing entirely by-hand—so far, anyway, of course these systems are changing by the day, but this specific limitation has remained true for about four years now, without much improvement)
Comment by SoftTalker 8 hours ago
That has always been my feeling. Once I really understand what I need to implement, the code is the easy part. Sure it takes some time, but it's not the majority. And for me, actually writing the code will often trigger some additional insight or awareness of edge cases that I hadn't considered.
Comment by 8note 1 hour ago
if i wanted, i could queue up weeks worth of review in a couple days, but that's not getting the whole team more productive.
Spending more time on documents and chatting proved much more useful for getting more output overall.
Even without LLMs ive been nearby and on teams where review burden from developers building away team code was already so high that youd need to bake an extra month into your estimates for getting somebody to actually look.
Comment by hard24 8 hours ago
Of course it wasn't! Do you think people can envision the right objects to produce all the time? Yeah.. we have a lot of Steve Jobs walking around lol.
As you say, there's 'other stuff' that happens naturally during the production process that add value.
Comment by somewhereoutth 2 hours ago
Thinking through making.
Comment by hard24 8 hours ago
Essentially something big has to happen that affects the revenue/trust of a large provider of goods, stemming from LLM-use.
They wont go away entirely. But this idea that they can displace engineers at a high-rate will.
Comment by Terr_ 7 hours ago
I feel the current proliferation of LLMs is going to resemble asbestos problem: Cheap miracle thingy, overused in several places, with slow gradual regret and chronic harms/costs. Although I suppose the "undocumented nasty surprise" aspect would depend on adoption of local LLMs. If it's a monthly subscription to cloud-stuff, people are far less-likely to lose track of where the systems are and what they're doing.
Comment by _wire_ 6 hours ago
Comment by rglover 3 hours ago
Comment by mentos 3 hours ago
Feels inevitable that code for aviation will slowly rot from the same forces at play but with lethal results.
Comment by rhubarbtree 2 hours ago
Just because nearly all software is going to be written by AI, does not mean critical infrastructure will be.
Comment by ndr42 11 hours ago
Comment by Lalabadie 9 hours ago
Comment by gtowey 9 hours ago
Comment by sethops1 9 hours ago
So basically, kill the productivity of senior engineers, kill the ability for junior engineers to learn anything, and ensure those senior engineers hate their jobs.
Bold move, we'll see how that goes.
Comment by whateveracct 8 hours ago
Comment by sdevonoes 8 hours ago
Comment by ritlo 8 hours ago
It's basically an even-more-ridiculous version of ranking programmers by lines-of-code/week.
What's especially comical is I've seen enormous gains in my (longish, at this point) career from learning other tools (e.g. expanding my familiarity with Unix or otherwise fairly common command line tools) and never, ever has anyone measured how much I'm using them, and never, ever has management become in any way involved in pushing them on me. It's like the CEO coming down to tell everyone they'll be making sure all the programmers are using regular expressions enough, and tracking time spent engaging with regular expressions, or they'll be counting how many breakpoints they're setting in their debuggers per week. WTF? That kind of thing should be leads' and seniors' business, to spread and encourage knowledge and appropriate tool use among themselves and with juniors, to the degree it should be anyone's business. Seems like yet another smell indicating that this whole LLM boom is built on shaky ground.
Comment by tavavex 7 hours ago
That's because they weren't sold regex as as service by a massive company, while also being reassured by everyone that any person not using at least one regular expression per line of code is effectively worthless and exposes their business to a threat of immediate obsolescence and destruction. They finally found a way to sell the same kind of FOMO to a majority of execs in the software industry.
Comment by to11mtm 7 hours ago
Gotta be careful if you do that tho; e.x. Copilot can monitor 'accept' rate, so at bare minimum you'd have to accept the changes than immediately back them out...
Comment by tavavex 7 hours ago
Comment by ourmandave 7 hours ago
Comment by lovich 6 hours ago
Did industrial psychology die out as a field? Why do we keep reinventing the wheel when it comes to perverse incentives. It’s like working on a team working with scrum where the big bosses expect the average velocity to go up every sprint, forever, but the engineers are the ones deciding the point totals on tickets.
Comment by bonesss 7 hours ago
I mean… throw some docs into the context window, see it explode. Repeat that a few times with some multi-step workflows. Presto, hundreds of dollars in “AI” spending accomplishing nothing. In olden days we’d just burn the cash in a waste paper basket.
Comment by tren_hard 5 hours ago
Comment by dboreham 4 hours ago
Comment by slopinthebag 7 hours ago
Comment by baal80spam 4 hours ago
In my case it's morality.
Comment by bravetraveler 2 hours ago
edit: Peer said it well, IMO. The consequences aren't really yours. Also: something, something, Goodhart's Law.
Comment by ummonk 1 hour ago
Comment by thewhitetulip 8 hours ago
I am saying in General, I've never worked in Amazon
Comment by throw_m239339 8 hours ago
Comment by dragonelite 9 hours ago
Comment by altairprime 9 hours ago
Comment by dude250711 4 hours ago
Comment by almostdeadguy 9 hours ago
Comment by zdragnar 9 hours ago
There's a lot of learning opportunity in failing, but if failure just means spam the AI button with a new prompt, there's not much learning to be had.
Comment by ritlo 9 hours ago
Jesus, yes. Maybe I'm an oddball but there's a limit to how much PR reviewing I could do per week and stay sane. It's not terribly high, either. I'd say like 5 hours per week max, and no more than one hour per half-workday, before my eyes glaze over and my reviews become useless.
Reviewing code is important and is part of the job but if you're asking me to spend far more of my time on it, and across (presumably) a wider set of projects or sections of projects so I've got more context-switching to figure out WTF I'm even looking at, yes, I would hate my job by the end of day 1 of that.
Comment by almostdeadguy 7 hours ago
I don't disagree, I think reviewing is laborious, I just don't see how this causes any unintended consequences that aren't effectively baked into using an AI assistant.
Comment by bluefirebrand 2 hours ago
Code Review is hard and tiring, much moreso than writing it
I've never met anyone who would be okay reviewing code for their full time job
Comment by VorpalWay 2 hours ago
Comment by shepherdjerred 2 hours ago
Comment by fmajid 3 hours ago
Comment by 8note 1 hour ago
still within the engineering IC role, but on a different track
Comment by AlotOfReading 9 hours ago
I wonder if it's an early step towards an apprenticeship system.
Comment by monarchwadia 9 hours ago
Comment by bilbo0s 9 hours ago
How else would they train the LLM PR reviewers to their standards?
I've never personally been in the position, because my entire career has been in startups, but I've had many friends be in the unenviable position of training their replacements. Here's the thing though, at least they knew they were training their replacements. We could be looking at a potential future where an employee or contractor doesn't realize s/he is actually just hired to generate training data for an LLM to replace them, and then be cut.
Comment by zouhair 32 minutes ago
Comment by daxfohl 3 hours ago
Obviously it's probably cost-prohibitive to do an all to all analysis for every PR, but I imagine with some intelligent optimizations around likelihood and similarity analysis something along those lines would be possible and practical.
Comment by 8note 1 hour ago
Amazon does have those things, and has fine tuning on models based on those postmortems.
Noisy reviews are also a problem causer. the PR doesnt know what scale a chunk of code is running at, without having access to 20 more packages and other details.
Comment by iLoveOncall 2 hours ago
COEs and Operation Readiness Reviews are already the documents that you mention, but they are largely useless in preventing incidents.
Comment by sailfast 2 hours ago
In the meantime they will be quite a bit slower I’d imagine.
Also wonder if those seniors will ever get to actually do any engineering themselves now that they’re the bottleneck. :)
Comment by sizzzzlerz 2 hours ago
And what are they going to do when they've fired all the senior engineers because they make too much money, leaving just juniors and AI?
Comment by rhubarbtree 2 hours ago
When they fire everyone, juniors will fix it with AI.
This is in general. I wouldn’t recommend this at critical services like AWS.
Comment by sizzzzlerz 1 hour ago
Comment by rhubarbtree 1 hour ago
But yes agree with the rest, which probably makes up a tiny tiny fraction of the software created today, and will be orders of magnitude smaller as a fraction in the future.
Comment by throwaway613746 2 hours ago
Comment by julienchastang 7 hours ago
The way I am working with AI agents (codex) these days is have the AI generate a spec in a series of MD documents where the AI implementation of each document is a bite sized chunk that can be tested and evaluated by the human before moving to the next step and roughly matches a commit in version control. The version control history reflects the logical progression of the code. In this manner, I have a decent knowledge of the code, and one that I am more comfortable with than one-shotting.
Comment by danjl 3 hours ago
Prior to each step, I prompt the AI to review the step and ask clarifying questions to fill any missing details. Then implement. Then prompt the AI after to review the changes for any fixes before moving on to the next step. Rinse, repeat.
The specs and plans are actually better for sharing context with the rest of the team than a traditional review process.
I find the code generated by this process to be better in general than the code I've generated over my previous 35+ years of coding. More robust, more complete, better tested. I used to "rush" through this process before, with less upfront planning, and more of a focus on getting a working scaffold up and running as fast as possible, with each step along the way implemented a bit quicker and less robustly, with the assumption I'd return to fix up the corner cases later.
Comment by AlexeyBrin 11 hours ago
Comment by hrmtst93837 4 hours ago
Comment by quantified 10 hours ago
The presumably human mid-level or junior engineer has their own issues with this, but the point of the LLM is that you don't need that engineer. For productivity purposes, the dev org only needs the seniors to wrangle all the LLMs they can. That doesn't sustain, so a couple of more-junior engineers can do similar work to mature.
Comment by MichaelRo 4 hours ago
LOL, it's the age old "responsibility without authority". The pressure to use AI will increase and basically you'll be fired for not using it. Simultaneously with the pressure to take the blame when AI fucks up and you can't keep up with the bullshit, leading you to get fired. One way or the other, get some training on how to stack shelves at the supermarket because that's how our future looks, one way or the other.
Comment by throwaw12 8 hours ago
So you have 2 systems of engineers: Sr- and Sr+
1. Both should write code to justify their work and impact
2. Sr- code must be reviewed by Sr+
What happens:
a. Sr+ output drops because review takes their time more and more
b. Sr+ just blindly accepts because of the volume is too high, and they should also do their own work
c. Sr+ asks Sr- to slow-down, then Sr- can get bad reviews for the output, because on average Sr+ will produce more code
I think (b) will happen
Comment by zcw100 6 hours ago
Comment by agoodusername63 2 hours ago
The impression I get from SWEs I’ve met throughout my life is that most of them don’t actually care about their job. They got in because it paid well and demand was plentiful.
Comment by booleandilemma 1 hour ago
Comment by LogicFailsMe 8 hours ago
And from their sagely reviews, we shall train a large language model to ultimately replace them because the most fungible thing at Amazon is the leadership.
Comment by jacknews 3 minutes ago
The seniors will now be directly responsible for all the AI slop that goes in. But how can they possibly properly review reams of code to a sufficient degree they can personally vouch for it?
Comment by mhogers 7 hours ago
force agents to not touch mission critical things, fail in CI otherwise
let it work on frontends and things at the frontier of the dependency tree, where it is worth the risk
Comment by readthemanual 7 hours ago
Comment by mhogers 4 hours ago
Comment by andai 2 hours ago
(Before injecting it into global infra...)
Comment by varenc 2 hours ago
Comment by dragonelite 9 hours ago
Comment by daheza 8 hours ago
Comment by booleandilemma 1 hour ago
Comment by gdulli 8 hours ago
Comment by hard24 8 hours ago
Comment by AlexeyBrin 9 hours ago
/s
So now, you can speed up using Claude Code and use Code Review to keep it in check.
Comment by znpy 3 hours ago
First thing that comes to mind is: reminds me of those movie where some dictatorship starts to crumble and the dictator start being tougher and tougher on generals, not realizing the whole endeavor is doomed, not just the current implementation.
Then again, as a former amazon (aws) engineer: this is just not going to work. Depending how you define "senior engineer" (L5? L6? L7?) this is less and less feasible.
L5 engineers are already supposed to work pretty much autonomously, maybe with L6 sign-off when changes are a bit large in scope.
L6 engineers already have their own load of work, and a fairly large amount of engineers "under" them (anywhere from 5 to 8). Properly reviewing changes from all them, and taking responsibility for that, is going to be very taxing on such people.
L7 engineers work across teams and they might have anywhere from 12 to 30 engineers (L4/5/6) "under" them (or more). They are already scarce in number and they already pretty much mostly do reviews (which is proving not sufficient, it seems). Mandating sign-off and mandating assumption of responsibility for breaking changes means these people basically only do reviews and will be stricter and stricter[1] with engineers under them.
L8 engineers, they barely do any engineering at all, from what I remember. They mostly review design documents, in my experience not always expressing sound opinions or having proper understanding of the issues being handled.
In all this, considering the low morale (layoffs), the reduced headcount (layoffs) and the rise in expectations (engineers trying harder to stay afloat[2] due to... layoffs)... It's a dire situation.
I'm going to tell you, this stinks A LOT like rotting day 2 mindset.
----
1. keep in mind you can't, in general, determine the absence of bugs
2. Also cranking out WAY MUCH MORE code due to having gen-ai tools at their fingertips...
Comment by monster_truck 2 hours ago
Comment by butILoveLife 8 hours ago
I am seeing this mindset still, with AI Agents. I imagine they will slowly realize they need to use this stuff to be competitive, but being slow to adopt AI seems like it could have been the source of this.
Comment by lmc 7 hours ago
Comment by bigstrat2003 2 hours ago
Comment by butILoveLife 5 hours ago
Comment by kmg_finfolio 11 hours ago
Comment by Insanity 8 hours ago
Imagine having to debug code that caused an outage when 80% is written by an LLM and you now have to start actually figuring out the codebase at 2am.. :)
Comment by 8note 1 hour ago
i think the team i was on was a bit of an outlier in terms of owning 40 dumptser fires at once, and the first time reading any one of them was at 2AM because it was down.
having an LLM give early passes on reading the godawful c++ code that you can tell at a glance that its not gonna work as expected, but you cant tell why, or what expected actually is would have been phenomenal, and gotten me back to sleep at 3 on those codebases rather than 5.
Comment by tcbrah 7 hours ago
Comment by smy20011 8 hours ago
Comment by newobj 4 hours ago
Imagine if the #1 problem of your woodworking shop is staff injuries, and the solution that management foists on you is higher RPM lathes.
Comment by dedoussis 7 hours ago
Comment by emotiveengine 7 hours ago
Comment by testbjjl 1 hour ago
Seems to me too low level in everyone’s stack to not have humans doing the work, especially at this stage. But what do I know, I certainly am not at the helm of a multibillion dollar operation.
Comment by booleandilemma 1 hour ago
Comment by adamzwasserman 48 minutes ago
Comment by jmspring 1 hour ago
Comment by mattschaller 9 hours ago
Comment by wenc 11 minutes ago
In my experience, it's high-quality for creating and iterating on specs. Tools like Cursor are optimized for human-driven vibing -- they have great autocomplete, etc. Kiro, by contrast, is optimized around spec, which ironically has been the most effective approach I've found for driving agents.
I'd argue that Cursor, Antigravity, and similar tools are optimized for human steering, which explains their popularity, while Kiro is optimized for agent harnesses. That's also why it’s underused: it's quite opinionated, but very effective. Vibe-coding culture isn't sold on spec driven development (they think it's waterfall and summarily dismiss it -- even Yegge has this bias), so people tend to underrate it.
Kiro writes specs using structured formats like EARS and INCOSE. It performs automated reasoning to check for consistency, then generates a design document and task list from the spec -- similar to what Beads does. I usually spend a significant amount of time pressure-testing the spec before implementing (often hours to days), and it pays off. Writing a good, consistent spec is essentially the computer equivalent of "writing as a tool of thought" in practice.
Once the spec is tight, implementation tends to follow it closely. Kiro also generates property-based tests (PBTs) using Hypothesis in Python, inspired by Haskell's QuickCheck. These tests sweep the input domain and, when combined with traditional scenario-based unit tests, tend to produce code that adheres closely to the spec. I also add a small instruction "do red/green TDD" (I learned this from Simon Willison) and that one line alone improved the quality of all my tests.
Kiro can technically implement the task list itself, but this is where agents come in. With the spec in hand, I use multiple headless CLI agents in tmux (e.g., Kiro CLI, Claude Code) for implementation. The results have been very good. With a solid Kiro spec and task list, agents usually implement everything end-to-end without stopping -- I haven’t found a need for Ralph loops.
Kiro didn't have the strongest start, but the Kiro IDE is one of the best spec generators I’ve used, and it integrates extremely well with agent-driven workflows.
Comment by riknos314 4 hours ago
Comment by daheza 8 hours ago
Haven't tried Kiro CLI.
Comment by letitgo12345 8 hours ago
Comment by locopati 50 minutes ago
no! not that way!
Comment by bigbuppo 9 hours ago
Comment by moomoo11 1 hour ago
Comment by m3kw9 3 hours ago
Comment by luxuryballs 2 hours ago
Comment by dlev_pika 7 hours ago
Over the next few days my account history came back, except purchases made Q1 2026. Those are still missing. There are a few substantial purchases I made that are nowhere to be found anymore.
I attributed this Iranian missiles hitting some of their infrastructure in EU, as it had been reported.
Now I am not sure if it was blast radius from missiles or AI mishaps. Lmao - couldn’t happen to a worse company…
Comment by rushabh 2 hours ago
Comment by rvz 5 hours ago
Comment by softwaredoug 2 hours ago
Comment by xodn348 2 hours ago
Comment by CodingJeebus 11 hours ago
I find myself context-switching all the time and it's pretty exhausting, while also finding that I'm not retaining as much deep application domain knowledge as I used to.
On the surface, it's nice that I can give my LLM a well-written bug ticket and let it loose since it does a good job most of the time. But when it doesn't do a good job or it's making a change in an area of the codebase I'm not familiar with, auditing the change gets tiring really fast.
Comment by skeledrew 9 hours ago
Thought this blurb most interesting. What's the between-lines subtext here? Are they deliberately serving something they know to be faulty to the Chinese? Or is it the case that the Chinese use it with little to no issue/complaint? Or...?
Comment by secondcoming 3 hours ago
Comment by oxqbldpxo 9 hours ago
Comment by teeray 4 hours ago
So what incentive is there for juniors to look at the code at all? Seniors are now just another CI stage for their slop to pass.
Comment by 10xDev 7 hours ago
Comment by mikkupikku 4 hours ago
Comment by th2o34i3432897 8 hours ago
Has Seattle now become the code-slop capital ? Or is SFO still on top ?
Comment by ChrisArchitect 4 hours ago
Comment by MDGeist 9 hours ago
Comment by dboreham 4 hours ago
Comment by dude250711 9 hours ago
Take a perfectly productive senior developer and instead make him be responsible for output of a bunch of AI juniors with the expectation of 10x output.
Comment by frogperson 9 hours ago
Comment by hard24 8 hours ago
Think about it - how do you increase the speed at which one can review code? Well first it must be attractive to look at - the more attractive the faster you review/understand and move through the review. Now this won't be the case everywhere - e.g. in outsourced regions the conditions will force people to operate a certain way.
Im not a SWE by trade, I just try to look at things from a pragmatic stand-point of how org's actually make incremental progress faster.
Comment by bombdailer 2 hours ago
A beautiful building is only as good as the correctness of its foundation, framework, materials, and construction. Those qualities can only be assessed by those with expertise enough to understand their importance. Beauty in its proper place is the output of the intersection between a craftsman and a engineer. Beauty is optional, but it makes life more worth living. The same is true for code - attractive code is optional, but it makes being a SWE more rewarding.
Comment by recallingmemory 2 hours ago
Comment by AlexandrB 8 hours ago
"No, not like that though!"
Comment by fredgrott 8 hours ago
If you know CS you know two things:
1. AI can not judge code either noise or signal, AI cannot tell. 2. CS-wise we use statistic analysis to judge good code from bad.
How much time does it take to take AI output and run the basic statistic tools for most computer languages?
Some juniors need firing outright
Comment by 8note 1 hour ago
maybe as software engineering topics, but thats a different discipline
Comment by desireco42 2 hours ago
I do consulting and use AI a lot. You just have to take responsibility for the code. We are delivering like never before, but have a lot of experience into how to do it as safe as possible. And we are learning along the way. They say you need a year to build up experience fyi.
I feel bad for those engineers who will have to sign off for things they will most likely not have enough time to review. Kiro is nice and all.
Comment by oliver_dr 7 minutes ago
Comment by josefritzishere 9 hours ago
Comment by throw_m239339 8 hours ago
Comment by 10xDev 7 hours ago
Comment by throw_m239339 2 hours ago
Comment by aplomb1026 1 hour ago
Comment by adrien_dev 5 hours ago
Comment by ihsw 2 hours ago
Comment by throwaway613746 4 hours ago
Comment by andsoitis 11 hours ago
The environment breathed a little.
Comment by 8note 1 hour ago
as an alternative, a bunch of people got into their one-person trucks and drove to the store to buy whatever thing would have been efficiently delivered