A Brief History of Ralph
Posted by dhorthy 5 days ago
Comments
Comment by shanewwarren 5 days ago
I created a small claude skill, that helps create the "specs" for a new/existing project, it adds a /specs folder with a README, that acts as a lookup for topics/features about the app, technical approach and feature set. Once we've chatted it spawns off subagents to do research and present those findings in the specific spec. In terms of improvements there, I'd almost like a more opinionated back and forth between "pm type" agents, to help test ideas and implementation ideas.
I've got the planning and build loop setup in the claude devcontainer, which is somewhat fragile at the moment, but works for now.
In terms of chewing up context, I've noticed that depending on the size of the project the "IMPLEMENTATION_PLAN.md" can get pretty massive. If each agent run needs to parse that plan to figure out what to do next it feels like a lot of wasted parsing. I'm working on changing that implementation plan to be more granular so there is less to parse when figuring out what to do next.
Overall, it's been fun and has kept me really engaged the past week.
Comment by vemv 5 days ago
If you'll code a demo MVP, one-off idea, etc, alright, go ahead, have your fun and waste your tokens.
However I'll be happy when the (forced) hype fades off as people realise that there's nothing novel, insightful or even well-defined behind Ralph.
Loops have always been possible. Coordination frameworks (for tracking TODOs, spawning agents, supervising completion, etc) too, and would be better embodied as a program instead of as an ad-hoc prompt.
Comment by realityfactchex 4 days ago
With YOLO on full-auto, you can give a wrapping rule/prompt that says more or less: "Given what I asked you to do as indicated in the TODO.md file, keep going until you are done, expanding and checking off the items, no matter what that means -- fix bugs, check work, expand the TODO. You are to complete the entire project correctly and fully yourself by looping and filling in what is missing or could be improved, until you find it is all completely done. Do not ask me anything, just do it with good judgement and iterating."
Which is simultaneously:
1. an effective way to spend tokens prodigiously
2. an excellent way to to get something working 90% of the way there with minimal effort, if you already set it up for success and the anticipatable outcomes are within acceptable parameters
3. a most excellent way to test how far fully autonomous development can go -- in particular, to test how the "rest of" one's configuration/scaffolding/setup is, for such "auto builds"
Setting aside origin stories, honestly it's very hard to tell if Ralph and full-auto-YOLO before it are tightly coupled to some kind of "guerilla marketing" effort (or whatever that's called these days), or really are organic phenomen. It almost doesn't matter.The whole idea with auto-YOLO and Ralph seems to be you loop a lot and see what you can get. Very low effort, surprisingly good results. Just minor variations on branding and implementation.
Either way, in my experience, auto-YOLO can actually work pretty well. 2025 proved to be cool in that regard.
Comment by jes5199 5 days ago
there’s some debate about whether this is in the spirit of the _original_ Ralph, because it keeps too much context history around. But in practice Claude Code compactions are so low-quality that it’s basically the same as clearing the history every few turns
I’ve had good luck giving it goals like “keep working until the integration test passes on GitHub CI” - that was my longest run, actually, it ran unattended for 24 hours before solving the bug
Comment by senjin 5 days ago
Comment by jes5199 4 days ago
Comment by Juvination 5 days ago
I still have a manual part which is breaking the design document down into multiple small gh issues after a review but I think that is fine for now.
Using codex exec, we start working on a github issue with a supplied design document, creating a PR on completion. Then we perform a review using a review skill madeup which is effectively just a "cite your sources" skill on the review along with Open Questions.
Then we iterate through open questions doing a minimum of 3 reviews (somewhat arbitrary but sometimes multiple reviews catch things). Then finally I have I have a step in for checking Sonarcloud, fixing them and pushing the changes. Realistically this step should be broken out into multiple iterations to avoid large context rot.
What I miss the most is output, seeing whats going on in either Codex or Claude in real time. I can output the last response but it just gets messy until I make something a bit more formal.
Comment by skybrian 5 days ago
Comment by wild_egg 5 days ago
The key bit is right under that though. Ralph is literally just this:
while :; do cat PROMPT.md | npx --yes @sourcegraph/amp ; doneComment by mkl 4 days ago
https://github.com/repomirrorhq/repomirror/blob/main/repomir... (discussed in https://news.ycombinator.com/item?id=45005434) provides a bit more detail, and prompts, but only seems to use the method for porting existing software.
Comment by skybrian 5 days ago
Comment by msla 5 days ago
cat PROMPT.md | cat | npx --yes @sourcegraph/ampComment by wild_egg 4 days ago
Comment by mkl 4 days ago
Comment by GibbonBreath 5 days ago
Comment by dhorthy 5 days ago
Comment by SafeDusk 4 days ago
Comment by fallinditch 5 days ago
Comment by odie5533 5 days ago
I think the key to it is having lots of smaller tasks with fresh context each loop. Ralph loop run starts, it picks the most important task, completes it, and ends its loop. Then the next ralph run starts with new context, grabs the most important task, and the loops continue. I have not tried this method yet.
Comment by ossa-ma 5 days ago
And it all ends with the grift of all grifts: promoting a crypto token in a nonchalant 'hey whats this??!!??' way...
Comment by dhorthy 5 days ago
Comment by f311a 5 days ago
It's complete garbage, and since it runs in a loop, the amount of garbage multiplies over time.
Comment by dhorthy 5 days ago
Even if in the absolute the ceiling remains low, it’s interesting the degree to which good context engineering raises it
Comment by ossa-ma 5 days ago
I’m all for AI and it’s evident that the future of AI is more transparency (MLOPs, tracing, mech interp, AI safety) not less.
Comment by alansaber 5 days ago
Comment by dhorthy 5 days ago
Comment by Veen 5 days ago
Comment by skerit 5 days ago
And this dream of "having Claude implement an entire project from start to finish without intervention" came crashing down with this realization: Coding assistants 100% need human guidance.
Comment by articulatepang 5 days ago
More generally, I've noticed that people who spend a lot of time interacting with LLMs sometimes develop a distinct brain-fried tone when they write or talk.
Comment by dang 5 days ago
Comment by articulatepang 4 days ago
Comment by cactusplant7374 5 days ago
Comment by GibbonBreath 5 days ago
So their comment is really a vehicle for them to deliver an insult and doesn't represent significant engagement with the material or a thoughtful digression that could foster curious conversation.
Note that that doesn't mean it's a good article or that Ralph is a good idea.
Comment by cactusplant7374 2 days ago
Comment by alansaber 5 days ago
Comment by linkregister 5 days ago
Comment by articulatepang 4 days ago
"If you've seen my socials lately, you might have seen me talking about Ralph and wondering what Ralph is. Ralph is a technique. In its purest form, Ralph is a Bash loop.
while :; do cat PROMPT.md | claude-code ; done
Ralph can replace the majority of outsourcing at most companies for greenfield projects. It has defects, but these are identifiable and resolvable through various styles of prompts."
but the contents of PROMPT.md are behind a paywall. In spirit that is not so different from gcc program.c; ./a.out
while program.c is behind a paywall. It's nearly impossible to reason about what the system will do and how it works without knowing more about PROMPT.md. For example, PROMPT.md could say "Build the software" or it could say "Write formal proofs in lean for each function" or ...In the spirit of curiosity, I'd appreciate a summary of a couple sentences describing the approach, aimed at a technically sophisticated audience that understands LLMs and software engineering, but not the specifics of this particular system.
Comment by linkregister 4 days ago
Comment by ahurmazda 4 days ago
And then all these planning steps … if CC/codex interviews a ~million senior devs, next iteration of models will perhaps know how to plan way better than today?