Apache Burr: Build reliable AI agents and applications

Posted by anhldbk 6 days ago

Comments

Comment by brotchie 6 days ago

I'm still on the fence about agent frameworks, they have their place, and it depends on the nature of the agent: e.g. "Low latency, return a good enough response in 3 seconds, vs. working for 3 hours on a problem."

BUT, if you boil it down, an agent really is context building, making an LLM call, executing requested tool calls, parsing the final model output, returning it to some frontend. There's extensions like memory, async tool calls, etc, but not THAT complicated from a traditional software engineering perspective.

Everyone seems to want to build their agent framework. But if you're tasked with building an agent, I've found it much easier and more maintainable to just build 1:1 code for THAT agent: most of the abstractions you get from an agent framework purely get in the way and obfuscate core agent logic.

You end up being forced to use the abstractions chosen by the agent framework, which sometimes are a mismatch for what you're actually trying to do.

Comment by peterbell_nyc 6 days ago

For me the heart of an agentic system is NOT using agents (except when you really have to). Components of a working system include: - Pipelines/recipes to describe multi-step flows (deterministic, agentic and HiTL steps), loops, conditionals, exit-on's for max loop iteractions, etc - The logistics to actually run the model and HiTL steps reliably across multiple agent worker pools - Management and delivery (and security/governance and permissioning) of thick skills with code to do as much as possible - Context management so the right agents have the right context for the right sessions at the right time - Project management - ability to store and access tickets, dependencies, track progress, restart stuck ticket claims, etc - Transcript saving, memory features and dreaming/compounding capabilities so the agents continue to learn from each session - o11y for understanding whats happening, tracking costs and usages, etc - Evals and auto-tuning of prompts so you can go cross model provider and also lock to a model version so you can do an ROI on each model version upgrade - Sandboxes for running the actual model sessions

Don't need to get it all from one vendor, but that feels to me like the toolkit and for most use cases I'd argue: - Don't limit yourself to a single model provider (anthropic, openai, etc) - Own your context - Own your compounding

Comment by sdesol 6 days ago

Can you comment more on

> Context management so the right agents have the right context for the right sessions at the right time

I'm going to do a show HN tomorrow that explains how you can give your agents years of experience. The basic idea is, you would commit in your repo or download manifests (JSON files) that can be converted to "Brains" (SQLite databases). Each brain can have its own properties.

For example, I provide a "code intent" analyzer (instructions for AI) that says when analyzing a file, extract this metadata. For the code intent analyzer, I have the AI extract a single sentence purpose for the file. So if you execute:

gsc rg cache --db code-intent --fields purpose

you get all matches for 'cache' plus the matching file's purpose like "Modify file to update caching strategy". This is how the agent can tell if the file is talking about cache vs. whether this file is what you should change if you want to update the caching strategy.

So for what you described, you can have a brain for different stages of a task. It can be as simple as, in the planning stage, make sure you do this if you need to touch this file.

I am working on a rust-blast-radius brain that uses `syn` + AI generated metadata to help you understand "what if I changed this file, what would be affected". With the rust-blast-radius brain, the AI can summarize the types of files that will be affected without having to open the file based on what has been changed or discussed.

So you can have a rule like, if I make changes to a Rust file, make sure to do a blast radius analysis so we don't forget to consider something.

Does this align with what you are looking for?

Comment by andai 6 days ago

(I'm not the guy but) That's funny, I had the same idea the other day. Keeping summaries of files. Haven't tested that yet.

Another thing I've been thinking is how, most parts of a file are not relevant to the whole system.

Like there are parts where they intersect, and those seem to be the most important ones for capturing the big picture. You wanna be able to see the entire "skeleton".

So I thought the summary maybe shouldn't be English but it should be a subset of the code — the subset that's relevant to the rest of the program.

`grep import` gets you 90% of the way there.

Comment by sdesol 6 days ago

If you include the following:

https://github.com/gitsense/chat/blob/main/base-state/analyz...

In your chat with AI, include the above file and let it know what your requirements are and I can create the analyzer and include it.

You can also think of my tool as data prepping tool. So if you have a clear prompt the AI can review the file during analysis and remove all unnecessary code so the extracted metadata will the stripped text which you can use search against.

Comment by andai 6 days ago

> If a developer wanted to change X, would these keywords help them find this file?

I think the best way to generate these is with a sub-agent. Tell it to try and solve a problem that involves editing this file, and see what it starts grepping for.

This ties in with this idea that the tools and designs should be what comes naturally to the LLM, i.e. what it's already been trained on. And the most straightforward way to do that is to let it reach for it.

Like when you reach in the darkness for an object. Where your hand lands is exactly where it should be.

Comment by sdesol 6 days ago

My solution has a natural self improvement loop. Once you have finished a task, you just ask the agent "If you had more information, how would you have finished the task sooner and/or better?" This was how I came about the rust blast radius brain.

I need to modify OpenAI's Codex agent to support slash commands that can help humans better guide agents, and I needed a solution with the least impact. They don't accept contributions so I need to plan for syncing with the upstream.

Comment by andai 6 days ago

Nice, tell me more about your Codex fork.

Are you running it with official or custom models? I've been trying to get custom models working in Codex and haven't been able to figure it out. (A lot of providers support Responses API, but they don't actually work with Codex.)

Comment by sdesol 5 days ago

I haven't made any changes yet and I think the changes that I do make, they will want. I want to create a `/knowledge` slash command that can quickly tell me what the Agent currently knows so I can determine if I need to perform "lobotomy surgery" to make it not know something or add what it needs.

I created a new brain that helped me find the answer for what you described:

> Codex does not just need a /v1/responses endpoint. It needs an OpenAI Responses-compatible agent surface. Many providers implement enough Responses API for text streaming, but not enough for Codex’s tool-call loop and event mapping.

I can understand why they might have done this for performance and/or lock-in and/or AI thinking reasons.

I don't think I will create a translation layer, as that would be a sync nightmare, so based on what I found and what you said, it doesn't look like you can use other providers unless you introduce a proxy layer to translate things.

I should also note, even if you have the translation layer, you might end up breaking harness capabilities.

I am going to update

https://github.com/gitsense/smart-codex

to include the `codex-rust-navigation` brain that you can use to chat with AI about. And you will probably want to use it since `gpt-5.5` estimated that 25 - 50 files did not have to be read:

> Roughly 300-500 files avoided, with a defensible lower bound around 25-50 files.

This brain is designed specifically for rust files so you will need to use code-intent if you want to ask more documention/config questions.

Comment by internet101010 6 days ago

I do file summaries as well. Basically a knowledge base commit hook + agent that creates/updates a file containing a summary and list of its dependencies, followed by updating an index to include it. Super useful for creating system migration scopes and just managing context in general.

Comment by kristjansson 6 days ago

Obscuring core logic is the most egregious part of most agent frameworks. One needs a clear view of what, exactly, is being sent to the underlying language model, and what's coming back. Everything in an 'agentic' application is realized as a sequence of tokens or a call to a provider eventually. It should be clear and obvious from ~all layers of the app what that's going to look like.

Comment by cpard 6 days ago

Most framework vendors don’t have an incentive to make things less obscure. The agent framework is free/open source and they make money primarily from selling observability products for agents. Even if they don’t intentionally obscure things, they just don’t have the motivation to optimize that part.

Comment by pjmlp 6 days ago

Unfortunately agent orchestration frameworks feels like the second coming of BEPL, and the incentives are all wrong.

Comment by throw1234567891 6 days ago

Have a look at pi.

Comment by tomrod 6 days ago

Whst about pi do you like?

Comment by peterbell_nyc 6 days ago

Pi is a nice multi-agent wrapper. I use it to wrap my OpenAI max plan calls and my API calls. It takes care of some of the agent plumbing - still need sandbox, orchestrator, compounding, context, evals, etc but it's a nice component.

Comment by chuckadams 6 days ago

Any particular plugins you'd recommend? For orchestration I gave pi-subagents a try, and didn't care for it, ended up with hung agents sticking around forever and I wasn't even doing anything terribly fancy. Claude's subagent control is annoying and clumsy, but it works.

Comment by krawczstef 6 days ago

FYI - Burr is designed with recursion in mind, i.e. the ability to kick off Burr within Burr. So you could have an action that is managing several Burr subgraphs... Or you write your own management layer here.

Comment by lukebuehler 6 days ago

Arguably, if your agent needs a lot of custom logic to drive the agent loop it isn't an "agent" at all. At best it is an agentic workflow, or rather a workflow with some LLMs calls in between.

I think that's why agent SDKs feel like the wrong abstraction. If you are writing a workflow, use a workflow engine (Airflow, Temporal, etc), and call some LLMs with a small LLM library. If you need a "real" agent, use a full-featured agent harness, like Codex or CC or Pi or whatever, then load it up with all the tools, skills, mcps that it needs, and let it rip.

Incidentally I've been building a full featured agent harness that runs inside durable workflow engines [0], but it is designed _not_ as an SDK but rather as a standalone, full-featured harness with an API.

[0] https://github.com/smartcomputer-ai/forge

Comment by dd8601fn 6 days ago

I’ve said something similar about dozens of frontend frameworks. It’s massive abstraction and convolution for some future payoff that’s obviously never going to happen.

But sometimes people just need something to do, or something fun to play with, and “the next guy” rarely matters that much… so who cares that you’ve saddled them with the result of your paid playtime?

Comment by andai 6 days ago

So before AI I had the experience, more often than not, that it would take me longer to figure out how to use someone else's thing (or get it to do some particular thing, which often turned out to be impossible), than to just make my own.

And that was before I could just ask the computer to make it for me!

But most people seem to be the other way around. They'd rather deal with abstractions and boilerplate instead of writing the actual code.

Comment by mbreese 6 days ago

I'm somewhat in agreement. I like building 1:1 code for that specific agent.

Where I'm starting to question this is maintainability. When I come up with a new technique or way of doing something in my new agent, how can I update an older agent. Do I want to update the older agent?

But, I get what you're talking about w.r.t. building for the exact problem at hand. For example, I'm guessing that Apache Burr has support for a plugin-able vector RAG system (or at least it will if it doesn't now). That's great, but I want my RAG system to add documents to the context and keep them as part of an updated system prompt with some very specific tweaks that happen as part of that process. This is a bespoke way of working with an existing concept (RAG) that doesn't lend itself to using any specific framework.

In my use-case, bespoke is the way to go. But then I'm still stuck with having to make engineering choices for updating older agents. So, I see your point.

Comment by cyanydeez 6 days ago

Agents are a way to de-bloat the context. The way LLMs function, you absolutely need to find the sweet spot for a given task, and if the primary LLM has to go through a bunch of failures to find a working function, those failures are better contained in an agent and disposed of.

Obviously, you could have a different LLM like a "angel" that prunes a primary agent of the context it doesn't need, but I think the realistic KV cache problem is will determine the optimal structure: you want the work do be done in the most efficience KV cache (context-reuse) as much as possible.

There's definitely more to it than just spawning agents.

Comment by freakynit 6 days ago

Yep.. same. I build my own agents... all use-case specific. Keeps the code super minimal, and avoid unnecessary complexity. I have tried a few of these, but nop.. no help.. only more work (and issues).

Comment by hilariously 6 days ago

Couldn't agree more - tried to convince a business that doubling down on OpenClaw wasn't going to solve problems except for some 0-1 stuff, and that almost immediately they'd run into roadblocks because most of the product wouldn't serve their use case.

4 months of mostly spinning their wheels later they launched a really lackluster OC product that's effectively DOA.

Comment by tcdent 6 days ago

OpenClaw is an application, not a harness. Yes, it contains a harness, but it is a complete product.

When building an agentic workflow there are enough primitives that rewriting them from scratch every time makes zero sense.

What is a tool? How does the LLM understand the tool? Formatting a native function into a serializable input/output pattern makes sense to generalize and that does not need to exist repeated in everyones application code.

We use libraries to interact with the APIs themselves; nobody would say writing a spec-compliant API client was poor practice. Agentic harnesses are just one layer above: I need to call the API and I need to do it with certain expected conventions.

Comment by hilariously 6 days ago

Sure, but that seems liked two different points.

One, obviously yes OC contains a lot more than a harness, but my point was that it was too much for their use case and constrained their choices, not enabled them, and that choosing the right layer of abstraction is important.

There's good indirection/abstraction and there's ones that do not serve your use case, eg what was obviously day one regarding Langchain.

Comment by pianopatrick 6 days ago

I like to think of it as "AI prompting algorithms". Like instead of just this prompt gets this result it's A prompt then B prompt the C prompt gets a result.

And just like when people were trying to figure out which sorting algorithm made the most sense, we are all just trying to figure out which prompt algorithms with which models lead to good results.

Comment by chuckadams 6 days ago

The most interesting evolution of my agent workflow over the past year is that I've dropped all the language gimmicks: I used to use particular emoji to mark parts of code as hints to the AI, I structured planning docs in very rigid language for different types of instructions, and generally optimizing my language for machine consumption. That's all gone now: all my comments, directives, plans and so on are just plain and clear English, nothing more. They're direct and unceremonious, but no longer bullet points that were nearly caveman-speak. It just works better that way.

Comment by vanuatu 6 days ago

my job rn is just building agents

the hard part about building agents isnt the framework it's discovery, context, traditional engineering, handling the last mile

there are some invariants like the loop, tools, observability, guardrails, monitors etc...

Comment by brotchie 6 days ago

100% agreed, the "this is what an agent looks like to write" is the wrong pitch for a new agent framework.

The better pitch would be, "this is how easy observability, guardrails, monitoring, deployment, evals, versioning, A/B testing are with our framework." What the agent code looks like is somewhat incidental.

Comment by peterbell_nyc 6 days ago

This this this!

Anyone have something they genuinely like for all of this? For now I'm rolling my own, but I can't believe I won't find a better OSS alternative soon...

Comment by agentdev001 6 days ago

Nvidia Openshell solves most of the hard problems I've run into while building stuff in this space.

Observability is, for my purposes, solved by a given framework supporting OpenTelemetry.

Guardrails is where I've gotten the most value of openshell being a neat package. Agent workload scope is written as policy in openshell, and capability is backed by openshell handling all execution.

Monitoring/deployment/versioning is helped as well, depending on how agents/runners are slotted into the system. Deployment namely is quite well supported- openshell has kube/helm bits that are experimental atm, but seem like a logical approach imho.

Evals and a/b testing isnt something ive explored in depth, considering that agents with composable tool sets + frontier models are beyond my expectations already.

Comment by brotchie 6 days ago

We need the equivalent of the MEAN / LAMP stackronym for agents.

Comment by msradam 18 hours ago

[dead]

Comment by elijahbenizzy 6 days ago

Right I think this is why we made it unopinioated to a fault. Burr doesn't really do these things rather it just provides an orchestration framework. So it's pure BYO functions, classes, components, etc...

Comment by fxwin 6 days ago

The advantage of frameworks isn't that they make it easier to write the actual agent, it's tooling + observability + ... Even Langchain, for all the (deserved) criticism it gets made this very clear very early: It might be easy/easier to write your own chatbot from the ground up, but what happens if you have to add observability/tracing? Being able to just add one environment variable and instantly have a UI where i can nicely go through all of my traces with basically 0 additional effort is something a hand rolled solution just can't really compete with

Comment by freakynit 6 days ago

This only becomes relevant if your execution graph is complex/big enough. Otherwise, all it takes is less than 30 minutes to add telemetry to all needed points. Doing manually also gives you better control on what you really want to track (to save costs).

Comment by trollbridge 6 days ago

It’s painfully obvious that you can just open your coding harness and… tell it you’d like to make an agent. They’re simple to write.

Comment by peterbell_nyc 6 days ago

The agent isn't the hard part - it's the orchestration, skills, research systems, adversarial reviews, dreaming/compounding, context management and all the rest. Plus all the annoying hygiene tools to "poke the agent that got a clear prompt and decided to just sit there and wait for no good reason" and "delete the remote branches that the prompt told all the agents to delete but some of them forgot to":)

Comment by elijahbenizzy 6 days ago

This is where it's nice to have some guardrails -- coding agents work etremely well with limitations.

Comment by toddmorey 6 days ago

Anthropic seems to agree with you as more recent Claude updates have it just building task specific harnesses as needed.

Comment by bko 6 days ago

Take a simple workflow. You have a query it goes to a classifier. The classifier determines what workflow it should route the request to.

Then you have a general workflow that has a set of skills (prompts) and tools. And that could be recursive.

So if you do something like "rename this file" you have to build up a workflow like:

[classifier]

what's the workflow -> rename

[rename workflow]

list files (tool call)

figure out relevant predicate (LLM)

convert predicate into a filter query give the context of the files (LLM)

figure out what you want the new name to be (LLM)

create the request body and hit the tool

approval workflow

formatting

It's a lot to manage and orchestrate and that's just one simple example. You'd like want to use the same building blocks to delete a file or move it. Even to know the right concepts is difficult as we're a bit deluded on whats going on in the background of these modern AI apps like Claude and GPT that do a lot of this stuff for you

Comment by krawczstef 6 days ago

yep - here's something cool an end user wrote with Burr - https://github.com/msradam/phoebe

Burr just helps you, the engineer, to really control the primitives. Then adds some cool features you don't have to think about -- like observability :)

Comment by agentifysh 6 days ago

100% this

you dont need a framework

Comment by aplomb1026 6 days ago

[flagged]

Comment by eranation 6 days ago

Was this built using https://vorpus.github.io/performativeUI/?

It hits every AI generated landing page trope possible.

Or was it done ironically?

Comment by glenngillen 6 days ago

I was just wondering the same thing!

I do suspect it was built with some form of AI though because the handful of links I've tried to dig into have all linked to the wrong place/are invalid :/

Comment by n2h4 6 days ago

amazing set of components. just noticed my "ai-native" company has almost all of them.

Comment by hmokiguess 6 days ago

How does this compare to https://strandsagents.com/ ? I'm interested in tools in this space, right now I'm not attached to one, but Bedrock + Serverless on Agent Core feels like the "easy guided path" though I don't like the platform lock-in

Comment by thedougd 6 days ago

Curious about other experiences.

I’ve been playing with this stack and left wondering if Strands provides any secret sauce with Agent Core. So far it doesn’t feel that way and sometimes they even feel at odds with each other.

Comment by fnordpiglet 6 days ago

I think strands and agentcore don’t have a ton of overlap but I have found agentcore makes running strands frameworks pretty easy and relatively inexpensive. I like strands in that it’s the most mature (IMO) that isn’t specific to a model provider - although I found using things like vertex to require a lot of custom work to get caching and other things to work properly, the maturity is all on bedrock.

I’ll need to dig into burr - I’m not finding strands has an insurmountable maturity to it but it’s not carrying a lot of weird opinion (like some of the other OSS frameworks seem to be highly infected with) and is pretty practical in what it exposed and does, and is most like the agentic frameworks I’ve worked with inside FAANG. If burr can meet that and keep growing I’d probably look at moving to it as I also get a sense strands has a bit of the Amazon “highly probable to be abandoned once the managers and PMs get promoted” feeling.

Comment by elijahbenizzy 5 days ago

Strands is more opinionated and higher-level, also not vendor neutral.

Comment by msradam 6 days ago

I have enjoyed using this framework in my personal and work projects, having a reliable stateful workflow for AI models while getting free observability. I stitched a tool that allows mounting a Burr state machine as an MCP, giving agents a rail to follow, and no matter how complex the state machine gets the MCP tools are constrained to state machine navigation: https://github.com/msradam/theodosia

I am currently working on skills-to-state-machine conversions, since a lot of popular skills out there are already written as phases for an AI model to follow, so it would be great to leverage the explicit functionality of Burr to make that more reliable. Thank you for this amazing project.

Comment by tcdent 6 days ago

A builder pattern and decorators.

Yes, Python has decorators, but they're best used as "filters" that apply to functions or methods. Cache this, serialize the output of this function always, prepare this function to be used as a tool by an agentic harness. Not registration, not flow control. You may disagree but someone has to say it; FastAPI influenced the modern use of decorators far too much in the wrong direction.

Builder patterns are a Rust convention, because Rust has no named keyword arguments. A Python function already exposes a named contract. There is very little reason to ever to sequentially pass configuration parameters in chained method calls. If you need to add state that doesn't exist yet to a constructor or factory, that is not a builder pattern. That is registration. The one place where builder patterns should be tolerated is query builders. They iteratively build on a concept and having the additional "slot" for metadata (method name plus keyword arguments) is genuinely useful. Using methods which accept single parameter instead of keyword arguments is incorrect.

Comment by elijahbenizzy 6 days ago

So decorators here specifically attach metadata to make a function a reusable component. Builder makes a workflow. In Hamilton it's all decorators because it's purely declarative construction (sans reusability, really).

Comment by mkarrmann 6 days ago

Builder pattern isn't only used in Rust, but I agree it's hideous to use in Python.

Comment by giancarlostoro 6 days ago

Doesn't look any different than doing the same in C# or Java to me, it is kind of pointless in Python, the one thing the pattern gives you is building a class in such a way that you the developer know exactly what's what, so its really a developer ergonomics thing is how it looks like to me.

Comment by tcdent 6 days ago

Fair point. I should have said "popularized in the modern software vernacular by Rust".

Comment by giancarlostoro 6 days ago

I think of Java immediately when I hear Builder Pattern, and I think anyone who has ever touched Java does as well.

Comment by anentropic 6 days ago

What is the argument against implementing registration pattern with decorators?

Comment by Oras 6 days ago

First time I hear about Burr, curious why it was incubated in Apache.

Comment by elric 6 days ago

Why wouldn't it? The ASF has a long history of incubating new FOSS projects. Some graduate and become household names. Others fail and end up in the attic. The ASF can provide organisational support and generally fosters good communities.

Comment by Oras 6 days ago

My point was this is a crowded market now, why would they pick a platform that is not known? I did search HN and this platform was only shown once 2 years ago, and from their releases, they are still 0.42 after two years.

It might sounded that I’m against the move, but I’m just curious as what apache found in the platform to get incubated

Comment by krawczstef 6 days ago

Cause I submitted it. Learning the Apache process and cranking on other things has been a slow process. But we've got some momentum and beginning more regular releases.

Comment by pratio 6 days ago

I've been working with jido https://jido.run and would definitely recommend it

Comment by elijahbenizzy 6 days ago

One of the co-creators/maintainers here! Will try to answer Qs over the day.

Comment by bananamogul 6 days ago

Do you ever abbreviate the project name as A. Burr and how many Hamilton jokes[1] result from this?

[1] https://en.wikipedia.org/wiki/Burr%E2%80%93Hamilton_duel

Comment by elijahbenizzy 6 days ago

wait till you see our other project github.com/apache/hamilton ;)

Comment by otterley 6 days ago

How does it compare to the Strands Agents SDK? https://strandsagents.com

Comment by mzaccari 6 days ago

I couldn't find an explicit reference for the naming, but for anyone wondering there is a Hamilton example: https://github.com/apache/burr/tree/main/examples/multi-agen...

Comment by abirch 6 days ago

Burr is named after Aaron Burr, founding father, third VP of the United States, and murderer/arch-nemesis of Alexander Hamilton. What's the connection with Hamilton? This is DAGWorks' second open-source library release after the Hamilton library We imagine a world in which Burr and Hamilton lived in harmony and saw through their differences to better the union. We originally built Burr as a harness to handle state between executions of Hamilton DAGs (because DAGs don't have cycles), but realized that it has a wide array of applications and decided to release it more broadly.

https://pypi.org/project/burr/

Comment by elijahbenizzy 6 days ago

Right it was a bit of a joke. Originally stefan and I presented frameworks when we were at stitch fix -- stefan called his "hamilton" and I called mine "burr". His was better for the use-case. But then we wanted to build something for state machines as opposed to DAGs, so we called it Burr. I wanted the git tagline to be "make your agents go burr..."

Comment by suction 6 days ago

[dead]

Comment by lnenad 6 days ago

Claude Opus really loves this template when building websites. It's very funny how many times I've seen it for recent launches.

Comment by doublerabbit 6 days ago

And it lags my desktop every-time, I hate it. It's the default bootstrap theme all over again but instead with SVG's.

Comment by nico 6 days ago

On a tangent, can anyone recommend good coding agent orchestration tools or platform? Something to launch, manage and monitor codex or claude agents in multiple machines

Ideally self-hostable/open source

I know claude code has a lot of that internally built in already, but it’s claude-only

Comment by sgc 6 days ago

I was looking at their docs and Burr has agent cookbooks to get started with this, and it can handle multi-machine workflows. Is this not what you were looking for? I am not sure how it integrates and uses skills etc, but it seems like it should work to me.

https://burr.apache.org/docs/examples/agents/

Comment by nico 6 days ago

Thank you. I want something like that, but that uses codex/claude code as agents (through their corresponding subscriptions), instead of having to create ad-hoc agents + api keys

Comment by sgc 4 days ago

You can absolutely do that by using subprocess.run, or use the codex sdk

https://github.com/openai/codex/tree/main/sdk/python

Comment by sabzil37 4 days ago

I had a similar need to yours, so if you're looking for a job launcher for multi-agents, I'd like to recommend "Prompt-Combo", it's freeware.

Comment by flakiness 6 days ago

Wow, such a un-apache-y homepage I've ever seen, vs. the canonical one: https://httpd.apache.org/ (And wow, they still keep releasing it!)

Comment by doublerabbit 6 days ago

Here's another ancient relic of a website design. They even used a free template from back in the day.

https://tcl.apache.org/rivet/

Comment by fantasizr 6 days ago

it looks like it was built using the satire vibecoding components listed here yesterday https://vorpus.github.io/performativeUI/

Comment by elijahbenizzy 6 days ago

Ha! We went all out on the modern one (user contribution!).

Comment by mooreds 6 days ago

How are agents authenticated?

I searched the docs for authentication and mcp (one of the protocols which, among other things, handles some pieces of authentication/authorization) but didn't see any results.

What did I miss?

Comment by yaodub 6 days ago

[dead]

Comment by redlewel 6 days ago

Didn't bother reading more after seeing the vibeslopped landing page that took < 2 hours to make.

Also "700+ Discord Members" is not any type of endorsement of a technology or service.

Comment by ivanmontillam 6 days ago

Not to kick dirt on their faces, but I also think Discord is just not the proper chat community open source projects should have.

Might be my IRC mind talking, but a proprietary service like Discord, or Slack, or even Telegram at times, is just not suitable for the target community, as there's often a data privacy concern.

I never had a Discord account, and I'm hoping I can keep that streak.

Comment by werdnapk 6 days ago

They're all looking the same now. The AI version of bootstrap.

Comment by amne 6 days ago

I have to say, after the recent Performative UI post here on HN, I was not expecting an apache foundation project to have gradient words on the landing page.

Comment by coneonthefloor 6 days ago

So many software homepages use that tailwind template, or a variety of it. As soon as I see it, it suggests to me a poor quality/rushed product.

Comment by suction 6 days ago

I agree. As with Bootstrap, it was fine in the beginning but at some point turned sour - strictly speaking about Look & Feel.

Comment by coneonthefloor 6 days ago

Yes, I have nothing against Tailwind as a technology. My gripe is that you can now prompt an LLM “make a stylish homepage for my product” and they almost exclusively spit out this Tailwind template.

If that’s all the effort you are going through to make your brochure site, I can only assume the same care was given to the actual product.

Comment by vanuatu 6 days ago

vibe coded landing page

reddit user testimonial

framework is for state machines

why man..

Comment by ivanmontillam 6 days ago

Don't ask the why, ask the how. How did they get acceptance into an incubation stage with what you just mentioned?

Comment by pixel_popping 6 days ago

The vibe coded landing page (at least in its look) is really degrading Apache foundation image imo.

Comment by pramodbiligiri 6 days ago

How does one know that a website is "vibe coded"? Any good indicators?

Comment by enragedcacti 6 days ago

https://vorpus.github.io/performativeUI/

so far I'm seeing: GradientText, Animated button, EyebrowPill, Aurora background, MockIDE, LogoRow, SlippyWords, StatCounter, CommunityBadge

also: "No DSL, no YAML — just Python functions and decorators."

'It's not X, its Y' but with an added em dash is crazy work.

Comment by doublerabbit 6 days ago

The flair is a big give away. View source. Look for the SVGs.

Comment by krawczstef 6 days ago

user contributed :)

Comment by chill_ai_guy 6 days ago

I think the marketing copy probably needs to focus on differentiating features vs any myriad of agent frameworks. I took one look at the sample and immediately said "This is literally just langgraph with a builder pattern"

Comment by elijahbenizzy 6 days ago

Right -- possible it's slightly out of date https://github.com/apache/burr#-comparison-against-common-fr.... Good point on differentiating.

Comment by _pdp_ 6 days ago

I don't know. This is a very naive version of what reliable AI agents are all about.

Reliable means "can it finish the job that was tasked to do". It certainly has nothing to do with state machines.

Comment by drchaim 6 days ago

I just create a MVP chatbot for a client that has a Django app. I took the route to no frameworks. Claude/codex wrote the agent loop, the tools, the streaming..it’s working well for the MVP, we’ll see

Comment by big-chungus4 6 days ago

I am interested to know why people use those graph based agentic frameworks. Why not just define the behavior in python?

Comment by hbarka 6 days ago

Is this comparable to https://dspy.ai/ ?

Comment by shhhhhplease 6 days ago

Definitely not.

Comment by krawczstef 6 days ago

no. more lower level.

Comment by schainks 5 days ago

Dumb question: why would I use this instead of airflow with some custom built agent bits?

Comment by g42gregory 6 days ago

What are the pros and cons vs. Pydantic?

Comment by vivekchand19 6 days ago

the observability layer is almost always what DIY agent setups skip, and it's the first thing you miss in production. (built clawmetry for agent runtimes like openclaw, claude code, hermes etc specifically because of this)

Comment by CuriouslyC 6 days ago

The best agent framework is Pi (pi.dev). It is minimal and doesn't assume a use case, runs fine interactively or non-interactively, has an active community building with it and supports everything you need to build whatever kind of agent you want with plugins.

Comment by hedgehog 6 days ago

After trying a few I like NanoBot more than Pi. Also popular, pretty clean code, I found fewer bugs than I did in Pi.

Comment by TZubiri 6 days ago

oh god, the vibecoded homepage, I instantly closed that. sorry

Comment by iririririr 6 days ago

of course the page is some disposable garbage generated by Claude code that doesn't even try to work without JavaScript. at least it's on brand.

Comment by Meneth 6 days ago

There's no such thing as "reliable" AI.

Comment by cohix 6 days ago

And me? I’m the damn fool that shot him…

Comment by sermakarevich 6 days ago

My strong opinion is that every SE should go for agentic engineering these days: understand the limitations, how to shift them, and how to squeeze the maximum possible out of AI. The AI boom is seeing hundreds of billions, close to a trillion, in investments — with yesterday's announcement from China about investing $300B in a nationwide grid of data centers. Models are becoming smarter, and the gap where yesterday's models were unable to do something is closing fast. I look at agentic engineering from the complexity perspective. Give an agent a simple task like inserting an assertion and you can be 100% sure it will do it correctly. Give it a complex task made of many subtasks in a complex codebase and it will fail. There is a boundary of complexity that a single call to an agent can handle. We can push this boundary if we decompose the task into many small ones and provide a sufficient description of what has to be done.

I found the Spec Driven Development approach works nicely for me, and I've been using it since Feb 2026 for all my mid+ size projects. I started with the GSD plugin, but it soon became too heavy, so I implemented my own lightweight SDD-based workflow for Claude. A friend of mine ported it to gemini-cli, and that version was added to Google's approved third-party frameworks for internal usage. The idea is to decompose feature implementation into multiple steps, task implementation into multiple subtasks, and be able to clear the context after every task/implementation.

Repo: https://github.com/sermakarevich/sddw

Slides: https://docs.google.com/presentation/d/1SjKXF7hkoqyiN9-3tBGY...

When SDD was not enough, I started playing with scaling a single AI worker to multiple, and then got to an agent swarm. Built on top of a centralized Beads database, claude -p headless execution, a UI, a custom ask_user MCP, and Telegram integrations, fleet (the app name) lets me add many tasks in advance, control the number of workers executing them, and use any kind of coder/model. It works nicely with the SDDW implementation phase. It shines when you keep creating tasks, define dependencies between them, and give clear descriptions. For personal projects I can queue up 70 tasks for an overnight run, set the number of workers to 1 to not be blocked by usage limits, and let it roll.

Repo: https://github.com/sermakarevich/fleet

Slides: https://docs.google.com/presentation/d/1O_pXyKdtpRG2ORD1xw7s...

Since Fable 5 appeared, I've been changing the way I work with fleet. Instead of adding tasks/descriptions/dependencies to fleet myself, I talk to Fable: specify the goal, ensure understanding, and let Fable 5 add tasks to fleet. Fable is expensive, but in this setup it doesn't code — it just investigates, designs, decomposes, and creates tasks. Workers use the cheaper Sonnet 4.6 model.

Reliability comes with task implementation decomposition into multiple steps, feature decomposition into many smaller and simpler subtasks, having better description, clean and focused context.