Things I've Done with AI
Posted by shepherdjerred 1 day ago
Comments
Comment by brotchie 1 day ago
2025 Taxes
Dumped all pdfs of all my tax forms into a single folder, asked Claude the rename them nicely. Ask it to use Gemini 2.5 Flash to extract out all tax-relevant details from all statements / tax forms. Had it put together a webui showing all income, deductions, etc, for the year. Had it estimate my 2025 tax refund / underpay.
Result was amazing. I now actually fully understand the tax position. It broke down all the progressive tax brackets, added notes for all the extra federal and state taxes (i.e. Medicare, CA Mental Health tax, etc).
Finally had Claude prepare all of my docs for upload to my accountant: FinCEN reporting, summary of all docs, etc.
Desk Fabrication
Planning on having a furniture maker fabricate a custom walnut solid desk for a custom office standing desk. Want to create a STEP of the exact cuts / bevels / countersinks / etc to help with fabrication.
Worked with Codex to plan out and then build an interactive in-browser 3D CAD experience. I can ask Codex to add some component (i.e. a grommet) and it will generate a parameterized B-rep geometry for that feature and then allow me to control the parameters live in the web UI.
Codex found Open CASCADE Technology (OCCT) B-rep modeling library, which has a web assembly compiled version, and integrated it.
Now have a WebGL view of the desk, can add various components, change their parameters, and see the impact live in 3D.
Comment by cj 1 day ago
What scares me though is how I've (still) seen ChatGPT make up numbers in some specific scenarios.
I have a ChatGPT project with all of my bloodwork and a bunch of medical info from the past 10 years uploaded. I think it's more context than ChatGPT can handle at once. When I ask it basic things like "Compare how my lipids have trended over the past 2 years" it will sometimes make up numbers for tests, or it will mix up the dates on a certain data points.
It's usually very small errors that I don't notice until I really study what it's telling me.
And also the opposite problem: A couple days ago I thought I saw an error (when really ChatGPT was right). So I said "No, that number is wrong, find the error" and instead of pushing back and telling me the number was right, it admitted to the error (there was no error) and made up a reason why it was wrong.
Hallucinations have gotten way better compared to a couple years ago, but at least ChatGPT seems to still break down especially when it's overloaded with a ton of context, in my experience.
Comment by arjie 1 day ago
1. I keep all my accounts in accounting software (originally Wave, then beancount)
2. Because the machinery is all in programmatically queriable means, the data is not in token-space, only the schema and logic
I then use tax software to prep my professional and personal returns. The LLM acts as a validator, and ensures I've done my accounts right. I have `jmap` pull my mail via IMAP, my Mercury account via a read-only transactions-only token and then I let it compare against my beancount records to make sure I've accounted for things correctly.
For the most part, you want it to be handling very little arithmetic in token-space though the SOTA models can do it pretty flawlessly. I did notice that they would occasionally make arithmetic errors in numerical comparison, but when using them as an assistant you're not using them directly but as a hypothesis generator and a checker tool and if you ask it to write out the reasoning it's pretty damned good.
For me Opus 4.6 in Claude Code was remarkable for this use-case. These days, I just run `,cc accounts` and then look at the newly added accounts in fava and compare with Mercury. This is one of those tedious-to-enter trivial-to-verify use-cases that they excel at.
To be honest, I was fine using Wave, but without machine-access it's software that's dead to me.
Comment by shepherdjerred 1 day ago
Comment by brotchie 1 day ago
To then "aggregate" all of the json outputs, I had Claude look at the json outputs, and then iterate on a Python tool to programmatically do it. I saw it iterating a few times on this: write the most naive Python tool, run it, throws exception, rinse and repeat, until it was able to parse all the json files sensibly.
Comment by dmd 1 day ago
Comment by cj 1 day ago
Which should pair well with the “write a script” tactic.
Comment by tavavex 1 day ago
Comment by ElFitz 1 day ago
And it usually takes just as long.
Comment by thijsvandien 1 day ago
Comment by basch 1 day ago
Comment by AlecSchueler 14 hours ago
Comment by jumpman500 1 day ago
Comment by thijsvandien 1 day ago
Comment by whackernews 1 day ago
Comment by whackernews 1 day ago
Comment by generallyjosh 4 hours ago
Personally, I know coding pretty well. So when I'm using it for coding, I can spot most of its mistakes / misunderstandings
I would not trust using it on a complex domain I'm not super familiar with, like doing taxes
A mistake here is pretty high cost (getting audited, and/or having to pay a bunch in penalties)
Comment by mandeepj 1 day ago
You couldn’t do that with TurboTax or block’s tax file? You don’t have to submit or pay.
Comment by MikeNotThePope 1 day ago
Comment by g947o 17 hours ago
Comment by slopinthebag 1 day ago
I imagine your accountant had the same reaction I do when an amateur shows me their vibe codebase.
Comment by whattheheckheck 1 day ago
Hope you dont get audited
Comment by semiquaver 1 day ago
Comment by stavros 1 day ago
* https://www.stavros.io/posts/i-made-a-voice-note-taker/ - A voice note recorder.
* https://github.com/skorokithakis/stavrobot - My secure AI personal assistant that's made my life admin massively easier.
* https://github.com/skorokithakis/macropad - A macropad.
* https://github.com/skorokithakis/sleight-of-hand - A clock that ticks seconds irregularly but is accurate for minutes.
* https://pine.town - A whimsical little massively multiplayer drawing town.
* https://encyclopedai.stavros.io - A fictional encyclopedia.
* https://justone.stavros.io - A web implementation of the board game Just One.
* https://www.themakery.cc - The website and newsletter for my maker community.
* https://theboard.stavros.io - A feature board that implements itself.
* https://github.com/skorokithakis/dracula - A blood test viewer.
* https://github.com/skorokithakis/support-email-bot - An email bot to answer common support queries for my users.
Maybe some of these will beat the rap.
Comment by profsummergig 1 day ago
Sounds like something that could be tried as a fix for a kind of OCD (obsessive seconds counting).
Comment by stavros 1 day ago
Comment by bencyoung 19 hours ago
Comment by observationist 1 day ago
Something like this would be anxiety inducing for most people, I bet. That'd be an excellent experiment, track heart rate, EEG, and performance on a range of cognitive tasks with 2 minute long breaks between each tasks, one group exposed to the irregular ticking, another exposed to regular ticking, another with silence, and one last one with pleasant white noise.
Comment by pinkmuffinere 1 day ago
Comment by stavros 1 day ago
It's just the right amount of "did that clock just skip a beat? Nah must just be my imagination".
Comment by pinkmuffinere 1 day ago
Comment by stavros 1 day ago
Comment by risyachka 1 day ago
And with AI the result of 99.9% is abandonware. Just piles of code no one will ever touch again.
Which proves the point of no productivity gains. Its just cheap dopamine hits.
Comment by danso 1 day ago
Comment by sarchertech 1 day ago
That’s not even mentioning that this tools doesn’t do much beyond wrap a call to Claude. And it’s using Claude to display blood test data to the end user. This is not something I’d trust an LLM to not mess up. You’d really want to double check every single result.
Comment by dsf2 1 day ago
We hate having to feel like we have to double check everything. We have an asymmetric relationship with gains and losses etc.
Is it me or is this stuff flying over peoples heads?
Comment by slopinthebag 1 day ago
Comment by simmerup 1 day ago
Comment by tempaccount5050 1 day ago
Comment by incr_me 1 day ago
Comment by dsf2 1 day ago
Steve Jobs once said a thing about the belief that an idea is 90% of the work is a disease. He is and was absolutely right.
Comment by sarchertech 1 day ago
Comment by grim_io 1 day ago
Constant enshittification and UI redesigns are driven by the provider to justify monthly extortion.
Comment by saulpw 1 day ago
And even for the ones that might "beat the rap", I don't understand from your descriptions why they are interesting or unique. A voice note recorder? Cool. There are already hundreds if not thousands of those, why did you need to make your own in the first place? I'm not saying that yours isn't special, I'm just saying that it doesn't help to post the blandest description possible if you're trying to impress people with the utility of your utility.
Comment by senko 1 day ago
Seems like the bar is now it has to be a mass market product. On another post someone else commented a SaaS doesn't count if it doesn't earn sustainable revenue.
I guess OpenClaw also doesn't count because we don't know how much Peter got from OpenAI.
This is an ideological flame war, not a rational discussion. There's no convincing anyone.
Comment by timacles 13 hours ago
Yeah they are interesting, I guess they do something but are any of them actually delivering value? That’s when you get into the argument of what is value and to whom, but as AIs role in society of generating productivity, that’s pretty disputable if every person being able to build their own train set that turns on the toaster and makes coffee is going to move us forward as a species like , say the internet.
That’s really the only argument, is the use of LLMs worth the trillions of dollars and selling out the future of humanity for. Not is it fun Bildungsroman quirky apps really fast
Comment by munksbeer 1 day ago
The bar for this will just keep moving. Some people are heavily invested in the anti-stance, so human nature being what it is, you've little hope of changing their minds anyway.
Comment by saulpw 1 day ago
For example, I checked out their "Fictional Encyclopedia". It's an absolutely terrible project, much worse than useless, because it claims to be an "encyclopedia" right in the name (the tagline is "Everything about everything"), yet it's engineered to just completely make things up, and nowhere on the page does it indicate this! I looked up my own niche open-source project, and was prepared to be at least somewhat impressed that it pulled together facts on the fly into an encyclopedic form. For the first couple of paragraphs that seemed like it might be the case, then it veered into complete fantasy and just kept going.
Like what is the point of this? I can already ask a chatbot the same question and at least then I have explicit indicators that it might be hallucinating. But this page deliberately confuses truth and reality for absolutely zero purpose. It's a waste of brain cells, for both the creator and the consumer, with no redeeming value. It's neither interesting, nor different, nor valuable. AND it's burning tokens to boot!
I mean, come on, the bar is not that high. Some of stavros' projects may even be over it. But the first projects I checked were sub-basement, and I am not interested in searching through mounds of trash for what might be a quarter dollar. I'm actually kind of disappointed that stavros didn't have (or apply) the sense or taste to whittle down that list of 11 (!) projects to some 3 that show off the value of their work. Which I'm starting to understand is everyone's issue with AI brain rot; it seems to just encourage "here's everything, I dunno, you figure it out" which is maddening and deserves the pushback it gets.
Comment by Grimblewald 17 hours ago
Comment by stavros 1 day ago
Comment by saulpw 1 day ago
Moreover though, I'm not even saying you shouldn't do those things. I'm actually playing around with AI quite a bit, and certainly have created my share of useless/productivity tools. But it's not a flex to show off your own Flappy Birds or OpenNanoClaw clone, even if they are written in COBOL or MUMPS.
And they definitely do not have to be "extremely useful". But they should answer the question: what problem does it solve?
Comment by jjee 1 day ago
And it’s exactly what I expected - lines of code. Cute. But… so what? This is not good for the AI hype and nor any continued support for future investment.
On the other hand all this stuff is going to drive continual innovation. The more tokens generated the more model producers invest. And we might eventually get to a place of local models.
Comment by stavros 1 day ago
Comment by tuesdaynight 1 day ago
Comment by stavros 1 day ago
Thanks for the support!
Comment by slopinthebag 1 day ago
Comment by munksbeer 1 day ago
The thing that triggers people is comments like yours still, even at this point, claiming that AI just produces slop and everyone is just lying.
It is absurd, and people are obviously going to react to it.
Comment by slopinthebag 1 day ago
If by "react" you mean make stuff up, sure.
Comment by Grimblewald 17 hours ago
Comment by lukan 1 day ago
I get the sentiment, but this is natural with a groundbraking new technology. We are still in the process of figuring out how to best apply generative LLM's in a productive way. Lots of people tinker and share their results. Most is surely hype and will get thrown away and forgotten soon, but some is solid. And I am glad for it as I did not take part in that but now enjoy the results as the agents have become really good now.
Comment by harry8 1 day ago
This is exactly the same reason why the appropriate question to ask about Haskell is "where are the open source projects that are useful for something that is not programming?"
The answer for Haskell after 3 decades is very, very little. Pandoc, Git Annexe, Xmonad. Might be something else since I last did the exercise but for Haskell the answer is not much. Then we examine why the kids (us kids of all ages) can't or don't write Haskell programs.
The answer for LLM coding may be very different. But the question "where is the software that does something that solves a problem outside its own orbit" is crucial. (You have a problem. You want to use foo to solve it, now you have two problems but you can use foo to solve a part of the second one!!)
The price of getting code written just went down. Where are the site/business launches? Apps? New ideas being built? Specifically. With links. Not general, hand-wavy "these are the sorts of things that ..." because even if it's superb analysis, without some data that can be checked it's indistinguishable from hype.
Whatever data we get will be very informative.
Comment by lukan 1 day ago
I looked into doing it manually, but gave up. Way too much dirty work and me no energy for that.
Then I discovered that claude CLI got good - and told it to do it (with some handholding).
And it did it. Build process modernized. No more outdated dependencies. Then I added some features I missed in the original wick editor. Again, it did it and it works.
A working editor that was abandoned and missed features - now working again with the missing features. With minimal work done from my side (but I did put in work before to understand the source).
I call this a very useful result. There are lots of abandoned half working projects out there. Lots of value to be recovered. Unlike Haskell, Agents are not just busy with building agents, but real tools. Currently I have the agents refactor a old codebase of mine. Lot's of tech dept. Lot's of hacks. Bad documentation. There are features I wanted to implement for ages but never did as I did not wanted to touch that ugly code again. But claude did it. It is almost scary of what they are already capable of.
Comment by shepherdjerred 1 day ago
At work, I would say I've done plenty of "useful" things with AI, but that's hard to show off given that I work on an internal application.
Comment by peteforde 1 day ago
Quite simply, I don't think that they are asking or arguing in good faith.
Comment by SunshineTheCat 1 day ago
I chuckle when I see some of them because you could achieve the same (or often faster) result by jotting a note onto a notecard and sticking it in your pocket.
Most of the other automations running don't really seem to serve any real purpose at all.
But hey, if it's fun, have at it.
Comment by gopher_space 1 day ago
Comment by bronlund 1 day ago
Comment by ssrshh 1 day ago
Comment by bronlund 1 day ago
Comment by simmerup 1 day ago
Comment by archagon 1 day ago
Comment by bronlund 1 day ago
Comment by archagon 1 day ago
Comment by bronlund 1 day ago
If you want to be passive aggressive without AI, the more tokens there is for the rest of us ;)
Comment by slopinthebag 1 day ago
These AI tools are not hard to learn, in fact they're super easy when you have some experience programming, so the only people who are going to be left behind are the ones who simply refuse to use the tools out of principle. And why would they care about being "left behind"? They're making a conscious choice to not use the tools. They want to be left behind!
And not everyone who is skeptical are that out of principle, some just don't see the value yet or are slowly and cautiously adopting it into their workflow. If AI powered coding ends up being even half as good as promised, so good that denying the evidence is impossible, they can just start using it and catch right up. So who exactly is "being left behind" here? It's complete nonsense while simultaneously being extremely condescending and I get triggered every time I read the phrase.
I don't mean anything against you personally with my ranting, it's more a general observation. Perhaps you and some others do mean it as a genuine bit of advice, like "hey, you should learn these tools or else you might struggle to find work in the future", but the sense I get most of the time are people who are gleeful that the non-believers are soon to be homeless or whatever.
Comment by bronlund 17 hours ago
We know books has been used for both good and evil, but I still think books are a wonderful invention. Same with television or the internet, the quality of the content on there doesn't really take anything away from the fact that the technology in itself is absolutely amazing. In hindsight - it has been a minute since Gutenberg - how the society has adopted the written word do have real implications for the people living in it though. If you can't read today, you will struggle. Not because there are anything wrong with you, but because the system more or less take it for granted that you can.
And it is going to be the same with AI, but even more so. The ones who learns to master it will dominate those who don't. It will create a new form of class divide, where access to tokens and knowing how to use them, is going to be the main drivers. AI is still in it's early stages and we see that not everything is alright with it, take for instance the economics around it or the environmental impact it has. But still, I do believe that it is an amazing invention, and that if you do not embrace it, you are missing out.
Comment by slopinthebag 12 hours ago
Maybe don't say it online if you wouldn't say it offline.
Comment by bronlund 11 hours ago
Comment by slopinthebag 11 hours ago
Comment by bronlund 10 hours ago
Comment by adampunk 14 hours ago
I think people are talking like this because we have not lived in a genuine computing revolution like this since probably the introduction of the micro computer. It’s been more than 40 years.
I get that people are mad about this. That’s real obvious when you comment in any way about the use of AI. You get told that you’re a robot you get told that you’re not a real engineer you get told that you’re insecure you get told that—-all kinds of things. So it’s super clear that people are upset because they’re being fucking childish about it. Even a post like this one where the author tries hard to be pretty nice, we see the same sneering comments about training your own replacement and shit like that. It’s not subtle.
Where I get off the train is concluding that because they’re upset that they don’t need to be told what’s happening. All of computing is already changing. It’s already happening. It’s like if the sun winked out right now we would discover it in eight minutes, but the event has already happened. We are merely outside the cone of visibility. This shit is all happening right now. It is all real. I think it does a disservice to people to pretend as though it’s not.
Comment by slopinthebag 12 hours ago
Comment by bronlund 11 hours ago
People are bitching now about how AI has ruined coding, not fully grasping that for most people, there will be no code, no applications, no operating systems. AI will pretend to be all of that, and doing it way better. A six year old will be able to "out-code" all of us.
This is half a year ago: https://www.youtube.com/watch?v=dGiqrsv530Y
Comment by archagon 11 hours ago
Comment by adampunk 10 hours ago
Am I supposed to talk to you like this? Should I do some psychoanalysis here? If you wanna say I’m pantomimed machinery, then I think we may need to have a discussion.
Because here’s what it looks like to me: I think there’s a lot of people who arguably had a pretty good handle on how their corner of computing worked. They can understand a pretty deep dive into the stack they use, and where they have to deposit something into the intellectual hinterlands, it can safely be abstracted away on dependable, engineered machines or standards. That is no mean feat; lots of people cannot say that. The fact that someone who does say it doesn’t fully understand paging or floating point arithmetic is not a sign that something is wrong, but rather that we have succeeded in big shared engineering problems. Cool.
Some new shit is afoot. We are entering into a new, turbulent, uncertain era of computing. A lot of people who previously had a pretty confident grasp of both the core in the frontier of their work now do not understand what is driving the frontier. They have made the fact that they do not understand this everyone else’s problem. Rather than admit that they do not understand an area they used to understand we are subjected to incessant infantile progression through what I hope are stages of grief. Because at least then it might come to an end.
Everyone has an explanation for why this is all gonna collapse tomorrow and why they don’t need to learn about it. Everyone has a smart remark about the use of AI for some very important moral reason which also means they don’t need to learn about it. They both add up to the same thing which might just be healthier if treated as a true admission of ignorance.
Comment by adampunk 9 hours ago
Some of the money thrown around is by companies that want to lock in some kind of interdependency because the only downside they can conceive is Oracle or Dell or whoever invents AGI and they get left behind. So huge circular deals are getting made which increase the correlation coefficient for any collapse to 1 lmao. These deals are getting made for reasons that feel like 2004-2005 american real-estate, where the downside risk of a national profile of mortgages was actually (not joking) taught in textbooks to be 0. So naturally if you're maximizing revenue by making things interdependent, you really only consider the upside.
All these forward looking energy contracts and local generation of energy are signs that the market is under strain more than we expect increasingly exponential future use. Giant companies are thinking they're locking counterparties into the right risk structure (here with an energy company being somehow willing to foresake this infinite future energy price they could charge by just waiting), but really that energy company is perfectly happy to accept some money to start a project which will generate revenue long before electricity. That energy company has an idea of its own risk and revenue profile and they can extract money from a bubble, too.
I give us...months? Maybe 18 months--probably less--before things things get really nasty and messy for the firms who thought they were buying a golden ticket. It then gets messy for users (if not sooner) who are right now being subsidized nicely--not the ludicrous 5k being thrown around recently but compute is at a subsidy right now, so long as you want to rent it or can run a model like Qwen locally. Lots of other people are paying for that subsidy while extraordinary amounts of money flow from one part of computing to another. That's already having weird consequences as firms who spent money on what is essentially rental compute pushing their employees to use more of it in order to keep the person who made the contract safe. More companies will do the Microsoft route (no I'm not talking about Copilot!) and try to push tasks into their internal pipelines like MSFT does with Azure--where with e.g. github, what github needed to do took a back seat (literally haha) to integration with the compute pipeline. That's good-bad-whatever, depending on how you want to think about it. But it's certainly disruptive and I think right now a lot of genAI is doing that kind of disrupting, where people and orgs are being forced through money shaped holes.
I don't know what happens when the music stops. I just know that it is playing.
Comment by bronlund 8 hours ago
And I do think you are underestimating just how much money these guys can print, if some event is disrupting the machine. If this thing really goes down, it will be by design and because they got the Thing 2.0 ready to go.
Comment by adampunk 7 hours ago
What I am saying and what I think a lot of folks who are trying to get this point across are saying is that this will be critical infrastructure sooner than 20 years. The right frame of mind is to look at this like we look at the Internet circa of the 1980s or the 1970s. This is big messy and experimental right now. We are in the middle of rebuilding computing with foundation models. That is happening at an enormous subsidy for the time being.
Comment by lagrange77 1 day ago
Comment by bronlund 1 day ago
I try to explain stuff to my kids, to the best of my ability, but give them room to make their own conclusions. As an old fart, there is a limit to how relevant my world will be to them - and I have to acknowledge that.
Change is scary and not always for the better, but in my humble opinion; we have nothing to lose and everything to gain.
I, For One, Welcome Our New AI Overlords :]
Comment by piker 1 day ago
I'm starting to believe using them is more likely to make you obsolete than not.
Comment by vermilingua 1 day ago
Comment by jjee 1 day ago
From where I stand this thing is going to provide great leverage to those who don’t simply just write code. I personally doubt the thing will ever get to a place where it can be trusted to operate alone - it needs a team of people and to go super fast you need more people.
Moreover, the price won’t be high due to competition.
I’ve changed my view on LLMs as being good, as long as competition is fierce.
Comment by alas44 1 day ago
Comment by shepherdjerred 1 day ago
Comment by jjee 1 day ago
Comment by shepherdjerred 1 day ago
I do think it'll be a while before LLMs make significant contributions to complex projects, though. For example I can't imagine many maintainers of the Linux kernel use LLMs much.
Comment by piker 1 day ago
I believe your skills are atrophying when you use these things no matter how trivial the case. That compounds with their bias towards solving problems by producing more code to further reduce your productivity without them.
Comment by shepherdjerred 1 day ago
I do agree with you to some extent. I think anyone who uses LLMs will need to set aside some time writing code by hand to keep their skills sharp.
Comment by max_streese 1 day ago
Comment by keybored 1 day ago
Comment by smokel 1 day ago
What I find interesting is that I have little motivation to open source it. Making it usable for others requires a substantial amount of time, which would otherwise be just a fraction of the development time.
Comment by xorvoid 1 day ago
Comment by smokel 21 hours ago
I think I've spent ~20 hours and a couple of $100 of Claude Opus tokens in Cursor. So it's not cheaper or easier, but the amount of frustration saved with having proper Emacs keybindings might delay catastrophic global warming by a few days.
Oh, and of course I'm not compatible with all the Obsidian extensions, nor do I have proper hosting for server-side sync yet. All in all, a fool's errand, but I'm having fun.
Comment by xorvoid 12 hours ago
Comment by smokel 8 hours ago
A major advantage of (certain) extension mechanisms is that you can update them in real-time. For example, in Emacs you can change functions without losing the current state of the application. In Processing or live coding environments, you can even update functions that affect real-time animation or audio.
Another advantage is that they can pose a very nice API, that allows for other people to learn an abstraction of the core application. If you are the sole developer, and if you can spend the time to keep an active memory of the core application, this does not help much. But it can certainly help others to build upon your foundation. Gimp and Emacs are great examples of this.
A disadvantage is that you have to keep supporting the extension mechanism, or otherwise extensions will break. That makes an ecosystem somewhat more slow to adapt. Emacs is the prime example here. We're still stuck with single-threaded text mode :)
Comment by bityard 1 day ago
(I actually did write my own note-taking application, but that was before LLMs were any good at writing code.)
Comment by archagon 1 day ago
Comment by TheAceOfHearts 1 day ago
Comment by archagon 1 day ago
Comment by smokel 21 hours ago
As stated in my first comment, Obsidian does not support Emacs keybindings properly, nor is it open source. Writing an extension to add Emacs keybindings is not at all trivial, because you have to work around a lot of existing and undocumented functionality.
There are other reasons for not vibe coding your own alternative, but as LLMs keep progressing, these reasons may become less relevant.
Comment by ipaddr 1 day ago
What do people think of it?
I personal don't think that's a badge of honor. Aside from losing your coding skills you miss oppurtunities to generate AI pieces and connect them to existing systems that can't be feed into the AI. Plus making small changes is easier than having the AI make them without messing something else up.
Comment by Maxatar 1 day ago
I prefer having Claude make even small changes at this point since every change it makes ends up tweaking it to better understand something about my coding convention, standard, interpretation etc... It does pick up on these little changes and commits them to memory so that in the long run you end up not having to make any little changes whatsoever.
And to drive this point further, even prior to using LLMs, if I review someone's work and see even a single typo or something minor that I could probably just fix in a second, I still insist that the author is the one to fix it. It's something my mentor at Google did with me which at the time I kind of felt was a bit annoying, but I've come to understand their reason for it and appreciate it.
Comment by sarchertech 1 day ago
Comment by Maxatar 1 day ago
The second thing Claude Code does is when it reaches the end of its context window it /compact the session, which takes a summary of the current session, dumps it into a file, and then starts a new session with that summary. But it also retains logs of all the previous sessions that it can use and search through.
Looking over my session of Claude Code, out of the 256k tokens available, about 50k of these tokens are used among "memory" and session summaries, and 200k tokens are available to work with. The reality is that the vast majority of tokens Claude Code uses is for its own internal reasoning as opposed to being "front-end" facing so to speak.
Additionally given that ChatGPT Codex just increased its context length from 256k to 1 million tokens, I expect Anthropic will release an update within a month or so to catch up with their own 1 million token model.
Comment by sarchertech 1 day ago
1. The closer the context gets to full the worse it performs.
2. The more context it has the less it weights individual items.
That is Claude might learn you hate long functions and add a line about short functions. When that is the only thing in the function it is likely to follow other very closely. But when it’s 1 piece of such longer context, it is much more likely to ignore it.
3. Tokens cost money even you are currently being subsidized.
4. You have no idea how new models and new system prompt will perform with your current memory.md file.
5. Unlike learning something yourself, anything you teach Claude is likely to start being controlled by your employer. They might not let you take it with you when you go.
Comment by shepherdjerred 1 day ago
keep in mind that those 50k memory tokens would likely be cached after the first run and thus significantly cheaper
Comment by sarchertech 1 day ago
Comment by shepherdjerred 23 hours ago
Comment by sarchertech 16 hours ago
Comment by CyberDildonics 9 hours ago
It seems like people who concede control to an AI are mostly people who didn't feel in control of it in the first place while keeping every detail intentional is no longer a priority.
Comment by aerhardt 1 day ago
Comment by stavros 1 day ago
Comment by shepherdjerred 1 day ago
It still has gaps. I don't think they've landed on the right model for CI. Like Earthly, their model is a CI runner + local cache. I believe a distributed cache (like Bazel) makes more sense.
If I were choosing between the two I'd personally always pick Dagger, but I think there is a strong argument for Earthly for simpler projects. If you're using multiple Earthfiles or a few hundred lines of Earthly, I think you've outgrown it.
Comment by stavros 1 day ago
Comment by JeanMarcS 1 day ago
Comment by shepherdjerred 1 day ago
Comment by vunderba 1 day ago
*Piece Together*
An animated puzzle game that I built with a fairly heavy reliance on agentic coding, especially for scaffolding. I did have to jump in and tweak some things manually (the piece-matching algorithm, responsive design, etc.), but overall I’d estimate that LLMs handled about 80% of the work. It's heavily based on the concept of animated puzzles in the early edutainment game The Island of Dr. Brain.
https://animated-puzzles.specr.net
*Lend Me Your Ears*
Lend Me Your Ears is an interactive web-based game inspired by the classic Simon toy (originally by Milton Bradley). It presents players with a sequence of musical notes and challenges them to reproduce the sequence using either an on-screen piano, MIDI keyboard, or an acoustic instrument such as a guitar.
https://lend-me-your-ears.specr.net
*Shâh Kur - Invisible Chess*
A voice controlled blindfold chess game that uses novel types of approaches (last N pieces moved hidden, fade over time, etc). Already been already playing it daily on my walks.
*Word game to find the common word*
It's based off an old word game where one person tries to come up with three words: sign, watch, bus. The other person has to think of a common word that forms compound-style words with each of them: stop.
I was quite surprised to see that this didn't exist online already.
https://common-thread.specr.net
*A Slide Puzzle*
Slide puzzles for qualified MENSA members. I built it for a friend who's basically a real-life equivalent of Dustin Hoffman's character from Rain Man. So you might have to rearrange a slide puzzle from the periodic table of elements, or the U.S. presidents by portrait, etc.
https://slide-puzzles.specr.net
*Glyphshift*
Transforms random words on web pages into different writing systems like Hiragana, Braille, and Morse Code to help you learn and practice reading these alphabets so you can practice the most functionally pointless task, like being able to read braille visually.
https://github.com/scpedicini/glyph-shift
All of these were built with varying levels of assistance from agentic coding. None of them were purely vibe-coded and there was a great deal of manual and unit testing to verify functionality as it was built.
Comment by fmbb 1 day ago
It also seems like none of them are relatively unique and all of them have been done before.
Comment by vunderba 1 day ago
Simon toy that's integrated into an ear training tool?
Blindfold chess with Last N moves hidden?
Mensa-style slide puzzles?
An extension that converts random words into phonetic equivalents like morse, braille, and vorticon?
I've also made some way less useful stuff like a win32 app that lets you physically grab a window and hurl it which invokes an WM_DESTROY when it completely is off the screen.
And an app that measures low frequencies to tell if you are blowing into the mic and then increases the speed of the CPU fan to cool it down.
Comment by lowsong 1 day ago
This is a narrow view of software engineering. Thinking that your role is "code that works" is hardly better than thinking you're a "(human) resource that produces code". Your job is to provide value. You do that by building knowledge, not only of the system you're developing but of the problem space you're exploring, the customers you're serving, the innovations you can do that your competitors can't.
It's like saying that a soccer player's purpose is "to kick a ball" and therefore a machine that launches balls faster and further than any human will replace all soccer players, and soon all professional teams will be made up of robots.
Comment by saint-evan 1 day ago
Comment by lowsong 1 day ago
Businesses wish this were the case, and many will even say it or start to believe it. But it doesn't bare out to be true in practice.
Think about it this way, engineers are expensive so a company is going to want to have as few of them as possible to do as much work as possible. Long before LLMs came along there have been many rounds of "replace expensive engineers" fads.
Visual programming was going to destroy the industry, where any idiot could drag and drop a few boxes and put together software. Turns out that didn't work out and now visual programming is all but dead. Then we had consultants and software consultancies. Why keep engineers on staff and have to deal with benefits and HR functions when you can hire consultants for just long enough to get the job done and end their contracts. Then we had offshoring. Why hire expensive developers in markets like California when you can hire far cheaper engineers abroad in a country with lower wages and laxer employment law. (It's not a quality thing either, many of these engineers are unquestionably excellent.)
Or, think about what happens when software companies get acquired. It's almost unheard of for the acquiring company to layoff all of the engineering staff from the acquired company right away, if anything it's the opposite with vesting incentives to convince engineers to stay.
If all that mattered was the code and the systems, and people were cogs that produced code that businesses wanted to optimise, then none of these actions make sense. You'd see companies offshore and use consultants with the company that does "good enough" as cheaply as possible. You'd see engineers from acquisitions be laid off immediately, replaced with cheaper staff as fast as possible.
There are businesses like that operate like this, it happens all the time. But, all of the most successful and profitable tech companies in the world don't do this. Why?
Comment by saint-evan 1 day ago
No, No... Of course all that matters isn't just the code. My framing was about how organizations model the work SWE do economically.
>Visual programming was going to destroy the industry, where any idiot could drag and drop a few boxes and put together software. Turns out that didn't work out and now visual programming is all but dead. Then we had consultants and software consultancies. Why keep engineers on staff and have to deal with benefits and HR functions when you can hire consultants for just long enough to get the job done and end their contracts. Then we had offshoring. Why hire expensive developers in markets like California when you can hire far cheaper engineers abroad in a country with lower wages and laxer employment law. (It's not a quality thing either, many of these engineers are unquestionably excellent.)
It seems like we're agreeing along the same tangent. With this argument, you're admitting that businesses do see SWE as cogs in a wheel and seasonally try to replace them... The seasonality of 'make the engineer replaceable' fads really does point to businesses trying to simplify what devs actually do since most of what they measure is working code output because it’s a tangible artifact (this is waht the OP meant by implying being a working code producer at work). Knowledge, judgment, architectural intuition, and domain understanding are harder to quantify, so they disappear from the model even though they ARE the real constraint. So for the record, I do agree with you that code isn't everything but I maintain that SWEs are modelled based on working codes produced even in more successful companies that invest in domain knowledge and long-term system understanding.
Metrics, performance reviews, sprint velocity, delivery timelines, all orbit around observable artifacts because those are what management systems can actually track objectively and equitably. It's a handy abstraction just like looking only at the ins/outs of a logic gate as opposed to looking at the implementation and wiring. Of course, a NOT gate would get upset over being called a 'bit flipper', it's not all thar physically exists but from our POV, it doesn't exactly matter. This applies to human labor even if a leaky abstraction w
Comment by lowsong 11 hours ago
Not quite. I agree that companies will try to do this, but every company that has tried to treat engineering staff as replaceable units of person-hours has failed.
> Metrics, performance reviews, sprint velocity, delivery timelines, all orbit around observable artifacts because those are what management systems can actually track objectively and equitably. It's a handy abstraction just like looking only at the ins/outs of a logic gate as opposed to looking at the implementation and wiring.
Yes, and these metrics are, usually, worthless.
It's not that companies and managers will not try to replace engineers with AI. I'm sure they will. I'm sure many will be laid off because "AI does it cheaper now".
My point is that companies that have gone down this route in the past have failed, and AI is no different. Companies that lean strongly into AI as a workforce replacement will fail too.
Comment by slopinthebag 1 day ago
Code doesn't need to be "beautiful", but the beauty of code has nothing to do with maintainability. Linus once said "Bad programmers worry about the code. Good programmers worry about data structures and their relationships." The actual hard part of software is not the code, it's what isn't in the code - the assumptions, relationships, feedback loops, emergent behaviours, etc. Maintainability in that regard is about system design. Imagine software as a graph, the nodes being pieces of code and the edges being those implicit relationships. LLM's are good at generating the nodes but useless at the edges.
The only thing that seems to work is to have a validation criteria (eg. a test suite) that the LLM can use to do a guided random walk towards a solution where the edges and nodes align to satisfy the criteria. This can be useful if what you are doing doesn't really matter, like in the case of all the pet projects and tools people share. But it does matter if your program assumes responsibility somewhere, like if you're handling user data. This idea of guardrail-style programming has been around for a while, but nobody drives by bouncing off the guardrails to get to their destination, because it's much more efficient to encode what a program should do instead of what it shouldn't, which is the case with this type of mega-test-driven-development. Is it more efficient to tell someone where not to go when giving directions as opposed to telling them how to get there?
Take the Cloudflare Next.js experiment for example - their version passed all the Next.js tests but still had issues because the test suite didn't even come close to encoding how the system works.
So no, you still need to care about maintainability. You don't need to obsess over code aesthetics or design patterns or whatever, but you never needed to do that. In fact, more than ever programmers need to be concerned with the edges of their software and how they can guide the LLM's to generate the nodes (code) while maintaining the invariants of the edges.
Comment by sarchertech 1 day ago
Similarly to your directions analogy, I’ve been using the the analogy id trying to ensure that a 1000 restaurant franchise produces the exact same peanut butter sandwich for ever customer.
It’s much easier to figure out the primitives that your employees understand and then use those primitives to describe exactly how to build a sandwich than it is to write a massive specification that describes what they should produce and just let them figure it out.