Average is all you need
Posted by AlexC04 4 days ago
Comments
Comment by jihadjihad 21 hours ago
But nobody bothered to check if it was correct. It might seem correct, but I've been burned by queries exactly like these many, many times. What can often happen is that you end up with multiplied rows, and the answer isn't "let's just add a DISTINCT somewhere".
The answer is to look at the base table and the joins. You're joining customers to two (implied) one-to-many tables, charges and email_events. If there are multiple charges rows per customer, or an email can match multiple email_events rows, it can lead to a Cartesian multiplication of the rows since any combination of matches from the base table to the joined tables will be included.
If that's the case, the transactions and revenue values are likely to be inflated, and therefore the pretty pictures you passed along to your boss are wrong.
Further reading, and a terrific resource:
https://kb.databasedesignbook.com/posts/sql-joins/#understan...
Comment by chairmansteve 13 hours ago
Comment by bluefirebrand 11 hours ago
I will never understand Engineers who struggle with SQL lookups. The vast majority of queries are extremely basic set theory
Comment by krackers 9 hours ago
Comment by KronisLV 4 hours ago
As someone who's seen queries that are hundreds of lines long, involve a bunch of CTEs, nested SELECTs as well, upwards of a dozen joined tables with OTLT and EAV patterns all over the place (especially the kind of polymorphic links where you get "type" not "table_name" so you also need to look at the app code to understand it), I'd say that SQL can be too hard for people to reason about well.
Bonus points for having to manually keep like 5 Oracle package contents in your working memory cause that's where the other devs on the 10 year old project stored some of the logic, while the remainder is sort-of-dynamic codegen in the app.
Same as with most app code, it shouldn't be like that, but you sometimes get stuff that is really badly developed and the cognitive load (both to inherent and accidental complexity) will increase until people will just miss things and not have the full picture.
Comment by mattmanser 13 hours ago
I can write that script faster than I can write the text asking the AI to write the script as SQL is concise and my IDE has auto-complete.
Comment by DeepDuh 2 hours ago
Comment by Axel2Sikov 18 hours ago
Comment by paulryanrogers 17 hours ago
Comment by bluefirebrand 11 hours ago
Comment by Axel2Sikov 16 hours ago
I do not sell a wrapper on top of some LLM; you can absolutely write your SQL directly. There is an engine, there are iceberg tables. You can just live your best life doing your own SQL by hand.
Now if you couldnt do it before and you have a sensible understanding, you can likely do a bit more with the CLI tooling. And if you know a lot more, you can still do that. The queries are not hidden, or abstracted, If you need them they will be saved - transparently in SQL.
So I dont know what is the answer to the question "how do people do things they don't know how to do" ?
Comment by paulryanrogers 14 hours ago
The statue quo had been to learn SQL or ask a human you trust to check their own work, which hopefully you can reuse.
Now it's ask AIs that are intentionally a bit random, and less likely to (or incapable of) check(ing) their work. Perhaps without seeing the SQL at all, requiring to trust it for every interaction. And in a culture that moves so fast that there is no checking by any(one|thing).
Comment by XenophileJKO 14 hours ago
Modern models are quite capable at surfacing and validating their assumptions and checking correctness of solutions.
Oversight helps you build confidence in the solutions. Is it perfect, no.. but way better then most engineers I also ask to check things.
Comment by Bridged7756 12 hours ago
If you think an LLMs can check their work, then you are doing a terrible job at writing software. Plain and simple.
They even go as far as "cheating", so tests fail, writing incorrect tests, or straight out leaking code (lol) like the latest Claude Code blunder. Is this the tool the original comment "is using wrong, plain and simple"? Or do you have access to some other model that works in a wildly different way than generating text predictions?
Comment by xg15 19 hours ago
I think this is important, because if his hypothesis is right, then LLMs behave differently here: They really are average in all dimensions. They are the pilots the Air Force thought they had before Daniels made the study.
So if he is right, we'd be changing from a mostly-non-average to a mostly-average society, which would really be a massive change - and probably not a good one IMO.
[1] https://noblestatman.com/uploads/6/6/7/3/66731677/cockpit.fl...
Comment by hackncheese 15 hours ago
Comment by codethief 14 hours ago
Comment by xg15 13 hours ago
Comment by 9991 14 hours ago
Comment by drfloyd51 22 hours ago
Why didn’t the boss ask the AI for the charts to begin with?
Everyone’s income is going to be below average, because they got fired.
Comment by CodeyWhizzBang 22 hours ago
I might not agree with the point, but I can see that idea that many things just need to be "good enough" (which we might define as "average") and we save our real expertise for the things that really matter.
Comment by sva_ 22 hours ago
s/average/median
Comment by jagged-chisel 21 hours ago
Comment by wongarsu 21 hours ago
But it is useful to question whether that is true in all cases. The cases that aren't normal-distributed might be exactly the cases where it pays off to be neither average or median
Comment by skeeter2020 16 hours ago
Comment by programjames 21 hours ago
Comment by paulddraper 15 hours ago
Though usually "average" implies arithmetic mean.
Comment by analog31 21 hours ago
Comment by raw_anon_1111 21 hours ago
No one has ever differentiated themselves based on how good of a ticket taker they are. Coding especially on the enterprise dev side where most developers work has been being commoditized since 2016 at least and compensation has stagnated since then and hasn’t come near keeping up with inflation.
In 2016, a good solid full stack, mobile or web developer working in the enterprise could make $135K working in a second tier city. That’s $185K inflation adjusted today. Those same companies aren’t paying $185K for the same position.
My one anecdote is that the same company I worked for back then making $125K and some of my coworkers were making $135K just posted a position on LinkedIn with the same requirements (SQL Server + C#) offering $145K fully remote.
Comment by Ancapistani 18 hours ago
I 100% agree here.
AI has been a huge boon for me personally, because I stopped spending most of my writing code years ago. I was reviewing code, writing procedures, handling incidents, and generally just looking for pain points across the entire company and solving them before they became critical.
Those skills have transferred directly to working with AI.
Comment by bluegatty 21 hours ago
Comment by HWR_14 21 hours ago
Comment by bluegatty 20 hours ago
That's like saying 'cars were better made in the 1950's because they used tons of steel'. Like they were 'heavier and more robust' - but that doesn't mean better.
Foundations are way better, more robust, especially weatherized. Windows today are like magic compared to windows 100 years ago.
What we do more poorly now is we don't use wood everywhere, aka doors, and certain kinds of workmanship are not there - like winding staircases, mouldings - but you can easily have that if you want to pay for it. That's a choice.
AI is power and leverage, it will make better things as long as it's directed by skilled operators.
Comment by HWR_14 20 hours ago
The precision of how the wood or material meets is worse (when cut at the site). There is a huge amount of sloppy work in modern construction.
Comment by kaashif 16 hours ago
It seems to me that in the past there probably was lots of shoddy workmanship and just no-one paid attention to it.
But I have no proof of that.
Comment by bluegatty 10 hours ago
Comment by bluegatty 8 hours ago
And we can accommodate for 'selection bias'.
We have all of the historical evidence we could ever want for 'how things were built', basically 'infinity examples'.
I think some things were more robust, particularly some of the old framing, like in Europe, with non load-bearing walls etc. Those will stand for 1K years, but arguably unnecessary.
Comment by kaashif 8 hours ago
You have to get a representative sample, that's the tricky part.
So there's that!
Comment by motoroco 13 hours ago
Comment by HWR_14 11 hours ago
Comment by marcosdumay 13 hours ago
Comment by roenxi 21 hours ago
The people who need to be above average and exceptionally are senior management and maybe a few bright sparks in middle management. Most of the value-add happens there that builds social machines that then do the work.
> If average is all we need, then anyone can do it.
Pretty much, yes. That is why the range of salaries on offer is pretty compressed compared to the range of returns capitalists get.
Comment by drfloyd51 16 hours ago
That is the dream. Upper management can get software made without talent.
But is seems to be the greatest ideas in the last 30 years didn’t start in board rooms. They started with a couple coders creating a new idea.
No boardroom could have invented Google. It was so fundamentally different than what other search engines were doing.
We have this myth that upper management is so important. It is as the business grows in size, they are excellent for coordination. But ideas come from people closer to the problems.
Comment by roenxi 5 hours ago
You might want to try a different example, that one rather undermines the point you're trying to make. PageRank [0] was developed by Page & Brin as original research/based on the work of other people who weren't employees.
Comment by j45 21 hours ago
Comment by jerf 21 hours ago
How stable that is on the long term, I don't know any more than the next guy, but it is where I'm contributing now.
Comment by localhoster 15 hours ago
You might say it "still less work" and that's true, perhaps, only for the first few times. After a while you _learn_ how to do it, and understand how to _think_ with the language of your data. With LLMs, you never get this benefit, and also loose your ability to judge the LLM's output properly.
But again, that might be enough on your case, or, you simply don't _know_.
Comment by skybrian 14 hours ago
Let's say you start with a report someone else wrote. It seems like you still need to read it and understand what it's telling you. Sometimes plotting all the points helps, or drilling down and looking at the raw data.
Comment by cremer 13 hours ago
Comment by movedx01 21 hours ago
When it comes to bs dashboard where "average is all you need", maybe the "better than average" result would be asking yourself if it's even worth doing in the first place?
Comment by montroser 22 hours ago
Comment by jagged-chisel 21 hours ago
The Business simply cannot admit that it’s really doing nothing above average. If they did, investment dries up.
Comment by Axel2Sikov 18 hours ago
Comment by ljhsiung 16 hours ago
There's a market for both, but the furniture slop of Ikea is dominant.
Comment by winterbloom 21 hours ago
Do you know enough about JOINs and how they work to be able to break those big queries down and figure out whether they are doing exactly what you're asking for in English?
Comment by antihipocrat 21 hours ago
Comment by Ancapistani 18 hours ago
Comment by Axel2Sikov 18 hours ago
Comment by lifestyleguru 54 minutes ago
Comment by busfahrer 21 hours ago
> ninety percent of everything is crud
Comment by tsimionescu 21 hours ago
Yes, thinking about your data and how to check it is so annoying. Much better to do something average, see if the result puts you in a good light, and share that insight into your company's working with ~~everyone on the internet~~ your boss.
Rarely have I seen "we help you create meaningless slop more easily" advertised so explicitly. Or is this also average?
Comment by throw310822 21 hours ago
Comment by llmssuck 14 hours ago
I think many people here work at nice, large places with reasonable and knowledgeable colleagues that are cooperative and mostly rational and try to do the right thing. In my experience that is not a common or widespread thing. Of course I only have small to medium business experience, but that's still a pretty good chunk of the economy. LLMs are an absurd, ridiculous win in those kinds of environments.
Comment by chriswait 21 hours ago
It makes me wonder if Hacker News has a silent majority of people who would actually use AI in this way without wanting to admit it, and a vocal minority of people who wouldn't.
Comment by Ancapistani 18 hours ago
Comment by underlines 14 hours ago
where it gets interesting is when you have a custom system that your LLM surely never saw (custom ERP) that has 50 sometime cryptic tables, unclear look up tables and unexplained flags.
something no text2sql solution solved for us.
we built a second mcp that lets the agent look up business logic (generated from source code) and then does better queries. that i think is something i never read in a blog post about a text2sql solution.
Comment by Axel2Sikov 13 hours ago
You could use claude code for the "text2sql" kind of part, but this is not why this tool exists. Nor what the article advocates.
Comment by segh 21 hours ago
Comment by marginalia_nu 21 hours ago
Comment by movedx01 21 hours ago
The question is, do we have good enough feedback loops for that, and if not, are we going to find them? I would bet they will be found for a lot of use cases.
Comment by bluGill 21 hours ago
/end extreme over optimism.
Comment by Retr0id 21 hours ago
I think you can have LLMs do that too, and then generate synthetic training data for "high-effort code".
Comment by marginalia_nu 21 hours ago
Part of the problem is that better code is almost always less code. Where a skilled programmer will introduce a surgical 1-3 LOC diff, an incompetent programmer will introduce 100 LOC. So you'll almost always have a case where the bad code outnumbers the good.
Comment by Retr0id 21 hours ago
Comment by monkaiju 15 hours ago
Comment by utopiah 21 hours ago
Comment by bashwizard 21 hours ago
Comment by programjames 21 hours ago
Comment by montroser 22 hours ago
And there is a lot of that type of work to do if you're trying to grow a business. But, something in there should be trying to be exceptional or else you have no moat. Claude will probably not be able to breeze through that part with the same amount of ease...
Comment by utopiah 21 hours ago
It's a post claiming average AI is useful... by a for-profit "data platform with a CLI that LLM agents can use directly". What are they going to do? Criticize the whole industry they are selling to?
Comment by Axel2Sikov 21 hours ago
Comment by pc86 15 hours ago
Comment by myhf 15 hours ago
Comment by kfk 21 hours ago
Not all context is documented, and some context has to even be changed because it doesn't make sense.
I find AI very useful, but I think a lot of this AI SQL products are misleading.
Comment by fedeb95 21 hours ago
Comment by tqi 12 hours ago
Why is that a good thing? Claude didn't ask any obvious follow up questions, like what determined whether a user got an email or not? It is using the ab test terminology in Step 3 without any kind of confirmation that this is, you know, a valid test.
Comment by throwaway98797 21 hours ago
if anything it makes the world more dangerous
a reckoning is coming
the top decile will be janitors for the rest
Comment by leecommamichael 13 hours ago
Comment by JackSlateur 22 hours ago
A car that starts 50% of the time ?
A plane that stops on 50% of the flights ?
A pacemaker that beats only 50% of the time ?
David Goodenought said that average is enough ..
Comment by CodeyWhizzBang 22 hours ago
Comment by JackSlateur 21 hours ago
Comment by CodeyWhizzBang 19 hours ago
"Whereas before, average was expensive in terms of both time and effort, average became cheap."
Comment by antisthenes 21 hours ago
Pass.
Comment by mpalmer 22 hours ago
This is not only average. This is actual magic.
So let's be real: the SQL is average. The joins are average. The chart is average. And that took us less than 5 minutes and that was amazing, that is the entire point.
You did not need a data engineer to model your HubSpot data, or a meeting to agree on whether it should be last-click or first-click or linear or time-decay or whatever.
You needed a query, written fast, on data you already own. Your LLM wrote it. You confirmed it made sense. Your manager got a link.
Honestly, average is clearly magic; prove me wrong.
I'll give it a go. This is generated slop, and the poor, factory-made quality of the writing undercuts every aspect of the argument.It is like nails on a chalkboard.
Comment by Axel2Sikov 21 hours ago
Comment by throwaway613746 21 hours ago