AI hallucinate. Do you ever double check the output?
Posted by jackota 1 day ago
Been building AI workflows and then randomly hallucinate and do something stupid so I end up manually checking everything anyway to approve the AI generated content (messages, emails, invoices,ecc.), which defeats the whole point.
Anyone else? How did you manage it?
Comments
Comment by prepend 22 hours ago
I treat it like hiring a consultant. They do a lot of work, but I still review the output before making a decision or passing it on.
Sending something with errors to my boss or peers makes me look stupid. Saying it was caused by unrevised AI makes me look stupider.
Comment by codingdave 1 day ago
Comment by varshith17 23 hours ago
Comment by casualscience 23 hours ago
Comment by Zigurd 1 day ago
So before anyone concludes that coding agents prove that AI can be useful, find some use cases with similar characteristics.
Comment by jackfranklyn 17 hours ago
The trap is when correctness is subjective. Tone, phrasing, whether something 'sounds right' - no automated check helps there, so you're back to reviewing everything.
For structured data like invoices, I've found pattern-matching against known values beats LLMs anyway. Less hallucination risk, faster, and when it fails at least it fails obviously rather than confidently wrong.
Comment by exabrial 23 hours ago
Comment by 19arjun89 22 hours ago
To minimize hallucinations, yes AI should be set up for deterministic behaviour (depending on your use case, for example, in recruiter, yes it should be deterministic so it produces the same evaluation for the same candidate every time). Secondly, having another AI check hallucination can be a good starting point, assigning scores and penalizing the first AI can also lead to more grounded responses.
Comment by aavci 22 hours ago
This is a valuable read: https://www.ufried.com/blog/ironies_of_ai_1/
Comment by Gioppix 1 day ago
Comment by 7777777phil 1 day ago
Comment by AlexeyBrin 1 day ago
Comment by Xorakios 15 hours ago
Part of the reason I like Perplexity is because of the embedded references, and I always, always, double check the sources and holler at the Perp AI when it is clearly confabulating or misinterpreting. Still gives me insights and is useful, but trust-but-verify isn't just about arms control ;)
Comment by wormpilled 17 hours ago
Not at all