Ask HN: How are teams validating AI-generated tests today?

Posted by sriramgonella 8 hours ago

With the rise of AI-assisted development, many tools generate tests automatically.

But validating whether those tests actually cover meaningful edge cases seems harder.

Curious how teams here handle this in real workflows.

Comments

Comment by david_iqlabs 8 hours ago

One thing I've noticed with AI generated tests is they can look very convincing even when they're wrong. The output reads confidently but there's not always anything grounding it in real signals.

I've found it works better when the AI is just explaining results that come from deterministic metrics rather than inventing the analysis itself.

Curious how other teams are dealing with that.

Comment by sriramgonella 8 hours ago

really good observation. The confidence of the output can sometimes mask the lack of grounding behind it. It almost feels like the emerging pattern is, let AI assist with generation and explanation, but keep the verification layer deterministic and measurable. Curious if you’ve seen teams building internal tooling around that, or if people are mostly relying on existing CI/testing framew

Comment by david_iqlabs 42 minutes ago

It's exactly the experience I had while building iQWEB.

I spent months trying to make an executive narrative generated by AI, but eventually moved away from that approach. The results were often inconsistent or overly generic, which made it difficult to rely on the output for serious reporting.

In the end I shifted to a fully model-driven approach where the narrative is built directly from structured signals and scoring logic. That made the reports far more accurate and evidence-based, and it keeps the output consistent from scan to scan.

Comment by itigges22 8 hours ago

For security vunerability testing on websites I have been making for clients- I almost always hire a senior developer to look over the work and or tests that were created. AI can pass a test, and it can make something that passes a test, but there almost ALWAYS are problems that the senior dev finds with the tests, or with the code that was being tested. Sometimes AI will adjust the code entirely to pass the test or adjust the test to pass failing code.

Another counter-measure I have is to simply lock code before testing. Look over test files, and ensure its not following the happy path.

Comment by sriramgonella 7 hours ago

can we even depend on End to end Testing on this AI Tools? but how far these founders can able to rely on that with confidence. I totaly agree for VAPT it will be better

Comment by itigges22 2 hours ago

In my opinion they cannot confidently rely on it- I think its great to get a product out there, but I think we can draw parallels to things like Dropshipping- where the "Get quick rich" thought process is a major failure. Its better to think about it as that you still need a large amount of energy, time, or capital to achieve a certain level of success. (INCLUDING production coding projects that req security and strong engineering practices). We can use Spec Driven AI Assisted coding to add a level of rigor to "vide coding", but I still don't think models are where they need to be in order to get it perfect. We will almost always have issues.