Stop using JSON for LLM structured output

Posted by 44za12 3 hours ago

Counter1Comment1OpenOriginal

Comments

Comment by 44za12 3 hours ago

For simple extraction tasks, a delimiter-separated string uses 11 tokens vs 35 for JSON. Output tokens are the latency bottleneck.