Stop using JSON for LLM structured output
Posted by 44za12 3 hours ago
Comments
Comment by 44za12 3 hours ago
For simple extraction tasks, a delimiter-separated string uses 11 tokens vs 35 for JSON. Output tokens are the latency bottleneck.
Posted by 44za12 3 hours ago
Comment by 44za12 3 hours ago