Show HN: I run a vision model on every screenshot, locally, on a 4GB GPU
Posted by skye0110 3 days ago
Comments
Comment by torunar 3 days ago
Considering the massive backlash it caused, it showed the exact opposite.
Comment by RobotToaster 3 days ago
Comment by skye0110 2 days ago
Comment by shmoogy 3 days ago
Comment by skye0110 2 days ago
Comment by skye0110 3 days ago
Comment by aynite 3 days ago
but i found gemma-e4b is still too "dumb", and barely capable to provide any good response.
could you share your experience with how you use e2b to generate good result?
Comment by skye0110 3 days ago
for screenshot generation its not open ended generation, image ocr and windows title is fed..and only structured json is asked in response, it works fine
so i just designed around it - small model + tight prompt + real context instead of hoping the model is clever. what were you trying to generate? can share the exact prompt setup if it helps
Comment by skye0110 3 days ago
Comment by heydog 3 days ago