Show HN: OculOS – Any desktop app as a JSON API via OS accessibility tree
Posted by stif1337 3 days ago
Single Rust binary (~3 MB) that reads the OS accessibility tree and gives every UI element a REST endpoint. Click buttons, type text, toggle checkboxes — all via JSON. Works as an MCP server too, so Claude/Cursor/Windsurf can control any desktop app out of the box.
Windows + Linux + macOS. MIT licensed.
Comments
Comment by stif1337 3 days ago
- Screenshot capture: GET /windows/{pid}/screenshot → returns PNG
- Batch operations: POST /interact/batch → multiple actions per request
- Wait/poll: GET /windows/{pid}/wait?q=Submit&timeout=5000
- Python & TypeScript SDKs (local install, PyPI/npm coming soon)
- OpenAPI spec, Dockerfile, 7 example scripts
- Demo GIF in README showing Calculator automation via Claude Code
Thanks for the feedback everyone!
Comment by ktpsns 3 days ago
Comment by stif1337 3 days ago
- Windows: UI Automation (works with Win32, WPF, WinForms, Qt, Electron) - Linux: AT-SPI2 (GTK, Qt, Electron) - macOS: AXUIElement (Cocoa, Qt, Electron)
The coverage varies by toolkit. Win32/WPF/GTK expose rich trees. Electron apps expose key elements but the tree is shallower. Custom-drawn UIs (games, OpenGL) have minimal or no accessibility tree. That's the main limitation.
Comment by Frannky 3 days ago
Comment by stif1337 3 days ago
The tree is shallower than native Win32/WPF apps, but key interactive elements (buttons, inputs, lists) are usually exposed. You can check what's available with:
curl "localhost:7878/windows/{pid}/find?interactive=true"Comment by tadfisher 3 days ago
Comment by stif1337 3 days ago
curl localhost:7878/windows
curl -X POST localhost:7878/interact/{id}/click
MCP mode is an optional layer for AI agents (Claude, Cursor, etc.) that already speak MCP. The REST API works standalone for scripts, testing, CI/CD — no AI required.