Show HN: First Claude Code client for Ollama local models
Posted by SerafimKorablev 1 day ago
Just to clarify the background a bit. This project wasn’t planned as a big standalone release at first. On January 16, Ollama added support for an Anthropic-compatible API, and I was curious how far this could be pushed in practice. I decided to try plugging local Ollama models directly into a Claude Code-style workflow and see if it would actually work end to end.
Here is the release note from Ollama that made this possible: https://ollama.com/blog/claude
Technically, what I do is pretty straightforward:
- Detect which local models are available in Ollama.
- When internet access is unavailable, the client automatically switches to Ollama-backed local models instead of remote ones.
- From the user’s perspective, it is the same Claude Code flow, just backed by local inference.
In practice, the best-performing model so far has been qwen3-coder:30b. I also tested glm-4.7-flash, which was released very recently, but it struggles with reliably following tool-calling instructions, so it is not usable for this workflow yet.
Comments
Comment by oceanplexian 1 day ago
Comment by d4rkp4ttern 22 hours ago
https://github.com/pchalasani/claude-code-tools/blob/main/do...
One tricky thing that took me a whole day to figure out is that using Claude Code in this setup was causing total network failures due to telemetry pings, so I had to set this env var to 1: CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC
Comment by eli 1 day ago
Comment by g4cg54g54 1 day ago
in particular i´d like to call claude-models - in openai-schema hosted by a reseller - with some proxy that offers anthropic format to my claude --- but it seems like nothing gets to fully line things up (double-translated tool names for example)
reseller is abacus.ai - tried BerriAI/litellm, musistudio/claude-code-router, ziozzang/claude2openai-proxy, 1rgs/claude-code-proxy, fuergaosi233/claude-code-proxy,
Comment by kristopolous 1 day ago
The invocation would be like this
llsed --host 0.0.0.0 --port 8080 --map_file claude_to_openai.json --server https://openrouter.ai/api
Where the json has something like { tag: ... from: ..., to: ..., params: ..., pre: ..., post: ...}
So if one call is two, you can call multiple in the pre or post or rearrange things accordingly.This sounds like the proper separation of concerns here... probably
The pre/post should probably be json-rpc that get lazy loaded.
Writing that now. Let's do this: https://github.com/day50-dev/llsed
Comment by eli 1 day ago
Comment by kristopolous 20 hours ago
This will be a bit challenging I'm sure but I agree, litellm and friends do too many things and take too long to get simple asks from
I've been pitching this suite I'm building as "GNU coreutils for the LLM era"
It's not sticking and nobody is hyped by it.
I don't know if I should keep going or if this is my same old pattern cropping up again of things I really really like but just kinda me
Comment by eli 1 day ago
But I'm surprised litellm (and its wrappers) don't work for you and I wonder if there's something wrong with your provider or model. Which model were you using?
Comment by dsrtslnd23 1 day ago
Comment by thtmnisamnstr 1 day ago
Comment by 3836293648 1 day ago
Comment by blizdiddy 4 hours ago
Comment by ryandrake 1 day ago
Comment by storystarling 1 day ago
Comment by horacemorace 1 day ago
Comment by g4cg54g54 1 day ago
Comment by d4rkp4ttern 22 hours ago
But with Qwen3-30B-A3B I get 20 tps in CC.
Comment by dosinga 1 day ago
Comment by mchiang 1 day ago
Comment by dang 1 day ago