What hardware are you running the 30b model on? I guess it needs at least 24GB VRAM for decent inference speeds.
The Anthropic API was already supported by llama.cpp (The project Ollama ripped off and typically lags in features by 3-6 months), which already works perfectly fine with Claude Code by setting a simple environment variable.
And they reference that announcement and related information in the second line.
Which announcement are you looking at? I see no references to llama-cpp in either Ollama's blog post or this project's github page.
I was trying to get Claude code to work with llama.cpp but could never figure out anything functional. It always insisted on a phone home login for first time setup. In cline I’m getting better results with glm-4.7-flash than with qwen3-coder:30b
~/.claude.json with {"hasCompletedOnboarding":true} is the key, then ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN work as expected
There are already various proxies to translate between OpenAI-style models (local or otherwise) and an Anthropic endpoint that Claude Code can talk to. Is the advantage here just one less piece of infrastructure to worry about?
siderailing here - but got one that _actually_ works?
in particular i´d like to call claude-models - in openai-schema hosted by a reseller - with some proxy that offers anthropic format to my claude --- but it seems like nothing gets to fully line things up (double-translated tool names for example)
reseller is abacus.ai - tried BerriAI/litellm, musistudio/claude-code-router, ziozzang/claude2openai-proxy, 1rgs/claude-code-proxy, fuergaosi233/claude-code-proxy,
this is cool. not sure it is the first claude code style coding agent that runs against Ollama models though. goose, opencode and others have been able to do that a while no?
Does this UI work with Open Code?
hey, thanks for sharing. I had to go to the Twitter feed to find the GitHub link:
Thanks for catching that. I've changed the URL at the top to that from https://twitter.com/serafimcloud/status/2014266928853110862 now.