Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It works super well!

You'll have to compile llama.cpp from source, and you should get a llama-mtmd-cli program.

I made some quants with vision support - literally run:

./llama.cpp/llama-mtmd-cli -hf unsloth/gemma-3-4b-it-GGUF:Q4_K_XL -ngl -1

./llama.cpp/llama-mtmd-cli -hf unsloth/gemma-3-12b-it-GGUF:Q4_K_XL -ngl -1

./llama.cpp/llama-mtmd-cli -hf unsloth/gemma-3-27b-it-GGUF:Q4_K_XL -ngl -1

./llama.cpp/llama-mtmd-cli -hf unsloth/unsloth/Mistral-Small-3.1-24B-Instruct-2503-GGUF:Q4_K_XL -ngl -1

Then load the image with /image image.png inside the chat, and chat away!

EDIT: -ngl -1 is not needed anymore for Metal backends (CUDA still yes) (llama.cpp will auto offload to the GPU by default!). -1 means all GPU layers offloaded to the GPU.



If it helps, I updated https://docs.unsloth.ai/basics/gemma-3-how-to-run-and-fine-t... to show you can use llama-mtmd-cli directly - it should work for Mistral Small as well


Is there a simple GUI available for running LLaMA on my desktop that I can access from my laptop?


If you are on a Mac, give https://recurse.chat/ a try. As simple as download the model and start chatting. Just added the new multimodal support in LLaMA.cpp.


Give https://docs.openwebui.com/ a look, you'll be able to access it by using your desktops IP while on your laptop (providing you're on the same network).


isnt that ollama + any client supporting it?

using tailscale for the internal network works really well


If you install llama.cpp via Homebrew, llama-mtmd-cli is already included. So you can simply run `llama-mtmd-cli <args>`


Oh even better!!


Ok it's actually better to use -ngl 99 and not -ngl -1. -1 might or might not work!


I can't see the letters "ngl" anymore without wanting to punch something.


That's your problem. Hope you do something about that pent up aggressivity.


Oh it's shorthand for number of layers to offload to the GPU for faster inference :) but yes it's probs not the best abbreviation.


It probably isn't, not gonna lie.


[flagged]


[flagged]


[flagged]


[flagged]


based




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: