Ollma on Mac Studio M1 Max not using GPU

I have installed ollama on my Mac Studio Max which has 32GB ram, i downlaoded Qwen2.5-Coder:32b. There is enough memory to load it into memory. But when it is running it is using CPU not GPU. How can i force it to use GPU?

Try these two settings:

  • export OLLAMA_GPU=true
  • /set parameter num_ctx 4096