ollama ps shows 28% CPU / 72% GPU when called via oTToDev using a 3090.
happens on windows and linux.
screen updates / code generation are very slow.
on windows, if the model is called via openwebui, ollama ps will show 100% GPU and GPU in task manager shows 100% usage and code / text is generated quickly.
so it seems like this model can run fine on this computer, and that there is something different about how ollama is called.
any ideas on how to get full utilization of the 3090 when using oTToDev?
8192 is as high as i can go with one 3090 at 100% GPU while things are snappy. 9216 is still quick (3% CPU), 10240 is ok, above that starts to get slow, curious what the tradeoffs will be likeā¦
# DEFAULT_NUM_CTX=32768 # Consumes 36GB of VRAM
# DEFAULT_NUM_CTX=24576 # Consumes 32GB of VRAM
# DEFAULT_NUM_CTX=12288 # Consumes 26GB of VRAM
# DEFAULT_NUM_CTX=11264 # Consumes 24GB of VRAM
# DEFAULT_NUM_CTX=10240 # Consumes 24GB of VRAM
# DEFAULT_NUM_CTX=9216 # Consumes 24GB of VRAM
DEFAULT_NUM_CTX=8192 # Consumes 24GB of VRAM
# DEFAULT_NUM_CTX=7168 # Consumes 24GB of VRAM
# DEFAULT_NUM_CTX=6144 # Consumes 24GB of VRAM
We dont really have a dev environment at the moment, I think with the project being so young it probably a lack of resources aswell, but saying that it would be good to have a dev branch where PRās could be merged in faster and then merged into a more stable branch once tested by the masses.
I agree and this is something we are talking about setting up as a maintainer team! @bolto90 is right that we are definitely lacking resources right now with how quickly the project is growing, but Iām working hard building the up team so we can continue to get all these awesome PRs merged!