Can you confirm that I don’t need to open up and save a new model file for the 32b model as mentioned in the readme in order to use its full capacity?! Or is the 32b version also ducked by default from Obama?
Can confirm that it is not necessary to manually create a model with a larger context windows (as previously described). I believe there was a PR that addressed this.
For anyone who is interested. I successfully tested running oTToDev using Qwen2.5-coder:30b on my Windows 10 system with an RTX 4070 TI Super (16G VRAM), Ryzen 7 5700 with 32G System RAM. It wasn’t quick, “Build a todo app in React using Tailwind” took just over 13 minutes.
At its peak it used;
Dedicated GPU memory: 14.3
Shared GPU memory: 11.6
It must have done a fair bit of shuffling between VRAM and Shared GPU RAM because the GPU utilization never got higher than 12%, while the CPU utilization sat at around 95% for most of the time.
I am just stoked that it works. I didn’t have much luck with the smaller models, so I’m glad that I now have a working solution. (Might have to save for a XX90 with 24G VRAM so that it runs in real-time.)
yes its this one, but when the UI loads it, it grows! thats my issue
here is my bug report Ollama custom Modelfile is listed in the models but reloads it with larger token value · Issue #313 · coleam00/bolt.new-any-llm · GitHub
even without the custom model file it still loads it as double size.
I’ve just put a pull request in to allow easy changing of the context size which help reduce the vram requirements
I applied the changes locally to test and they work great! Nice work
!
Sorry if it’s a dumb question, but could you please guide me and tell what exactly did you change in the .env (or .env.local) to choose specifically Hyperbolic or DeepInfra? My .env.local only contains the OPEN_ROUTER_API_KEY, but that alone doesn’t determine which provider will be used?? As I understand, to choose an alternative provider you need to use dynamic routing, but I can’t seem to figure out how to do it in oTToDev. Thanks in advance!
hi
my file name is .env
i changed these lines (If you don’t have these lines, you should redownload the repository from github. there will be a file .env.example - you need to rename it to .env or .env.local)
OPENAI_LIKE_API_BASE_URL=https://api.deepinfra.com/v1/openai
OPENAI_LIKE_API_KEY=here is my key from deepinfra
for hyperbolic:
OPENAI_LIKE_API_BASE_URL=https://api.hyperbolic.xyz/v1
OPENAI_LIKE_API_KEY=here is my key from hyperbolic
currently i’m using hyperbolic, because they fixed qwen stutters
p.s. qwen 32b is amazing. it fixed error that claude made and couldn’t fix
Just checked this out on openrouter. It’s cheaper than Deepseek.
Deepseek: $0.35/M
Qwen Instruct 32B via Openrouter: $0.18/M
just a tad more than half the price.
Yeah the pricing is one of the reasons I love the model so much! For those who can’t run it with Ollama it’s still super affordable.
It works, thank you!!!
It’s acting weird though, tried setting up a project with Vite and shadcn, the process was riddled with errors and incorrect structure setup scripts. Same with next + tailwind, same files are rebuilt over and over again, everything is installed again after each prompt, errors are present here as well. Doesn’t act very smart to be honest… is it the model or my system somehow?
i think it’s still hyperbolic stutters, i’m not sure why, but i had those problems too in ottodev and in other coders. it rewrote the same code many times in a row, creating the same files, stuttering weird words and just consuming tokens
but then the problem just disappeared, i didn’t do anything
try deepinfra. it has lower output but maybe it will be enough for your needs
So all this hype around the model is general use outside of oTToDev? Can anyone show or share something awesome that has been built with oTToDev and Qwen Instruct 32B?
It’s hype both within and outside of oTToDev @nickmartin! Generally smaller models like Qwen-2.5-Coder-32b aren’t going to be strong enough to build huge apps, but it certainly is enough to help you iterate on a proof of concept like I showed in my video on Qwen!