Trouble with Qwen2.5 Coder 14B on MacBook Pro with OttoDev (Bolt.new)

grmo · November 13, 2024, 10:52pm

I’m curious if anyone here has successfully run Qwen2.5 Coder 14B on a MacBook Pro with Ottodev (Bolt.new). I’ve been using the 7B model without any issues, but when I try loading the 14B model, it just takes forever and never actually works. I have an M1 mac with 16gigs ram and also vram. The strange thing is, the 14B model runs fine with other apps, so I’m not sure what the problem might be with Ottodev in particular. I also created the modelfile and specified the model the same way it is instructed with the 7b version.

Is anyone else experiencing this, or does anyone have a fix? Would appreciate any insights or advice. Thanks!

mahoney · November 14, 2024, 4:44am

I’m using 7B instruct because 14B wasn’t playing nice with some other RAM requirements on my machine. In my experience it’s behaving similar to larger models I was using before Qwen.

M1 Max 32GB, CPU 2 efficiency cores and 8 or 10 performance; not offloading 100% of the model to GPU will come to a screeching halt

philmcneal · November 14, 2024, 5:21am

oh jeez I ran the old Quen 7B on a m1 pro 14 inch 16gigs with somewhat a delay when prompting but with bolt.new and that locally performance was like meh.

Curious if the modified 16B models would work? but I guess ill have to give it a shot and find out!

mahoney · November 14, 2024, 10:39pm

I do think that a lot of improvement will come not solely from the LLM, but also from some additional guidance within the application to support less “intuitive” or smaller models that don’t make necessary leaps during project initialization. There are some ideas around this kicking around that will surely end up on the roadmap.