As the topic says, i can’t make the Deepseek 1 model work in the Bolt DIY project.
This is my setup and all the things i already made.
Reinstall Ollama to get last version (0.5.7)
Pull the Deepseek r1:14b model in my system. (deepseek-r1)
Run “ollama serve” to run a server that allows me access to the models. (It runs in: 127.0.0.1:11434" - localhost:11434)
Clone Bolt DIY project and run “npm install”
Create .env file and update the OLLAMA_API_BASE_URL to the recommended setting (127.0.0.1:11434):
Run the project with “npm run dev”
6.1) Ollama is correctly set in the providers settings.
6.2) Model is correctly selected as Deepseek r1: 14b
Once i run my prompt, I receive these logs:
DEBUG api.chat Total message length: 2, words
INFO LLMManager Getting dynamic models for Ollama
INFO LLMManager Got 4 dynamic models for Ollama
WARN stream-text MODEL [claude-3-5-sonnet-latest] not found in provider [Ollama]. Falling back to first model. deepseek-r1:14b
INFO stream-text Sending llm call to Ollama with model deepseek-r1:14b
DEBUG Ollama Base Url used: http://127.0.0.1:11434
ERROR api.chat AI_RetryError: Failed after 3 attempts. Last error: Internal Server Error
@thecodacus as far as I see its loading the model and trying to answer, but it failed on 3 attempts as on GPT-4o when you reach the token limit.
So I dont think its a connection problem if this is your thought
Looks like not working good with bolt.diy at the moment. So maybe would other system prompt or I dont know. strangely the deepseek-r1 from OpenRouter works.
In my case R1 8B model works fine with bolt.diy where r1-14B thinks a lot (I can hear the fans actively working to control the temperature) and after a while just starts typing few words and hangs.
I am trying on Mac Pro chip M4 10-Core CPU 10-Core GPU
16GB Unified Memory
1TB SSD Storage
Not sure if the issue relative to GPU or anything else. However, it works fine via cmd
14B is a bit too much for 16GB unified memory. it usually works fine for low context but bolt uses on average 8k-10k context (while context optimization is enabled) which is a bit overkill for 16GB with 14b model