- Intro
- I, like many others, am not an expert in models, parameters, quantization. I’d like some simple advice as to best practices with model selection.
- Gold Standard
- Is Anthropic Claude 3.5 Sonnet (new) always going to be the best result, the gold standard?
- Is there anything else which is better or similar? From other commercial providers?
- What is the best OpenAI model to use and how does it compare to the gold standard.
- Criteria
- For everything below, how does it compare to the gold standard in terms of:
- quality of results
- cost
- speed
- For everything below, how does it compare to the gold standard in terms of:
- Bolt.new
- How does Otto with Claude 3.5 Sonnet compare to Bolt.new’s official product? (remember, in terms of resutls, cost, speed).
- Free (Local and Online)
- What free option will give the best results? How will they compare to the gold standard?
- What is the best best free local model assuming you have lots of GPU power, or less power.
- What is the best free online model e.g. with OpenRouter.
- Budget Online
- What is the best online option which is not free, but much less expensive than the gold standard.
- How does OpenRouter compare to say hosting your own model with something like Novita.
- Uncensored
- Is it relevant at all to bolt development if a model is uncensored? Like would it let you create an AI girlfriend site, whereas others wouldn’t?
Gold Standard & Comparison:
- Claude 3.5 Sonnet represents one of the current leading models in terms of reasoning, analysis, and general capabilities
- GPT-4 Turbo is comparable in many aspects, sometimes performing better on certain technical/coding tasks
- Both are considered top-tier, with different strengths and slight variations in performance
For development specifically:
- Gold Standard Options:
- Claude 3.5 Sonnet: Excellent reasoning, strong coding abilities, well-suited for complex tasks
- GPT-4 Turbo: Very strong technical capabilities, good for coding Results: Both excellent Cost: Both premium pricing Speed: Claude 3.5 tends to be faster than GPT-4
- Free Options: Local:
- Qwen-2.5-Coder-32b: Best free local option if you have significant GPU power
- Mistral or DeepSeek: Good for limited GPU resources Results: 60-80% of “gold standard capability”
- oTToDev vs Bolt.new with Claude 3.5 Sonnet:
You’ll get pretty similar results, though there are some prompt additions to commercial Bolt.new that make it perform a bit better sometimes.
1 Like