Which LLM give best result?

What models are people using for main and reasoning models? I’ve been using gpt 4o-mini for main, and o3-mini for reasoning. I haven’t tried a lot of combinations yet but wondering if other have.

1 Like

I’ve gotten best results with o3-mini for the reasoning LLM and Claude 3.7 Sonnet for the main LLM! I use OpenRouter so that I can use both an OpenAI and Claude model at the same time.

Thanks, that was helpful.

What about for Embedding?

1 Like

I’m glad! For embedding, I typically just use the text-embedding-3-small from OpenAI.

I tired multiple models, using OpenRouter, to see which one works better. Some of them work and some don’t. I notice a pattern, the ones that end in “:free” don’t work.

Don’t work

  • meta-llama/llama-4-maverick:free
  • google/gemini-2.5-pro-exp-03-25:free
  • deepseek)/deepseek-chat-v3-0324:free
  • google/gemma-3-1b-it:free

Work

  • google/gemini-2.5-pro-preview-03-25
  • anthropic/claude-3.7-sonnet
1 Like

Huh that’s really good to know! I actually have no idea why the free ones wouldn’t work but I’m thinking they have rate limits that make Archon fail.