Embedding Dimensions for Ollama Incorrect - Need Clarification

I was hoping for a little clarification regarding embedding dimensions.

  1. Installed Archon using docker
  2. Used .env to set variables
  3. Set base URL: http://localhost:11434/v1
  4. Set reasoner and primary: qwq:latest
  5. Embedding model: nomic-embed-text

According to the documentation, I set Embedding Dimensions to 768 and used the provided SQL for Supabase.

I then proceeded to Crawl Pydantic AI Docs. Crawling is successful. The log shows all but 3 URLs were processed and stored.

But they were never actually stored. Database Statistics shows zero.

I checked my Docker logs, and requests to store the data in the database were unsuccessful because embedding dimension was expected to be 768 but it was actually 1536.

I deleted the database and recreated it as 1536 (even though I am using Ollama) and also left it as nomic-embed-text. It worked.

Therefore, is Ollama not always 768? Does it have something to do with the new qwq:32b? I’m confused, how can we be sure what to set this to?

Thank you for the help!

2 Likes

This is actually baffling to me because I’m 99% sure nomic-embed-text is always 768 dimensions! Even did a sanity check Google search just now to confirm. The LLM you use shouldn’t matter since it’s all up to the embedding model. What command did you use to pull nomic-embed-text? And did you change anything through config/environment variables for Ollama?

1 Like

No, this happened just from following the instructions given in the Archon streamlit.

Not sure what you mean by command to pull nomic-embed-text. This happens using the Documents tab and supplied buttons.

1 Like

Haven’t used this in archon but my previous use of ollama tells me that standard practice was to run ollama pull from the cli. I believe it will also pull automatically if you try to use a model you don’t have locally. Not sure how archon uses the info provided. If it assumes ollama is already running then maybe you need to pull it.

1 Like

Yeah exactly! @Dupre Archon won’t pull the embedding model for you so it’s assuming you already ran the “ollama pull” command to pull nomic-embed-text, like

ollama pull nomic-embed-text

I’m getting the same problem. But when I check my logs in the docker container, it looks like the documentation insert isn’t adjusting to the dimensions I set for the embedding model. I am using the nomic-embed-text, and specifically created the database with the 768 dimension. But when the documentation is trying to insert, it still tries to use 1536.

Error inserting chunk: {'code': '22000', 'details': None, 'hint': None, 'message': 'expected 768 dimensions, not 1536'}

Hi, did you find a solution to this problem?

I am getting the same problem too.
image

Hmm that’s really strange… I’ll have to look into this.

I didn’t find a solution, but I just did what OP did, and updated the table to handle 1536 dimension, and ran the application again. It worked well and the database was seeded properly.