Hello everyone,
I’m making steady progress on Archon, my AI agent that builds AI agents!
My goal is to develop a tool that helps both in understanding the process of building AI agents and in improving my own agent over time.
My Use Case
I want to create a clean Supabase knowledge database containing all the documentation I need (e.g., Pydantic AI, Langraph, n8n).
Over time, different versions of Archon will have access to this structured and well-maintained knowledge base.
Key Questions
Embeddings Model Choice:
• Should I always use the same model to embed my documentation knowledge database?
• Is it necessary to use a model from the same company to both embed and query the content?
Model-Specific Considerations:
• Each model has its own vector size (e.g., 1536 for OpenAI embeddings ).
• How do I find the vector size of a specific model? I checked the Hugging Face model card for nomic-embed-text, but I couldn’t find this information. Any tips?
Long-Term Strategy:
• What’s the best approach to building a knowledge database that can evolve over time?
• I’ve seen Cole switch between models (e.g., nomic-embed-text vs. text-embedding-3-small from OpenAI). How should I think about embedding model updates?
Would love to hear your insights! Thanks in advance for your help.