Hey all!
I’ve noticed that sometimes RAG can’t return any data to me or returns incorrect data. For example, I added a PDF file successfully to the Qdrant database. After that, I asked for some information in the RAG chat, and after thinking for 1-2 minutes, it returned “I don’t know anything about …” But a day ago, I added another file, and this one works fine and is still working as expected. So, are there dependencies on the file size or format or the database I’m using, or the LLM type? Or maybe for a better results it’s require a GPU and not just core i5 CPU?
I currently host everything locally and use the “ai starter kit”.
I think correct tunning of the “Recursive Character Text Splitter” can be a thing as well?
Hey @M3RCY! I see this kind of thing happen a lot with smaller LLMs in n8n, which I’m guessing you are using since you said you’re running things locally on a CPU. Which model are you using?
You could probably optimize your text splitter (or chunking in general), but I’d first try with a larger LLM and just see if the results change a lot. I’m guessing they will! I know with a CPU you won’t be able to run a larger model yourself, but you could always use GPT or some other LLM through OpenRouter temporarily just to test results with something bigger.
Hey @ColeMedin ,
Thanks for the answer!
I will try it with Gemini or DeepSeek, but am I wrong if I say that when we use RAG, we just want a simple answer from the database and not any kind of analyzed information with distorted details from LLM? So, I’m not sure why larger LLMs can help with their wider knowledge of the world. All they have to do is take information from the database. Anyway, I’m no expert in the LLM world; it’s just a guess.
No that’s a good thought! The reason a larger LLM helps is a lot of the time there is a LOT of text returned from a RAG retrieval. So smaller LLMs get confused with all the information and sometimes don’t pick out the single piece of information they need to answer the question correctly.