Crawl4AI + RAG - question

Hi everyone,

Wondering if anyone has used Crawl4AI for RAG use cases like Cole shows in this video?

I’d like to do this kind of setup where I use Crawl4AI to regularly a check a given list of URLs to see if the different websites have been updated and, if so, to update a database with that data.

Can you use the multi-URL function in crawl4AI if each URL is a different website as opposed to a different page on the same site?

2 Likes

Yes you certainly can! There really isn’t a difference between if the pages you are scraping in parallel are from the same site or not! :smiley:

Thanks, Cole!

So I guess if I wanted to run the agent and the only change I wanted to make was rather than crawl all the pages on the one website (e.g. ai(dot)pydantic(dot)dev) but instead, say, a list of pages on different websites

e.g.

  • newyorktimes(dot)com/pricing
  • wallstreetjournal(dot)com/pricing
  • washingtonpost(dot)com/pricing
    etc

Then to implement that change I guess I have to make some small change to the following function and then it should run as you have implemented it but for the different sources?

def get_pydantic_ai_docs_urls() → List[str]:

Having read through the code this seems like the only area that needs to be modified for the use case I mentioned above. Or have I missed something?

Thanks again!

1 Like

That is exactly right! Only place to have to edit is where you get the list of URLs and that’s it!