Self-Hosted Web-Crawler and Search Engine

I'd like to do some research into what options there are in this area. I know Dave has had issues with not finding things on his blog using Google or Duck Duck Go.

Playing with Roam Research has shown how useful it would be if there was a comprehensive search engine of Dave's writing. Not just what Google deems important. Maybe something where Dave could manually put extra weight on key pages.

Also it would be useful because Dave has writing on several domains. It would be cool to have a single search engine that searched all his sites.

Examples: scripting.com, this.how, essaysfromexodus.scripting.com

I wonder if Dave has an archive of dave.editthispage.com?