Most of that seems oriented at full, manual archival rather than automated (lightweight-ish) text caching for search operations. ArchiveBox (https://github.com/pirate/ArchiveBox) looks promising, although it isn't capturing the "live" browser feed. Readable-Web Proxy (https://github.com/fake-name/ReadableWebProxy) also looks interesting, but seems to be external as well.
I was going to permanently save my browsering history by running all my browsers through squid. If squid's StorageManager supports it, I'd put page storage in Postgres, then use PG's FTS with a light UI around it.
Maybe I just need to suck it up and roll my own.