If you are interested in optimizing the project further, I can suggest you rebuilding SQLite with Profile-Guided Optimization (PGO). I collected as many as possible materials (including many related benchmarks) in my repo: https://github.com/zamazan4ik/awesome-pgo . Regarding SQLite and PGO, I have the following link: https://sqlite.org/forum/forumpost/19870fae957d8c1a
Nope, I didn't test PGO explicitly on its FTS functionality. However, I am 99% sure that by enabling PGO for FTS you can expect ~5-10% performance win too.
Thanks. I looked through the notes you posted but... honestly, it seems a bit disorganised and rather complicated. Not sure the juice is worth the squeeze for max 10% (we are getting 2-3s query time on 15m records after sharding and that seems good enough for now. We can speed this up by having more focused queries. Most of the time is spend on the bm25 ranking step. For anything less, like 200k records it's already blazingly fast).
Is a 10% performance win worth it or not of course depends on your case. For some situations, even 2x-3x performance is not worth it at all, in other cases, even a few percent (or even half of a percent) win is a huge thing (especially on Google-like scales). If different ways work for you fine - it's great since you don't need to spend your time with PGO!
I do some sharding, mainly by grid-based bounding boxes. This allows for searching with an area or overlapping areas FTS index. It means there is some data duplication, but it is minimal compared to the speed boost.
A lot of the issues here have been resolved in documentation.
For example there’s now new docs on permissions and deployment also more information on backups. Others are currently being worked ;)..
The issue is still open as we haven’t resolved all of it yet so makes a good reference.
TL;DR - Profile-Guided Optimization (PGO) can improve Typst's performance for 10%+