> Maybe an additional "lower resolution" index on top of the map could help as a heuristic to skip whole regions
That sounds a bit like circling back around to a tree-based system (albeit with depth=2) where each parent is somehow summarizing the state of child nodes below it.
Yeah it's definitely similar to having parent nodes summarize their children. The motivation for the flat array structure was to have an entirely fixed-size array that's contiguous in memory to make it friendly for SIMD filtering. Having the data in the tree on the other hand could probably filter a little better since it can summarize at multiple levels, but my bet is that the flat array allows for a faster implementation for the case here with a few thousand intervals. On the one hand, the whole working set should probably fit in L1 cache for both the tree or a flat array, but on the other hand, a tree with pointers needs to do some pointer chasing. The tree could alternatively be represented in a flat array, but then elements need to be shuffled around. I don't know which is faster in practice for this problem, but my gut says the speed of a flat array + SIMD probably outweighs potential asymptotic improvements for the few thousand case
I don't think I've put the same amount of thought into it, but my gut feeling is:
1. If you're summarizing "a contiguous chunk exists from A to B", that would require a mutable rebalancing tree and the borders constantly shift and merge and subdivide.
2. If you're instead just summarizing "the range between constant indices X-Y is [1/0/both] right now", then that can be done with a fixed tree arrangement that can have an exact layout in memory.
A simple example of the latter would be 8 leaves of 01110000, 4 parents of [both, 1, 0, 0], 2 grandparents of [both, 0], etc. The 3-value aspects could also be encoded into two bitmasks, if "has a 1" and "has a 0".
That sounds a bit like circling back around to a tree-based system (albeit with depth=2) where each parent is somehow summarizing the state of child nodes below it.