If I am right, if you delete all elements in F14, there will be no tombstone. However, if you delete some, there may still be some tombstones left. You may still need rehashing due to those remaining tombstones. In that sense, F14 is not eliminating the worst case; it makes the worst case happens less often. Then the question is how "less often". This probably depends on the access pattern.
What is your access pattern? Are insertions and deletions frequently interleaved? Or do you insert a batch and then delete another batch? It will be interesting to have a micro-benchmark to measure how the F14 strategy works in practice.
> I believe khash does as well.
Yes, khash uses the traditional tombstone solution. It rehashes in the same space when there are too many tombstones.
It would be nice to have a deeper understanding of how/why this works, but in practice it seems fine. Under a continuous insert and erase workload the number of chunks with an overflow fluctuates a bit but stays low enough to keep probe lengths short, so F14 never needs to rehash to clean them up. Batched adds and then removes might let you bias a bit toward having longer probes for the remaining keys, but it never gets too bad. A truly adversarial workload could make things pretty bad, but they could also just insert lots of keys that all have the same hash code.
> What is your access pattern? Are insertions and deletions frequently interleaved?
It's basically random, but generally they are very interleaved and most of the time the number of insertions and deletions are approximately equal over some period of time (~seconds). This is for keeping track of orders while processing a depth of book feed, in case you're familiar with that sort of thing.
F14 is using some kind of reference counting scheme for tombstones, described in the article. I don't fully understand it, but it definitely seems different than the standard tombstone-based algorithm, which I do understand. In any case, when using something like dense_hash_map, I see huge spikes in processing time (due to rehashing) that I don't see when using F14. I also resize the hash tables up front to avoid having to do so later on, but of course for my use case that doesn't help with implementations that use a traditional tombstone algorithm unless I use ridiculously large initial sizes.
What is your access pattern? Are insertions and deletions frequently interleaved? Or do you insert a batch and then delete another batch? It will be interesting to have a micro-benchmark to measure how the F14 strategy works in practice.
> I believe khash does as well.
Yes, khash uses the traditional tombstone solution. It rehashes in the same space when there are too many tombstones.