This may not be part of the spec per se but you can invalidate them by distributing a bloom filter with revoked tokens and validation times, services just need to poll the service at the granularity needed. This makes scaling still much simpler than one big session store.
The problem, as the linked 'Slightly Sarcastic Flowchart' says, is: how do you handle the invalidation server going down? If you just assume that tokens are valid in this case, then an attacker just has to kill the server and they're back to being impossible to invalidate. If you assume they're invalid, you're back to having centralised state, which mostly defeats the purpose.
The bloom filter allows you to largely distribute the work of the server, and serves as a reasonable proxy during a failure off the server. More importantly though, there's no reason it need be a centralized server. Invalidated tokens could be broadcast to a wide number of servers that each maintain the invalidated token list (it's a great case for a CRDT, since it is append only with a TTL). Normally it'd be a pretty compact list, and if it isn't, you probably want to take a more defensive posture anyway.
...and beyond that, if an attacker can take out your invalidation server, which needn't be directly accessible to the public, you've already had a pretty serious security breech. I think the least of your problems would be the invalidation server.
Thank you for this. While I love a good old nerd round of 'stump the wizard' I also appreciate someone pumping the breaks with some realism.
Some security conversations get to the point where it's stringing together a highly unlikely chain of scenarios requiring multiple pivots, multiple concurrent failures (stochastic or otherwise), and a state sponsored actor in order to slap a 'do not use' recommendation on something.
If companies like Nike can run _Magento_ and educational sites still require Flash then I can use JWTs. Maybe...
It’s largely irrelevant because the revocation bloom filters are cached on each service, and if the auth service is down then tokens can’t be revoked anyway so the list is still accurate enough.
TBH I don’t think the author of the article has expirenced the nightmare that is a hot session store at a large scale before, you end up with needing to troubleshoot IO latency issues with basically no tooling that can show you where the problem is and you’re up against the hardware limits and what ever black box your cloud provider has made. Where as with JWT everything happens in normal user space and can be reasonably reasoned about with a bit of complexity without razor thin latency deps on IO performance.
I'm confused by your comment. Session stores seem like the easiest storage to scale horizontally. The workload is just a distributed hash, _maybe_ with atomic updates. "Three nines" durability over one day is perfectly acceptable. Use a consistent hash ring, no replication, and add nodes until you achieve the performance required.
It’s easier if you have one lookup key for a session, but often that’s not sufficient, consider the case of termination of a session by email address or by a session id. The other issue is you still have every service making high rates of network calls that require low latency since they’re often inline with the UX. So you’re relying more on a chain of IO between services which is nearly impossible to performance troubleshoot with today’s tooling.