I still don't understand why CPUs don't provide support for 'fat pointers'.
ARM has 'load pair' to update two registers in one go, it would just need a 'load/store from array' instruction to check that a pointer is inside the pair address to have a very efficient mechanism (efficient from an instruction count point of view at least, of course the data bandwith and cache impact are still here).
"Fatter than normal" pointers have a lot of disadvantages (it's been explored extensively as part of CHERI, etc) for the upsides they give you. For the purpose of use after free/buffer overflow finding, it's IMHO, not a great fit (there are other uses, of course).
Gwydion Dylan used a two-word representation for values - a full word as a type tag, and then a word for the value itself (whether pointer or int). It was an interesting experiment, but multicore basically killed it. Once your basic value representation is more than a word, then every load/store/manipulation of a value requires a lock around it, to prevent corruption if the thread is preempted when the type has been written but the value has not. The locking overhead kills performance.
Or you can go with a GIL (but for compiled code) the way Python and early Ruby/JS implementations did, but that hasn't worked out terribly well for them either.
No, by fatter than normal I mean, say, 96 or 128 bit pointers.
I'm aware of SPARC ADI :)
It does not have fatter-than-normal pointers, it uses 4 bit tagging of the memory address by reusing bits 63-60 of the pointer.
The IBM iSeries, or whatever its latest names is, has 128bit software pointers, with 65bit HW pointers, where the extra bit indicates whether the value has been tampered with in userland. AFAIR, checks are still all implemented in software, by relying on a runtime trusted code generator.