> For example, you can't get mutable references to items in a shared container by thread id or loop iteration.
This would be a good candidate for a specialised container that internally used unsafe. Well, thread id at least; since the user of an API doesn't provide it, you could mark the API safe, since you wouldn't have to worry about incorrect inputs.
Loop iteration would be an input to the API, so you'd mark the API unsafe.
> In larger projects things like data structures and libraries are going to dominate over slightly different compiler optimizations.
At this level of abstraction you'll probably see on average an effect based on how easy it is to access/use better data structures and algorithms.
Both the ease of access to those (whether the language supports generics, how easy it is to use libraries/dependencies), and whether the population of algorithms and data structures available are up to date, or decades old, would have an impact.
Optimising out TLS isn't going to be a good example of compiler capability. Whether another thread exists is a global property of a process, and beyond that the system that process operates in.
The compiler isn't going to know for instance that an LD_PRELOAD variable won't be set that would create a thread.
> Say the program is not dynamically linked. Still no?
Whether the program has dynamic dependencies does not dictate whether a thread can be created, that's a property of the OS. Windows has CreateRemoteThread, and I'd be shocked if similar capabilities didn't exist elsewhere.
If I mark something as thread-local, I want it to be thread-local.
> As with C, there is nothing preventing anyone from writing all of that generated code by hand. It is just far more work and much less maintainable than e.g. using C++20.
It's also still less elegant, but compile time codegen for specialisation is part of the language (build system?) with build.rs & macros. serde makes strong use of this to generate its serialisation/deserialisation code.
I'm curious if this is tracked or observed somewhere; crater runs are a huge source of information, metrics about the compilation time of crates would be quite interesting.
> This does come with code-bloat. So the Rust std sometimes exposes a generic function (which gets monomorphized), but internally passes it off to a non-generic function.
There's no free lunch here. Reducing the amount of code that's monomorphised reduces the code emitted & improves compile times, but it reduces the scope of the code that's exposed to the input type, which reduces optimisation opportunities.
In C, the only way to write a monomorphized hash table or array list involves horribly ugly macros that are difficult to write and debug. Rust does monomorphization by default, but you can also use &dyn trait for vtable-like behaviour if you prefer.
It's harder than you'd expect. Depending on what kind of bucketing an arena does (by size or by type), a stale reference may end up pointing to another piece of memory of the correct type, which is still wrong, but more subtly than a crash.
I'm not familiar enough with Zig to want to dive into architecture, the point I wanted to make is general to arenas in any language that can have a stale reference.
I once had a stale stack reference bug in C that lived for a year, because the exact same object was created at the exact same offset every time it was used, which is a similar situation.
reply