Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It used to be that .dlls loaded by the .exe on startup (e.g. implicitly listed there) would get their thread local vars correctly (TLS), but dlls loaded later (like /DELAYLOAD or through LoadLibrary) would not. (the workaround was to initialize these through TlsAlloc/TlsFree, and have hook in DllMain to clean up)

But then Microsoft added /Zc:tlsGuards - https://learn.microsoft.com/en-us/cpp/build/reference/zc-tls... - which is now the default that fixes the issue, but with some significant performance penalty (e.g. the "bug" that I've listed).

I guess you can't have it both ways easy...

On the clang/clang-cl side, there is https://clang.llvm.org/docs/ClangCommandLineReference.html#c...

to support this.

So check your compiler version and options :)

Also the notes posted here about CRT mixing might apply to you (not sure though) - https://learn.microsoft.com/en-us/cpp/porting/binary-compat-...

I work in a gamedev world, and plugins, ffi, delay loaded dlls etc. are constant pain that one needs to look and solve issues around.



So this was only a MSVC bug? Most people compile Pd with MinGW, which would explain why we never ran into this issue.

Do you happen to have a link to the original MSVC bug report (i.e. the wrong thread locals, not the performance regression)?


Note that MinGW uses libwinpthread, which is known to have slow TLS behavior anyway (I've observed a 100% overhead compared to running the same program under WSL using a linux-native GCC). c.f. https://github.com/msys2/MINGW-packages/discussions/13259


I haven't looked into it, but going through the release notes for tlsGuards showed this - though not directly a bug report

https://learn.microsoft.com/en-us/cpp/overview/cpp-conforman...

and also the implementation in "clang" (for "clang-cl" being conformant with MSVC) - https://reviews.llvm.org/D115456#3217595

then last year clang-cl also added ways to disable this (if need to), probably this hit some internal issue and had to be resolved. Maybe "thread_local" have become more widely used (unlike OS specific "TlsAlloc")


Thanks! Fortunately, this issue does not affect us because our thread locals are all zero initialized integers or pointers.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: