Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wonder if it is worth it to select a few core libraries like glibc, and get the distro to install multiple variants, each optimized for a different CPU.


For x64, you don't need to distribute multiple binaries. You can have one binary with variants of a function each optimized for a particular microarch, and resolve which variant to use at runtime using the CPUID instruction. gcc already does this for you with function multiversioning, and glibc makes use of it for its string routines.


Intels Clear Linux does this IIRC. Wins most comparisons on Phoronix. Works on AMD CPUs too.

(edit) Even though i'm a sucker for efficiency/performance, i prefer Tumbleweed myself. Since i like KDE, whose integration is as good as it gets on SUSE but left a lot to be desired on Clear which ships with GNOME by default. Both are rolling releases. SUSE has automatic btrfs snapshots on updates for easy rollback. But Clear has the performance edge.

https://www.phoronix.com/scan.php?page=article&item=ubuntu-2...

https://www.phoronix.com/scan.php?page=article&item=cascade-...

https://www.phoronix.com/scan.php?page=article&item=icelake-...

Oh nice this one shows Tumbleweed is not far behind:

https://www.phoronix.com/scan.php?page=article&item=amd-epyc...


glibc already does dynamic dispatch for the memcpy family of functions.


Recently (in the last year or so), the platform ABI for x86-64 has been augmented with 4 levels of variants: base x86-64 (i.e., up to SSE2), and then assume up to SSE4.2, assume up to AVX2, and then a base AVX-512 level. (Not mentioned in each of these levels is the random other instructions, such as BMI/BMI2 that get sprinkled throughout.)


Imo, the solution is to stop using C/C++ for high performance applications. Today's computers are diverse enough that distributing compiled binaries leaves a ton of performance on the table.


What would you like you use instead? C/C++ code which targets generic x86_64 is a few percent slower than march=native but that still leaves it way faster than most other languages.


I think Rust and Julia are 2 of the stronger contenders.


Rust also produces compiled binaries?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: