So would turning off denormals fix the problem? That concept has given me 100 pr...

kr7 · on Feb 26, 2017

The solution is to compile with SSE2 on x86. (flags: -mfpmath=sse -msse -msse2)

On x86-64, the compiler should default to SSE2.

SSE2 is ~16 years old so compatibility shouldn't be an issue.

phire · on Feb 26, 2017

Technically, you only actually need the instructions from the original SSE set to do floating point operations. SSE2 adds a bunch of really useful integer floating point instructions.

But the only extra cpus that gets you is the Pentium III, AMD Athlon XP, and AMD Duron.

SSE2 is supported on every single x86 cpu released after those, such as the Pentium 4, Pentium M, and Athlon 64.

It's a real shame that people are still using CPUs that don't support SSE4, such as the AMD Phenom and Phenom II cpus, otherwise everyone would have moved to exclusive SSE4.

kr7 · on Feb 26, 2017

SSE1 is single-precision only. SSE2 added double precision.

So the bug will still appear for 'double' using just SSE1.

Narishma · on Feb 27, 2017

Some Atoms only support up to SSSE3 too.

wtallis · on Feb 26, 2017

Turning off denormals won't fix the issue that a long double has a wider exponent field than a double, and can represent smaller magnitudes without relying on denormals.