> I also implemented all the wide math functions to work with this. It seems that I have arranged all the data perfectly for the compiler to use automatic vectorization. But it seems this doesn’t really happen to a sufficient degree to compete with my hand written SSE2.
I will keep this example in mind the next time somebody trots out the line that you should just trust the compiler.
I will keep this example in mind the next time somebody trots out the line that you should just trust the compiler.