Memory Barriers: A Hardware View for Software Hackers (2010) [pdf]

dewyatt · on May 12, 2016

This is a lot of interesting information. I hope that one day we can avoid this level of complexity though. We have enough to worry about with modern computer systems.

I wrote a driver for FreeBSD recently and had to implement memory barriers to get it accepted into mainline. I had never dealt with memory barriers before and I couldn't help but think "great! what is this new thing I have to keep in mind?" It wasn't all that difficult, but still...

We have built up so many layers of complexity that it's becoming very difficult to secure things, for example. I really hope we can somehow drastically simplify computers/software in the future.

mtanski · on May 13, 2016

Most of the performance in processors in the last decade has come from increasing complexity. Compare the single thread performance of the fastest P4 processor to a recent i7.

More complexed manufacturing process (smaller), more complex instruction (AVX2, crypto), more complex branch prediction, more complex micro/macro op decoding, more complex bus (QuickPath), the list goes on.

The memory barrier semantics on x86(_64) are the strictest of all the arch. This is at the expense of expense of complexity (cache synch protocols). Compare that to the other archs where the semantics presented to software a more complex.

In my mind the chip designers have handled a lot of complexity on their end to simplify it on the software designers. If you go far enough down the rabbit hole (drivers, high performance) you might have to deal with some of the complexity.

dman · on May 12, 2016

Only if we stop worshipping at the altar of speed. We are collectively so obsessed about maximizing speed related metrics that simplicity often takes a backseat.

slededit · on May 13, 2016

If anything I'd argue we see the opposite. Billions of cycles are wasted through abstraction vs the few that do actual work. You'll see this profiling any modern application; extra copying, useless computation and other ills abound.

Copying data and computing things "just in case" make the software much easier to modify. But it is not a good way to maximize "speed related metrics".

vvanders · on May 12, 2016

Speed = battery life which I for one find somewhat important.

dman · on May 12, 2016

Agree with you on that front. In fact on my laptop i would happily trade some latency/responsiveness to get battery life. ie let the os batch some wireless / IO / computation together and do it at one go when it wakes up relevant hardware from its sleep state.

vvanders · on May 12, 2016

Modern hardware is actually really good at DCVS. You don't trade latency for battery life, just doing less work gets you better battery efficiency.

There's a really gnarly V*I curve which is not linear(as clock speed increases, voltage also needs to come up) which makes good performance even more key.

mrich · on May 13, 2016

I feel like speed often takes the backseat, especially in large big corp projects designed by committee. At the end during user testing everyone realizes the architecture is just too slow and all to often this leads to a failed project.

Dylan16807 · on May 13, 2016

Collectively, we don't have to change anything. A coder today can already avoid 99% of this complexity with some full memory barriers and message passing.