Just two examples, there is more wrong, beginning with the title.
>It creates a “memory barrier” CPU are usually free to >shuffle around the stream of instructions that flow through >them and it is done all the time as the CPU constantly >optimizes the instruction stream to keep all its resources >busy. A lock creates a barrier and the CPU is not allowed >to move stuff past this barrier. Branch prediction across >memory barriers is also something I don’t think is doable.
He's confusing instruction retiring with memory ordering. On modern fast CPUs they are only vaguely related.
>Cache coherency. The address line that stores the lock gets >invalidated and the content in that cache gets thrown out.
That's not how a modern CPU with MESI protocol operate.
As long as the cache line stays EXCLUSIVE to that core it does not get invalidated.
>It creates a “memory barrier” CPU are usually free to >shuffle around the stream of instructions that flow through >them and it is done all the time as the CPU constantly >optimizes the instruction stream to keep all its resources >busy. A lock creates a barrier and the CPU is not allowed >to move stuff past this barrier. Branch prediction across >memory barriers is also something I don’t think is doable.
He's confusing instruction retiring with memory ordering. On modern fast CPUs they are only vaguely related.
>Cache coherency. The address line that stores the lock gets >invalidated and the content in that cache gets thrown out.
That's not how a modern CPU with MESI protocol operate. As long as the cache line stays EXCLUSIVE to that core it does not get invalidated.