Another thing to note, is that loop unrolling can be harmful to performance, as it requires more instruction cache. Beyond branch prediction, modern compilers can also convert loops into vector / SIMD instructions, and other magic.
The Duff's device is still useful for creating co-routines though; a handy way of yielding, then returning to the yield point.
The Duff's device is still useful for creating co-routines though; a handy way of yielding, then returning to the yield point.