Understandably confusing for a non-C-programmer, but it is idiomatic C. It's not code golf; it's just the way one does machine independent bit twiddling.
A mask is used to test each bit position in turn. The tests look like this if written using binary literals:
0b1000 & x
0b0100 & x
0b0010 & x
0b0001 & x
The '&' is the bitwise AND opperator in C. Of course, we'd have to do as many of these as the word size so 1010111 uses a for loop that starts with the first mask, a 1 in the leftmost position, and shifts it right one position every time through the loop (using the C right shift operator >> on an unsigned mask value). When the one bit is eventually shifted out the right side of the mask, the mask is all zeros so the loop terminates because zero acts like false in the for loop test.
The only other tricky thing is initializing the mask. To set only the leftmost bit in a word the code uses the bit complement operator ~ of C. Breaking it down for a four bit example looks like:
This is the expression that appears in the for loop initializing the mask value, and it works for any word size.
The original article's code was definitely not idiomatic, efficient, or safe (the memory allocation for an array of characters could fail and segfault). The book Hacker's Delight is a great reference for those wanting to understand how to do low level coding, a requirement for close to the hardware work like writing device drivers.
It uses a bit mask from the leftmost bit to the rightmost and prints the respective bits in x. Is that more complicated than allocating memory on the heap, storing the bits, and then printing them in reverse?
Also the code above does not contain anything C specific, with the exception of 'putchar' let's say.
I think it's =much= easier to read compared to the one in the article.
Although the usual approach is bit shifting the input and checking only the lowest/highest bit.
Here is an example in Java:
This code (`x & m`) promotes x to unsigned, which means you're not actually printing the bit representation of a signed integer. You'd get the wrong bit pattern for a negative number on ones' complement machines, as signed-to-unsigned conversions are well defined in C and obey modulo semantics. So, for example, `-1` would print as `1 ... 111` instead of `1 ... 110`.
I was going to say that you just need to change m to int and fix the mask derivation to avoid directly or indirectly manipulating or reading the sign bit. And to do that you _only_ need to know the number of value bits. It turned out more complicated than that.
You can't reliably derive the number of value bits from the unsigned type on evil implementations. Using the range limits like INT_MIN and INT_MAX, though, you can deduce the number of value bits. It's useful that the definition of precision and width in 6.2.6.2p6 of C11 effectively precludes, I think, a range which doesn't make full use of the available value bits. Also that the standard effectively only permits ones' complement, two's complement, and sign magnitude. Being able to reliably determine the number of value bits means you could carefully shift a masking bit through the set of value bits.
But confirming with the standard I was reminded that shifts of negative values are undefined. But I think we could use arithmetic to shift the bit. Preferably multiplication because I'm not quite grokking the requirements for signed division, and I _just_ learned that INT_MIN / -1 will cause a floating point exception on x86. Cool!
Also, with this general approach we'd never be able to peek at all the representation bits for the value and sign bits. We wouldn't see the high bit on two's and ones' complement implementations to show our hypothetical skeptic how it changes.
We could inspect all the representation bits by inspecting the int object as an unsigned char. But I don't think we could reliably differentiate the padding bits from the value and sign bits, especially on an evil implementation where the value bits weren't all contiguous or which toggled the padding bits semi-randomly just to screw with us.
On the contrary I'd say. When you're talking about low level (and basic CS101 details like 2's complement) I think it makes sense to be lucid rather than "needlessly complicated short" code that does too much in a single line.