Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The first example (tree represented in Japanese) seemed a bit misleading, because the "alphabet" has not been kept as a constant. Since the Japanese alphabet is much larger, it may be argued that the number of bits actually occupied in storage by "本" and "tree" are about the same. Could someone clarify if this is correct reasoning?


One could argue that in information theory terms, there is more information encoded in a single "本" than a single "T".

However, this article is dealing with the concept of compression in terms of a simple symbolic representation of data.


Certainly, and I deliberately didn't get into bytes and encoding until after this - I was trying to get across the softer idea that in terms of space-on-a-page-using-a-pen, you've saved.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: