HTML documents are hardly unusual examples. Also, look at the other column in that table, where he stripped out the HTML tags and looked only at the body text: UTF-16 was somewhat smaller, and gzipping them made the difference negligible.
Does UTF-16 really have such a great advantage for non-Roman writing systems? Or is this motivated more by a disliking for Anglocentrism?
Does UTF-16 really have such a great advantage for non-Roman writing systems? Or is this motivated more by a disliking for Anglocentrism?