Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Compression achieved on the first billion bytes of English Wikipedia is only 0.88 bits per byte (bpb). Even when the decompressing program is large at almost 200KB, that is... wow!


Note that it is an XML formatted version of Wikipedia, so some of that data is very predictable XML tags.


True, but that would hold true for all common compressors, so beating them is still a huge win.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: