Some real cognitive dissonance in this article… “The PDF Association operates un...

xxs · 2026-01-22T14:22:21 1769091741

yup, zstd is better. Overall use zstd for pretty much anything that can benefit from a general purpose compression. It's a beyond excellent library, tool, and an algorithm (set of).

Brotli w/o a custom dictionary is a weird choice to begin with.

adzm · 2026-01-22T14:43:44 1769093024

Brotli makes a bit of sense considering this is a static asset; it compresses somewhat more than zstd. This is why brotli is pretty ubiquitous for precompressed static assets on the Web.

That said, I personally prefer zstd as well, it's been a great general use lib.

dist-epoch · 2026-01-22T15:17:07 1769095027

You need to crank up zstd compression level.

zstd is Pareto better than brotli - compresses better and faster

atiedebee · 2026-01-22T15:49:38 1769096978

I thought the same, so I ran brotli and zstd on some PDFs I had laying around.

  brotli 1.0.7 args: -q 11 -w 24
  zstd v1.5.0  args: --ultra -22 --long=31 
                 | Original | zstd    | brotli
  RandomBook.pdf | 15M      | 4.6M    | 4.5M
  Invoice.pdf    | 19.3K    | 16.3K   | 16.1K

I made a table because I wanted to test more files, but almost all PDFs I downloaded/had stored locally were already compressed and I couldn't quickly find a way to decompress them.

Brotli seemed to have a very slight edge over zstd, even on the larger pdf, which I did not expect.

mort96 · 2026-01-22T17:29:10 1769102950

EDIT: Something weird is going on here. When compressing zstd in parallel it produces the garbage results seen here, but when compressing on a single core, it produces result competitive with Brotli (37M). See: https://news.ycombinator.com/item?id=46723158

I did my own testing where Brotli also ended up better than ZSTD: https://news.ycombinator.com/item?id=46722044

Results by compression type across 55 PDFs:

    +------+------+-----+------+--------+
    | none | zstd | xz  | gzip | brotli |
    +------|------|-----|------|--------|
    | 47M  | 45M  | 39M | 38M  | 37M    |
    +------+------+-----+------+--------+

mort96 · 2026-01-22T23:41:11 1769125271

Turns out that these numbers are caused by APFS weirdness. I used 'du' to get them which reports the size on disk, which is weirdly bloated for some reason when compressing in parallel. I should've used 'du -A', which reports the apparent size.

Here's a table with the correct sizes, reported by 'du -A' (which shows the apparent size):

    +---------+---------+--------+--------+--------+
    |  none   |  zstd   |   xz   |  gzip  | brotli |
    +---------|---------|--------|--------|--------|
    | 47.81M  | 37.92M  | 37.96M | 38.80M | 37.06M |
    +---------+---------+--------+--------+--------+

These numbers are much more impressive. Still, Brotli has a slight edge.

Thoreandan · 2026-01-22T19:57:26 1769111846

Does your source .pdf material have FlateDecode'd chunks or did you fully uncompress it?

mrspuratic · 2026-01-22T18:16:14 1769105774

> I couldn't quickly find a way to decompress them

    pdftk in.pdf output out.pdf decompress

order-matters · 2026-01-22T16:47:25 1769100445

Whats the assumption we can potentially target as reason for the counter-intuitive result?

that data in pdf files are noisy and zstd should perform better on noisy files?

jeffbee · 2026-01-22T17:03:43 1769101423

What's counter-intuitive about this outcome?

order-matters · 2026-01-22T17:39:58 1769103598

maybe that was too strongly worded but there was an expectation for zstd to outperform. So the fact it didnt means the result was unexpected. i generally find it helpful to understand why something performs better than expected.

mort96 · 2026-01-22T17:47:29 1769104049

Isn't zstd primarily designed to provide decent compression ratios at amazing speeds? The reason it's exciting is mainly that you can add compression to places where it didn't necessarily make sense before because it's almost free in terms of CPU and memory consumption. I don't think it has ever had a stated goal of beating compression ratio focused algorithms like brotli on compression ratio.

sgerenser · 2026-01-22T18:38:44 1769107124

I actually thought zstd was supposed to be better than Brotli in most cases, but a bit of searching reveals you're right... Brotli, especially at the highest compression levels (10/11), often exceeds zstd at the highest compression levels (20-22). Both are very slow at those levels, although perfectly suitable for "compress once, decompress many" applications which the PDF spec is obviously one of them.

DetroitThrow · 2026-01-22T16:12:55 1769098375

I love zstd but this isn't necessarily true.

jeffbee · 2026-01-22T15:34:11 1769096051

Are you sure? Admittedly I only have 1 PDF in my homedir, but no combination of flags to zstd gets it to match the size of brotli's output on that particular file. Even zstd --long --ultra -22.

dchest · 2026-01-22T16:07:13 1769098033

Not with small files.

Dylan16807 · 2026-01-23T00:58:07 1769129887

If that's about using predefined dictionaries, zstd can use them too.

If brotli has a different advantage on small source files, you have my curiosity.

If you're talking about max compression, zstd likely loses out there, the answer seems to vary based on the tests I look at, but it seems to be better across a very wide range.

itsdesmond · 2026-01-22T18:11:04 1769105464

> Pareto

I don’t think you’re using that correctly.

wizzwizz4 · 2026-01-22T20:26:21 1769113581

It's correct use of Pareto, short for Pareto frontier, if the claim being made is "for every needed compression ratio, zstd is faster; and for every needed time budget, zstd is faster". (Whether this claim is true is another matter.)

stonogo · 2026-01-22T21:10:17 1769116217

brotli is ubiquitous because Google recommends it. While Deflate definitely sucks and is old, Google ships brotli in Chrome, and since Chrome is the de facto default platform nowadays, I'd imagine it was chosen because it was the lowest-effort lift.

Nevertheless, I expect this to be JBIG2 all over again: almost nobody will use this because we've got decades of devices and software in the wild that can't, and 20% filesize savings is pointless if your destination can't read the damn thing.

deepsun · 2026-01-22T19:56:30 1769111790

Brotli compresses my files way better, but it's doing it way slower. Anyway, universal statement "zstd is better" is not valid.

greenavocado · 2026-01-22T14:38:52 1769092732

This bizzare move has all the hallmarks of embrace-extend-extinguish rather than technical excellence

deepsun · 2026-01-22T20:07:16 1769112436

Well, except for speed, compression algorithms need to be compared in terms of compression, you know.

Here's discussion by brotli's and zstd's staff:

https://news.ycombinator.com/item?id=19678985

mmooss · 2026-01-22T22:26:58 1769120818

Note the language: "You're not creating broken files—you're creating files that are ahead of their time."

Imagine a sales meeting where someone pitched that to you. They have to be joking, right?

I have no objection to adding Brotli, but I hope they take the compatability more seriously. You may need readers to deploy it for a long time - ten years? - before you deploy it in PDF creation tools.

nxobject · 2026-01-22T22:35:23 1769121323

(sarcasm warning...)

You're absolutely right! It's not just an inaccurate slogan—it's a patronizing use of artificial intelligence. What you're describing is not just true, it's precise.