Web's random numbers are too weak, researchers warn

tptacek · on Aug 9, 2015

This presentation makes very little sense to me.

It appears to revolve around the idea that Linux servers "produce" entropy at an unexpectedly low rate over time, and "consume" entropy quickly.

But that's not how CSPRNGs work. A CSPRNG is a stream cipher, keyed by whatever entropy is available to seed it at boot, and, for purposes of forward secrecy, periodically (or continually) rekeyed by more entropy.

Just as for all intents and purposes AES-CTR never "runs out" of AES key, a CSPRNG doesn't "run out of" or "deplete" entropy. The entire job of a CSPRNG, like that of a stream cipher, is to take a very small key and stretch it out into a gigantic keystream. That's a very solved problem in cryptography.

I am automatically wary of anyone who discusses "entropy depletion", even moreso when they appear to be selling "entropy generators".

JoshTriplett · on Aug 9, 2015

So you're saying that Linux's /dev/random should go away, /dev/urandom or getrandom is always what you want as long as the system has collected enough entropy (even for key material), and Linux could stop "collecting entropy" once it has enough to seed the CSPRNG?

witty_username · on Aug 9, 2015

Correct. If the entropy is say 256 bits, then the attacker has to try 2^256 combinations. It's quite like encryption; you can encrypt a 1 TB file with a 256 bit key. (And indeed some stream ciphers like RC4 are just a PRNG xored with plaintext).

wolf550e · on Aug 9, 2015

AES-CTR keystream is also a CSPRNG xor'd with plaintext and is a better example because it's the recommended encryption mode (AES-GCM is AES-CTR + GMAC and is what everyone recommends).

tptacek · on Aug 9, 2015

Pedantic: there are better modes to use than GCM, for a couple reasons. GCM is the most performant widely available AEAD though.

JoshTriplett · on Aug 9, 2015

> there are better modes to use than GCM, for a couple reasons.

What are the better modes, and the reasons?

sdevlin · on Aug 9, 2015

The authentication mode for GCM is sort of fragile. While nonce reuse is always bad, it's particularly disastrous in GCM in that it immediately leaks the authentication key. Similarly, using GCM with a truncated authentication tag makes forgery easier than you'd expect and again leaks the authentication key in the process.

GCM is also difficult to implement in software for the same reasons AES is: the high-performance implementation strategies tend to rely on precomputed tables. This puts memory pressure on servers that handle a large number of keys concurrently. Table-based implementations also tend to expose cache-timing side channels. Fortunately, modern Intel machines have instructions (e.g. PCLMULQDQ) that aid implementations, though I'm not sure how widespread their use is in practice.

To be very clear, GCM is still a fine choice, and much safer than composing authentication and encryption yourself.

If you have access to it, NaCl's Secret Box is a good choice that avoids these problems. Libsodium implements NaCl and is pretty widely available, I think. OCB is also a good choice, though I haven't seen many implementations of this.

EDIT: For those interested, Niels Ferguson's criticism of GCM (http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/comment...) is a great read. Lots of minor practical issues (e.g. specifying bit strings rather than byte strings, performance measurement across platforms, etc.) along with the aforementioned attack on short authentication tags.

tptacek · on Aug 9, 2015

JoshTriplett · on Aug 9, 2015

Any useful citations that could be used if one wanted to submit a patch doing exactly that and have it taken seriously? Something that would spell out exactly why it makes sense to rip out the entire runtime entropy estimator mechanism beyond making sure there's enough entropy to begin with?

tptacek · on Aug 9, 2015

I believe any such patch is doomed; the Linux randomness maintainer appears to be thoroughly convinced of the utility of /dev/random, which is unfortunate.

Many people have tried pointing out this design flaw to him.

JoshTriplett · on Aug 9, 2015

That's exactly what I had in mind when I was asking for any specific citations you might have that address this. Random patch versus self-appointed "expert" won't go nearly as well as supporting evidence versus lack of supporting evidence.

I have a little bit of hope due to this quote from the getrandom(2) manpage: "Unless you are doing long-term key generation (and perhaps not even then), you probably shouldn't be using GRND_RANDOM. The cryptographic algorithms used for /dev/urandom are quite conservative, and so should be sufficient for all purposes." The parenthetical and the second sentence give me some hope.

(Related question: is Linux using the most sensible CSPRNG it could be using?)

But even if that doesn't work, I can see a few ways around that. Most notably, /dev/random and its associated ongoing accounting could become optional. (This is already something I'd like to do for size reasons.) Then, for compatibility, random could be a symlink to urandom.

And yeah, between that and the ongoing battle against the Great Evils of hardware random number generators, the potential of Linux's randomness mechanism seems rather maintainer-limited.

tptacek · on Aug 9, 2015

Making /dev/random blocking optional doesn't help; in fact, it probably hurts, because it allows more Linux code to "safely" use /dev/random, and that code is dangerous on platforms where /dev/random does the old Linux thing.

JoshTriplett · on Aug 9, 2015

I'm not saying /dev/random's blocking would be optional; I'm saying the driver backing /dev/random would be entirely optional. Making it a link to urandom rather than having it entirely missing just avoids breaking old (misguided) userspace that still uses it; breaking userspace would make the proposal less likely to be accepted. Convincing various userspace components to switch to urandom, and for that matter fixing the documentation to stop recommending /dev/random, is a separate battle.

aidenn0 · on Aug 9, 2015

Perhaps you can clear something up: my understanding is that linux /dev/urandom will drain the entropy pool first and then use a CSPRNG when it is empty.

This is what I was told back in the 2.6 days, and it always seemed to me that you would want to stretch your entropy out more.

tptacek · on Aug 9, 2015

That has never been true, but it is for reasons passing understanding a very popular urban legend.

/dev/random and /dev/urandom use the same CSPRNG code. There are only two differences: there are separate state pools for them, and /dev/random includes logic to block on requests that exceed an estimator number.

Tomte · on Aug 9, 2015

And to make it even more clear: the notion that /dev/urandom first does "the right thing" and then (after entropy has been exhausted) falls back on the "worthless thing" (the CSPRNG) is absolutely and categorically wrong.

Both /dev/urandom and /dev/random always do "the worthless thing".

What people consider "the right thing", namely handing out raw collected random bits without any postprocessing, would even be catastrophic.

__Joker · on Aug 9, 2015

The blackhat whitepaper[1] and presentation[2] has more info, which might appetite curious HNers.

[1] https://www.blackhat.com/docs/us-15/materials/us-15-Potter-U...

[2] https://www.blackhat.com/docs/us-15/materials/us-15-Potter-U...

qrmn · on Aug 9, 2015

Honestly, this sales brochure of a "paper" tastes even worse than the BBC fluff piece. This is below the standard of paper I would have expected Black Hat to accept.

Good CSPRNG design is not a "dark art", and entropy is not "consumed" when a good CSPRNG is used. Any good CSPRNG uses a good PRF - any good block cipher in CTR mode, a hash, or a HMAC, perhaps - to stretch one good, solid, 256-bit entropy seed into as much cryptographically-secure random data as you'll ever need over the lifetime of your cryptosystem, and ratchets forward through the PRF after each call so the state cannot later be reversed (without breaking the PRF, but you're using a good one, so you'll be fine).

You need quality entropy to seed a CSPRNG - not quantity. Yes, it is, as we all know, very important you don't try to use a CSPRNG before its initial seed has collected enough good entropy - which is, yes, a particular problem in embedded systems or headless servers - but after that, the entropy in your CSPRNG seed isn't something that magically disappears, as you'll see from the design of Linux's newest random-number API, getrandom, patterned after the OpenBSD folks' ideas.

Reseeding a CSPRNG's state with more entropy is not a benefit, but in fact a potential risk every time you do it: it can result in entropy injection attacks if an attacker can observe the state, and control some of the entropy. That, in turn, could break your whole cryptosystem, especially if you're using fragile primitives like DSA or ECDSA. One source: http://blog.cr.yp.to/20140205-entropy.html

Detuned ring oscillator TRNGs [p2] can be troublesome to protect from RF side-channel attacks, or even RF injection attacks in pathological cases. Carefully used, they are fine, but best used when combined with shot-noise-based TRNGs. You can find those in surprising places: even the Raspberry Pi's BCM2835/BCM2836 has a perfectly serviceable one, available from /dev/hwrng after bcm2708-rng/bcm2835-rng has been loaded, and which rngd can use with no trouble.

Forgive me if, therefore, I perhaps wouldn't like to buy a "quantum random number generator" from Allied Minds Federal Innovations, Inc, who are behind this "paper", or to replace the OpenSSL RNG with theirs. That all feels far too much like a black box, and Dual_EC_DRBG is still fresh in our memory. I'd rather use the one Alyssa Rowan described to me on a napkin, thanks, or LibreSSL's/OpenBSD's arc4random with ChaCha20, or CTR_DRBG, or HMAC_DRBG.

bhickey · on Aug 9, 2015

> You need quality entropy to seed a CSPRNG - not quantity.

For the most part I think you're spot on, but I don't follow here. Entropy is measured in bits and bits are bits are bits. When we ask /dev/random for 256-bits it should return a 256-bit sequence, possibly after blocking. If that sequence exhibits less than 256-bits of entropy, it just means that the pool had a bad entropy estimate. Is there some nuance I'm missing?

qrmn · on Aug 9, 2015

The point I'm making is that people should probably balk at the suggestion that they need 200Mbps/sec of entropy from a mysterious black box on a PCIe card sold to them by an NSA affiliate who want them to put it into their critical servers. No. Just... no. Don't do that.

256 good bits, once, is quite enough, as long as they are good. You might well try to collect more entropy, and your CSPRNG's setup might use a compression function (e.g. a cryptographic hash) to combine them into the seed to try to hedge against failures. That's quite a good idea, as long as the last one you sample is your most trusted (see djb's blog for why). But you don't need megabits of entropy, and you don't need it on an ongoing basis. That task is solved by the PRF.

So what you should perhaps be doing is not using /dev/random at all, but using Linux's default getrandom syscall to get 256 bits to seed your userspace CSPRNG instead. The urandom mode of that will block if it hasn't collected enough entropy, and will never block thereafter, and it also doesn't need a device node handy.

Even attempting to estimate entropy is perilous, so most modern CSPRNGs don't try. (Note, by way of example, the difference between the earlier Yarrow and the later Fortuna.)

PhantomGremlin · on Aug 9, 2015

sold to them by an NSA affiliate

So if you're against quantum RNGs you probably won't be buying anything from ID Quantique[1]. I don't know if they're an NSA affiliate. But they are Swiss, and another Swiss company, Crypto AG,[2] reportedly backdoored their crypto at the behest of the NSA.

I don't share your enthusiasm for shot-noise-based RNGs. There's a lot to go wrong there as well.

[1] http://www.idquantique.com/ [2] https://en.wikipedia.org/wiki/Crypto_AG#Back-doored_machines

pimlottc · on Aug 9, 2015

So basically, entropy bits do get used up, but it's not the problem you should worrying about.

tptacek · on Aug 9, 2015

In the sense you're thinking about, entropy bits do not get "used up". The reason they're continuously refreshed is because something could theoretically happen to your system that compromises your CSPRNG internal state, and if the CSPRNG wasn't rekeyed you'd be permanently compromised.

geofft · on Aug 9, 2015

If you ask a CSPRNG for 256 bits, it should return 256 bits chosen uniformly at random from 0 to 2^256-1.

If you ask it again for another 256 bits, it should return another 256 bits chosen uniformly at random from 0 to 2^256-1. That's pretty straightforward.

But you can satisfy both of these without the combined 512 bits being chosen uniformly at random from 0 to 2^512-1. For instance, if you generate a random 256-bit key for a 256-bit-block cipher, and encrypt 0 and 1 with the cipher, the two blocks are uniformly at random from their range, but they're not independent. Since there are only 2^256 possible keys, not all 2^512 outputs are possible by a straightforward counting argument.

You need quality entropy, but only up to the quantity of the security level of your system.

bhickey · on Aug 9, 2015

> You need quality entropy, but only up to the quantity of the security level of your system.

What do you mean when you say entropy quality? As I said before, entropy is measured in bits. If you have two coin tosses that are correlated, you don't say "I've got 2-bits of low quality entropy". You simply have strictly less than 2-bits of entropy.

pedrocr · on Aug 9, 2015

Going through these it seems they've gone to a whole lot of trouble to implement something no one really needs (an extra layer for entropy management). Their reasoning seems to be:

1) OpenSSL seeds its CSPRNG once on startup from /dev/urandom

2) A Linux server will often have low entropy when responding to that call

3) The OpenSSL CSPRNG is thus compromised and some extra logic is needed to only take the value once there's enough entropy

The thing is 2) is not really a problem. As long as the pool had enough entropy at the beginning to seed the /dev/urandom CSPRNG the output is good forever even if the pool is now empty. I think most distros already make sure /dev/urandom is properly seeded on startup so there should be no real attack here.

On 3) we should probably be going the other way (less code). Apparently OpenSSL actually has its own CSPRNG instead of just reading from /dev/urandom when it needs random numbers. Maybe there's a valid performance reason to do that (less context switches) but I doubt it.

rsy96 · on Aug 9, 2015

The definition of cryptographic pseudorandom generators are deterministic functions that turn a true random number (seed) of bit length k into a stream of seemingly random numbers of bit length n, where n is much larger than k. A computationally bounded attacker cannot distinguish this stream of pseudo random numbers from true random numbers, unless he/she knows the seed.

So if the CSPRNG are truly cryptographic secure, you don't need constant stream of high entropy input. You only need enough starting entropy, say 256 bits, and you will be fine for a long time.

pilsetnieks · on Aug 9, 2015

Is this really the kind of article we want here?

> A study found shortcomings in the generation of the random numbers used to scramble or encrypt data.

> The hard-to-guess numbers are vital to many security measures that prevent data theft.

> But the sources of data that some computers call on to generate these numbers often run dry.

> This, they warned, could mean random numbers are more susceptible to well-known attacks that leave personal data vulnerable.

I get that you have to simplify for the ordinary people but this looks like talking to a five-year-old.

colinbartlett · on Aug 9, 2015

Funny, I came to the comments to remark about how nice it was to see an article very clearly articulate the issue.

Yes, I generally understand these concepts, but I am not a security professional and found the explanation useful both for my own comprehension and for improving my ability to relay complex technical topics to non technical people.

pilsetnieks · on Aug 9, 2015

I get that it's good to educate the general public but first of all, we're not exactly the general public here, and something even a little bit more substantial would have been nice.

Second, this description is so vague, it could apply to almost anything. When the next openssl vulnerability, or android bug, or any other crypto weakness appears, you could almost run the same article: "Web's secret numbers are too weak. These numbers are vital to security. They prevent data theft. Your data could be vulnerable." While true, it's pointless.

amouat · on Aug 9, 2015

I submitted it, and that thought did cross my mind.

However, I see someone has linked the papers with the original research in a comment which allows anyone curious to dig into the actual details.

At any rate, it's always interesting to see how the mainstream media cover computing/hacking news.

davnicwil · on Aug 9, 2015

I guess you're not familiar with the BBC's tech journalism?

TLDR: It's interesting to HN readers only on the meta level (how does a non-technical readership consume tech news in 2015).

BBC tech news makes very strange reading for someone in the tech world. All the technical terms we use to be precise about things are replaced with broad, ostensibly more understandable, terms that do often sound very childish in comparison - sometimes so much so that they do actually obscure the meaning and even the truthfulness of the message.

As such it can make for frustrating reading, only because you wish there was a better way to communicate these ideas that doesn't mislead people or dumb down the information so much that it's basically without value.

There almost certainly is, to be honest, but it's very difficult, and so probably an unwise investment of effort for the likes of the BBC, whose readers are mainly there to read 'general' news stories and might skim a couple of tech stories as an aside.

It's probably much like a Physics PhD going back to science classes in school - "I see what you're trying to say here, and I perhaps can see why you're saying it in the way you are, to introduce a concept in a way you think is most accessible, but what you're saying isn't actually fully correct, and I find it irritating that you're implying it is".

davidgerard · on Aug 9, 2015

It's relevant and useful to know that this is now a matter of mainstream concern.

sarciszewski · on Aug 9, 2015

Uh oh, is this another Debian bug or the underpinnings of a fundamental weakness in Linux's CSPRNG?

I wonder if OpenBSD's arc4random_buf() is unaffected?

cc 'tptacek :)

kaesve · on Aug 9, 2015

As far as I could see, it's not a problem in the CSPRNG itself, but in how it is used. More specifically it seems like a lot of applications use more entropy bits per second than servers generate by normal use. I'd say this is the result of not understanding how CSPRNG works and how to use it safely. Adding more or better sources of entropy to your systems would solve this.

pedrocr · on Aug 9, 2015

>I'd say this is the result of not understanding how CSPRNG works and how to use it safely.

My take from previous discussions is that once you seed a CSPRNG properly you can take secure random numbers from it forever. So in a linux server once /dev/urandom has been properly seeded you can take random numbers from it forever with no issues.

So if what they discovered in this research is that "Linux's /dev/urandom entropy pool often runs empty on servers" this shouldn't really be much of an issue.

sarciszewski · on Aug 9, 2015

So it's, once again, an issue where cloud servers aren't being seeded before their keys are generated?

tedunangst · on Aug 9, 2015

OpenBSD does not allow entropy to be "used up".

sarciszewski · on Aug 9, 2015

Yeah, and the concept of "entropy can be used up" is stupid. :)

I thought maybe this was a flaw in the PRFs used in Linux, for example.

atoponce · on Aug 10, 2015

This paper asserts something innacurate: entropy pools can be "used". Entropy is not an object, it's an estimation. Just as you don't use temperature, or barometric pressure, you don't "use" entropy.

Further, once a CSPRNG is properly seeded, there is no need to concern yourself with whether or not it can produce "high quality random numbers", provided the cryptographic primitive behind the CSPRNG contains conservative security margins. The Linux kernel CSPRNG uses SHA-1 as the underlying primitive. While SHA-1 is showing weaknesses in blind collision attacks, it still contains wide security margins for preimage attacks, which is what would need to be applied to a key generated by a CSPRNG (you have the output, now predict the state that produced it). Even MD5 remains secure against preimage and second preimage attacks.

Again, once properly seeded, the Linux CSPRNG can produce data indistinguishable from true random indefinitely until SHA-1 is sufficiently broken.

praseodym · on Aug 9, 2015

I have a VM running Debian Jessie (Linux 3.16) that has very low entropy available (cat /proc/sys/kernel/random/entropy_avail returns <200 most of the time) even though the Intel RDRAND instruction is available. Shouldn't it be using that to fill up the entropy pool, or am I misunderstanding how the entropy_avail value works?

Freaky · on Aug 9, 2015

Linux is very conservative with how much entropy it credits to things like RDRAND since they can't be easily trusted. Looks like you get one extra bit per interrupt:

https://github.com/torvalds/linux/blob/master/drivers/char/r...

You'll note no other use of arch_get_random_* throws anything at credit_entropy_bits().

Linux 3.17 did bump up the assumed quality of virtio-rnd:

https://github.com/torvalds/linux/commit/34679ec7a0c45da8161...

mukyu · on Aug 9, 2015

The talk about needing to constantly add more entropy or 'manage' it is nonsense. djb says it best: http://blog.cr.yp.to/20140205-entropy.html

Briefly, once you have say 256 random bits it is trivial to use AES and CTR mode and turn that into 2^71 random bits until you need to rekey. If you cannot get more entropy in the time it takes to use up all of those numbers something is completely broken. The only problem you can have is not having enough entropy to bootstrap (such as VMs or needing to generate a key at poweron on an embedded device), but this paper gives little more than lipservice to it.

im3w1l · on Aug 9, 2015

How big problem is this in practice? Let's say you only have 256 bits of "real entropy" and you then stretch that into large amounts of pseudo-random bits using a state of the art CSPRNG and use those bits for all your randomness needs. Let's say worst case scenario here, so a server that is online for several years, with no reseeding at all. Are there any practical attacks against that?

marcosdumay · on Aug 9, 2015

If you stretch it into a set of numbers with 256 bits or less, you are good. If you expect to generate bigger random numbers from it, you have a problem.

But the pool does not stay with only 256 bits for long (if at all). It's always accumulating more.

Anyway, if the pool ever get to zero, it means that an attacker with infinite resources that can see the entire sequence generated by the CSPRNG could predict the next numbers it'll generate. On practice none of those conditions are met.

Tomte · on Aug 9, 2015

So you don't trust modern block ciphers and avoid encrypting more than 128 (or 256) bits of data with a single key?

Isn't that extreme key rotation a bit bothersome?

No, stretching your seed of 256 bits into terabytes of pseudorandom numbers is normal and absolutely fine.

mangeletti · on Aug 9, 2015

It would be interesting to get Bruce Scheier's take on this.

He wrote this in 99, and it talks about key length near the bottom (though doesn't cover this exact scenario): https://www.schneier.com/crypto-gram/archives/1999/0215.html