The bell has tolled for rand()

jlebar · on Feb 12, 2015

Not to focus on a question unrelated to this blog post's point, but

* "auto main() -> int" could be just "int main()".

* "auto v = vector<int>(20);" could be just "vector<int> v(20)".

* "auto print_value = [](auto&& v)" could be (I'd argue should be) "auto print_value = [](const auto& v)".

"auto" is useful, "auto" is great. But "auto" is not an end in and of itself.

(Bring on the C++(11) haters, blah blah.)

fafner · on Feb 12, 2015

1. The new declaration style might look a bit strange to people used to the old style. But one could argue that using it will make it more consistent for the cases when you really do need to use it. Anyway it's just a matter of style. Not really a need to discuss it IMO.

2. Some people argue that

  auto v = T{x};

should be the new way of defining variables with types. The use of {}-initialization should prevent unnecessary narrowing. In this case () has to be used because a specific ctor is to be called. But using that style would make it more consistent with the rest.

forrestthewoods · on Feb 12, 2015

I'm not convinced auto should ever be used for anything other than iterators and anonymous types. There are other instances where it's relatively fine to use. Perhaps a few where it's even slightly better to use. But in C++11 I think it's by far best to limit it to there be explicit cases I mentioned.

In C++14 I'll extend support to lambdas in some cases but will need to experiment to say for sure.

fafner · on Feb 12, 2015

Sorry, but your comment lacks substance because you don't provide any argument for the way you think. Why should it only be used for iterators and anonymous types?

There is a pretty good argument for using the

  auto var = T{val};

style of defining variables. The {}-initialization won't narrow literals. You can find more discussion about this in GotW.

Maybe you have a good reason for not liking the style and there certainly can be arguments against it. But from experience I find that many people opposing more use of auto simply oppose it because it is different to their old ways. Therefore I'd like to see a proper argument for your comment.

EliRivers · on Feb 12, 2015

I really, really like knowing what kind of object something is by looking at the code where it is created. I really, really like it. I find it extremely helpful.

This sort of thing:

auto x = function();

is so unhelpful

I find it frustrating and an active impediment when I can look at an object being created and not know what kind of object it is. In some programming languages, knowing what kind of object something is doesn't matter so much. Not the case here.

fafner · on Feb 12, 2015

Why do you need to know the exact type? Why does it matter so much for C++ but not for other languages? I mean there are templates in C++ and a lot of code even in C++98 was written not knowing the exact type and just assuming or expecting certain properties/methods to work.

EliRivers · on Feb 18, 2015

Why do you need to know the exact type?

Okay, YOU tell me what class functions that variable "x" has. Code isn't written just for the compiler to read.

forrestthewoods · on Feb 12, 2015

I specifically referred to C++11. And I believe your example with the T{val} is C++14 so I don't know much about it.

My experience is working on video game engines with other people. As a senior developer a large part of my job is jumping through a wide range of systems written by other people to debug problems, identify ways to increase performance, add features, etc. There's also a lot of code written by people who have moved on to other jobs.

We've had quite a bit of code that was full of autos. In my experience thus far auto has never made code more significantly readable or easier to understand than not using auto. (excluding iterators/lambdas) Not only has auto not made code easier to comprehend the use of auto has made it significantly more difficult to understand on more than a few contexts. If deducing a type as a human reader of text requires backtracing a half dozen calls of code across who knows how many lines of code, and even files, then I'm gonna be justifiably grumpy. It's a huge burden. And for what benefit? Damn near nothing in my experience so far.

fafner · on Feb 13, 2015

> And I believe your example with the T{val} is C++14 so I don't know much about it.

No, it's C++11.

MrDosu · on Feb 12, 2015

This is something an IDE does for you...

forrestthewoods · on Feb 12, 2015

In theory, but not in practice. Visual Studio starts off great but it eventually chokes and dies for C++. Tools like Visual Assist can extend the lifetime but eventually it too will fail. This is true for every game and every engine I've ever worked on.

I now do 100% of code editing in Sublime Text. Other people use other text editors. I'm now of the opinion that code bases should be useable and searchable in plaintext form. It's not difficult and even with an IDE makes things better imo.

MrDosu · on Feb 12, 2015

I would (almost) completely disagree with this.

C++ support in VS is horrible, that's correct, but having the ability to be supported by compiler services that can parse (invalid) code is a major milestone when it comes to handling more complex codebases.

It does not matter how smart you are, the easier it is for you to understand and reason about the code, the more will fit in your head.

Refactoring is just one of the many amazing tools that make me a much better and more productive programmer today then I was without when using vi in the 90s. Add in static analysis, intellisense etc... Every bit of complexity that tooling can hide from you is worth gold.

Reality for C++ is grim though in this regard. Let's hope we get better compiler services for it soon.

forrestthewoods · on Feb 12, 2015

I agree that the easier it is to understand and reason about code the better. That's why I'm opposed to most uses of auto. It makes code harder to understand. It might not make code harder to understand if it was used with tools that doesn't exist. But those tools don't exist. I'm constantly re-evaluating my opinions and for auto I keep reaching the same conclusion. I pray that someday new tools are released that make my work life better. When they are some of my re-evaluations, related to auto or otherwise, will certainly change. But sadly that day is not today.

MrDosu · on Feb 13, 2015

Just interested: Do you use static analysis tools (like for example coverity) in the gamedev industry?

papaf · on Feb 12, 2015

Most people think that auto should not be used much, but Herb Sutter recommends that everyone should use it much more. His argument is convincing:

https://github.com/CppCon/CppCon2014/blob/master/Presentatio...

gpvos · on Feb 12, 2015

Interesting to see that OpenBSD recently went the other way: break the standard and make rand() a good random generator by default. Even to the point of making srand(time(0)) a no-op.

https://news.ycombinator.com/item?id=8719593

brudgers · on Feb 12, 2015

That's part of exactly what makes the problems with rand() so intractable. Whether a program performs as specified is an arbitrary function of real world state.

nly · on Feb 12, 2015

It's good that they did that, as a safety net for old code, but none of the standards which define rand() guarantee or even suggest uniform randomness, so the bell has still certainly tolled.

DonHopkins · on Feb 12, 2015

"The C random library has always been… to put it politely… less than ideal. Okay, it’s pretty fucking horrible. It’s so bad that the C standard itself suggests you’d be better off not using it."

Back in the days 4.2 BSD or so, the BUGS section of the manual entry for rand understatedly mansplained that it had "bad spectral characteristics". In fact, it was so bad that the lower bit alternated between 0 and 1 every time you called it. Hard to miss that bright line on a spectrogram.

If you couldn't figure out what to expect from such a forthright disclosure in the manual, then you were in for quite a shock when you did the obvious thing and tried to use "rand() & 1" to simulate flipping a coin!

fulafel · on Feb 12, 2015

The claim that the C standard suggests not using it is disingenuous: The footnote in question just says that there aren't implementation quality guarantees and specific requirements can be met with application specific RNGs. This applies to most of the standard library...

Also the anecdote about bad rand() in a 1983 BSD libc says little about current rand() implementations!

These look ok, for example:

https://github.com/lattera/glibc/blob/master/stdlib/random_r...

https://www.opensource.apple.com/source/Libc/Libc-320/stdlib...

Of course it's true that relying on libc implementation quality is always risky for maximally portable programs. Same can be said for stdio, malloc, etc etc.

tedunangst · on Feb 12, 2015

rand() and random() are different functions.

lloeki · on Feb 12, 2015

If I were to use C, what should I be using instead of rand(3)?

A cursory look at rand(3) SEE ALSO hints at candidates but random(3) seems to hardly fare better†, arc4random(3) isn't available on glibc.

† A few notes about rand as the parent suggests, but nothing regarding mod bias, srandom initial state, or threads:

> It returns successive pseudo-random numbers in the range from 0 to (231)-1. The period of this random number generator is very large, approximately 16((2*31)-1).

> All of the bits generated by random() are usable. For example, `random()&01' will produce a random binary value.

tedunangst · on Feb 12, 2015

There are several paths to getting arc4random or something like it.

You can just use it and optionally tell people to link with e.g. libcrypto from libressl.

You can include the portable code yourself, though that's yucky. Aging software that includes never updated very early versions of arc4random is actually kind of a problem because that code still gets used when better versions are available.

You can link with https://github.com/nmathewson/libottery-lite which is approximately arc4random with a different name.

Some combination of the above.

For example, sqlite3 includes an arc4random workalike (rc4 rng), but doesn't discard the early stream (critical because it leaks the key) nor include any degree of fork safety. Nor does it check if the host provides a better version. sqlite3 shipped with OpenBSD is patched to use libc arc4random instead, but building from source means you're back to square one. Not a big deal in the case of sqlite3, but try not to build something that doesn't improve as the world around it improves.

justincormack · on Feb 12, 2015

arc4random (under a different name as the name is misleading) will be standardised, and Linux now has a random() syscall that is a necessary building block, so it doesnt fail.

In the meantime, use arc4random on openbsd and netbsd 7 and read from /dev/urandom on all other platforms (FreeBSD is in process of fixing arc4random, currently still uses rc4 and may have problems with state being same in multiple threads; I guess OSX is the same version).

Android does have arc4random, although it can fall back to weaker random sources if it does not manage to open /dev/urandom https://code.google.com/p/android-source-browsing/source/bro...

noselasd · on Feb 12, 2015

How well does the rand48 family of functions fare ? http://pubs.opengroup.org/onlinepubs/7908799/xsh/drand48.htm...

AceJohnny2 · on Feb 12, 2015

> manual entry for rand understatedly mansplained

Heh, I see what you did there, but the downvotes imply others didn't.

recursive · on Feb 12, 2015

Presumably "man" for "manpages".

unwind · on Feb 12, 2015

Probably part that, part joking reference to the term as used outside computing (see http://www.urbandictionary.com/define.php?term=Mansplain).

I guess the idea is that manual pages can sometimes seem to be intentionally obscure, and kind of proud of it. I don't have an example handy, nor am I even sure I agree, but I think I got the joke, at least. :)

DonHopkins · on Feb 12, 2015

Yes, that was my intent: the manual entry used the term "bad spectral characteristics" condescendingly, instead of actually admitting that rand was a terrible mistake and nobody should ever use it. The effort the manual writer put into rationalizing the bug would have been better put into fixing it.

I looked up "mansplaining" on the urban dictionary while writing that, and was disgusted to see that it was full of sexist definitions written by obviously butt-hurt men's rights advocates, literally blaming angry misandrist pseudo-intellectual ugly insecure radical fat women for using it as a get-out-of-jail-free card against daring opinionated oppressed men. http://www.urbandictionary.com/define.php?term=mansplaining

So it's possible the downvotes came from red pill poppers reflexively reacting to my use of the word "mansplaining", and if so, I suffer those fools gladly, and hope to earn their downvotes again. http://www.salon.com/2014/07/01/feminism_is_a_sexual_strateg...

Here's a Unix manual entry that I wrote, which made it into Solaris and SVR4, whose BUGS section is now obsolete thanks to technological advances based on high quality pseudo random number generators, like https and PayPal: https://stuff.mit.edu/afs/net/system/sun4c_41/rsp.01/usr/ope...

recursive · on Feb 12, 2015

I mean, in general, if I see the term "mansplain" on the internet with no context, I'm going to assume the usage is in line with what's described on urban dictionary. Just because that's how it's actually predominantly used. This is the first usage I've ever seen that was something else. So it's not totally unreasonable.

leni536 · on Feb 12, 2015

Why is a specific random generator should be in the standard at all? Random generators are quite a dangerous area.

Picking one is not without compromises: Do you want your PRNG fast? Or do you want it cryptographically secure?

It could become obsolete in less time than expected. It's mostly true for CSPRNGs. Maybe that's why they are considering Mersenne Twister which is "good enough" for many use cases but not meant to be cryptographically secure. Sure, I'm using it right now for physics simulation and it's certainly good enough for that and it's hard to imagine a case where a 623-dimansionally equidistributed PRNG could fail. But it most certainly can if it can't be used for cryptography. It is much better than rand() though and one could argue that it will be good enough 99% of the time. The problem that you can't replace it if it ever becomes obsolete since some software depends on the predictability and still random like features of PRNGs (like game map generators, digital art, etc...).

I think the one boost library they should standardize is boost's random device. PRNGs could become obsolete but "truly random" will always mean the same. However it's not trivial that one has access to a truly random source and in that case it should fall back to fail. At least it could kill the practice of seeding with time.

lmm · on Feb 12, 2015

> Picking one is not without compromises: Do you want your PRNG fast? Or do you want it cryptographically secure?

Many library functions involve tradeoffs. Do you want your malloc() fast? Or do you want it to minimize memory usage? Do you want your sorting function to be fast in the average case or be without pathological edge cases? A PRNG is no way unique in this respect.

The standard library is ultimately a tool like any other; it should contain things that are useful. Of course there are compromises involved in any particular choice of PRNG, but that doesn't mean not including any is a better choice.

My view is that a lot more things should be rigidly specified by the standards. The days of radically varying CPU architectures and radically different OSes are gone; these days underspecification mostly serves to trip programmers up. Standardizing the memory model was a huge step forward, and it's time for C++ to go further in that direction, time to start defining what was previously undefined or implementation-defined behaviour.

ygra · on Feb 12, 2015

> hard to imagine a case where a 623-dimansionally equidistributed PRNG could fail

Well, there is the generalization and improvement of MT19937 in form of the WELL generators. They go beyond that, so there probably are cases where it's needed.

> PRNGs could become obsolete but "truly random" will always mean the same

I'd argue that far fewer people need truly random than those who just need a dice roll.

There are those who work on crypto. They won't need anything in the C++ stdlib because there are libraries specifically for that. They also provide much more that is needed for programs utilizing cryptographic primitives. This group should know what they're doing and choosing, and if not should have no business writing such software. And they won't even accidentally pick MT.

There are those who do numerical simulations. They neither need or can use a "truly-random" generator because submitting a paper with the words "to reproduce our results, grab the following 2 TiB file of random bits ..." is probably frowned-upon. Reproducability of a sequence is a feature, and a good one. Heck, quasi-random algorithms also have their place. In any case, you shouldn't do simulations with just a single PRNG either to rule out interactions between intricacies of your model/simulator and the PRNG. So people in this group are likely to use a framework or library that caters to their needs, too, which includes several different PRNGs (even obsolete ones for checking older results), distributions, special data structures needed for certain kinds of simulations (e.g. event queues), etc. This group also should know what they're doing and why, otherwise they shouldn't publish research at all.

And then there is everyone else who just needs a random number every now and then. Maybe for shuffling a playlist, maybe for rolling a die. Having easy access to a PRNG that works well for a large number of use cases (they wouldn't have any idea what to choose anyway) is a major benefit to this group. They don't care about (or notice) the difference between pseudo-random and truly random (even though the latter sounds more impressive and the former somehow not random enough, even though anything that doesn't look random in the former case is often bad seeding, e.g. in a loop). They just want something to work. MT19937 is a very safe choice for this group, and as an added benefit, it's much better and often faster than LCGs.

StephanTLavavej · on Feb 12, 2015

std::random_device is part of C++11, and VC's implementation guarantees that it's crypto-secure.

tedunangst · on Feb 12, 2015

The problem is the standard doesn't guarantee that, so people will inevitably read the standard, decide that they can't actually trust random_device, and then build their own, much worse version.

nly · on Feb 12, 2015

And on Linux you can pass the string "/dev/urandom" to the constructor

arsv · on Feb 12, 2015

The title is grossly misleading. It should read "C++17 people choose Boost over std::rand() for their RNG needs" or something like that, a hardly surprising statement since C++17 people would choose Boost for pretty much anything else as well.

In particular, it has little to do with rand() (as in rand(3) from libc), which has its uses as well as well-known alternatives within C world.

As a side note, it's funny to see fresh new C++ code that boils down to srand(readintfrom("/dev/random")), except /dev/random is now given an "abstract standard name" random_device.

And that part about limited seeding options. Beats me Boost (the library) alone won't fit in the memory of a device with 16bit ints, so inability to seed the RNG will be among the least of their problems.

unscaled · on Feb 14, 2015

Some C++ programmers love writing cross-platform code, with platform including non-UNIX platforms here, so /dev/random just won't work. That's why you'd need an "abstract standard name".

One of the largest user audiences of C++ is game developers, and (excluding mobile) 99% of their target platforms don't have /dev/random, so it's perfectly normal to want an abstract random device.

As a side note, if you've actually read the article through you'd learn that:

1. The Boost.Random stuff entered the C++ standard back in TR1 (published in 2007), way earlier than C++17. 2. Even if your seed is perfectly random, rand() would give you crappy distribution. 3. srand()/rand() is not re-entrant, which makes it a really bad idea to use it in any codebase that has a chance to grow larger one day. 4. Even if the seed is truly random, rand would give you crappy distribution and thus it has no legitimate uses other than making a game where its easy to cheat. 5. None of the rand() alternatives is in the C standard, and most are highly platform-specific. C++, on the other hand, had very good standard random generators for the last 8 years or so, which is very nice for the language that didn't even have standard strings for 15 of its existence.

DonHopkins · on Feb 12, 2015

"(not to be confused with std::mem_fn() – not the dropped ‘u’)" -- note the dropped 'e'.