Why Is Randomness So Hard in JavaScript?
So it turns out the random number generator long used by developers working with Google's V8 JavaScript engine doesn't really generate random numbers at all.
That's being fixed in the latest release of Google's Chrome browser (and the latest Firefox and Safari releases), but the whole issue begs the question: Why is generating a random number so hard?
Beats me. My brain starts to hurt the very second I begin to even think about understanding any math concept. And, I've discovered, a seemingly simple concept such as random number can be incredibly complicated. Perhaps nothing speaks to the mind-numbing complexity of computer programming like the vast amount of verbiage and math dedicated to making sure a blackjack player can't predict the next playing card to be flipped up.
Even talking about the problem is laden with complexity. People in this world use phrases like "external electromagnetic and quantum phenomena" and "natural entropy." And they don't say the random number generator function in JavaScript -- Math.random() -- is broken. They say it "offers sub-par quality."
Specifically, V8 used a pseudorandom number generator (PRNG) algorithm called MWC1616. The "psuedo" part comes from the fact that this doesn't even generate a true random number. Wikipedia says PRNGs produce "apparently random results" and the generated "seemingly random sequence" can be reproduced under certain circumstances. This is because the algorithm requires a seed value, or key, and if this is known, the seemingly random sequence can be reproduced.
The chances of that actually happening in the real world seem pretty remote to me, especially if the seed is the number of seconds elapsed since midnight on Jan. 1, 1980, or something. But some people care -- probably gambling sites and such where big money is at stake. For some cases, there are cryptographically secure algorithms available to generate true random numbers, but apparently at the expense of time or computing power. Everything is a tradeoff -- you can be wicked fast and less than perfect or you can be perfect and relatively slow, or prohibitively memory-expensive.
Anyway, Google software engineer Yang Guo last month pointed out that MWC1616 is flawed. Or rather: "MWC1616 uses little memory and is pretty fast to compute, but unfortunately offers sub-par quality."
Specifically, he said:
- The number of random values it can generate is limited to 232 as opposed to the 252 numbers between 0 and 1 that double precision floating point can represent.
- The more significant upper half of the result is almost entirely dependent on the value of state0. The period length would be at most 232, but instead of few large permutation cycles, there are many short ones.
- With a badly chosen initial state, the cycle length could be less than 40 million.
- It fails many statistical tests in the TestU01 suite.
According to its site, "TestU01 is a software library, implemented in the ANSI C language, and offering a collection of utilities for the empirical statistical testing of uniform random number generators." I have no idea what the rest of his bullet points mean.
Regardless, Guo pointed out that the problem is being fixed by using a new algorithm called xorshift128+, which "uses 128 bits of internal state, has a period length of 2128 - 1, and passes all tests from the TestU01 suite."
But, he warns, even that isn't cryptographically secure.
"For use cases such as hashing, signature generation, and encryption/decryption, ordinary PRNGs are unsuitable," the Google engineer said. "The Web Cryptography API introduces window.crypto.getRandomValues, a method that returns cryptographically secure random values, at a performance cost."
At any rate, the Math.random() function used V8 is being fixed in the Chrome 49 release, currently in Google's Dev Channel and scheduled for "stable" release during the week that ends March 8.
I still don't understand why true randomness is so hard to do without a big performance hit. It seems to me that literally every single thing that happens in nature (uninfluenced by humans) is random, so why should that be so hard to duplicate in computer programming?
I also don't understand, if the V8 problem has been known for years, why it hasn't been fixed before this.
I also don't understand if the problem even had any proven real-world significance. Did it really matter to anyone, even big Internet gambling sites where millions of dollars are at stake? Did anyone ever benefit by exploiting the flaw, or was pseudorandomness good enough, after all?
Then again, I've noticed I mysteriously continue to fail in picking winning lottery numbers. In fact, in the past few days, I've been unable to pick even one of the Powerball numbers on any of several $2 tickets I've bought in an attempt to win hundreds of millions (now nearing billions) of dollars. Hmm, I'd better check on what algorithm the "pick-six" lottery uses. Maybe it's time to use a manual card and start filling in those little ellipses with a pen. That should pass everything in the TestU01 suite.
So what's the deal with randomness? Does it really matter? Comment here or drop me a line.
Posted by David Ramel on January 13, 2016