## Just how random are random numbers in MATLAB?

Stephen

### Stephen (view profile)

さんによって質問されました 2019 年 3 月 21 日

### John D'Errico (view profile)

さんによって 編集されました 2019 年 3 月 22 日
Hello,
I am working on a project that requires random whole number generation between 1:20,000, repeated 10 times. Every single time so far there has been a number 19,950+ within 10 generations or sometimes far less.
When I generate a set of random integers using randi(), is Matlab producing an approximately even spread across the range of values? Or is it more random?
I have found the pages on the nature of randi etc but to be honest it's a bit beyond me, if someone could explain in plain english it would be greatly appreciated!!

#### 1 件のコメント

2019 年 3 月 21 日
I just generated 10 random numbers with randi and didn't get one > 15886 and I got 2 above 19,950 out of 1000 generated, which is around the expected number.

サインイン to comment.

R2018b

## 2 件の回答

### Guillaume (view profile)

2019 年 3 月 21 日

"Every single time so far there has been a number 19,950+ within 10 generations or sometimes far less."
You will have to explain what it means (what is a generation) or show the code you're using because you certainly shouldn't get 1 in 10 chance of getting a number between 19950 and 20000 using randi. If you do get that, you've hit a major bug so we'd want the recipe.
The distribution is supposed to be uniform and if you're doing things like monte-carlo simulations, the randomness is plenty enough. If you are doing cryptography, the randomness probably does not have enough entropy to qualify the RNG as Cryptographically secure. (On the other hand, if you're doing cryptography you should already know a lot about RNGs).

#### 4 件のコメント

2019 年 3 月 21 日
Ah, well, if you are generating 100 random numbers in each of your 10 'generations' that is a different matter (you missed out that key detail in your question when you just said 10!), although you still should not be getting one > 19,950 every time you run it as the chance is still not so high.
My quick maths would suggest you have around a 22% chance of generating a number between 19950 and 20000 from 100, but if you run that 10 times then the chances of getting 1 number in that range are more like 92% (though my maths maybe wrong there - I always hated statistics and probabilities!)
Stephen

### Stephen (view profile)

2019 年 3 月 21 日
Yeah my bad haha I came to the same conclusions after I posted, still eager to know the mechanisms however
Guillaume

### Guillaume (view profile)

2019 年 3 月 22 日
"You mentioned the distribution is suposed ot be uniform, how is that enforced?"
By very careful design of the RNG. If you want more details than that you're going to have to dig deep into the maths of RNG.
You can easily check that the distribution is uniform:
histogram(randi(20000, 1, 1e7))

サインイン to comment.

2019 年 3 月 22 日

### John D'Errico (view profile)

2019 年 3 月 22 日

The random number generation scheme in MATLAB is fairly sophisticated, having evolved greatly over the years. It is currently using the Mersenne Twister generator by default, which seems to be pretty good as I understand things. Older versions were somewhat more flawed, although uniformity over such a small span would not have been an issue even with the older methods that were in MATLAB when we started out.
You might enjoy reading some of the hits based on a Google search for the problems with old random number generators, say this one:
Can you have any pseudo-random number generator that is truly random? Well, not so easily. The best is arguably using things like random decays of particles.
Your problem is that you are choosing far too small of a sample. 10 iterations of 100 numbers is not at all large in context. Over that span, you will see only 1000 numbers in total.
What is the average number of samples at 19950 or above in that span? Here is where you have a serious problem with your methodology. There are 51 integers that are at 19950 or above. (I'm going to assume that 19950+ includes 19950. But the difference is hardly relevant to what I've said below.)
51/20000
ans =
0.00255
51/20000*1000
ans =
2.55
So on average, I would expect to see 2.55 such events in that span of samples. But that is a relatively small number. It could easily be as small as 0. Or it could also easily by 5. For example, what is the probability that you would get 0 such events in a span of 1000 uniformly random samples?
(1 - 51/20000)^1000
ans =
0.077828
So even from a TRULY, PURELY random stream, almost 8% of the time, I would expect to see no such event happen at all.
With a little thought, you should see that in fact this is just a binomial distribution. If you have the stats toolbox, we can compute that probability very easily using binocdf or binopdf.
binocdf(0,1000,51/20000)
ans =
0.077828
Over a sample time of 1000 total samples, the expected fraction of hits in that interval will look like this:
[0:5;binopdf(0:5,1000,51/20000);binocdf(0:5,1000,51/20000)]
ans =
0 1 2 3 4 5
0.077828 0.19897 0.25408 0.21609 0.13769 0.070121
0.077828 0.2768 0.53087 0.74696 0.88465 0.95477
But as you see, it is not at all unlikely that you will see no events at all. and as many as 5 or even 6 is also entirely possible. Again, you are choosing far too small of a sample size to have any real statistical significance. And blocks of 100 such samples? PPPPPPLEASE. Get serious. In any given stream of 100 random samples...
[0:3;binopdf(0:3,100,51/20000)]
ans =
0 1 2 3
0.77466 0.19804 0.025062 0.002093
So in any such small sample, most of the time you will see 0 such events, and 1 event around 20% of the time.
In order to ask questions about any decent random number generator, you want to look at much more massively large samples than this. It is far too easy to delude yourself by making any conclusions from small sample sizes. Carried to an extreme, suppose you looked at a random coin flip?
Flip a truly random coin twice, and fully 50% of the time, I will get the same result both times. How can that be random?

#### 0 件のコメント

サインイン to comment.

Translated by