Just how random are random numbers in MATLAB?

23 ビュー (過去 30 日間)
Stephen
Stephen 2019 年 3 月 21 日
編集済み: John D'Errico 2019 年 3 月 22 日
Hello,
I am working on a project that requires random whole number generation between 1:20,000, repeated 10 times. Every single time so far there has been a number 19,950+ within 10 generations or sometimes far less.
When I generate a set of random integers using randi(), is Matlab producing an approximately even spread across the range of values? Or is it more random?
I have found the pages on the nature of randi etc but to be honest it's a bit beyond me, if someone could explain in plain english it would be greatly appreciated!!
  1 件のコメント
Adam
Adam 2019 年 3 月 21 日
I just generated 10 random numbers with randi and didn't get one > 15886 and I got 2 above 19,950 out of 1000 generated, which is around the expected number.

サインインしてコメントする。

回答 (2 件)

Guillaume
Guillaume 2019 年 3 月 21 日
"Every single time so far there has been a number 19,950+ within 10 generations or sometimes far less."
You will have to explain what it means (what is a generation) or show the code you're using because you certainly shouldn't get 1 in 10 chance of getting a number between 19950 and 20000 using randi. If you do get that, you've hit a major bug so we'd want the recipe.
The distribution is supposed to be uniform and if you're doing things like monte-carlo simulations, the randomness is plenty enough. If you are doing cryptography, the randomness probably does not have enough entropy to qualify the RNG as Cryptographically secure. (On the other hand, if you're doing cryptography you should already know a lot about RNGs).
  4 件のコメント
Stephen
Stephen 2019 年 3 月 21 日
Yeah my bad haha I came to the same conclusions after I posted, still eager to know the mechanisms however
Guillaume
Guillaume 2019 年 3 月 22 日
"You mentioned the distribution is suposed ot be uniform, how is that enforced?"
By very careful design of the RNG. If you want more details than that you're going to have to dig deep into the maths of RNG.
You can easily check that the distribution is uniform:
histogram(randi(20000, 1, 1e7))

サインインしてコメントする。


John D'Errico
John D'Errico 2019 年 3 月 22 日
編集済み: John D'Errico 2019 年 3 月 22 日
The random number generation scheme in MATLAB is fairly sophisticated, having evolved greatly over the years. It is currently using the Mersenne Twister generator by default, which seems to be pretty good as I understand things. Older versions were somewhat more flawed, although uniformity over such a small span would not have been an issue even with the older methods that were in MATLAB when we started out.
You might enjoy reading some of the hits based on a Google search for the problems with old random number generators, say this one:
Can you have any pseudo-random number generator that is truly random? Well, not so easily. The best is arguably using things like random decays of particles.
Your problem is that you are choosing far too small of a sample. 10 iterations of 100 numbers is not at all large in context. Over that span, you will see only 1000 numbers in total.
What is the average number of samples at 19950 or above in that span? Here is where you have a serious problem with your methodology. There are 51 integers that are at 19950 or above. (I'm going to assume that 19950+ includes 19950. But the difference is hardly relevant to what I've said below.)
51/20000
ans =
0.00255
51/20000*1000
ans =
2.55
So on average, I would expect to see 2.55 such events in that span of samples. But that is a relatively small number. It could easily be as small as 0. Or it could also easily by 5. For example, what is the probability that you would get 0 such events in a span of 1000 uniformly random samples?
(1 - 51/20000)^1000
ans =
0.077828
So even from a TRULY, PURELY random stream, almost 8% of the time, I would expect to see no such event happen at all.
With a little thought, you should see that in fact this is just a binomial distribution. If you have the stats toolbox, we can compute that probability very easily using binocdf or binopdf.
binocdf(0,1000,51/20000)
ans =
0.077828
Over a sample time of 1000 total samples, the expected fraction of hits in that interval will look like this:
[0:5;binopdf(0:5,1000,51/20000);binocdf(0:5,1000,51/20000)]
ans =
0 1 2 3 4 5
0.077828 0.19897 0.25408 0.21609 0.13769 0.070121
0.077828 0.2768 0.53087 0.74696 0.88465 0.95477
But as you see, it is not at all unlikely that you will see no events at all. and as many as 5 or even 6 is also entirely possible. Again, you are choosing far too small of a sample size to have any real statistical significance. And blocks of 100 such samples? PPPPPPLEASE. Get serious. In any given stream of 100 random samples...
[0:3;binopdf(0:3,100,51/20000)]
ans =
0 1 2 3
0.77466 0.19804 0.025062 0.002093
So in any such small sample, most of the time you will see 0 such events, and 1 event around 20% of the time.
In order to ask questions about any decent random number generator, you want to look at much more massively large samples than this. It is far too easy to delude yourself by making any conclusions from small sample sizes. Carried to an extreme, suppose you looked at a random coin flip?
Flip a truly random coin twice, and fully 50% of the time, I will get the same result both times. How can that be random?

カテゴリ

Help Center および File ExchangeCreating and Concatenating Matrices についてさらに検索

製品


リリース

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by