您的位置:首页 > 其它

已知有个rand7()的函数,返回1到7随机自然数,让利用这个rand7()构造rand10() 随机1~10。

2011-09-26 15:50 417 查看
已知有个rand7()的函数,返回1到7随机自然数,让利用这个rand7()构造rand10() 随机1~10。

http://www.ihas1337code.com/2010/11/rejection-sampling.html

This has been asked in Google interview and Amazon interview,
and appears quite often as one of the few probabilistic analysis questions. You should be familiar with the concept of expected
value, as it could be extremely helpful in probabilistic analysis.

Hint:

Assume you could generate a random integer in the range 1 to 49.
How would you generate a random integer in the range of 1 to 10?
What would you do if the generated number is in the desired range? What if it’s not?

Solution:

This solution is based upon Rejection
Sampling. The main idea is when you generate a number in the desired range, output that number immediately. If the number is out of the desired range, reject it and re-sample again. As each number in the desired range has the same probability of being
chosen, a uniform distribution is produced.

Obviously, we have to run rand7() function
at least twice, as there are not enough numbers in the range of 1 to 10.
By running rand7() twice,
we can get integers from 1 to 49 uniformly.
Why?

1  2  3  4  5  6  7
1  1  2  3  4  5  6  7
2  8  9 10  1  2  3  4
3  5  6  7  8  9 10  1
4  2  3  4  5  6  7  8
5  9 10  1  2  3  4  5
6  6  7  8  9 10  *  *
7  *  *  *  *  *  *  *


A
table is used to illustrate the concept of rejection sampling. Calling rand7() twice
will get us row and column index that corresponds to a unique position in the table above. Imagine
that you are choosing a number randomly from the table above. If you hit a number, you return that number immediately. If you hit a *, you repeat the process again until you hit a number.

Since 49 is
not a multiple of tens, we have to use rejection sampling. Our desired range is integers from 1 to 40,
which we can return the answer immediately. If not (the integer falls between 41 to 49),
we reject it and repeat the whole process again.

?
Now let’s get our hands dirty to calculate the expected
value for the number of calls to rand7() function.
E(# calls to rand7) = 2 * (40/49) +
4 * (9/49) * (40/49) +
6 * (9/49)2 * (40/49) +
...

∞
= ∑ 2k * (9/49)k-1 * (40/49)
k=1

= (80/49) / (1 - 9/49)2
= 2.45


Optimization:

There are a total of 2.45 calls
to rand7() on
average using the above method. Can we do better? Glad that you asked. In fact, we are able to improved the above method by 10% faster.

It seems wasteful to throw away the integers in the range 41 to 49.
In fact, we could reuse them in the hope of minimizing the number of calls to rand7().
In the event that we could not generate a number in the desired range (1 to40),
it is equally likely that each number of 41 to 49 would
be chosen. In other words, we are able to obtain integers in the range of 1 to 9 uniformly.
Now, run rand7() again
and we obtain integers in the range of 1 to 63 uniformly.
Apply rejection sampling where the desired range is 1 to 60.
If the generated number is in the desired range (1 to 60),
we return the number. If it is not (61 to 63),
we at least obtain integers of 1 to 3 uniformly.
Run rand7() again
and we obtain integers in the range of 1 to 21 uniformly.
The desired range is 1 to 20,
and in the unlikely event we get a 21,
we reject it and repeat the entire process again.

Below is the code for this optimized method. Note that there are code sections that are repeated, but I leave it as it is for code clarity. (Take it as a challenge to refactor the code below!)

?
The expected value for the number of calls to rand7() function
using this optimization is:
E(# calls to rand7) = 2 * (40/49) +
3 * (9/49) * (60/63) +
4 * (9/49) * (3/63) * (20/21) +

(9/49) * (3/63) * (1/21) *
[ 6 * (40/49) +
7 * (9/49) * (60/63) +
8 * (9/49) * (3/63) * (20/21) ] +

((9/49) * (3/63) * (1/21))2 *
[ 10 * (40/49) +
11 * (9/49) * (60/63) +
12 * (9/49) * (3/63) * (20/21) ] +
...

= 2.2123
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐