Skip to content

uniform_real_distribution<double> produces identical values too often #61

@danieltuzes

Description

@danieltuzes

If I generate uniformly distributed numbers with boost::random::uniform_real_distribution<double> it produces identical values too often. For further information, please read the original issue on SO and an example code on wandbox. I post my problem here as somebody suggested that it could be a bug. I am still not convinced about that but I am also not sure if there is any bug in my code that produces this strange behavior.

I copy the question from SO here but follow the link for further updates and suggestions.

Problem description

Sometimes I get the same random number from a uniform distribution using a Mersenne Twister engine even I properly used the engine and iterated it. I know that the number of possible states of the engine is finite and number of possible generated values is also finite, but this is not the case now.

Using boost's implementation, 1e6 number of uniformly distributed random values are generated on the range [0; 1e7). That means that there way more possible values than required number of random values. However, I get quite often the same values, sometimes more than 100 times in this range. How is it possible?

Code

A simple code is provided to reproduce the situation. On both platforms I get the same problem:

  • MSVS 2019 with boost-random:x64-windows 1.71.0, and
  • g++ (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 with libboost-dev 1.58.0.1ubuntu1
#include <iostream>
#include <chrono>

#include <boost/random/mersenne_twister.hpp>          // random number generator
#include <boost/random/uniform_real_distribution.hpp> // uniform distribution generator
using namespace std;

int main()
{
    size_t seed = static_cast<int> (std::chrono::system_clock::now().time_since_epoch().count());
    cout << "seed = " << seed << endl;
    
    boost::random::mt19937 engine(seed);                         // the random number generator engine
    boost::random::uniform_real_distribution<double> u(0, 1e7);  // uniformly distributed double values on the range [0; 1e7)
    cout.precision(20);
    vector<double> history;                                      // stores the generated values for comparison
    for (size_t i = 0; i < 1e6; ++i)
    {
        history.push_back(u(engine));
        for (size_t j = 0; j < i; ++j)
            if (history[i] == history[j])
                cout << "Equal values ("<< history[i] <<") at ID = " << i << " and " << j << endl;
    }
}

Question

Is there a bug in the code that produces the same values? Or is it a bug in boost?

For my task it is important to generate numbers with uniform distribution. Finding identical values is one of the easiest tests but there are many more and I am sure I don't want to do quality analysis on a well-known library like Boost. I didn't want to use the standard library, because it is not guaranteed that two different compilers will give the same sequence for the same seed values, but it was a requirement for the task. What kind of a solution can you suggest?

Note

A strange behavior can be seen if one compares the generated values with the one std::random generates. Example for values from random::boost for seed 4561565448989 is

1755586.0406719148159
3354420.976247638464   <--
3630764.0071026980877
3488445.2889673411846  <--
7920481.4555123448372
8773544.1024415194988  <--

while standard library generates

3354420.9766563926823  <--
3488445.2898126943037  <--
8773544.1042856499553  <--
...

That is, every second generated value in the boost's sequence is very close to a corresponding value in the standard library's implementation. When two values in the boost-sequence are equal, the values in the standard-library-sequence are not equal, but close to each other. The similarity holds for MSVS and g++ compilers too, which have the right to have different implementation for Mersenne Twister and distributions.

It is the distribution generator

If the standard library's MT engine is used, but boost's distribution, the problem persists. But if the engine is the one from boost and the distribution is standard, the problem disappears. The problem is, as Peter pointed out, that the uniform distribution is platform depend for which I use boost.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions