PHP: Random\Randomizer::getFloat

Description

public Random\Randomizer::getFloat(float $min, float $max, Random\IntervalBoundary $boundary = Random\IntervalBoundary::ClosedOpen): float

Returns a uniformly selected, equidistributed float from a requested interval.

Due to the limited precision, not all real numbers can be exactly represented as a floating point number. If a number cannot be represented exactly, it is rounded to the nearest representable exact value. Furthermore, floats are not equally dense across the whole number line. Because floats use a binary exponent, the distance between two neighboring floats doubles at each power of two. In other words: There are the same number of representable floats between 1.0 and 2.0 as they are between 2.0 and 4.0, 4.0 and 8.0, 8.0 and 16.0, and so on.

Randomly sampling an arbitrary number within the requested interval, for example by dividing two integers, might result in a biased distribution for this reason. The necessary rounding will cause some floats to be returned more often than others, especially around powers of two when the density of floats changes.

Random\Randomizer::getFloat() implements an algorithm that will return a uniformly selected float from the largest possible set of exactly representable and equidistributed floats within the requested interval. The distance between the selectable floats (“step size”) matches the distance between the floats with the lowest density, i.e. the distance between floats at interval boundary with the larger absolute value. This means that not all representable floats within a given interval may be returned if the interval crosses one or more powers of two. Stepping will start from the interval boundary with the larger absolute value to ensure the steps align with the exactly representable floats.

Closed interval boundaries will always be included in the set of selectable floats. Thus, if the size of the interval is not an exact multiple of the step size and the boundary with the smaller absolute value is a closed boundary, the distance between that boundary and its nearest selectable float will be smaller than the step size.

Caution

Post-processing the returned floats is likely going to break the uniform equidistribution, because the intermediate floats within a mathematical operation are experiencing implicit rounding. The requested interval should match the desired interval as closely as possible and rounding should only be performed as an explicit operation right before displaying the selected number to a user.

Explanation of the Algorithm Using Example Values

To give an example of how the algorithm works, consider a floating point representation that uses a 3-bit mantissa. This representation is capable of representing 8 different floating point values between consecutive powers of two. This means that between 1.0 and 2.0 all steps of size 0.125 are exactly representable and between 2.0 and 4.0 all steps of size 0.25 are exactly representable. In reality PHP’s floats use a 52-bit mantissa and can represent 2⁵² different values between each power of two. This means that

1.0
1.125
1.25
1.375
1.5
1.625
1.75
1.875
2.0
2.25
2.5
2.75
3.0
3.25
3.5
3.75
4.0

are the exactly representable floats between 1.0 and 4.0.

Now consider that $randomizer->getFloat(1.625, 2.5, IntervalBoundary::ClosedOpen) is called, i.e. a random float starting at 1.625 until, but not including, 2.5 is requested. The algorithm first determines the step size at the boundary with the larger absolute value (2.5). The step size at that boundary is 0.25.

Note that the size of the requested interval is 0.875, which is not an exact multiple of 0.25. If the algorithm would start stepping at the lower bound 1.625, it would encounter 2.125, which is not exactly representable and would experience implicit rounding. Thus the algorithm starts stepping at the upper boundary 2.5. The selectable values are:

2.25
2.0
1.75
1.625

2.5 is not included, because the upper boundary of the requested interval is an open boundary. 1.625 is included, even though its distance to the nearest value 1.75 is 0.125, which is smaller than the previously determined step size of 0.25. The reason for that is that the requested interval is closed at the lower boundary (1.625) and closed boundaries are always included.

Finally the algorithm uniformly selects one of the four selectable values at random and returns it.

Why Dividing Two Integers Does Not Work

In the previous example, there are eight representable floating point numbers between each sub-interval delimited by a power of two. To give an example why dividing two integers would not work well to generate a random float, consider that there are 16 equidistributed floating point numbers in the right-open interval from 0.0 until, but not including, 1.0. Half of them are the eight exactly representable values between 0.5 and 1.0, the other half are the values between 0.0 and 1.0 that the step size of 0.0625. These can easily be generated by dividing a random integer between 0 and 15 by 16 to obtain one of:

0.0
0.0625
0.125
0.1875
0.25
0.3125
0.375
0.4375
0.5
0.5625
0.625
0.6875
0.75
0.8125
0.875
0.9375

This random float could be scaled to right-open interval from 1.625 until, but not including, 2.75 by multiplying it with the size of the interval (0.875) and adding the minimum 1.625. This so-called affine transformation would result in the values:

1.625 rounded to 1.625
1.679 rounded to 1.625
1.734 rounded to 1.75
1.789 rounded to 1.75
1.843 rounded to 1.875
1.898 rounded to 1.875
1.953 rounded to 2.0
2.007 rounded to 2.0
2.062 rounded to 2.0
2.117 rounded to 2.0
2.171 rounded to 2.25
2.226 rounded to 2.25
2.281 rounded to 2.25
2.335 rounded to 2.25
2.390 rounded to 2.5
2.445 rounded to 2.5

Note how the upper boundary of 2.5 would be returned, despite being an open boundary and thus being excluded. Also note how 2.0 and 2.25 are twice as likely to be returned compared to the other values.

Parameters

min: The lower bound of the interval.
max: The upper bound of the interval.
boundary: Specifies whether the interval boundaries are possible return values.

Return Values

A uniformly selected, equidistributed float from the interval specified by min, max, and boundary. Whether min and max are possible return values depends on the value of boundary.

Errors/Exceptions

If the value of min is not finite (is_finite()), a ValueError will be thrown.
If the value of max is not finite (is_finite()), a ValueError will be thrown.
If the requested interval does not contain any values, a ValueError will be thrown.
Any Throwables thrown by the Random\Engine::generate() method of the underlying Random\Randomizer::$engine.

Examples

Example #1 Random\Randomizer::getFloat() example

<?php
$randomizer = new \Random\Randomizer();

// Note that the latitude granularity is double the
// longitude’s granularity.
//
// For the latitude the value may be both -90 and 90.
// For the longitude the value may be 180, but not -180, because
// -180 and 180 refer to the same longitude.
printf(
    "Lat: %+.6f Lng: %+.6f",
    $randomizer->getFloat(-90, 90, \Random\IntervalBoundary::ClosedClosed),
    $randomizer->getFloat(-180, 180, \Random\IntervalBoundary::OpenClosed),
);
?>

The above example will output something similar to:

Lat: +69.244304 Lng: -53.548951

Notes

Note:
This method implements the γ-section algorithm as published in » Drawing Random Floating-Point Numbers from an Interval. Frédéric Goualard, ACM Trans. Model. Comput. Simul., 32:3, 2022 to obtain the desired behavioral properties.

Random\Randomizer::getFloat