Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: high precision random number generator #2357

Merged
merged 55 commits into from
Feb 27, 2024
Merged
Show file tree
Hide file tree
Changes from 50 commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
18e2522
fix: mersenne and number function precision loss
ST-DDT Aug 29, 2023
522c279
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Aug 30, 2023
ddf2927
test: add twister tests
ST-DDT Aug 30, 2023
7f4db97
test: move mersenne test to correct location
ST-DDT Aug 30, 2023
0ce9831
test: add tests for number.int and float
ST-DDT Aug 30, 2023
8fe4773
test: use some inline snapshots
ST-DDT Aug 30, 2023
e6ff244
test: select seed for max genrandInt32
ST-DDT Aug 30, 2023
b6554c6
refactor(mersenne): only increase precision
ST-DDT Aug 31, 2023
177d0dd
chore: use constant for magic numbers
ST-DDT Aug 31, 2023
c2d1393
chore: move test constants to separate file
ST-DDT Aug 31, 2023
4c51f70
test: check for correct bounds
ST-DDT Aug 31, 2023
a736534
test: normalize seed
ST-DDT Aug 31, 2023
8ead8ea
chore: only do the minimum
ST-DDT Sep 1, 2023
25fb0d9
Update test/modules/date.spec.ts
ST-DDT Sep 1, 2023
612ff4d
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Sep 7, 2023
08aec88
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Sep 11, 2023
156c15c
test: update snapshots
ST-DDT Sep 11, 2023
042609f
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Sep 11, 2023
b34be3d
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Sep 13, 2023
9435106
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Sep 15, 2023
fb2d80b
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Sep 17, 2023
b6fe199
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Sep 19, 2023
16310da
chore: update snapshots
ST-DDT Sep 19, 2023
fa39fc4
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Oct 5, 2023
fa24446
chore: adjust test names
ST-DDT Oct 5, 2023
73b3a08
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Oct 7, 2023
9c032e2
test: update snapshots
ST-DDT Oct 7, 2023
9fcb82a
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Oct 7, 2023
3ef3fbd
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Oct 8, 2023
50d4b70
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Oct 9, 2023
75b5dba
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Oct 11, 2023
a524ed1
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Oct 12, 2023
1c7ffc2
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Oct 15, 2023
5fcfc92
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Oct 26, 2023
cb9ced7
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Oct 29, 2023
6698d83
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Oct 30, 2023
6602be2
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Nov 7, 2023
5936b49
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Nov 14, 2023
49cbe86
Merge next
ST-DDT Feb 8, 2024
26bcd0c
docs: Write upgrading guide
ST-DDT Feb 8, 2024
501e453
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Feb 9, 2024
911248c
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Feb 19, 2024
a2c0773
chore: actually export the generateMersenneXRandomizer functions
ST-DDT Feb 19, 2024
6851a84
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Feb 21, 2024
4aa3740
docs: imporve upgrading guide
ST-DDT Feb 21, 2024
df889cc
chore: adjust highlight
ST-DDT Feb 21, 2024
41b00cb
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Feb 22, 2024
bd81d1e
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Feb 25, 2024
8c79f39
docs: extend documentation
ST-DDT Feb 25, 2024
4b76395
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Feb 26, 2024
3083e8e
Update docs/guide/randomizer.md
ST-DDT Feb 26, 2024
ebf5b2e
chore: apply suggestions
ST-DDT Feb 26, 2024
dbe8b68
docs: improve wording
ST-DDT Feb 26, 2024
a4b4203
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Feb 27, 2024
998611b
Merge branch 'next' into fix/mersenne-and-number-precision
ST-DDT Feb 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions docs/guide/randomizer.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,23 @@ There are two connected use cases we have considered where this might be needed:
1. Re-Use of the same `Randomizer` within multiple `Faker` instances.
2. The use of a random number generator from a third party library.

## Built-In `Randomizer`s

Faker ships with two variations

```ts
import {
generateMersenne32Randomizer, // Default prior to v9
generateMersenne53Randomizer, // Default since v9
} from '@faker-js/faker';

const randomizer = generateMersenne53Randomizer();
```

The 32bit `Randomizer` is faster, but the 53bit `Randomizer` generates better random values (with significantly less duplicates).
ST-DDT marked this conversation as resolved.
Show resolved Hide resolved

But you can also implement your own by implementing the [related interface](/api/randomizer.html).

## Using `Randomizer`s

A `Randomizer` has to be set during construction of the instance:
Expand Down
70 changes: 70 additions & 0 deletions docs/guide/upgrading_v9/2357.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Use High Precision RNG by default

In v9 we switch from a 32 bit random value to a 53 bit random value.
We don't change the underlying algorithm much, but we now consume two seed values each step instead of one.
This affects generated values in three ways:

- In large lists or long numbers the values are spread more evenly.
This also reduces the number of duplicates it generates.
For `faker.number.int()` this reduces the duplicates from `1 / 10_000` to less than `1 / 8_000_000`.
- Some seeded runs now return slightly different values, due to the last digits of random numbers changing,
ST-DDT marked this conversation as resolved.
Show resolved Hide resolved
some methods now generate slightly different results.
- Subsequent values change their values. Because the generator now takes two values each,
the values afterwards are different from before.

```ts
import {
SimpleFaker,
generateMersenne32Randomizer,
generateMersenne53Randomizer,
} from '@faker-js/faker';

// < v9 default
const f32 = new SimpleFaker({ randomizer: generateMersenne32Randomizer() });
f32.seed(123);
const r32 = f32.helpers.multiple(() => f32.number.int(10), { count: 10 });
// > v9 default
const f53 = new SimpleFaker({ randomizer: generateMersenne53Randomizer() });
f53.seed(123);
const r53 = f53.helpers.multiple(() => f53.number.int(10), { count: 5 });

diff(r32, r53);
//[
// 7,
// 7, // [!code --]
// 3,
// 4, // [!code --]
// 2,
// 7, // [!code --]
// 6,
// 7, // [!code --]
// 7,
// 5, // [!code --]
//]
```

## Adoption

If you don't have any seeded tests and just want some random values, then you don't have to change anything.

If you have seeded tests, you have to update most test snapshots or similar comparisons to new values.

If you are using vitest, you can do that using `pnpm vitest run -u`.

## Keeping the old behavior

You can keep the old behavior, if you create your own `Faker` instance
and pass a `Randomizer` instance from the `generateMersenne32Randomizer()` function to it.

```ts{8}
import {
Faker,
generateMersenne32Randomizer, // < v9 default
generateMersenne53Randomizer, // > v9 default
} from '@faker-js/faker';

const faker = new Faker({
randomizer: generateMersenne32Randomizer(),
matthewmayer marked this conversation as resolved.
Show resolved Hide resolved
...
});
```
2 changes: 1 addition & 1 deletion src/faker.ts
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ export class Faker extends SimpleFaker {
* Specify this only if you want to use it to achieve a specific goal,
* such as sharing the same random generator with other instances/tools.
*
* @default generateMersenne32Randomizer()
* @default generateMersenne53Randomizer()
*/
randomizer?: Randomizer;
});
Expand Down
4 changes: 4 additions & 0 deletions src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,10 @@ export type {
export { FakerError } from './errors/faker-error';
export { Faker } from './faker';
export type { FakerOptions } from './faker';
export {
generateMersenne32Randomizer,
generateMersenne53Randomizer,
} from './internal/mersenne';
export * from './locale';
export { fakerEN as faker } from './locale';
export * from './locales';
Expand Down
27 changes: 24 additions & 3 deletions src/internal/mersenne.ts
Original file line number Diff line number Diff line change
Expand Up @@ -328,9 +328,7 @@ export class MersenneTwister19937 {

/**
* Generates a MersenneTwister19937 randomizer with 32 bits of precision.
* This is the default randomizer used by Faker.
*
* @internal
* This is the default randomizer used by faker prior to v9.0.
*/
export function generateMersenne32Randomizer(): Randomizer {
const twister = new MersenneTwister19937();
Expand All @@ -350,3 +348,26 @@ export function generateMersenne32Randomizer(): Randomizer {
},
};
}

/**
* Generates a MersenneTwister19937 randomizer with 53 bits of precision.
* This is the default randomizer used by faker starting with v9.0.
*/
export function generateMersenne53Randomizer(): Randomizer {
const twister = new MersenneTwister19937();

twister.initGenrand(Math.ceil(Math.random() * Number.MAX_SAFE_INTEGER));

return {
next(): number {
return twister.genrandRes53();
},
seed(seed: number | number[]): void {
if (typeof seed === 'number') {
twister.initGenrand(seed);
} else if (Array.isArray(seed)) {
twister.initByArray(seed, seed.length);
}
},
};
}
6 changes: 3 additions & 3 deletions src/simple-faker.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { generateMersenne32Randomizer } from './internal/mersenne';
import { generateMersenne53Randomizer } from './internal/mersenne';
import { DatatypeModule } from './modules/datatype';
import { SimpleDateModule } from './modules/date';
import { SimpleHelpersModule } from './modules/helpers';
Expand Down Expand Up @@ -117,12 +117,12 @@ export class SimpleFaker {
* Specify this only if you want to use it to achieve a specific goal,
* such as sharing the same random generator with other instances/tools.
*
* @default generateMersenne32Randomizer()
* @default generateMersenne53Randomizer()
*/
randomizer?: Randomizer;
} = {}
) {
const { randomizer = generateMersenne32Randomizer() } = options;
const { randomizer = generateMersenne53Randomizer() } = options;

this._randomizer = randomizer;
}
Expand Down
12 changes: 12 additions & 0 deletions test/internal/__snapshots__/mersenne.spec.ts.snap
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,15 @@ exports[`generateMersenne32Randomizer() > seed: 42 > should return deterministic
exports[`generateMersenne32Randomizer() > seed: 1211 > should return deterministic value for next() 1`] = `0.9285201537422836`;

exports[`generateMersenne32Randomizer() > seed: 1337 > should return deterministic value for next() 1`] = `0.2620246761944145`;

exports[`generateMersenne53Randomizer() > seed: [42,1,2] > should return deterministic value for next() 1`] = `0.8562037477947296`;

exports[`generateMersenne53Randomizer() > seed: [1211,1,2] > should return deterministic value for next() 1`] = `0.8916433279801969`;

exports[`generateMersenne53Randomizer() > seed: [1337,1,2] > should return deterministic value for next() 1`] = `0.17990487224060836`;

exports[`generateMersenne53Randomizer() > seed: 42 > should return deterministic value for next() 1`] = `0.3745401188473625`;

exports[`generateMersenne53Randomizer() > seed: 1211 > should return deterministic value for next() 1`] = `0.9285201539025842`;

exports[`generateMersenne53Randomizer() > seed: 1337 > should return deterministic value for next() 1`] = `0.2620246750155817`;
8 changes: 6 additions & 2 deletions test/internal/mersenne.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ import { beforeAll, beforeEach, describe, expect, it } from 'vitest';
import {
MersenneTwister19937,
generateMersenne32Randomizer,
generateMersenne53Randomizer,
} from '../../src/internal/mersenne';
import type { Randomizer } from '../../src/randomizer';
import { seededRuns } from '../support/seeded-runs';
Expand Down Expand Up @@ -84,8 +85,11 @@ describe('MersenneTwister19937', () => {
});
});

describe('generateMersenne32Randomizer()', () => {
const randomizer: Randomizer = generateMersenne32Randomizer();
describe.each([
['generateMersenne32Randomizer()', generateMersenne32Randomizer],
['generateMersenne53Randomizer()', generateMersenne53Randomizer],
])('%s', (_, factory) => {
const randomizer: Randomizer = factory();

it('should return a result matching the interface', () => {
expect(randomizer).toBeDefined();
Expand Down
82 changes: 41 additions & 41 deletions test/modules/__snapshots__/airline.spec.ts.snap
Original file line number Diff line number Diff line change
Expand Up @@ -23,33 +23,33 @@ exports[`airline > 42 > airport 1`] = `
}
`;

exports[`airline > 42 > flightNumber > flightNumber addLeadingZeros 1`] = `"0089"`;
exports[`airline > 42 > flightNumber > flightNumber addLeadingZeros 1`] = `"0097"`;

exports[`airline > 42 > flightNumber > flightNumber length 2 to 4 1`] = `"891"`;
exports[`airline > 42 > flightNumber > flightNumber length 2 to 4 1`] = `"975"`;

exports[`airline > 42 > flightNumber > flightNumber length 2 to 4 and addLeadingZeros 1`] = `"0891"`;
exports[`airline > 42 > flightNumber > flightNumber length 2 to 4 and addLeadingZeros 1`] = `"0975"`;

exports[`airline > 42 > flightNumber > flightNumber length 3 1`] = `"479"`;
exports[`airline > 42 > flightNumber > flightNumber length 3 1`] = `"497"`;

exports[`airline > 42 > flightNumber > flightNumber length 3 and addLeadingZeros 1`] = `"0479"`;
exports[`airline > 42 > flightNumber > flightNumber length 3 and addLeadingZeros 1`] = `"0497"`;

exports[`airline > 42 > flightNumber > noArgs 1`] = `"89"`;
exports[`airline > 42 > flightNumber > noArgs 1`] = `"97"`;

exports[`airline > 42 > recordLocator > allowNumerics 1`] = `"DTY7RT"`;
exports[`airline > 42 > recordLocator > allowNumerics 1`] = `"DYRM66"`;

exports[`airline > 42 > recordLocator > allowVisuallySimilarCharacters 1`] = `"JUYETU"`;
exports[`airline > 42 > recordLocator > allowVisuallySimilarCharacters 1`] = `"JYTPEE"`;

exports[`airline > 42 > recordLocator > both allowNumerics and allowVisuallySimilarCharacters 1`] = `"DSY6QS"`;
exports[`airline > 42 > recordLocator > both allowNumerics and allowVisuallySimilarCharacters 1`] = `"DYQL55"`;

exports[`airline > 42 > recordLocator > noArgs 1`] = `"JVYETU"`;
exports[`airline > 42 > recordLocator > noArgs 1`] = `"JYTQDD"`;

exports[`airline > 42 > seat > aircraftType narrowbody 1`] = `"14E"`;
exports[`airline > 42 > seat > aircraftType narrowbody 1`] = `"14F"`;

exports[`airline > 42 > seat > aircraftType regional 1`] = `"8D"`;

exports[`airline > 42 > seat > aircraftType widebody 1`] = `"23H"`;
exports[`airline > 42 > seat > aircraftType widebody 1`] = `"23K"`;

exports[`airline > 42 > seat > noArgs 1`] = `"14E"`;
exports[`airline > 42 > seat > noArgs 1`] = `"14F"`;

exports[`airline > 1211 > aircraftType 1`] = `"widebody"`;

Expand All @@ -74,33 +74,33 @@ exports[`airline > 1211 > airport 1`] = `
}
`;

exports[`airline > 1211 > flightNumber > flightNumber addLeadingZeros 1`] = `"5872"`;
exports[`airline > 1211 > flightNumber > flightNumber addLeadingZeros 1`] = `"9296"`;

exports[`airline > 1211 > flightNumber > flightNumber length 2 to 4 1`] = `"5872"`;
exports[`airline > 1211 > flightNumber > flightNumber length 2 to 4 1`] = `"9296"`;

exports[`airline > 1211 > flightNumber > flightNumber length 2 to 4 and addLeadingZeros 1`] = `"5872"`;
exports[`airline > 1211 > flightNumber > flightNumber length 2 to 4 and addLeadingZeros 1`] = `"9296"`;

exports[`airline > 1211 > flightNumber > flightNumber length 3 1`] = `"948"`;
exports[`airline > 1211 > flightNumber > flightNumber length 3 1`] = `"982"`;

exports[`airline > 1211 > flightNumber > flightNumber length 3 and addLeadingZeros 1`] = `"0948"`;
exports[`airline > 1211 > flightNumber > flightNumber length 3 and addLeadingZeros 1`] = `"0982"`;

exports[`airline > 1211 > flightNumber > noArgs 1`] = `"5872"`;
exports[`airline > 1211 > flightNumber > noArgs 1`] = `"9296"`;

exports[`airline > 1211 > recordLocator > allowNumerics 1`] = `"XGWT86"`;
exports[`airline > 1211 > recordLocator > allowNumerics 1`] = `"XW8ZPQ"`;

exports[`airline > 1211 > recordLocator > allowVisuallySimilarCharacters 1`] = `"YLXUFD"`;
exports[`airline > 1211 > recordLocator > allowVisuallySimilarCharacters 1`] = `"YXFZRR"`;

exports[`airline > 1211 > recordLocator > both allowNumerics and allowVisuallySimilarCharacters 1`] = `"XGWS84"`;
exports[`airline > 1211 > recordLocator > both allowNumerics and allowVisuallySimilarCharacters 1`] = `"XW8ZOO"`;

exports[`airline > 1211 > recordLocator > noArgs 1`] = `"YMXUFC"`;
exports[`airline > 1211 > recordLocator > noArgs 1`] = `"YXFZSS"`;

exports[`airline > 1211 > seat > aircraftType narrowbody 1`] = `"33C"`;
exports[`airline > 1211 > seat > aircraftType narrowbody 1`] = `"33F"`;

exports[`airline > 1211 > seat > aircraftType regional 1`] = `"19B"`;
exports[`airline > 1211 > seat > aircraftType regional 1`] = `"19D"`;

exports[`airline > 1211 > seat > aircraftType widebody 1`] = `"56E"`;
exports[`airline > 1211 > seat > aircraftType widebody 1`] = `"56J"`;

exports[`airline > 1211 > seat > noArgs 1`] = `"33C"`;
exports[`airline > 1211 > seat > noArgs 1`] = `"33F"`;

exports[`airline > 1337 > aircraftType 1`] = `"narrowbody"`;

Expand All @@ -125,30 +125,30 @@ exports[`airline > 1337 > airport 1`] = `
}
`;

exports[`airline > 1337 > flightNumber > flightNumber addLeadingZeros 1`] = `"0061"`;
exports[`airline > 1337 > flightNumber > flightNumber addLeadingZeros 1`] = `"0022"`;

exports[`airline > 1337 > flightNumber > flightNumber length 2 to 4 1`] = `"61"`;
exports[`airline > 1337 > flightNumber > flightNumber length 2 to 4 1`] = `"22"`;

exports[`airline > 1337 > flightNumber > flightNumber length 2 to 4 and addLeadingZeros 1`] = `"0061"`;
exports[`airline > 1337 > flightNumber > flightNumber length 2 to 4 and addLeadingZeros 1`] = `"0022"`;

exports[`airline > 1337 > flightNumber > flightNumber length 3 1`] = `"351"`;
exports[`airline > 1337 > flightNumber > flightNumber length 3 1`] = `"312"`;

exports[`airline > 1337 > flightNumber > flightNumber length 3 and addLeadingZeros 1`] = `"0351"`;
exports[`airline > 1337 > flightNumber > flightNumber length 3 and addLeadingZeros 1`] = `"0312"`;

exports[`airline > 1337 > flightNumber > noArgs 1`] = `"61"`;
exports[`airline > 1337 > flightNumber > noArgs 1`] = `"22"`;

exports[`airline > 1337 > recordLocator > allowNumerics 1`] = `"AK68AJ"`;
exports[`airline > 1337 > recordLocator > allowNumerics 1`] = `"A6AGBJ"`;

exports[`airline > 1337 > recordLocator > allowVisuallySimilarCharacters 1`] = `"GOEFHO"`;
exports[`airline > 1337 > recordLocator > allowVisuallySimilarCharacters 1`] = `"GEHLIN"`;

exports[`airline > 1337 > recordLocator > both allowNumerics and allowVisuallySimilarCharacters 1`] = `"9K57AJ"`;
exports[`airline > 1337 > recordLocator > both allowNumerics and allowVisuallySimilarCharacters 1`] = `"95AGBI"`;

exports[`airline > 1337 > recordLocator > noArgs 1`] = `"GPDEGP"`;
exports[`airline > 1337 > recordLocator > noArgs 1`] = `"GDGMHN"`;

exports[`airline > 1337 > seat > aircraftType narrowbody 1`] = `"10D"`;
exports[`airline > 1337 > seat > aircraftType narrowbody 1`] = `"10A"`;

exports[`airline > 1337 > seat > aircraftType regional 1`] = `"6C"`;
exports[`airline > 1337 > seat > aircraftType regional 1`] = `"6A"`;

exports[`airline > 1337 > seat > aircraftType widebody 1`] = `"16F"`;
exports[`airline > 1337 > seat > aircraftType widebody 1`] = `"16B"`;

exports[`airline > 1337 > seat > noArgs 1`] = `"10D"`;
exports[`airline > 1337 > seat > noArgs 1`] = `"10A"`;
Loading