Fast PRNG (suitable for games + demos)
Post by David Williams on Apr 6th, 2010, 08:14am
You probably wouldn't want to use this if high quality pseudo-random numbers are required, but this fast PRNG seems adequate for games and demos, and is intended to be called from assembly language routines (I wouldn't bother with it if calling from BASIC). I don't know why it works.
Code: MODE 8 : OFF
PROCasm
seed% = RND
FOR I% = 1 TO 1000000
SYS FastRnd%, ^seed%, 640 TO X%
SYS FastRnd%, ^seed%, 512 TO Y%
PLOT 2*X%, 2*Y%
NEXT I%
END
DEF PROCasm
LOCAL I%, P%, code%
DIM code% 45
FOR I% = 0 TO 2 STEP 2
P% = code%
[OPT I%
.FastRnd%
; REM. seed% = 36969*(seed% AND 65535) + (seed% >> 16)
; [ESP + 4] = ^seed
; [ESP + 8] = range
mov ecx, [esp + 4]
mov eax, [ecx]
mov ebx, eax
shr ebx, 16
and eax, 65535
imul eax, 36969
add eax, ebx
mov [ecx], eax
and eax, &FFFF
imul eax, [esp + 8]
shr eax, 16
add eax, 1
ret 8
]
NEXT I%
ENDPROC
David.
Re: Fast PRNG (suitable for games + demos)
Post by admin on Apr 6th, 2010, 08:59am
on Apr 6th, 2010, 08:14am, David Williams wrote:I don't know why it works. |
|
It looks like the well-known "Multiply-With-Carry" generator by George Marsaglia, but with a chunk missing! The related MWC generator is this (C code):
Code: m_z = 36969 * (m_z & 65535) + (m_z >> 16);
m_w = 18000 * (m_w & 65535) + (m_w >> 16);
return (m_z << 16) + m_w; /* 32-bit result */
but your code appears to have omitted the m_w part entirely. An obvious result of that simplification is that the number of state bits is halved (from 64 to 32) so the sequence length is reduced from about 2^60 to no better than 2^32-1, making it worse than (rather than far better than) BBC BASIC's RND.
Incidentally, since the MWC generator is quite similar to the older linear congruential generator you might want to consider using a 3D plot rather than a 2D plot as in your program, because that can reveal shortcomings. See this Wikipedia article for details:
http://en.wikipedia.org/wiki/RANDU
Here is George Marsaglia's own post in sci.crypt, listing the above generator and several others:
http://groups.google.com/group/sci.crypt/browse_thread/thread/ca8682a4658a124d/
Richard.
Re: Fast PRNG (suitable for games + demos)
Post by David Williams on Apr 6th, 2010, 1:12pm
on Apr 6th, 2010, 08:59am, Richard Russell wrote:It looks like the well-known "Multiply-With-Carry" generator by George Marsaglia, but with a chunk missing! |
|
Yes it is, but like I said, my modified version is a simple and fast PRNG that may be suited to games and graphics demos (not meant for anything else), and is intended to be called only from assembly language routines (not BASIC).
Speed was the point.
I will probably use this PRNG myself!
David.
Re: Fast PRNG (suitable for games + demos)
Post by admin on Apr 6th, 2010, 10:37pm
on Apr 6th, 2010, 1:12pm, David Williams wrote:Yes it is, but like I said, my modified version is a simple and fast PRNG |
|
George Marsaglia's versions are "simple and fast" too; maybe not quite as simple or fast as yours, but with a known level of performance. What I would like to know is the extent to which your modifications have impaired the performance.
I wouldn't care if you were simply using it for your own purposes, but you chose to list it here, presumably for other people to use. Before anybody can decide whether it is or isn't suitable for their application they need to know what its properties are.
Richard.
Re: Fast PRNG (suitable for games + demos)
Post by David Williams on Apr 7th, 2010, 05:49am
on Apr 6th, 2010, 10:37pm, Richard Russell wrote:George Marsaglia's versions are "simple and fast" too; maybe not quite as simple or fast as yours, but with a known level of performance. What I would like to know is the extent to which your modifications have impaired the performance.
I wouldn't care if you were simply using it for your own purposes, but you chose to list it here, presumably for other people to use. Before anybody can decide whether it is or isn't suitable for their application they need to know what its properties are.
Richard. |
|
A proper statistical analysis is currently beyond my capabilities, however I can tell you that it seems adequate for making 'random' binary decisions (something I do a lot in my game programming). I wouldn't recommend it be used for anything other than that.
Given what I've already said about it, only an idiot would want to call it from BASIC (other than for test/example purposes), or use it in anything other than a game or graphics demo, or rely on it for high quality random numbers.
For making fast pseudo-random binary decisions (from within an assembly language routine), it seems OK.
David.
Re: Fast PRNG (suitable for games + demos)
Post by admin on Apr 7th, 2010, 09:12am
on Apr 7th, 2010, 05:49am, David Williams wrote:Given what I've already said about it, only an idiot would want to call it from BASIC.... For making fast pseudo-random binary decisions (from within an assembly language routine), it seems OK. |
|
I don't really see why you emphasise so strongly the distinction between BASIC and assembler. One might write a simple video game in BASIC for which your PRNG was perfectly suitable. Equally, one might write a Patience program in assembler (requiring card shuffling) for which your PRNG would be entirely unsuited.
Surely whether a program is written in BASIC or assembler is irrelevant; the suitability of a PRNG depends on its properties and the requirements of the program using it. After all, if you initially write a program in BASIC to prove the algorithm, then port it to assembler for speed, the quality of the PRNG it needs doesn't suddenly diminish!
Richard.
Re: Fast PRNG (suitable for games + demos)
Post by David Williams on Apr 7th, 2010, 09:53am
on Apr 7th, 2010, 09:12am, Richard Russell wrote:I don't really see why you emphasise so strongly the distinction between BASIC and assembler. One might write a simple video game in BASIC for which your PRNG was perfectly suitable. Equally, one might write a Patience program in assembler (requiring card shuffling) for which your PRNG would be entirely unsuited. |
|
Speed, obviously.
Why would anyone want to call FastRnd from a BASIC program when that would be slower than using BB4W's standard (and more versatile) RND function?
Here's the proof:
Code: MODE 8
PROCasm
seed% = RND
PRINT "Running test 1 (BASIC RND)..."
A% = 0
SYS "GetTickCount" TO timeA0%
FOR I% = 1 TO 1000000
IF RND(2)=1 THEN
A% += 1
ELSE
A% -= 1
ENDIF
NEXT I%
SYS "GetTickCount" TO timeA1%
PRINT "Running test 2 (FastRnd)..."
A% = 0
F% = FastRnd%
S% = ^seed%
SYS "GetTickCount" TO timeB0%
FOR I% = 1 TO 1000000
SYS F%, S%, 2 TO N%
IF N%=1 THEN
A% += 1
ELSE
A% -= 1
ENDIF
NEXT I%
SYS "GetTickCount" TO timeB1%
PRINT "RND took ";timeA1%-timeA0%;" ms."
PRINT "FastRnd took ";timeB1%-timeB0%; " ms."
END
DEF PROCasm
LOCAL I%, P%, code%
DIM code% 45
FOR I% = 0 TO 2 STEP 2
P% = code%
[OPT I%
.FastRnd%
; REM. seed% = 36969*(seed% AND 65535) + (seed% >> 16)
; [ESP + 4] = ^seed
; [ESP + 8] = range
mov ecx, [esp + 4]
mov eax, [ecx]
mov ebx, eax
shr ebx, 16
and eax, 65535
imul eax, 36969
add eax, ebx
mov [ecx], eax
and eax, &FFFF
imul eax, [esp + 8]
shr eax, 16
add eax, 1
ret 8
]
NEXT I%
ENDPROC
Using FastRnd from a BASIC program is significantly slower than using RND and offers no other benefit over RND.
However, for making fast 'random' Yes/No (or True/False) decisions within an assembly language program, I think it's quite ideal.
If you know of a faster way of making random binary (50/50) decisions in an assembly language routine then please do share. :-)
David.
Re: Fast PRNG (suitable for games + demos)
Post by admin on Apr 7th, 2010, 1:30pm
on Apr 7th, 2010, 09:53am, David Williams wrote:
Really? Two points to make: firstly, the reason the assembler code is so much slower than RND is because you've failed to ensure the necessary gap between the data and code to avoid cache thrashing.
Modify this line as shown and you should find the assembler code runs much more quickly (depending on your CPU):
Code:
Secondly, your so-called 'fast' PRNG runs no faster than the assembler version of RND listed on the Wiki! Try this extended program, which compares BASIC RND, your FastRnd and the assembler Rnd. On my PC, Rnd and FastRnd take virtually the same time (sometimes one is slightly faster, sometimes the other):
Code: MODE 8
PROCasm
DIM seed% 4
!seed% = RND
PRINT "Running test 1 (BASIC RND)..."
A% = 0
SYS "GetTickCount" TO timeA0%
FOR I% = 1 TO 1000000
IF RND(2)=1 THEN
A% += 1
ELSE
A% -= 1
ENDIF
NEXT I%
SYS "GetTickCount" TO timeA1%
PRINT "Running test 2 (FastRnd)..."
A% = 0
F% = FastRnd%
C% = seed%
D% = 2
SYS "GetTickCount" TO timeB0%
FOR I% = 1 TO 1000000
IF USR(F%)=1 THEN
A% += 1
ELSE
A% -= 1
ENDIF
NEXT I%
SYS "GetTickCount" TO timeB1%
PRINT "Running test 3 (assembler Rnd)..."
A% = 0
R% = Rnd%
C% = seed%
D% = 2
SYS "GetTickCount" TO timeC0%
FOR I% = 1 TO 1000000
IF USR(R%)=1 THEN
A% += 1
ELSE
A% -= 1
ENDIF
NEXT I%
SYS "GetTickCount" TO timeC1%
PRINT "BASIC RND took ";timeA1%-timeA0%;" ms."
PRINT "FastRnd took ";timeB1%-timeB0%; " ms."
PRINT "Assembler Rnd took ";timeC1%-timeC0%; " ms."
END
DEF PROCasm
LOCAL I%, L%, P%, code%
DIM code% 100, L% -1, gap% 2047
FOR I% = 8 TO 10 STEP 2
P% = code%
[OPT I%
.FastRnd%
; REM. seed% = 36969*(seed% AND 65535) + (seed% >> 16)
; ecx = ^seed
; edx = range
mov eax, [ecx]
mov ebx, eax
shr ebx, 16
and eax, 65535
imul eax, 36969
add eax, ebx
mov [ecx], eax
and eax, &FFFF
imul eax, edx
shr eax, 16
add eax, 1
ret
.Rnd%
; REM. Identical algorithm to RND
; ecx = ^seed
; edx = range
mov esi,ecx
mov eax,[esi]
mov cl,[esi+4]
mov ebx,eax
shr cl,1
rcr ebx,1
rcl cl,1
shl eax,12
xor ebx,eax
mov eax,ebx
shr eax,20
xor eax,ebx
mov [esi+4],cl
mov [esi],eax
mul edx
lea eax,[edx+1]
ret
]
NEXT I%
ENDPROC
Richard.
Re: Fast PRNG (suitable for games + demos)
Post by David Williams on Apr 7th, 2010, 6:44pm
on Apr 7th, 2010, 1:30pm, Richard Russell wrote:Really? Two points to make: firstly, the reason the assembler code is so much slower than RND is because you've failed to ensure the necessary gap between the data and code to avoid cache thrashing. |
|
On my system (Intel Centrino), at least on this occasion, the inclusion of a 2Kb (or 4Kb) gap makes no difference to execution speed (I do normally include a gap, actually).
on Apr 7th, 2010, 1:30pm, Richard Russell wrote:Secondly, your so-called 'fast' PRNG runs no faster than the assembler version of RND listed on the Wiki! Try this extended program ... |
|
Yes, I confirm that the performance of assembler RND and FastRnd are about the same.
BB4W's standard RND function is still the clear winner on my system.
Many of my ASM programs include a function which returns a random sign integer (+1 or -1), and so I think it's possible to shave a few instructions off FastRnd to make it a dedicated, fast pseudo-random binary (50/50) decision maker. That's really all I'd use it for. For other things, I'd probably use (and have several times used) your RND (and associated) assembler routines.
David.
Re: Fast PRNG (suitable for games + demos)
Post by admin on Apr 7th, 2010, 10:18pm
on Apr 7th, 2010, 6:44pm, David Williams wrote:Yes, I confirm that the performance of assembler RND and FastRnd are about the same. |
|
Incidentally, they are within one byte of being the same size too (and I could have made the assembler Rnd slightly smaller than FastRnd by changing the registers in which the parameters are supplied, thus avoiding the initial mov esi,ecx).
Quote:BB4W's standard RND function is still the clear winner on my system. |
|
The actual code executed is of course the same as the assembler version. The difference in execution speed comes from the overhead of the USR function.
Quote:I think it's possible to shave a few instructions off FastRnd to make it a dedicated, fast pseudo-random binary (50/50) decision maker. |
|
If all you want is a fast binary decision maker I reckon I could write one significantly faster than either FastRnd or Rnd. What minimum sequence length do you require?
Richard.
Re: Fast PRNG (suitable for games + demos)
Post by David Williams on Apr 8th, 2010, 7:32pm
on Apr 7th, 2010, 10:18pm, Richard Russell wrote:If all you want is a fast binary decision maker I reckon I could write one significantly faster than either FastRnd or Rnd. |
|
Thanks, that would be interesting to see.
on Apr 7th, 2010, 10:18pm, Richard Russell wrote:What minimum sequence length do you require? |
|
If for use with gaming (which is what I'd be inclined to use it for), wouldn't a sequence length in the range 2^16 to 2^32 suffice?
David.
Re: Fast PRNG (suitable for games + demos)
Post by admin on Apr 8th, 2010, 11:05pm
on Apr 8th, 2010, 7:32pm, David Williams wrote:Thanks, that would be interesting to see. |
|
It may be possible to do better than this, but as a naive first attempt:
Code: mov eax,[esp]
mov ebx,eax
shr ebx,28
adc eax,eax
shr ebx,2
and ebx,1
xor eax,ebx
mov [esp],eax
and eax,1
ret
This returns 0 or 1 on a pseudo-random basis, with a sequence length of 2^31-1. Here the 'state' is stored on the stack, but of course could be anywhere.
Richard.
Re: Fast PRNG (suitable for games + demos)
Post by admin on Apr 9th, 2010, 02:56am
Further, if you're happy to test the sign of the result as your 'binary decision' then you can eliminate the and eax,1 (effectively you're testing the MSB rather than the LSB):
Code: mov eax,[esp]
mov ebx,eax
shr ebx,28
adc eax,eax
shr ebx,2
and ebx,1
xor eax,ebx
mov [esp],eax
ret
This returns >0 or <0 on a pseudo-random basis, with a sequence length of 2^31-1.
Richard.
Re: Fast PRNG (suitable for games + demos)
Post by David Williams on Apr 9th, 2010, 7:00pm
on Apr 8th, 2010, 11:05pm, Richard Russell wrote:It may be possible to do better than this, but as a naive first attempt:
Code: mov eax,[esp]
mov ebx,eax
shr ebx,28
adc eax,eax
shr ebx,2
and ebx,1
xor eax,ebx
mov [esp],eax
and eax,1
ret
This returns 0 or 1 on a pseudo-random basis, ... |
|
Thanks very much for that bit of code (which I haven't yet tested but I feel certain will work just fine and is a keeper!).
I'm just trying to figure out how I might turn a pseudo-random 0 or 1 into a random sign (-1 or +1) in an efficient way without resorting to a CMP (or TST) + jump instruction.
David.
Re: Fast PRNG (suitable for games + demos)
Post by David Williams on Aug 20th, 2014, 03:09am
For a current project, I needed fast (but not 'high quality') generation of pseudo-random floats between 0.0 and 1.0. I'm sharing this code here in case it may possibly be of some interest to others.
David.
Code:
REM Fast -- but not high quality -- PRNG (Pseudo-Random Number Generator)
REM Suited to certain game applications, but NOT statistical simulations.
REM It is NOT intended to replace BB4W's RND function (that would be silly),
REM but it is meant to be called from other assembly language routines.
REM Returns a 64-bit float between 0.0 and 1.0 (exclusive, probably!)
REM
REM Calling this RNG from BASIC is slower than simply using RND(1)
REM (which also returns values in the range 0 to 1)
REM
REM Thanks to George Marsaglia for his 'congruent RNG' on which this code is based:
REM
REM x = 69069*x + 362437
REM
REM Obscenely simple, isn't it?
*FLOAT 64
MODE 8 : OFF
DIM code% 2048+128
FOR I% = 0 TO 2 STEP 2
P% = code%
[OPT I%
.rand
sub esp, 12
xor edx, edx
imul eax, DWORD [seed], 69069
mov DWORD [esp+4], edx
add eax, 362437
mov DWORD [esp], eax
fild QWORD [esp]
fmul QWORD [const]
mov DWORD [seed], eax
mov eax, DWORD [esp+16]
fstp QWORD [eax]
add esp, 12
ret 4
;
; aligned 2Kb code/data separation
;
.gap% : ] : P%+=2048 : P%=(P%+15) AND -16 : [OPT I%
.const dd 1048576 : dd 1039138816 ; 64-bit double representation of 1/(2^32 - 1)
.seed dd 1234567897 ; initial seed value
]
NEXT I%
N% = ^n#
F% = rand
FOR I% = 1 TO 1000000
SYSF%,N% : X%=1280*n#
SYSF%,N% : Y%=1024*n#
PLOTX%,Y%
NEXT
CLS
FOR I% = 1 TO 1000000
SYSF%,N% : X%=1280*n#
SYSF%,N% : Y%=1024*n#
SYSF%,N% : GCOL 16*n#
PLOTX%,Y%
NEXT