Raymond Chen is continuing his series of Monday articles on SSE2 integer arithmetic. This week he is dealing with the signum function, i.e. SGN in BASIC. Here is a corrected version of his code for returning the signum of eight 16-bit integers in parallel:
Code: INSTALL @lib$+"ASMLIB2"
DIM P% 100, L% -1, memory% 31
memory% = (memory% + 15) AND -16
ON ERROR LOCAL [OPT FN_asmext : ]
[OPT 10
.signum
movdqu xmm0, [memory%] ; input in xmm0
pxor xmm1, xmm1
pxor xmm2, xmm2
pcmpgtw xmm1, xmm0 ; xmm1 = pcmpgt(0, x)
pcmpgtw xmm0, xmm2 ; xmm0 = pcmpgt(x, 0)
psubw xmm1, xmm0 ; xmm1 = signum
movdqu [memory%], xmm1 ; output in xmm1
ret
]
RESTORE ERROR
This can be converted to perform 16 8-bit signums or 4 32-bit signums by changing pcmpgtw and psubw to pcmpgtb and psubb or pcmpgtd and psubd respectively.
You can also adapt it to use the 64-bit MMX registers (hence working on half the number of values); in that case no library is required of course:
Code: DIM P% 100, L% -1, memory% 15
memory% = (memory% + 7) AND -8
[OPT 10
.signum
movq mm0, [memory%] ; input in mm0
pxor mm1, mm1
pxor mm2, mm2
pcmpgtw mm1, mm0 ; mm1 = pcmpgt(0, x)
pcmpgtw mm0, mm2 ; mm0 = pcmpgt(x, 0)
psubw mm1, mm0 ; mm1 = signum
movq [memory%], mm1 ; output in mm1
ret
]
Richard.