BBC BASIC for Windows - Quad-precision multiplication

BBC BASIC for Windows

Programming

Assembly Language Programming (Moderator: admin)

Quad-precision multiplication

« Previous Topic | Next Topic »

Pages: 1

Author

Topic: Quad-precision multiplication (Read 514 times)

rtr2
Guest

Quad-precision multiplication
« Thread started on: Dec 8^th, 2014, 11:01pm »

Once again Raymond Chen has tackled an assembly language problem in his blog. This time it's the multiplication of two 64-bit integers to give a 128-bit result:

http://blogs.msdn.com/b/oldnewthing/archive/2014/12/08/10578956.aspx

Here's a BBC BASIC version of the unsigned multiplication (BB4W v6 is required):

Code:

      INSTALL @lib$+"ASMLIB2"

      DIM crossterms 15+15, result 15+15, gap% 2047, P% 120
      crossterms = (crossterms + 15) AND -16
      result = (result + 15) AND -16 : REM align

      ON ERROR LOCAL [OPT FN_asmext : ]
      [
      .mul
      movq xmm0, [^x%%]            ; xmm0 = { 0, 0, A, B } = { *, *, A, B }
      movq xmm1, [^y%%]            ; xmm1 = { 0, 0, C, D } = { *, *, C, D }
      punpckldq xmm0, xmm0         ; xmm0 = { A, A, B, B } = { *, A, *, B }
      punpckldq xmm1, xmm1         ; xmm1 = { C, C, D, D } = { *, C, *, D }
      pshufd xmm2, xmm1, %10001101 ; xmm2 = { D, D, C, C } = { *, D, *, C }

      pmuludq xmm1, xmm0           ; xmm1 = { AC, BD } // provisional result
      pmuludq xmm2, xmm0           ; xmm2 = { AD, BC } // cross-terms

      movdqa [result], xmm1
      movdqa [crossterms], xmm2

      mov    eax, crossterms[0]
      mov    edx, crossterms[4]    ; edx|eax = BC
      add    result[4], eax
      adc    result[8], edx
      adc    dword result[12], 0   ; add the first cross-term

      mov    eax, crossterms[8]
      mov    edx, crossterms[12]   ; edx|eax = AD
      add    result[4], eax
      adc    result[8], edx
      adc    dword result[12], 0   ; add the second cross-term

      ret
      ]
      RESTORE ERROR

      *hex64
      x%% = &1234567812345678
      y%% = &8765432187654321
      *hex32
      CALL mul
      PRINT ~result!12, ~result!8, ~result!4, ~result!0

(The answer is correct, by the way!)

Richard.

Logged

Pages: 1


« Previous Topic \| Next Topic »