BBC BASIC for Windows
Programming >> Assembly Language Programming >> Antialiased line drawing (a timed test)
http://bb4w.conforums.com/index.cgi?board=assembler&action=display&num=1328010187

Antialiased line drawing (a timed test)
Post by David Williams on Jan 31st, 2012, 10:43am

This morning I devised an antialiased line drawing routine for GFXLIB which uses only fixed-point arithmetic.
It is intended to replace the existing "experimental" routine which makes very heavy use of FPU instructions.

To my surprise and slight disappointment, it turns out that the fixed-point version is only about 14% faster
than the quite ghastly FPU-based version! I was honestly expecting something between 50 and 100% faster.

But 14% ?

Anyway, until it's deleted in the next few days, the timed test (EXE) can be downloaded from here:

http://www.bb4wgames.com/temp/timed_test.zip


DrawAntialiasedLine0 is the FPU-intensive routine.

DrawAntialiasedLine uses just fixed-point maths (and a single IDIV instruction).


Probably the last time I'll fret about making heavy use of the FPU in my Asm programs. smiley


David.



Re: Antialiased line drawing (a timed test)
Post by admin on Jan 31st, 2012, 12:07pm

on Jan 31st, 2012, 10:43am, David Williams wrote:
But 14% ?

On my PC it reports "The fixed point-based routine is 9.71% faster than the FPU-based one"!

Quote:
uses just fixed-point maths (and a single IDIV instruction).

IDIV is very slow. If it's in a loop and executed many times that may explain the poor performance.

Quote:
Probably the last time I'll fret about making heavy use of the FPU in my Asm programs.

Have you tried coding the floating-point version in SSE (or SSE2) and/or the integer version in MMX? One or other of those might do better.

How do the timings compare with non-antialiased lines? If they are not very different it may indicate that the time is dominated by plotting the pixels, which would imply there's little point trying to optimise the line-drawing calculations.

Richard.
Re: Antialiased line drawing (a timed test)
Post by admin on Jan 31st, 2012, 1:48pm

Here's what I get if I compare your results with the MMX-based line-drawing code I wrote a while ago. You didn't include the listing of your test program so I don't know precisely the coordinates you used; if it turns out my test harness is significantly different from yours then the comparison may be meaningless:

GFXLIB_DrawAntialiasedLine0 (FPU) took 5485 ms.
GFXLIB_DrawAntialiasedLine (fixed-point) took 5031 ms.
RTR's MMX-based line-drawing code took 1478 ms.


The code is here:

http://www.rtr.myzen.co.uk/DrawAntialiasedLineRTR.zip

Richard.
Re: Antialiased line drawing (a timed test)
Post by David Williams on Jan 31st, 2012, 7:52pm

on Jan 31st, 2012, 12:07pm, Richard Russell wrote:
On my PC it reports "The fixed point-based routine is 9.71% faster than the FPU-based one"!


From bad to worse then.


Quote:
IDIV is very slow. If it's in a loop and executed many times that may explain the poor performance.


In the fixed point-based code, IDIV is executed only once per line drawn.

In contrast, the ghastly FPU-based code, per line drawn, there are:

5x fidiv
1x fdivr
1x fsqrt
3x fmul

And in the point-plotting loop, per point plotted, there are 2 fimuls, and 2 fmuls!

In (partial) conclusion, the single IDIV instruction (per line drawn) in the fixed-point code almost certainly has no bearing on what appears to be the code's very poor performance.


Quote:
Have you tried coding the floating-point version in SSE (or SSE2) and/or the integer version in MMX? One or other of those might do better.


No, I haven't.

I consider getting the bog-standard non-SIMD, non-FPU version up and running something of an achievement!


Quote:
How do the timings compare with non-antialiased lines? If they are not very different it may indicate that the time is dominated by plotting the pixels, which would imply there's little point trying to optimise the line-drawing calculations.


Good point. I will have to check that.



David.
Re: Antialiased line drawing (a timed test)
Post by David Williams on Jan 31st, 2012, 10:47pm

on Jan 31st, 2012, 1:48pm, Richard Russell wrote:
http://www.rtr.myzen.co.uk/DrawAntialiasedLineRTR.zip


Thanks for the source!

So, yours is approaching three times faster than mine (I made the necessary modifications to your test harness in order to make it more-or-less identical to mine).

Also, what I then did was to remove the call to the 'put subpixel' subroutines in the three versions, and got these timings for the line calculations alone:

DW (FPU): 1531 ms
DW (Fixed-point): 1360 ms
RTR (Fixed-point): 1094 ms

100,000 lines (identical sets of co-ordinates); 640x512 drawing area


Comparing our line calculation loops:

DW

Code:
        .plotYagainstX_loop%
        
        push eax                               ; preserve X
        push ebx                               ; preserve Y
        
        sar eax, 8                             ; EAX = X >> 8
        sar ebx, 8                             ; EBX = Y >> 8
        
        push edx                               ; colour
        push ebx                               ; Y >> 8
        push eax                               ; X >> 8
        push [ebp + 8]                         ; dispVars.bmBuffH%
        push [ebp + 4]                         ; dispVars.bmBuffW%
        push [ebp]                             ; dispVars.bmBuffAddr%
        call GFXLIB_Plot2x2FilteredPoint%
        
        pop ebx                                ; restore EBX (Y)
        pop eax                                ; restore EAX (X)
        
        add eax, edi                           ; X += step
        add ebx, esi                           ; Y += m
        cmp eax, ecx                           ; X <= x2 ?
        jle plotYagainstX_loop%

 



RTR

Code:
        .loopx
        call plotsubpixel
        add edi,eax
        add esi,&10000
        cmp esi,ebx
        jc loopx
 



I'm forced to do all that PUSHing and POPing. :-[

I strongly suspect that your line calculation code is faster than my pi**poor implementation of Bresenham's line drawing algorithm as employed in the standard GFXLIB_Line routine.

Which is why I'm tempted...



David.
Re: Antialiased line drawing (a timed test)
Post by David Williams on Jan 31st, 2012, 11:55pm

An earlier program modified to use RTR's antialiased line drawing routine (includes EXE and source):

http://www.bb4wgames.com/misc/2d_asteroids_v1_5.zip


For 20 asteroids, I get the maximum VSync-locked frame rate of 60 fps on my laptop (60 Hz screen refresh rate).

I think the real bottleneck is the BASIC code to calculate the line endpoint coordinates. It's quite involved.

I am aware that the lines disappear (aren't drawn) if one or both endpoints leave the 'viewport'.


David.