BBC BASIC for Windows: GFXLIB

BBC BASIC for Windows
General >> General Board >> GFXLIB
http://bb4w.conforums.com/index.cgi?board=general&action=display&num=1219877763

GFXLIB
Post by David Williams on Aug 27^th, 2008, 10:56pm

The latest version of GFXLIB is available for download via the link below. GFXLIB is a library of machine code graphics routines primarily for use in games. The package includes dozens of fully commented example programs which, until I've written the tutorial, will have to serve as tutorials in themselves (not ideal, I admit). I do intend to write a proper tutorial, a FAQ, and a HTML-based reference. These things take time :)

(...although I do wonder if it's really worth the effort?)

GFXLIB version 1.0.00 link:

~~http://www.bb4w-games.com/gfxlib_1_0_00.zip~~

The example programs will not run directly from the ZIP folder, so extract the GFXLIB folder first to a suitable place (e.g. your My Documents folder).

There are still no bitmap scaling or rotation routines in GFXLIB, but they'll come eventually.

The quality of the routines is quite variable, some (like, for example, GFXLIB_PlotColourBlendOpaque or GFXLIB_Plot32as8) are terribly inefficient -- there are surely much more elegant and efficient ways to do what those routines do.

(In case you're wondering, GFXLIB_Plot32as8 attempts to render a 32 bits-per-pixel bitmap on an 8 bits-per-pixel DIB section/bitmap buffer (no dithering, unfortunately) -- my solution to the colour-matching problem is horrendously slow, I'm too embarassed to go into the details here).

Regards,

David.

PS. Setting up and using GFXLIB is now as easy as this:

Code:

MODE 8

INSTALL @lib$ + "GFXLIB"
PROCAutoInit32(0)

FOR I%=1 TO 1000
  X% = RND(640)
  Y% = RND(512)
  SYS GFXLIB_Plot, dispvars{}, demoBm32%, 64, 64, X%, Y%
NEXT I%

SYS "InvalidateRect", @hwnd%, 0, 0
*REFRESH

Copy, paste & run :)

Re: GFXLIB
Post by admin on Aug 28^th, 2008, 08:19am

Quote:

my solution to the colour-matching problem is horrendously slow

There are two issues here: generating an 'optimum' palette and then choosing which entry is the best match to a specified colour. If you're trying to match to an existing palette, without dithering, why don't you simply use the GetNearestPaletteIndex API?

Generating an optimum palette is another matter, and I don't think there's a GDI function to do it. I devised my own method many years ago, which was responsible for the image here:

ftp://ftp.bbc.co.uk/pub/video/stills/tcf.gif

That has received favourable comments as a 256-colour image.

Richard.

Re: GFXLIB
Post by David Williams on Aug 28^th, 2008, 2:59pm

on Aug 28^th, 2008, 08:19am, Richard Russell wrote:

Yes, in most cases it'll be an existing 256-colour palette. I had never heard of the GetNearestPaletteIndex API! So thanks for acquainting me with its existence. That said, when I entered the following string into Google:

+GetNearestPaletteIndex +slow

the first result that came up said:

Quote:

"Looking up palette index values with GetNearestPaletteIndex is extremely slow, so we build a lookup table of 256 x 256, which contains the results of ..."

The question then is "is their 'extremely slow' algorithm any faster than my 'extremely slow' algorithm?"

Perhaps the folk at MS could have a word with a certain Ms. Wilson whose legendary colour-matching algorithm (as employed in her image mastering software ChangeFSI) was reportedly very fast. (No I don't know how it works!)

Mine works by 'scoring' the sums of the squared differences between the RGB values (per pixel) of the 32bpp or 24bpp source bitmap, and the RGB values of the available colours in the palette. Lowest score wins.

Based on the standard Windows XP palette, GFXLIB_Plot32as8 renders this:

User Image

as this...

User Image

Dithering would definately be a good idea.

(Cue the hues and cries over my inclusion of images)

Quote:

Generating an optimum palette is another matter, and I don't think there's a GDI function to do it. I devised my own method many years ago, which was responsible for the image here:

...

That has received favourable comments as a 256-colour image.

That looks very good!

Regards,

David.

Re: GFXLIB
Post by admin on Aug 28^th, 2008, 4:45pm

Quote:

Perhaps the folk at MS could have a word with a certain Ms. Wilson whose legendary colour-matching algorithm (as employed in her image mastering software ChangeFSI) was reportedly very fast.

I'm guessing that Sophie's method may have been fast because of her unique ARM coding skills rather than a clever algorithm. Your approach sounds like the right one; what makes it so slow? I would imagine MMX instructions may well be of value in doing the computations.

Richard.

Re: GFXLIB
Post by David Williams on Aug 28^th, 2008, 5:28pm

on Aug 28^th, 2008, 4:45pm, Richard Russell wrote:

GFXLIB_Plot32as8 calls an external colour matching function for each pixel that it plots, and so time is wasted in performing this CALL (since this flushes the pipeline, I believe), and more clock cycles eaten up by register preservation (PUSHAD) in said external function. There's six memory accesses (reads) per plotted pixel although three of these are almost certainly read from cached locations (ESP+offset), and I'm hoping the palette entries get cached quickly since they're accessed multiple times in most cases.

Here is the code pasted straight out of GFXLIB:

Code:

        .GFXLIB_ColourMatch
        
        ; SYS GFXLIB_ColourMatch, palAddr, numCols, R`, G`, B`
        
        pushad
        
        ; ESP+36 = palAddr
        ; ESP+40 = numCols
        ; ESP+44 = R`
        ; ESP+48 = G`
        ; ESP+52 = B`
        
        ;----*----*----*----*----*----*----*----|
        
        mov edi, &7FFFFFFF                      ; EDI = least squares max sum (initially set to &7FFFFFFF)
        xor ecx, ecx                            ; ECX = least squares index
        
        mov edx, [esp + 36]                     ; EDX = palette addr
        
        xor ebp, ebp                            ; EBP = loop counter (palette index)
        
        .GFXLIB_ColourMatch__lp
        
        movzx eax, BYTE [edx + 4*ebp + 2]       ; load palette R byte
        movzx ebx, BYTE [edx + 4*ebp + 1]       ; load palette G byte
        movzx esi, BYTE [edx + 4*ebp + 0]       ; load palette B byte
        
        sub eax, [esp + 44]                     ; = R-R`
        sub ebx, [esp + 48]                     ; = G-G`
        sub esi, [esp + 52]                     ; = B-B`
        
        imul eax, eax                           ; = (R-R`)^2
        imul ebx, ebx                           ; = (G-G`)^2
        imul esi, esi                           ; = (B-B`)^2
        
        add eax, ebx                            ; = (R-R`)^2 + (G-G`)^2
        add eax, esi                            ; = (R-R`)^2 + (G-G`)^2 + (B-B`)^2
        
        cmp eax, edi                            ; compare current sum with least squares sum
        jge GFXLIB_ColourMatch__skip
        
        mov edi, eax                            ; least squares sum = current sum
        mov ecx, ebp                            ; lsq index = ebp
        
        .GFXLIB_ColourMatch__skip
        
        inc ebp
        cmp ebp, [esp + 40]                     ; compare loop counter with numCols
        jne GFXLIB_ColourMatch__lp
        
        mov BYTE [varsblk], cl                  ; store final lsq index
        
        popad
        mov al, BYTE [varsblk]
        ret (5*4)

I'm thinking those IMULs could be pre-calc'd (squares looked-up from a table), but perhaps that might prove more 'expensive'.

Regards,

David.

Re: GFXLIB
Post by admin on Aug 28^th, 2008, 10:27pm

Quote:

time is wasted in performing this CALL (since this flushes the pipeline, I believe)

Are you sure? I can't see any mention of that in the Intel Architecture Optimization Reference Manual. Inlining CALLs is recommended, but only for 'peripheral' reasons:

• Parameter passing overhead can be eliminated.
• In a compiler, inlining a function exposes more opportunity for optimization.
• If the inlined routine contains branches, the additional context of the caller may improve branch prediction within the routine.
• A mispredicted branch can lead to larger performance penalties inside a small function than if that function is inlined.

I doubt that any of these apply significantly in your case. In general the CPU is "optimized specifically for calls and returns" (e.g. the trace cache) so I don't think you need worry too much about the overhead.

Richard.

Re: GFXLIB (Fast Text Drawing)
Post by David Williams on Aug 29^th, 2008, 02:53am

The next release of GFXLIB will feature some fast text drawing subroutines.

Here's a demo:

~~http://www.bb4w-games.com/fastfontdemo.zip~~

The screen redraw is supposed to be sync'd with the monitor's VBlank, but if the synchronisation is not good then please don't form the impression that the text drawing routine is slow!

Regards,

David.

Re: GFXLIB (demo of PlotDissolve3 routine)
Post by David Williams on Aug 31^st, 2008, 8:21pm

The next public release of GFXLIB will include a new routine called PlotDissolve3.

Watch this demo to see what it does:

~~http://www.bb4w-games.com/plotdissolve3demo.zip~~

The routine is currently very suboptimal -- it calls Richard's pseudo-random number generator every d**ned pixel, so some kind of shortcut needs to be devised even if that means A) huge table of random numbers, or B) a faster but lower quality random number generator.

(Not suggesting Richard's routine is slow -- it isn't -- just that I'm happy to compromise high quality pseudo-randomness for speed in this case).

Regards,

David.

Re: GFXLIB
Post by admin on Aug 31^st, 2008, 8:46pm

Quote:

Watch this demo to see what it does

You may not like David Tennant as Doctor Who, but at least you have the satisfaction of knowing that BBC BASIC for Windows may end up having a significant (retrospective) contribution to make to Jon Pertwee's depiction of the role! For more details see the September 2008 edition of Everyday Practical Electronics (page 16).

Richard.

Re: GFXLIB (full screen demo)
Post by David Williams on Sep 4^th, 2008, 12:38am

A simple full screen demo:

~~http://www.bb4w-games.com/fullscreendemo.zip~~

I was surprised to get the 'ideal' (VBlank-sync'd) frame rate of 60 fps on my 1.86GHz Centrino-based laptop. However, the CPU load was rather high at approx. 50%. Also, the VBlank synchronisation isn't perfect, but it's better than no sync, IMO.

Regards,

David.

Re: GFXLIB (Fast bitmap scaling)
Post by David Williams on Sep 4^th, 2008, 09:12am

Some very fast -- albeit low quality nearest-neighbour -- bitmap scaling:

~~http://www.bb4w-games.com/fastscalingdemo.zip~~

Re: GFXLIB
Post by admin on Sep 4^th, 2008, 2:36pm

Quote:

Some very fast -- albeit low quality nearest-neighbour -- bitmap scaling

This appears to be broken on my PC: the 'GFXLIB' text, which I presume is intended to be in the foreground, is partially hidden most of the time:

User Image

Richard.

Re: GFXLIB
Post by David Williams on Sep 4^th, 2008, 4:51pm

on Sep 4^th, 2008, 2:36pm, Richard Russell wrote:

This appears to be broken on my PC: the 'GFXLIB' text, which I presume is intended to be in the foreground, is partially hidden most of the time:

Oops... yes, I had REM'd out the *REFRESH statement and forgot to un-REM it prior to compilation.

It should work o.k. now.

~~http://www.bb4w-games.com/fastscalingdemo.zip~~

David.

Re: GFXLIB (text squashing)
Post by David Williams on Sep 5^th, 2008, 12:05am

This will be the last GFXLIB demo for a month or two because I really must get the documentation and example programs written...

~~http://www.bb4w-games.com/textsquash.zip~~

I intend to release the next version of GFXLIB (with lots of new routines plus decent docs) by the end of this month, or early October. I hope then that it'll not just be me and Simon writing games based on it

Check out Simon's game 'Blast' which promises some frantic arcade action (you'll probably need to extract the files from the ZIP folder first before running it):

~~http://www.bb4w-games.com/blast.zip~~

Regards,

David.

Re: GFXLIB
Post by admin on Sep 5^th, 2008, 08:22am

Quote:

Check out Simon's game 'Blast' which promises some frantic arcade action (you'll probably need to extract the files from the ZIP folder first before running it)

Do you happen to know why he doesn't package all the 'resource' files into the executable? Personally I can't be bothered to download the zip and find somewhere suitable to extract all the files.

Your programs are so much easier to run; I don't even have to download them (explicitly), I just 'open' the link in your post then double-click on the executable. Wonderful!

Re: GFXLIB (Scaled game graphics)
Post by David Williams on Sep 8^th, 2008, 12:47am

I wanted to try an experiment with a view to perhaps creating a game that is largely independent of screen resolution. The method used in this demo (link below) involves the pre-scaling of bitmaps using simple nearest-neighbour scaling, and then these pre-scaled bitmaps are drawn in the usual way using the reasonably fast standard GFXLIB_Plot routine.

~~http://www.bb4w-games.com/scaledgamegraphicsdemo.zip~~

The demonstration 'game' doesn't do much -- use the arrow keys to move around and collect objects. Not much fun... but then, the point of the program is to demonstrate an idea/concept, not to entertain

You have to re-start the program in order to change the resolution.

Regards,

David.

Re: GFXLIB
Post by 81RED on Sep 10^th, 2008, 07:02am

on Sep 5^th, 2008, 08:22am, Richard Russell wrote:

To quote what I wrote to David on that subject:
"Can only speak from personal experience, but in my end of the world, users
Have a nasty tendency to download stuff directly to their desktop.
Now having a Blast.exe that "explodes" into 29 additional items on said desktop Is not the ideal way to make friends "
And I could go on and on about that particular subject, but I guess I'm as opposed to programs that uncritically and without warning clutters up the folder they happen to be run in, as you are to unzipping anything.

Simon

Re: GFXLIB
Post by admin on Sep 10^th, 2008, 09:14am

Quote:

Now having a Blast.exe that "explodes" into 29 additional items on said desktop Is not the ideal way to make friends

I suggested that you "package the resource files into the executable", not that you 'explode' 29 items onto the desktop. One doesn't follow from the other!

For a start, I would always recommend putting the resource files into a single sub-directory, not keeping them in the same directory as the executable. Thus if one were to download the executable to the desktop and run it there the most that would happen is that a single additional folder icon would appear.

Arguably the appearance of that icon isn't in itself a bad thing, since it would draw attention to what is in any case bad practice - putting an executable file on the desktop. However it could easily be removed by setting the resource directory's attributes to 'hidden' early in your program.

But what I think is more important is that David's method of embedding all the resource files means that you don't have to (explicitly) download the programs at all. To run one of his programs I just 'open' it from the web site - the downloading and extraction of resource files to a temporary directory happens 'behind the scenes'. Literally his programs are four mouse-clicks away from a message on this forum.

Anyway it's ultimately up to you. I've marvelled at David's programs but I've not even looked at yours because I can't be bothered with the hassle of downloading, extracting and subsequently deleting it.

Richard.

Re: GFXLIB
Post by 81RED on Sep 10^th, 2008, 10:37am

Quote:

I suggested that you "package the resource files into the executable", not that you 'explode' 29 items onto the desktop. One doesn't follow from the other!

Admittedly, no. But, with the danger of repeating myself, there will always be a risk when you unpack something without the users consent or control.

Quote:

For a start, I would always recommend putting the resource files into a single sub-directory, not keeping them in the same directory as the executable. Thus if one were to download the executable to the desktop and run it there the most that would happen is that a single additional folder icon would appear.

The files in the Blast.zip contains *two* files in the root of the archive - a Blast.exe and a readme. The rest is in a \data folder.

Quote:

Arguably the appearance of that icon isn't in itself a bad thing, since it would draw attention to what is in any case bad practice - putting an executable file on the desktop. However it could easily be removed by setting the resource directory's attributes to 'hidden' early in your program.

Hang on a moment, did you just suggest that I clutter the users harddrive, only to hide it afterwards? I sincerely hope I read that paragraph wrong..

Quote:

But what I think is more important is that David's method of embedding all the resource files means that you don't have to (explicitly) download the programs at all. To run one of his programs I just 'open' it from the web site - the downloading and extraction of resource files to a temporary directory happens 'behind the scenes'. Literally his programs are four mouse-clicks away from a message on this forum.

May I suggest WinRAR. It makes running anything inside a zip file equally simple to what you describe above.
Be warned though - WinRAR requires that you actually install it.

Quote:

Anyway it's ultimately up to you. I've marvelled at David's programs but I've not even looked at yours because I can't be bothered with the hassle of downloading, extracting and subsequently deleting it.

Hmm.. I'm still glad I could "be bothered" to download a certain compiler, run the installer, click "next" an amount of times, enter a serial number, click "next" a few more times etc. so that I could produce Blast in the first place.

That you cannot "be bothered" to unzip my game is just something I will have to live with.

Simon

Re: GFXLIB
Post by admin on Sep 10^th, 2008, 2:42pm

Quote:

The files in the Blast.zip contains *two* files in the root of the archive - a Blast.exe and a readme. The rest is in a \data folder.

So where do those 29 items on the desktop come from when you embed them? I would have expected there to be just two (in addition to the executable): a folder icon and a readme icon.

Quote:

Hang on a moment, did you just suggest that I clutter the users harddrive, only to hide it afterwards? I sincerely hope I read that paragraph wrong..

What's wrong with that? Loads of applications install 'hidden' files, and as for cluttering the user's hard drive you can simply delete the folder when your program exits. Another solution is to store your resource files in a subdirectory of @lib$, in which case they are deleted automatically on exit.

Quote:

I'm still glad I could "be bothered" to download a certain compiler, run the installer, click "next" an amount of times, enter a serial number, click "next" a few more times etc. so that I could produce Blast in the first place.

I don't see that it's a valid comparison. If you develop your game into a fully-fledged application that needs to be 'installed' then of course I'd have no objection to carrying out those steps. I'd still encourage you to use a proper installer (so that again I can just 'run' the program from a web page) rather than require me to download it and extract the files manually.

I'm puzzled at your negative reaction to what was intended to be a practical suggestion to improve the user-friendliness of your software. It's a strength of BBC BASIC for Windows that you can embed resource files in the executable, and being able just to 'run' a program from a website or forum message seems to me a useful feature, so long as it is used appropriately.

Re: GFXLIB
Post by 81RED on Sep 10^th, 2008, 4:48pm

on Sep 10^th, 2008, 2:42pm, Richard Russell wrote:

I'm puzzled at your negative reaction to what was intended to be a practical suggestion to improve the user-friendliness of your software. It's a strength of BBC BASIC for Windows that you can embed resource files in the executable, and being able just to 'run' a program from a website or forum message seems to me a useful feature, so long as it is used appropriately.

I did not mean to sound negative. A little bit sarcastic perhaps, but not negative.
Must admit it amuses, more than it irritates me that it's now the second time that you have complained about the packaging rather than the content of my creation.
First time was when the original (non GFXLIB version) version of Blast was posted to the yahoo files area - you did not like/trust/approve of my Windows Installer MSI file.
I duly removed that and replaced it with a zip file. This was all a long time ago.
Now, as it turns out, you cannot be bothered with zip files either.
Have removed the current zip file from yahoo, will replace it with an exploding version at some later stage. And then again, I might not.

Simon

Re: GFXLIB
Post by admin on Sep 10^th, 2008, 7:08pm

Quote:

Now, as it turns out, you cannot be bothered with zip files either.

You mustn't take that personally. Computer games have never been my 'thing'; really they don't interest me at all (possibly because I'm hopeless at writing them and hopeless at playing them). David has achieved the near impossible by making his games so easy to run that even I bother to do so - although I still just watch the demo and listen to the music rather than actually play the thing.

Re: GFXLIB (GFXLIB Example #35)
Post by David Williams on Sep 11^th, 2008, 07:17am

Although I won't publish the source yet (since it relies on subroutines not implemented in the publicly available version of GFXLIB), here's Example 35, which demonstrates two subroutines:

1. PROCLoadBMP24Scaled
2. GFXLIB_PlotColourBlendOpaque

(1) Loads and scales a 24 bits-per-pixel bitmap (original bitmap discarded)
(2) 'Colourizes' a bitmap to a specified colour and strengh, and then alpha-blends the resultant colour with the background pixel colour.

~~http://www.bb4w-games.com/example35.zip~~

[ Michael: this proggy uses only ~15% CPU on my laptop wink

]

Regards,

David.

Re: GFXLIB
Post by admin on Sep 11^th, 2008, 08:44am

Quote:

http://www.bb4w-games.com/example35.zip

Beautiful, as always.

If I move the window partially off the bottom of the screen and then back, a (one-pixel wide?) border around your graphics appears not to be repainted. Is that me or is it you?

Richard.

Re: GFXLIB
Post by David Williams on Sep 11^th, 2008, 08:55am

on Sep 11^th, 2008, 08:44am, Richard Russell wrote:

If I move the window partially off the bottom of the screen and then back, a (one-pixel wide?) border around your graphics appears not to be repainted. Is that me or is it you?

Richard.

It's "fixing the window size" that causes this problem (which affects nearly all of my graphics programs).

Code:

MODE 8 : OFF
      
REM. Fix window size
SYS "GetWindowLong", @hwnd%, -16 TO ws%
SYS "SetWindowLong", @hwnd%, -16, ws% AND NOT &50000

IIRC, you suggested a possible solution a few months ago, however it didn't work for me.

Regards,

David.

Re: GFXLIB
Post by admin on Sep 12^th, 2008, 09:39am

Quote:

IIRC, you suggested a possible solution a few months ago, however it didn't work for me.

Hmm, it should work. Whenever you change the window style in a way which might change the size of the border, you should force a redraw of the border:

Code:

MODE 8 : OFF
      
REM. Fix window size
SYS "GetWindowLong", @hwnd%, -16 TO ws%
SYS "SetWindowLong", @hwnd%, -16, ws% AND NOT &50000 
SYS "SetWindowPos", @hwnd%, 0, 0, 0, 0, 0, 32+7

If it doesn't work please let me know rather than grumbling to yourself that I've given you duff gen!

Richard.

Re: GFXLIB
Post by David Williams on Sep 12^th, 2008, 12:34pm

on Sep 12^th, 2008, 09:39am, Richard Russell wrote:

Hmm, it should work ...

But it doesn't!

Try this (most of the code taken straight from the Wiki):

Code:

      MODE 8 : OFF
      
      REM. Fix window size
      SYS "GetWindowLong", @hwnd%, -16 TO ws%
      SYS "SetWindowLong", @hwnd%, -16, ws% AND NOT &50000
      SYS "SetWindowPos", @hwnd%, 0, 0, 0, 0, 0, 32+7
      
      DIM BITMAPINFOHEADER{Size%, Width%, Height%, Planes{l&,h&}, BitCount{l&,h&}, \
      \                    Compression%, SizeImage%, XPelsPerMeter%, YPelsPerMeter%, \
      \                    ClrUsed%, ClrImportant%}
      
      DIM bmi{Header{} = BITMAPINFOHEADER{}, Palette%(255)}
      
      bmi.Header.Size% = DIM(BITMAPINFOHEADER{})
      bmi.Header.Width% = @vdu%!208
      bmi.Header.Height% = @vdu%!212
      bmi.Header.Planes.l& = 1
      bmi.Header.BitCount.l& = 32
      
      SYS "CreateDIBSection", @memhdc%, bmi{}, 0, ^bits%, 0, 0 TO hbitmap%
      IF hbitmap% = 0 ERROR 100, "Couldn't create DIBSection"
      
      SYS "SelectObject", @memhdc%, hbitmap% TO oldhbm%
      SYS "DeleteObject", oldhbm%
      CLS
      
      *REFRESH OFF
      REPEAT
        CLS
        SYS "InvalidateRect", @hwnd%, 0, 0
        *REFRESH
      UNTIL INKEY(1)=0

Upon closing or dragging a background window, or often moving the program window around (and especially when much of it disappears off the screen), I get two extraneous borders (two pixels in width) on the right and bottom edges of the window.

Even if SetWindowPos is placed after the CLS command, I still get the said borders.

Regards,

David.

Re: GFXLIB
Post by admin on Sep 12^th, 2008, 5:55pm

Quote:

But it doesn't!

Oh yes it does! The code I listed correctly changes the window size, but it doesn't update @vdu%!208 and @vdu%!212 because you've forgotten the VDU 26. You must always have a VDU 26 somewhere between changing the window size and using those variables otherwise they will reflect the size *before* you changed the style.

Re: GFXLIB
Post by David Williams on Sep 12^th, 2008, 7:13pm

on Sep 12^th, 2008, 5:55pm, Richard Russell wrote:

Richard 1, David 0.

Yes, VDU 26 does indeed do the trick.

Thanks! I'm really pleased that I can now fix a problem that affects most of my programs.

David.

Re: GFXLIB
Post by David Williams on Sep 13^th, 2008, 02:08am

Code:

      MODE 8 : OFF
      
      REM. Fix window size
      SYS "GetWindowLong", @hwnd%, -16 TO ws%
      SYS "SetWindowLong", @hwnd%, -16, ws% AND NOT &50000
      SYS "SetWindowPos", @hwnd%, 0, 0, 0, 0, 0, 32+7
      VDU 26

Actually, I don't really want the dimensions of the window to change (namely because it breaks one or two of my GFXLIB routines which either expect a 640x512 DIB section/bitmap buffer, or require the window width to be divisible by 4). The window is initially 640 by 512 pixels; after applying the SetWindowPos, VDU 26, it becomes 642 by 514.

I take it the '32' component of the flags parameter is SWP_FRAMECHANGED, I'm not sure where the +7 comes from.

Whilst it fixes the problem, I'd rather it did so without altering the window dimensions.

Regards,

David.

Re: GFXLIB
Post by admin on Sep 13^th, 2008, 08:53am

Quote:

I take it the '32' component of the flags parameter is SWP_FRAMECHANGED, I'm not sure where the +7 comes from.

The 7 is SWP_NOSIZE + SWP_NOMOVE + SWP_NOZORDER.

Quote:

Actually, I don't really want the dimensions of the window to change

In that case you're going about things in the wrong order. You need to change the window style before the MODE 8:

Code:

SYS "GetWindowLong", @hwnd%, -16 TO ws%
SYS "SetWindowLong", @hwnd%, -16, ws% AND NOT &50000
SYS "SetWindowPos", @hwnd%, 0, 0, 0, 0, 0, 32+7
MODE 8 : OFF

Richard.

Re: GFXLIB (clipped scaling)
Post by David Williams on Sep 18^th, 2008, 4:48pm

Just finished translating the BASIC version of my new bitmap scaler (with edge clipping when required), and it's significantly faster than my previous attempt (as seen in Spacerocks). Two demos:

~~http://www.bb4w-games.com/example39.zip~~

~~http://www.bb4w-games.com/example40.zip~~

Example 39 shows real-time scaling of the ubiquitous ball bitmap, and Example 40 shows full-window scaling plus 'darkening' via GFXLIB_MMXSubtract64.

When I say it's faster than my previous attempt, the new bitmap scaling routine (GFXLIB_PlotScale) is still not as fast as it could be -- and probably slow compared to what an expert coder could achieve. I have a faster method of doing it (prototype version still in BASIC), but whilst it clips correctly (and without memory leaks) at the screen edges, whole scaled pixels just 'pop-off' at the left and bottom edges of the viewport rather than just slide off smoothly. I know why this is, it's just I don't know how to fix it (tried plotting right-to-left, or top-to-bottom -- didn't work).

One more note: as usual with my example programs, I get near-perfect VBlank synchronisation on my laptop with it's mediocre integrated graphics, and this leads to beautifully slick, fluid animation. However, on my P4-based desktop, with its supposedly superior graphics hardware, I'm almost never able to get decent synchronisation, and usually it's quite terrible. Very annoying!

Regards,

David.

Re: GFXLIB (yet another 'example')
Post by David Williams on Sep 21^st, 2008, 10:05pm

~~http://www.bb4w-games.com/example41.zip~~

The point of this is to demonstrate one of a dozen-or-so new GFXLIB routines; in this case, GFXLIB_PlotScaleColourBlend.

There'll be a dedicated (albeit very incomplete) GFXLIB website up by the end of this month.

David.

Re: GFXLIB (yet another 'example')
Post by 81RED on Sep 25^th, 2008, 07:03am

on Sep 21^st, 2008, 10:05pm, David Williams wrote:

http://www.bb4w-games.com/example41.zip

The point of this is to demonstrate one of a dozen-or-so new GFXLIB routines; in this case, GFXLIB_PlotScaleColourBlend.

There'll be a dedicated (albeit very incomplete) GFXLIB website up by the end of this month.

David.

Looks great as always, and that font looks strangely familiar. grin

Simon

Re: GFXLIB (yet another 'example')
Post by David Williams on Sep 25^th, 2008, 07:13am

on Sep 25^th, 2008, 07:03am, Simon Mathiassen wrote:

Looks great as always, and that font looks strangely familiar. grin

Simon

I like that font

Re: GFXLIB
Post by admin on Sep 25^th, 2008, 08:34am

Quote:

There'll be a dedicated (albeit very incomplete) GFXLIB website up by the end of this month.

Can I encourage you to write up your library (or at least provide a link to your own site) in the 'Third Party Libraries' section of the BB4W Wiki: http://bb4w.wikispaces.com/Libraries

Richard.

Re: GFXLIB
Post by David Williams on Sep 25^th, 2008, 12:50pm

on Sep 25^th, 2008, 08:34am, Richard Russell wrote:

Can I encourage you to write up your library (or at least provide a link to your own site) in the 'Third Party Libraries' section of the BB4W Wiki: http://bb4w.wikispaces.com/Libraries

Richard.

No encouragement is necessary -- I intend to do just as you suggested.

One potential issue that may be cause for grumbles, is the size of the library: it's currently 872Kb, and will probably be around 2Mb by the time I've finished. Having said that, there's a lot of repeated code segments outside time-critical sections across many of the routines, so some space can be saved there. If I were to re-write GFXLIB, I would have the user INSTALL the 'core library' (comprising a couple of essential common routines), and then the user could CALL library components as required, viz.

REM. Install core library
INSTALL @lib$ + "GFXLIB"
PROCInitGfxLib

REM. Install required library routines
CALL @lib$ + "gfxlib\plot"
CALL @lib$ + "gfxlib\plotscale"
CALL @lib$ + "gfxlib\boxblur3x3"
CALL @lib$ + "gfxlib\alphablend"

And so on.

Regards,

David.

Re: GFXLIB
Post by admin on Sep 25^th, 2008, 5:07pm

Quote:

One potential issue that may be cause for grumbles, is the size of the library: it's currently 872Kb

I assume much of that is assembler source code. If there's little or no BASIC code how about converting the library into a DLL, which (containing only machine code) ought to be substantially smaller?

Given that (I believe) you already use a 'SYS' interface to your routines - rather than CALL for example - conversion to a DLL should be made easier.

The only area where you might perhaps have problems is the use of 'global' (i.e. BASIC) variables which would have to be reworked for a DLL that doesn't share the same address space.

Richard.

Re: GFXLIB
Post by David Williams on Sep 26^th, 2008, 7:12pm

on Sep 25^th, 2008, 5:07pm, Richard Russell wrote:

When I started writing GFXLIB, I set up a jump table of routine addresses at the beginning of the code, then I *SAVEd the assembled code. When I *LOADed the code into memory and tried to execute one of the routines, it didn't work. I can't remember the reason for this, I think it was due to the addresses of the global variables changing, or maybe jumps were jumping to incorrect addresses. This version of GFXLIB employs relatively few global variables -- it relies on a single large variables block (varsblk). I'll have another crack at it at some point, but, really, it'll have to wait until I'm motivated to undertake the considerable work involved in making it work.

David.

Re: GFXLIB
Post by admin on Sep 27^th, 2008, 09:12am

Quote:

I *SAVEd the assembled code. When I *LOADed the code into memory and tried to execute one of the routines, it didn't work

It's jolly difficult to write 'position independent' code for the x86 CPU family, and the overheads involved would adversely affect performance anyway. That's why you really need a relocatable object format (such as a DLL or other PE-file) in which all the cross-references and jump destinations are automatically adjusted by the loader.

Richard.

GFXLIB v1.1.0 released and website launched
Post by David Williams on Oct 3^rd, 2008, 7:10pm

New GFXLIB website:

~~http://www.bb4w-games.com/gfxlib/gfxlib.html~~

Version 1.1.00 is a major update, a lot of work has gone into it, and
whilst there's plenty of new routines to come, it won't be easy for
me to remain motivated to implement them if there's little evidence
of use (you can understand that, can't you?).

I've made a start on the GFXLIB Reference section (on the website),
but there's so much more to do. Meanwhile, there's plenty of
commented example programs (with listings also given in HTML form) to
study.

Some are bound to be offended by the size of the GFXLIB.BBC file --
currently 882Kb and growing. On a personal level though, it wouldn't
bother me if it was 5Mb.

To be done:

* Finish the documentation of current routines
* Write a simple tutorial
* More fully-commented example programs
* Add bitmap rotation routines
* Improve font drawing routines
* Add line, circle and polygon plotters (will need help with these)
* Add bitmap blurring routines
* Possibly polygon texturing routines

I'm also happy to accept routines from anyone who can contribute.
Or, if you're not able to program in assembly language (I barely
can), then send me a working algorithm in BASIC and I'll probably be
able to convert it to assembly language.

Regards,

David.

Re: GFXLIB
Post by admin on Oct 15^th, 2008, 2:23pm

Quote:

No encouragement is necessary -- I intend to do just as you suggested.

I don't want to hurry you more than you're comfortable with, but I notice that GFXLIB still isn't listed in the Libraries page on the Wiki:

http://bb4w.wikispaces.com/Libraries

Richard.

Re: GFXLIB
Post by David Williams on Oct 16^th, 2008, 5:53pm

on Oct 15^th, 2008, 2:23pm, Richard Russell wrote:

I don't want to hurry you more than you're comfortable with, but I notice that GFXLIB still isn't listed in the Libraries page on the Wiki:

http://bb4w.wikispaces.com/Libraries

Richard.

Done.

Regards,
David.

Re: GFXLIB
Post by Michael Hutton on Oct 26^th, 2008, 07:02am

David,

I have made a routine to sort an array of structures by a 4 byte float key. It is whittled down to the bear essentials and is fast! (well, I think so anyway).

It will only accept one sort key parameter and you need to pass the structure as the first parameter.

http://tech.groups.yahoo.com/group/bb4w/files/Libraries/SORTSAF4LIB.bbc

Let me know if you think it is useful at all.

Michael

Re: GFXLIB
Post by David Williams on Oct 27^th, 2008, 11:32am

on Oct 26^th, 2008, 07:02am, Michael Hutton wrote:

David,

I have made a routine to sort an array of structures by a 4 byte float key. It is whittled down to the bear essentials and is fast! (well, I think so anyway).

It will only accept one sort key parameter and you need to pass the structure as the first parameter.

http://tech.groups.yahoo.com/group/bb4w/files/Libraries/SORTSAF4LIB.bbc

Let me know if you think it is useful at all.

Michael

Well, great work I'm sure, and beautiful-looking assembly language to boot, but it's not immediately useful to me because I don't use four-byte floats (although I had to use them of course during my brief adventure with D3DLIB).

I was interested in the instruction timings/clock cycles you neatly gave alongside most of the assembler instructions, but are you sure they're correct (even for the specified Pentium processor)?

For instance, aren't instruction pairs like

add eax,9
shl edx,3

fetched 'simulatenously' into the two separate pipelines (U and V), effectively having both instructions executed in 1 cycle?

Similarly for this pair:

mov edx, ecx
inc ebx

I don't think memory accesses are completed in one cycle either:

mov esi, [ebx]
mov edi, [ebp+7]

As for fld dword [ecx], well, no way is that completed in one cycle!

Please forgive my nitpicking

Regards,

David.

Re: GFXLIB
Post by admin on Oct 27^th, 2008, 2:47pm

Quote:

For instance, aren't instruction pairs like add eax,9 : shl edx,3 fetched 'simulatenously'

I'm not sure that's such a good example, because since both those instructions affect the flags it's necessary for the CPU to ensure that they are effectively executed in the specified sequence (of course that doesn't necessarily imply more than one clock cycle).

Quote:

Similarly for this pair: mov edx, ecx : inc ebx

That's a better example, because they are genuinely independent. Having said that, 'inc ebx' isn't a particularly fast instruction on some modern processors (because of the need to preserve the carry flag) - 'add ebx,1' may execute faster despite requiring more bytes.

Richard.

Re: GFXLIB
Post by Michael Hutton on Oct 28^th, 2008, 07:53am

on Oct 27^th, 2008, 11:32am, David Williams wrote:

As for fld dword [ecx], well, no way is that completed in one cycle!

Please forgive my nitpicking

I know, I'm not sure I believe it also and I remember being quite surprised , but I wasn't going to argue. There are obviously qualifiers to the memory access opcodes, but I think I was right when it didn't seem to require adding any other cycles.

I will definitely go and check again, it's on another computer.

Don't thank me for the elegant code ! - it's a complete ripoff of SORTLIB which contains the beef of the QUICKSORT routine. I've only added the addressing of structures...

Re: GFXLIB
Post by admin on Oct 28^th, 2008, 09:17am

Quote:

I will definitely go and check again, it's on another computer.

The only way is to measure it. This paper gives some useful hints, including the use of CPUID as a 'serialising' instruction: http://pasta.east.isi.edu/algorithms/IntegerMath/Timers/rdtscpm1.pdf

Richard.

Re: GFXLIB
Post by Michael Hutton on Oct 30^th, 2008, 12:38pm

Thanks Richard, I had remembered something about 'a time stamp' or CPUID feature and was looking at for it again but that saves me a bit of reading.

I got the original timing from

http://www.packetstormsecurity.com/programming-tutorials/Assembly/fpuopcode.html

FLD     Floating point load

    operand       8087         287        387      486     Pentium
    reg          17-22        17-22       14        4       1      FX
    mem32       (38-56)+EA    38-56       20        3       1      FX
    mem64       (40-60)+EA    40-60       25        3       1      FX
    mem80       (53-65)+EA    53-65       44        6       3      NP

but I will try and use Richard's link..

Michael

Re: GFXLIB
Post by admin on Oct 30^th, 2008, 2:45pm

Code:

operand       8087         287        387      486    Pentium
mem32     (38-56)+EA       38-56       20        3       1
mem64     (40-60)+EA       40-60       25        3       1
mem80     (53-65)+EA       53-65       44        6       3

Don't those timings exclude the memory fetch? Any memory access may incur an overhead, depending on where the data is (L1 cache, L2 cache, main memory) unless it has been 'prefetched'.

Richard.

Re: GFXLIB
Post by Michael Hutton on Oct 30^th, 2008, 11:05pm

Hmm, yes it seems so. Those must be timings from when the data is where exactly?

(Should I start a new topic OPCODE TIMINGS) in the assembler section?)

I spent last night using the timestamp instruction to time fld dword [mem], will post later. Have to go on ward round!

Re: GFXLIB
Post by Michael Hutton on Nov 15^th, 2008, 03:19am

Hello,

I've been using GFXLIB. I've got the latest version and I like the Autoinit bit!

http://tech.groups.yahoo.com/group/bb4w/files/Graphics/Plasmas/

I've been making some simple plasmas, reminds me of the FRACTINT plasmas but at the moment they are all fixed SIN functions. I am coding a 'realtime' RGB plasma which should change the pattern with time not just the colours. I was also going to try to use the plasma as an alpha mask...

At the moment I am using my own bit of code to do the palette rotation but I thought I might be able to use GFXLIB_****LUT1/2/3 but I wasn't quite sure how to implement them.

LUT1, looks up the colour value (RGB) from a one dimensional look up table, yes? So in effect I was thinking of filling a bit map with the plasma pallete values, each separate byte containing a separate pallete code. For example if the palette colour was 255 the pixel would be &FFFFFF not just &FF and then using LUT1 to find my colour and then BPlot the resultant bit map... I wasn't quite sure form the documentation what exactly I should be doing. Could I use LUT3 instead. Also I noticed a line REM'd out in LUT1 function. I presume this is correct but was just wondering!

I love the Bplotscale(NC) functions I got some very 'nice' plasma effects in BASIC by drawing a 50x50 grid using PLOT and then scaling it up to full screen. The 'Blockyness' was quite good.

I have found a good way (well, I think it's good) to make a blank bitmap which avoids all the CreateDC,CreateCompatibleBitmap, SelectObject calls (which I HATE by the way, as I always get lost (same as idiv/div - always gets me down... grin

). Just *SCREENSAVE a bit of the screen (I found you have to make it four times the x and y co-ordinates you want) and then load it back in with GFX. I noticed that when I got the size wrong subsequently when I DIM'd more variables they ended up in the middle of my bitmap! Very frustrating, with multiple crashes...

Are there any transforming (ie rotation, sphere, cylinder stuff coming up?) I know I don't ask for much! I have tried a sort of cylinder but only can manage it aligned along the x-axis using Plot**row (can' remember the name sitting here!) and some sin/cos functions. It would be good to be able to wrap it 'properly'.

anyway, keep up the good work. Is there more GFXLIB documentation coming up?

Re: GFXLIB
Post by David Williams on Nov 16^th, 2008, 12:24am

on Nov 15^th, 2008, 03:19am, Michael Hutton wrote:

I've been making some simple plasmas

They're great! Plasma 1.03 is especially impressive (and the closest I'll ever get to experiencing an acid trip).

on Nov 15^th, 2008, 03:19am, Michael Hutton wrote:

At the moment I am using my own bit of code to do the palette rotation but I thought I might be able to use GFXLIB_****LUT1/2/3 but I wasn't quite sure how to implement them.

Sorry, about the lack of documentation concerning these and other GFXLIB routines -- it'll come, eventually.

LUT1 takes an RGB word from a source bitmap (the one you want to plot), and extracts the individual RGB bytes from it, each of course being in the range 0 to 255 (I'm being verbose here for the benefit of others!). The extracted byte values are then used as indices for a single 256-byte lookup table, from which bytes are read and it is these bytes which are written to the destination bitmap (or DIB section). It's important to note that the three RGB components are indices for just one single table -- greater versatility would be had if each component indexed its own table (I'll probably implement this for a future LUT4).

Copy, paste and run this example program:

Code:

      M% = 2
      HIMEM = LOMEM + M%*&FA000
      
      MODE 8 : OFF
      
      INSTALL @lib$+"GFXLIB"
      PROCAutoInit32
      
      REM. Reserve memory for a 640x512 32bpp bitmap
      DIM bm% 4*640*512-1
      
      REM. Reserve space for a 256-byte colour table
      DIM table% 255
      
      REM. Fill the colour table
      FOR I%=0 TO 255
        table%?I% = 255-I%
      NEXT I%
      
      REM. Redirect GFXLIB's output to addr pointed to by bm%
      REM. (Normally, GFXLIB's output is to the DIB section/screen memory)
      SYS GFXLIB_QuickSetDispVars, dispVars{}, bm%
      
      REM. QuickSetDispVars can be used here because bitmap pointed to by bm%
      REM. is the same dimensions as the program window
      
      REM. Draw 100 bitmaps to the bitmap pointed to by bm%
      REM. Note that the global variable demoBm32% points to a 64x64 32bpp bitmap
      REM. set up when PROCAutoInit32 is called (it doesn't just appear out of thin air!)
      
      FOR I%=1 TO 100
        SYS GFXLIB_Plot, dispVars{}, demoBm32%, 64, 64, RND(640), RND(512)
      NEXT I%
      
      REM. Restore GFXLIB's output to the DIB section pointed to by dibSectionAddr%
      SYS GFXLIB_QuickSetDispVars, dispVars{}, dibSectionAddr%
      
      REPEAT
        IF (TIME DIV 100) MOD 2=0 THEN
          REM. Draw the bitmap (bm%) normally using BPlot
          SYS GFXLIB_BPlot, dispVars{}, bm%, 640, 512, 0, 0
        ELSE
          REM. Draw the bitmap (bm%) using BPlotLUT1
          SYS GFXLIB_BPlotLUT1, dispVars{}, bm%, table%, 640, 512, 0, 0
        ENDIF
        
        SYS "InvalidateRect", @hwnd%, 0, 0
        PROCWait(4)
        *REFRESH
      UNTIL FALSE

After running, replace the line

Code:

     table%?I% = 255-I%

with

Code:

     table%?I% = I% DIV 2

LUT2 is similar, except that it uses the RGB components of the background pixels as indices into a 256-byte table, rather than the source bitmap pixels. LUT2 can be used for shadow effects, for example.

LUT3 relies on a 2D table (256*256 bytes) as the RGB components of both the background and source bitmap pixels are taken into account. LUT3 lends itself to various transparency effects, although, as with LUT1 and LUT2, the individual RGB components of the background and source pixels are indices for a single common colour table. Would be nice if they could each have their own table (a future LUT5, perhaps).

If you have the time and inclination, then copy, paste and run this proggy:

Code:

      M% = 2
      HIMEM = LOMEM + M%*&FA000
      
      MODE 8 : OFF
      
      INSTALL @lib$+"GFXLIB"
      PROCAutoInit32
      
      DIM table% 256*256-1
      
      opacity% = 25
      f# = 1.0 - opacity%/100
      
      FOR I%=0 TO 255
        FOR J%=0 TO 255
          table%?(256*I% + J%) = I% + f#*(J%-I%)
        NEXT J%
      NEXT I%
      
      FOR I%=1 TO 100
        SYS GFXLIB_PlotLUT3, dispVars{}, demoBm32%, table%, 64, 64, RND(640)-32, RND(512)-32
      NEXT
      
      SYS "InvalidateRect", @hwnd%, 0, 0
      PROCWait(4)
      *REFRESH

on Nov 15^th, 2008, 03:19am, Michael Hutton wrote:

I love the Bplotscale(NC) functions I got some very 'nice' plasma effects in BASIC by drawing a 50x50 grid using PLOT and then scaling it up to full screen. The 'Blockyness' was quite good.

Just in case anyone's wondering... the NC part of the routine name GFXLIB_BPlotScaleNC means "Not Clipped" or "No Clipping". It's a fast and rather dangerous routine! But if used carefully (as I'm sure you've done), it's a mighty fast bitmap scaler - almost as fast as straightforward GFXLIB_Plot.

on Nov 15^th, 2008, 03:19am, Michael Hutton wrote:

I have found a good way (well, I think it's good) to make a blank bitmap which avoids all the CreateDC,CreateCompatibleBitmap, SelectObject calls (which I HATE by the way, as I always get lost (same as idiv/div - always gets me down... ;D). Just *SCREENSAVE a bit of the screen (I found you have to make it four times the x and y co-ordinates you want)

I've found a good way of achieving the same thing -- that is simply to reserve a suitably sized area of memory using DIM (or whatever). Most of GFXLIB's aren't concerned with bitmap headers as most of the routines assume a 32bpp bitmap, and the width and height of the bitmap is - in most cases - specified when the routine is called.

So this bit of code from Plasma 1.03 ...

Code:

     REM Create blank bitmaps to store the plasma and the buffer bitmap
      PRINT"Creating Blank Bitmap...";
      plasma_file$="plasma.bmp"
      OSCLI "SCREENSAVE """+plasma_file$+""" "+STR$0+","+STR$0+","+STR$(cx%*4)+","+STR$(cy%*4)
      PROCLoadBMP(@dir$+plasma_file$, plasmaBM%, FALSE )
      OSCLI "DEL """+plasma_file$+""""
      PRINT"OK!"

can be replaced by

Code:

      DIM plasmaBM% 4*(cx%*cy% + cx%) +4
      plasmaBM%=(plasmaBM%+3) AND -4

The additional cx% is required in the DIM declaration because your plasma routine has either a memory leak (poking data outside the 640x512 bitmap), or you're attempting to read data from a location outside the bitmap. I suspect that the former case is at foot here.

For those who don't know, the plasmaBM%=(plasmaBM%+3) AND -4 statement ensures that plasmaBM% points to a word-aligned address (always a good idea as non-aligned memory accesses can incur a speed penalty, on some systems). The +4 tacked on the end ensures there's sufficient room in case plasmaBM% does need to be incremented by at most 3 bytes.

on Nov 15^th, 2008, 03:19am, Michael Hutton wrote:

Are there any transforming (ie rotation, sphere, cylinder stuff coming up?)

I started coding coordinate rotation routines for GFXLIB a few weeks ago, with the idea of providing high-precision but slow-ish (FPU- based) routines, and a parallel set of routines based on fast fixed-point integers. I'll have to resume work on this, but at the moment, my 'batteries' are a bit flat, as it were.

on Nov 15^th, 2008, 03:19am, Michael Hutton wrote:

I have tried a sort of cylinder but only can manage it aligned along the x-axis using Plot**row (can' remember the name sitting here!) and some sin/cos functions. It would be good to be able to wrap it 'properly'.

I've got such a routine to do just that... but it's currently in BASIC! I think it's called HLineShift or something like that (horizontal line shift with wraparound (on same raster)).

on Nov 15^th, 2008, 03:19am, Michael Hutton wrote:

Is there more GFXLIB documentation coming up?

Yes, and I'm sorry I've been dragging my feet on this -- my batteries really are low! Just can't get motivated. I'll do some work on the docs this week -- certainly on the PlotLUT routines. Will try to provide plenty of examples.

Again, your plasmas look great (and not too nausea-inducing) and very slick. I look forward to seeing the shapes/patterns change in subsequent versions!

Regards,

David.

Re: GFXLIB
Post by Michael Hutton on Nov 16^th, 2008, 02:24am

Quote:

I've found a good way of achieving the same thing --

Ah, yes. At the time I was looking for the whole bitmap including the header, but good point!

As with my 'plot' code, I always get mixed up with the addressing. Not good.

Thanks for the examples for LUT1 and LUT2, I think I see what is happening a bit more clearly. I think I will be able to use them. I was thinking that I might be able to get the graphics card to cycle the colour palette rather than me do it but I suppose that involves DirectX (Draw?) which I haven't really got into yet.

I have a working version of the moving plasma but I'll 'tidy' it up today before posting.

Michael

Re: GFXLIB
Post by admin on Nov 16^th, 2008, 09:53am

Quote:

For those who don't know, the plasmaBM%=(plasmaBM%+3) AND -4 statement ensures that plasmaBM% points to a word-aligned address

Bear in mind that in a Windows bitmap (either in the form of a .BMP file or a DIB in memory) every row must be DWORD-aligned.

That doesn't involve any overhead if you're using a 32-bpp bitmap (if the first row is aligned, so must all the others) but if you're using a 24-bpp bitmap then it means every row must be padded to a multiple of 4 bytes if it isn't already.

So the general form of memory allocation for a bitmap is:

Code:

DIM bmp% cy% * ((bpp% DIV 8 * cx% + 3) AND -4) + 3
bmp% = (bmp% + 3) AND -4

Richard.

Re: GFXLIB
Post by David Williams on Nov 17^th, 2008, 5:45pm

on Nov 16^th, 2008, 09:53am, Richard Russell wrote:

So the general form of memory allocation for a bitmap is:

Code:

DIM bmp% cy% * ((bpp% DIV 8 * cx% + 3) AND -4) + 3
bmp% = (bmp% + 3) AND -4

Thanks for that bit of code, Richard.

GFXLIB is intended for use mostly with 32-bpp bitmaps (although there is a handful of routines for 8-bpp bitmaps). The one routine available for simply displaying a 24-bpp bitmap (GFXLIB_BPlotBMP24) can only be safely written to a 32-bpp DIB section or bitmap buffer.

GFXLIB provides the subroutine PROCLoadBMP which will load 8, 24 and 32-bpp bitmaps, and in the first two cases, will convert the bitmap to 32-bpp. 16-bpp isn't currently catered for, but then who uses those nowadays?

David.

Re: GFXLIB
Post by David Williams on Dec 24^th, 2008, 11:51am

The next version of GFXLIB (version 1.2) will feature several new routines, including line, circle and filled triangle plotters, and possibly also routines for rotating lists of 3D coordinates.

Meanwhile, this wee 'demo' indicates the speed of GFXLIB's new line plotter (bear in mind that the coordinates of the line endpoints are updated and checked in BASIC, and that one SYS GFXLIB_Line... statement is issued for each line drawn).

~~http://www.bb4w-games.com/temp/gfxlib_line.zip~~

I'll make some much more interesting demos at a later date.

David.

Re: GFXLIB
Post by David Williams on Apr 29^th, 2009, 01:30am

Here's another example of bitmap shape manipulation using GFXLIB:

~~http://www.bb4w-games.com/gfxlib/ex/32bpp/exe/example32.zip~~

Distorting the bitmap also in the vertical direction might produce a fairly interesting result. Maybe I'll try it.

Re: GFXLIB
Post by Michael Hutton on May 2^nd, 2009, 12:05pm

Nice. That is a really good effect.

What function did you use for that or is it a new one? I would guess at GFXLIB_PlotBMRow?

Re: GFXLIB
Post by David Williams on May 2^nd, 2009, 10:17pm

> Nice. That is a really good effect.
>
> What function did you use for that or is it a new one?
> would guess at GFXLIB_PlotBMRow?

No, PlotBMRow doesn't do any scaling.

What's used here is BPlotScale:

Code:

SYS GFXLIB_BPlotScale, dispVars{}, bmAddr, bmW, bmH, newBmW, newBmH, x, y

When used ordinarily and correctly, this would render a scaled bitmap in the way you'd expect. However, to draw an individual (scaled) horizontal line of a bitmap (pointed to by bmAddr), set bmH and newBmH to 1, and add a suitable offset to the bmAddr param:

Code:

SYS GFXLIB_BPlotScale, dispVars{}, bmAddr + 4*row%*bmW, bmW, 1, newBmW, 1, x, y

where row% (given from 0 to bmH-1) is the row of pixels that you wish to draw.

I know you know this, but for other people's benefit, the x4 multiplier is due to the fact that we're dealing with four bytes per pixel here (GFXLIB operates mostly on 32-bits-per-pixel bitmaps).

FWIW, here's the drawing loop from Example #32:

Code:

REM. Draw the individually scaled rows of the 480x256 BB4W graphic
FOR I%=0 TO 255
    x1% = 80  + 128*SIN( a1# + I%/64 ) * COS( a3# + I%/32 )
    x2% = 560 + 128*COS( a2# + I%/64 ) * SIN( a3# + I%/32 )
    SYS GFXLIB_BPlotScale, dispVars{}, bitmap%+4*480*I%, 480, 1, x2%-x1%, 1, (640-(x2%-x1%))/2, 128+I%
NEXT

URL: ~~http://www.bb4w-games.com/gfxlib/ex/32bpp/html/example32.html~~

Regards,

David.

~~http://www.bb4w-games.com~~

Re: GFXLIB
Post by David Williams on May 3^rd, 2009, 8:04pm

Would anyone out there like to contribute graphics routines to GFXLIB? Come on folks, this should be a collaborative effort!

Here are some ideas:

* Bitmap blurrers / smoothers (inc. Gaussian blurring)
* Higher quality bitmap scaling -- bicubic interpolation or Richard's somewhat esoteric method
* Lensing effects
* Lighting effects
* Other special effects
* Bitmap distortion routines (e.g., barrel distortion)
* Colour manipulation -- HSV-altering routines, for example
* Faster alpha blending routines (MMX-based, perhaps)
* Robust line, circle and polygon plotters -- with anti-aliasing if possible
* Texture mappers -- even if it's just so-called "affine" texture mapping
* Fast bitmap rotation
* Bitmap rotation *with* anti-aliasing
* 3D routines of various kind
* Bezier / spline curve rendering
* Fast font renderers

For those not able to implement such routines in assembly language, how about creating a working prototype version in BASIC and have one of the BB4W community's willing assembler experts 'translate' it to glorious assembly language?

Help make GFXLIB into something *special* -- and useful.

Even better, perhaps, someone could start to create a competing graphics library with superior graphics routines than is currently available in GFXLIB. laugh

Regards,

David.

Re: GFXLIB
Post by admin on May 3^rd, 2009, 9:52pm

Quote:

Richard's somewhat esoteric method

I would like to point out that it's hardly 'my' method; it's well known and documented, and based on sound fundamental mathematics. For example, it's been routinely used for years at BBC Research & Development (indeed long before I ever joined that august organisation) and is the method used in all Snell & Wilcox Standards Converters, Aspect Ratio Converters, PAL Decoders etc. (one reason why they have the reputation of being the best of their kind).

Calling it "esoteric" does it an injustice, in my opinion, because it's neither difficult to understand nor particularly complicated. It could also discourage people to try it, when it's almost always the best choice.

Richard.

Re: GFXLIB
Post by David Williams on May 3^rd, 2009, 11:29pm

on May 3^rd, 2009, 9:52pm, Richard Russell wrote:

My word, you seem to be almost offended (I had half-expected a response just like that one from you!).

Can the "frequency domain method" be implemented in IA-32 assembly language so that it's fast enough for use with games -- after all, that's primarily what GFXLIB is intended for, and I've always maintained that? I suspect it's overkill under these circumstances.

A fast (bi)cubic interpolator would be sufficient for GFXLIB's primary intended purpose. Tony Tooth's excellent "Spline Resize" program employs cubic spline interpolation, however it may be too slow for real-time use in games (I'm *not* at all knocking his program, just "telling it like it is").

David.

Re: GFXLIB
Post by admin on May 4^th, 2009, 09:28am

Quote:

My word, you seem to be almost offended

It's not a case of being offended, it's a case of not wanting a false impression to be given.

Quote:

Can the "frequency domain method" be implemented in IA-32 assembly language so that it's fast enough for use with games

The 'frequency domain method' is simply an FIR (Finite Impulse Response filter), as is bicubic interpolation and virtually every other known interpolation technique! Therefore not only does it it take exactly the same time (assuming the same number of taps is used) but indeed uses exactly the same code as bicubic interpolation!

If you refer back to my article, you will see that the only difference between cubic interpolation and 'my' method (with 4 taps) is in the coefficients by which the input samples are multiplied. Simply by altering the values of the coefficients, but otherwise making no changes to the code the performance is improved.

Of course, if you're actually calculating the coefficients 'at run time' in your program then there could be an impact on complexity and speed, but that's a one-time calculation for any given degree of scaling.

Perhaps you can now see why I reacted to your terminology. To convert bicubic interpolation into a comparable 'frequency-domain-based' method involves no changes to the code and no changes to the speed. You get a performance improvement for no cost whatever, other than the need to calculate the coefficients differently.

Richard.

Re: GFXLIB
Post by David Williams on May 14^th, 2009, 2:32pm

I'm quite pleased with this animated background made in readiness for the completion of my slightly improved 'colour keying' algorithm. It was made almost entirely with GFXLIB - no tricks, except for a very slight Gaussian blur applied in VirtualDub (it didn't occur to me to use the 5x5 box blur routine available in GFXLIB!).

(BTW - anyone fancy contributing a Gaussian blur routine to GFXLIB? Just wondering...)

File name: bg2_3.avi
Format: DivX AVI
Dimensions: 720 x 576 pixels
Duration: 60 seconds
File size: 6 Mb

URL: ~~http://www.bb4w-games.com/138519651/bg2_3.avi~~

The freely available DivX codec is required to view this video.

Regards,

David.

~~http://www.bb4w-games.com~~

Re: GFXLIB
Post by David Williams on May 15^th, 2009, 3:54pm

Another background animation (created with GFXLIB and VirtualDub) just crying out to be used with Richard's CSO utility:

~~http://www.bb4w-games.com/138519651/bg3.avi~~

File size: 4.26 MB
Format: DivX AVI
Codec: DivX
Dimensions: 720 x 576
Frame rate: 25 fps
Data rate: 582 kbps
Duration: 60 seconds
No audio

--

Re: GFXLIB
Post by Michael Hutton on May 16^th, 2009, 08:12am

on May 14^th, 2009, 2:32pm, David Williams wrote:

(BTW - anyone fancy contributing a Gaussian blur routine to GFXLIB? Just wondering...)

Didn't Tony Tooth do a Gaussian Blur routine?

Michael

Re: GFXLIB
Post by David Williams on May 16^th, 2009, 10:45am

on May 16^th, 2009, 08:12am, Michael Hutton wrote:

Didn't Tony Tooth do a Gaussian Blur routine?

Michael

I think you may be referring to his image smoothing program (SmoothX), which, as far as I can tell, performs a 3x3 'box blur' with user-specifiable relative weightings for the central and surrounding pixels.

For the benefit of others viewing this thread, with Gaussian blurring you specify a fractional blur radius 'r' (in pixels). The image below demonstrates Gaussian blurring for r=0.0, 0.5, 1.0, 2.0, 3.0, 4.0, 6.0, 8.0 and 10.0:

~~http://www.bb4w-games.com/138519651/gaussianblurring.jpg~~

(That was done with Adobe Photoshop).

David.

Re: GFXLIB
Post by David Williams on May 17^th, 2009, 10:57am

Okay, here's another video which, for its making, relied heavily on the following tools:

- BBC BASIC for Windows (naturellement)
- My CSO utility
- GFXLIB
- VirtualDub
- Adobe Photoshop
- My fat little fingers

"Obscured by fluffy clouds"

YouTube (view in HQ mode, please)
http://www.youtube.com/watch?v=F-hS1Zhqjvc

Direct download (34 MB, DivX AVI)
~~http://www.bb4w-games.com/138519651/obscured_by_fluffy_clouds.avi~~

Regards,
David.

Re: GFXLIB
Post by Michael Hutton on May 18^th, 2009, 11:13am

Only one suggestion - repaint the guitar! grin

I like the background.

Michael

Re: GFXLIB
Post by David Williams on May 22^nd, 2009, 8:15pm

Here's a little demo of one of the new GFXLIB routines, provisionally named BrushBlur:

~~http://www.bb4w-games.com/138519651/brushblur.zip~~ [1.2 MB]

Re: GFXLIB
Post by David Williams on May 23^rd, 2009, 6:49pm

GFXLIB.BBC file size now exceeds 1 MB (assembled code size 60 Kb).

With so many variables declared (albeit most of them local), there may be trouble ahead.

Sooner rather than later, I'm going to have to consider practically rewriting GFXLIB so that it can be distributed as a compact DLL + packaging. But I wonder if it would really be worth the tremendous effort involved?

David.

Re: GFXLIB
Post by admin on May 23^rd, 2009, 9:26pm

Quote:

With so many variables declared (albeit most of them local), there may be trouble ahead.

What kind of trouble do you anticipate?

Quote:

Sooner rather than later, I'm going to have to consider practically rewriting GFXLIB so that it can be distributed as a compact DLL

Have you considered less drastic solutions? For example you could arrange to assemble the code using CALL filename$ (which would mean the memory occupied by the 'source' would be required only transitorily) and even - with care - discard the memory used by your temporary variables.

By judicious use of these techniques it should be possible to reduce the memory 'footprint' of GFXLIB to little more than the 60 kB code size.

Richard.

Re: GFXLIB
Post by David Williams on May 23^rd, 2009, 9:50pm

on May 23^rd, 2009, 9:26pm, Richard Russell wrote:

What kind of trouble do you anticipate?

You might remember that some time ago, in an early pre-release version GFXLIB, so many variables were declared that when the main program was compiled with the 'Abbreviate names' option set, one of the variables in the assembler section of the main program was renamed to (IIRC) esi or edi, which of course happens to be a register name! You did mention that well over a thousand variables would have to be declared before such 'collisions' (with names of registers) occurs, and you also asked why on earth I needed to declare so many variables in the first place. The number of variables was drastically reduced, but the numbers are creeping up again...

I was going to suggest (or had I already suggested?) that perhaps you could modify the relevant code in the compiler to not replace variables with register names.

on May 23^rd, 2009, 9:26pm, Richard Russell wrote:

What kind of trouble do you anticipate?

Have you considered less drastic solutions? For example you could arrange to assemble the code using CALL filename$ (which would mean the memory occupied by the 'source' would be required only transitorily) and even - with care - discard the memory used by your temporary variables.

Yes, I started to jot down ideas (actually, I made a start on the code a few weeks ago) for a possible fully modulized GFXLIB II, whereby the user can install the routines he or she requires. There would be a core set of routines mostly for internal use by GFXLIB, and the rest can be chosen as and when.

Regards,

David.

Re: GFXLIB
Post by admin on May 24^th, 2009, 10:09am

Quote:

Yes, I remember that, but I don't believe the cruncher can ever create one of the 32-bit extended register names (eax, ebx, ecx...) because, since they start with the valid hexdecimal character e (in *LOWERCASE mode), they are specifically disallowed.

The first valid register name created by the cruncher is 'GS' which is the 1273rd variable (I think). That really is a bug, because register names like SI and SP are already explicitly tested for and disallowed. I'll make a note to correct that if I ever release another version.

In the meanwhile I'm sure you can keep your number of label names below 1273 by sensible use of macros (with 'local' or 'private' labels as appropriate) or even using array elements as labels as documented on the Wiki.

Richard.

Re: GFXLIB
Post by David Williams on May 25^th, 2009, 12:34am

on May 24^th, 2009, 10:09am, Richard Russell wrote:

Yes, right you are. I had tried to find the e-mail that I originally sent to you which mentioned the actual register name, but it appears that Hotmail has either deleted it from their system, or has made it unavailable to me (I doubt they actually erase any e-mails from their servers).

on May 24^th, 2009, 10:09am, Richard Russell wrote:

In the meanwhile I'm sure you can keep your number of label names below 1273 by sensible use of macros (with 'local' or 'private' labels as appropriate) or even using array elements as labels as documented on the Wiki.

Array elements as labels sounds like a good idea, so I'll consider going that route; I'll consult the Wiki.

David.

Re: GFXLIB
Post by David Williams on May 25^th, 2009, 12:42am

A quick demo of a new routine called PlotBMColumn (plots a single 1-pixel-wide column of pixels from a bitmap):

http://www.bb4w-games.com/138519651/gfxlib_vplot_demo.zip (500 Kb)

If it seems a bit sluggish then bear in mind that the BB4W interpreter is doing a lot of work!

This routine will form the basis of several other routines.

Re: GFXLIB
Post by David Williams on May 28^th, 2009, 02:48am

GFXLIB's alpha blending routines are set to become a hell of a lot faster thanks to this sweet bit of
code I recently discovered on Avery Lee's VirtualDub site (http://tinyurl.com/obqpyt):

Code:

unsigned blend2(unsigned src, unsigned dst) {
unsigned alpha = src >> 24;
alpha += (alpha > 0);

unsigned srb = src & 0xff00ff;
unsigned sg = src & 0x00ff00;
unsigned drb = dst & 0xff00ff;
unsigned dg = dst & 0x00ff00;

unsigned orb = (drb + (((srb - drb) * alpha + 0x800080) >> 8)) & 0xff00ff;
unsigned og = (dg + (((sg - dg ) * alpha + 0x008000) >> 8)) & 0x00ff00;

return orb+og;
}

It works very well, and my ASM implementation of the above code is a big improvement over how
GFXLIB's routines currently perform the task (although here the code is modified to work with
a constant alpha value, rather than a per-pixel one as is done in the original code):

Code:

        .blend
        
        ;REM. ESP+4  -> src pxl (RGB32)
        ;REM. ESP+8  -> dst pxl (RGB32)
        ;REM. ESP+12 -> alpha (0-255)
        
        mov ebp, [esp + 12]          ; alpha value (0 to 255)
        mov esi, [esp + 4]           ; src pxl &xxRRGGBB
        mov edi, [esp + 8]           ; dest pxl &xxRRGGBB
        
        mov eax, esi                 ; copy ESI
        and eax, &FF00FF             ; EAX = srb
        and esi, &00FF00             ; ESI = sg
        
        mov edx, edi                 ; copy EDI
        and edi, &FF00FF             ; EDI = drb
        and edx, &00FF00             ; EDX = dg
        
        ;REM. EAX = srb
        ;REM. ESI = sg
        ;REM. EDI = drb
        ;REM. EDX = dg
        
        sub eax, edi                 ; srb - drb
        sub esi, edx                 ; sg - dg
        
        imul eax, ebp                ; (srb - drb)*alpha
        imul esi, ebp                ; (sg - dg)*alpha
        
        add eax, &800080             ; (srb - drb)*alpha + &800080
        add esi, &008000             ; (sg - dg)*alpha + &008000
        
        shr eax, 8                   ; ((srb - drb)*alpha + &800080) >> 8
        shr esi, 8                   ; ((sg - dg)*alpha + &008000) >> 8
        
        add eax, edi                 ; drb + ((srb - drb)*alpha + &800080) >> 8
        add esi, edx                 ; dg + ((sg - dg)*alpha + &008000) >> 8
        
        and eax, &FF00FF             ; (drb + ((srb - drb)*alpha + &800080) >> 8) AND &FF00FF
        and esi, &00FF00             ; (dg + ((sg - dg)*alpha + &008000) >> 8) AND &00FF00
        
        add eax, esi
        
        ret 12

If anyone can spot any optimisations that can be made, perhaps shaving off an instruction or two,
then please let me know!

Regards,

David.

Re: GFXLIB
Post by David Williams on Jun 2^nd, 2009, 11:16pm

More trivial nonsense:

~~http://www.bb4w-games.com/138519651/example48.zip~~ (529Kb)

Re: GFXLIB
Post by admin on Jun 3^rd, 2009, 08:21am

Quote:

More trivial nonsense:
http://www.bb4w-games.com/138519651/example48.zip (529Kb)

Very nice.

Richard.

Re: GFXLIB
Post by David Williams on Jun 3^rd, 2009, 5:45pm

Example 49:

~~http://www.bb4w-games.com/138519651/example49.zip~~ (557Kb)

Just demo'ing a few new routines, namely:

GFXLIB_BrushBlur
GFXLIB_BPlotBMRowList
GFXLIB_PlotBMColumnList
GFXLIB_Line

David.

Re: GFXLIB
Post by David Williams on Jun 6^th, 2009, 8:01pm

Another day, another new routine for an already bloated GFXLIB. This one's called GFXLIB_RotateScaleTile:

~~http://www.bb4w-games.com/138519651/rotatescaletile.zip~~

It uses a quite fast method of rotating a bitmap, although my code is far from optimal. Still, this
demo averages only 16% CPU load on my laptop which is rather good in my opinion.

David.

Re: GFXLIB
Post by Michael Hutton on Jun 7^th, 2009, 6:41pm

Another great routine! When is the next version of GFXLIB coming out?

Michael

Re: GFXLIB
Post by David Williams on Jun 7^th, 2009, 8:34pm

on Jun 7^th, 2009, 6:41pm, Michael Hutton wrote:

Another great routine! When is the next version of GFXLIB coming out?

Thanks, Michael.

The next version (i.e., the second publicly released version) will probably be out
in a few months; plenty of work to do until then.

Currently trying to get an MMX-powered 'alpha blending' routine up and running.

David.

Re: GFXLIB
Post by admin on Jun 7^th, 2009, 9:58pm

Quote:

Currently trying to get an MMX-powered 'alpha blending' routine up and running.

I presume you are aware of (and are probably using) this document:

ftp://download.intel.com/ids/mmx/MMX_App_Alpha_Blending.pdf

Richard.

Re: GFXLIB
Post by David Williams on Jun 7^th, 2009, 10:31pm

on Jun 7^th, 2009, 9:58pm, Richard Russell wrote:

I presume you are aware of (and are probably using) this document:

ftp://download.intel.com/ids/mmx/MMX_App_Alpha_Blending.pdf

Yes, thanks, I've downloaded that document twice now... first time was some months ago,
and then again a few days ago. I was put off by the fact that it 'outputs' 16-bit 5:5:5 colour
values, rather than 32-bit 8:8:8:8.

I'm trying to adapt the alpha-blending code I posted here a few days ago -- it should be easy,
or at least it will be when I've worked out how to do MMX multiplies.

David.

Re: GFXLIB
Post by admin on Jun 8^th, 2009, 09:10am

Quote:

I was put off by the fact that it 'outputs' 16-bit 5:5:5 colour values, rather than 32-bit 8:8:8:8.

To be precise, both the 'background' input and the output are 16-bpp, whereas the 'foreground' input is 32-bpp ARGB. No doubt this is because, when that article was written, 16-bpp was a common setting for PC displays.

It's not too difficult to strip out the code that unpacks and packs the 16-bpp pixels and replace it with code for 32-bpp RGB (with the 'alpha' byte unused). The resulting code is simpler, too.

Later: I've modified the Intel code to work with 32-bpp input and output. On my PC it's taking 2.2ms for a 640x480 image. Are you interested in it, or have you got your own working?

Richard.

Re: GFXLIB
Post by David Williams on Jun 8^th, 2009, 4:05pm

on Jun 8^th, 2009, 09:10am, Richard Russell wrote:

Later: I've modified the Intel code to work with 32-bpp input and output. On my PC it's taking 2.2ms for a 640x480 image. Are you interested in it... ?

In one word: Yes !

I am definately interested in it, thank you.

Now...

I recently knocked-up an MMX version of a supposedly optimised '50%' alpha blender (simply averages
the RGB32 colour values of corresponding foreground and background pixels). Here's the inner loop
from the non-MMX version:

Code:

        mov edx, [edi + 4*esi]                  ; load RGB32 pixel from source bitmap
        mov ebx, [ecx + 4*esi]                  ; load RGB32 pixel from dest addr
        and edx, &FEFEFE
        and ebx, &FEFEFE
        shr edx, 1
        shr ebx, 1
        add edx, ebx
        mov [ecx + 4*esi], edx                  ; write RGB32 pixel to destination bitmap buffer

Here's my MMX version, operating on four pixels per iteration of the inner X-loop:

Code:

        .GFXLIB_MMXBPlotAvgNC__xloop
        
        mov ebx, ecx
        shl ebx, 4
        
        movq mm1, [edi + ebx + 0]             ; load 2 pxls from bg      \
        movq mm2, [esi + ebx + 0]             ; load 2 pxls from srcBm    \
        ;                                     ;                            > 4 pixels
        movq mm3, [edi + ebx + 8]             ; load 2 pxls from bg       /
        movq mm4, [esi + ebx + 8]             ; load 2 pxls from srcBm   /
        
        pand mm1, mm0
        pand mm2, mm0
        pand mm3, mm0
        pand mm4, mm0
        
        psrld mm1, 1
        psrld mm2, 1
        psrld mm3, 1
        psrld mm4, 1
        
        paddd mm1, mm2
        paddd mm3, mm4
        
        movq [edi + ebx + 0], mm1
        movq [edi + ebx + 8], mm3
        
        dec ecx
        jge GFXLIB_MMXBPlotAvgNC__xloop

To my amazement, it's really no faster (or just marginally so) than the non-MMX version.

I was expecting something approaching a 2x speed improvement. :-(

David.

Re: GFXLIB
Post by admin on Jun 8^th, 2009, 5:35pm

Quote:

To my amazement, it's really no faster (or just marginally so) than the non-MMX version.

I'm amazed that you were amazed! Your code really takes no advantage of the MMX features at all (e.g. parallel operation on up to 4 independent values, automatic clipping of results etc.). Really all you're doing is using 64-bit wide registers rather than 32-bit wide registers.

In addition to this, the operations you are performing (basically just an AND and a SHIFT) are so simple the likelihood is that the speed is determined largely by the bottleneck of getting the data out of and into memory, and not by the processing. Since exactly the same amount of data is transferred in each case, it's not surprising the speed is similar.

(Incidentally I assume you ensured your data was always QWORD-aligned; if it isn't the MMX method will be substantially slower than it otherwise would be).

Quote:

I am definately interested in it, thank you.

Here's my alphablending code. I would hope and expect it to be substantially faster than a non-MMX version:

Code:

        ;
        ; eax = pointer to array of 32-bit ARGB pixels ('foreground')
        ; ebx = pointer to array of 32-bit xRGB pixels ('background')
        ; ecx = pixel count DIV 2
        ;
        .roundf  dd &00800080 : dd &00000080
        ;
        .blend
        mov esi, eax       ; foreground pointer
        mov edi, ebx       ; background pointer
        
        movq mm4, [roundf] ; mm4 = 0000 0080 0080 0080 (rounding factor)
        pxor mm5, mm5      ; mm5 = 0000 0000 0000 0000
        
        .blendloop
        movq mm6, [esi]    ; mm6 = a2r2 g2b2 a1r1 g1b1 (foreground)
        movq mm7, [edi]    ; mm7 = xxR2 G2B2 xxR1 G1B1 (background)
        
        movq mm0, mm6      ; mm0 = xxxx xxxx a1r1 g1b1
        movq mm2, mm7      ; mm2 = xxxx xxxx xxR1 G1B1
        
        punpcklbw mm0, mm5 ; mm0 = 00a1 00r1 00g1 00b1 (p1)
        punpcklbw mm2, mm5 ; mm2 = 00xx 00R1 00G1 00B1 (q1)
        
        movq mm1, mm0      ; mm1 = 00a1 xxxx xxxx xxxx
        punpckhwd mm1, mm1 ; mm1 = 00a1 00a1 xxxx xxxx
        punpckhdq mm1, mm1 ; mm1 = 00a1 00a1 00a1 00a1
        
        psubw mm0, mm2     ; mm0 = p1 - q1
        psllw mm2, 8       ; mm2 = q1 * 256
        paddw mm2, mm4     ; mm2 = q1 * 256 + 128
        
        pmullw mm0, mm1    ; mm0 = (p1 - q1) * a1
        paddw mm2, mm0     ; mm2 = (p1 - q1) * a1 + q1 * 256 + 128
        psrlw mm2, 8       ; mm2 = xxxx 00R1 00G1 00B1
        
        psrlq mm6, 32      ; mm6 >>= 32 for second pixel
        psrlq mm7, 32      ; mm7 >>= 32 for second pixel
        
        movq mm0, mm6      ; mm0 = xxxx xxxx a2r2 g2b2
        movq mm3, mm7      ; mm3 = xxxx xxxx xxR2 G2B2
        
        punpcklbw mm0, mm5 ; mm0 = 00a2 00r2 00g2 00b2 (p2)
        punpcklbw mm3, mm5 ; mm3 = 00xx 00R2 00G2 00B2 (q2)
        
        movq mm1, mm0      ; mm1 = 00a2 xxxx xxxx xxxx
        punpckhwd mm1, mm1 ; mm1 = 00a2 00a2 xxxx xxxx
        punpckhdq mm1, mm1 ; mm1 = 00a2 00a2 00a2 00a2
        
        psubw mm0, mm3     ; mm0 = p2 - q2
        psllw mm3, 8       ; mm3 = q2 * 256
        paddw mm3, mm4     ; mm3 = q2 * 256 + 128
        
        pmullw mm0, mm1    ; mm0 = (p2 - q2) * a2
        paddw mm3, mm0     ; mm3 = (p2 - q2) * a2 + q2 * 256 + 128
        psrlw mm3, 8       ; mm3 = xxxx 00R2 00G2 00B2
        
        packuswb mm2, mm3  ; mm2 = xxR2 G2B2 xxR1 G1B1 (result)
        movq [edi], mm2    ; save result
        mov byte [edi+3],0 ; zero 'alpha' of pixel 1 (if necessary)
        mov byte [edi+7],0 ; zero 'alpha' of pixel 2 (if necessary)
        
        add esi, 8         ; next pixel-pair (foreground)
        add edi, 8         ; next pixel-pair (background)
        loop blendloop
        
        emms
        ret

I haven't done any extensive 'pairing' optimisation so there's a possibility it could be made more efficient.

Richard.

Re: GFXLIB
Post by David Williams on Jun 9^th, 2009, 06:24am

on Jun 8^th, 2009, 5:35pm, Richard Russell wrote:

I thought I was operating on two pixels in parallel (four pixels per iteration of the inner loop); the operations on the second pixel of each pair being almost 'free'.

on Jun 8^th, 2009, 5:35pm, Richard Russell wrote:

(Incidentally I assume you ensured your data was always QWORD-aligned; if it isn't the MMX method will be substantially slower than it otherwise would be).

No, I didn't! But I sure as heck do now. I've modified GFXLIB's bitmap loading routine to ensure that the start address of the data is divisible by 8. I notice that the DIB section ('screen memory') base address always seems to be QWORD-aligned, too. So, all data is now QWORD-aligned.

on Jun 8^th, 2009, 5:35pm, Richard Russell wrote:

Here's my alphablending code. I would hope and expect it to be substantially faster than a non-MMX version:

Beautiful... thanks. I've done some timing and frame rate tests involving your MMX routine, a more recent non-MMX routine that I had believed was fast and not far from being optimal, and a what-I-thought-was the stinky old GFXLIB alphablend routine (GFXLIB_BPlotAlphaBlend2). The results are surprising and a bit puzzling.

Here are the timing results of alphablending two 640x512 32bpp bitmaps, 1000 blends (times given in seconds):

Your MMX routine: 2.4
New supposedly fast non-MMX routine: 5.6
'Old' GFXLIB_BPlotAlphaBlend2 routine: 3.0

So, MMX wins. But... it doesn't seem to be that much faster than GFXLIB_BPlotAlphaBlend2.

Compare the code from the inner loop of the new non-MMX alphablend routine with that of the old GFXLIB_BPlotAlphaBlend2:

New non-MMX

Code:

        ._esp dd 0         ; will be temporarily stashing ESP in here
        
        .blend2
        
        ; EAX = fg
        ; EBX = bg
        ; ECX = pixel count
        
        mov [_esp], esp               ; naughty ;-)
        
        mov esp, eax                  ; ESP = fg
        
        .blend2_lp
        mov esi, [esp + 4*ecx - 4]    ; fg pxl (ARGB)
        mov edi, [ebx + 4*ecx - 4]    ; bg pxl (xRGB)
        
        mov ebp, esi                  ; extract alpha value from src ARGB pxl
        ; EBP is shifted (>> 24) later (aids pipelining ?)

        mov eax, esi                 ; copy ESI
        and eax, &FF00FF             ; EAX = srb
        and esi, &00FF00             ; ESI = sg
        
        mov edx, edi                 ; copy EDI
        and edi, &FF00FF             ; EDI = drb
        and edx, &00FF00             ; EDX = dg
        
        shr ebp, 24
        adc ebp, 0
        
        ;REM. EAX = srb
        ;REM. ESI = sg
        ;REM. EDI = drb
        ;REM. EDX = dg
        
        sub eax, edi                 ; srb - drb
        sub esi, edx                 ; sg - dg
        
        imul eax, ebp                ; (srb - drb)*alpha
        imul esi, ebp                ; (sg - dg)*alpha
        
        add eax, &800080             ; (srb - drb)*alpha + &800080
        add esi, &008000             ; (sg - dg)*alpha + &008000
        
        shr eax, 8                   ; ((srb - drb)*alpha + &800080) >> 8
        shr esi, 8                   ; ((sg - dg)*alpha + &008000) >> 8
        
        add eax, edi                 ; drb + ((srb - drb)*alpha + &800080) >> 8
        add esi, edx                 ; dg + ((sg - dg)*alpha + &008000) >> 8
        
        and eax, &FF00FF             ; (drb + ((srb - drb)*alpha + &800080) >> 8) AND &FF00FF
        and esi, &00FF00             ; (dg + ((sg - dg)*alpha + &008000) >> 8) AND &00FF00
        
        add eax, esi
        
        mov [ebx + 4*ecx - 4], eax   ; write alpha-blended pixel
        
        loop blend2_lp
        
        mov esp, [_esp]
        
        ret

Old GFXLIB_BPlotAlphaBlend2

Code:

        .GFXLIB_BPlotAlphaBlend2__xloop

        movzx edx, BYTE [edi + 4*esi + 3]       ; load alpha mask byte
        neg edx
        add edx, 255
        imul edx, (1.0/255.0)*(2^20)            ; = mulfac
        
        movzx eax, BYTE [edi + 4*esi + 2]       ; load src bmp Red byte
        movzx ebx, BYTE [ecx + 4*esi + 2]       ; load dst bmp Red byte
        sub ebx, eax                            ; (dst - src)
        imul ebx, edx                           ; mulfac*(dst - src)
        shr ebx, 20                             ; (mulfac*(dst - src)) >> 20
        add eax, ebx
        mov BYTE [ecx + 4*esi + 2], al
        
        movzx eax, BYTE [edi + 4*esi + 1]       ; load src bmp Green byte
        movzx ebx, BYTE [ecx + 4*esi + 1]       ; load dst bmp Green byte
        sub ebx, eax                            ; (dst - src)
        imul ebx, edx                           ; mulfac*(dst - src)
        shr ebx, 20                             ; (mulfac*(dst - src)) >> 20
        add eax, ebx
        mov BYTE [ecx + 4*esi + 1], al
        
        movzx eax, BYTE [edi + 4*esi + 0]       ; load src bmp Blue byte
        movzx ebx, BYTE [ecx + 4*esi + 0]       ; load dst bmp Blue byte
        sub ebx, eax                            ; (dst - src)
        imul ebx, edx                           ; mulfac*(dst - src)
        shr ebx, 20                             ; (mulfac*(dst - src)) >> 20
        add eax, ebx
        mov BYTE [ecx + 4*esi + 0], al
        
        dec esi                                 ; X -= 1
        jge GFXLIB_BPlotAlphaBlend2__xloop      ; loop if X >= 0

How could GFXLIB_BPlotAlphaBlend2 possibly be faster than the new non-MMX routine? I'd like to know where the bottleneck lies with the new non-MMX routine.

BPlotAlphaBlend2 has 10 memory accesses (7 reads, 3 writes) per pixel.

New non-MMX has just 3 memory accesses (2 reads, 1 write) per pixel.

BPlotAlphaBlend2 has 4 multiply instructions (IMUL) per pixel. New non-MMX has 2.

Both routines have roughly the same number of instructions (26) in the inner loop.

ADC

The ADC instruction in the non-MMX routine is quite an execution speed killer, it would seem. Removing the ADC ebp,0 instruction (which is probably largely superfluous) lops off an incredible 2 seconds! So the new league table is as follows:

Your MMX routine: 2.4
'Old' GFXLIB_BPlotAlphaBlend2 routine: 3.0
New non-MMX routine (without ADC instruction): 3.5
New non-MMX routine: 5.6

Still, GFXLIB_BPlotAlphaBlend2 is the faster non-MMX routine.

Timing and frame rate test programs (executables)

The frame rate test programs are largely identical, except a different alphablending routine is used in each case.

Richard's MMX routine (averages 226 fps on my PC):
~~http://www.bb4w-games.com/138519651/alphablend_mmx.zip~~ (1 MB)

'Old' GFXLIB_BPlotAlphaBlend2 routine (averages 206 fps on my PC):
~~http://www.bb4w-games.com/138519651/alphablend_bplotalphablend2.zip~~ (1 MB)

New non-MMX routine (averages 127 fps on my PC):
~~http://www.bb4w-games.com/138519651/alphablend_non-mmx.zip~~ (1 MB)

New non-MMX routine without ADC instruction (averages 174 fps on my PC):
~~http://www.bb4w-games.com/138519651/alphablend_non-mmx_no-adc.zip~~ (1 MB)

Timing tests:

~~http://www.bb4w-games.com/138519651/alphablend_timingtest_2.zip~~ (1 MB)

By the way, I've been repeatedly referring to the 'New' non-MMX routine -- I don't intend to replace BPlotAlphaBlend2 with it until I can figure out (or someone else can tell me) where the bottleneck lies with it. Just to reiterate, New non-MMX *should* be faster than BPlotAlphaBlend2, but it isn't.

Anyway, Richard, your MMX-powered alphablending routine would be a fine and very welcome addition to GFXLIB (if I may !?).

Thanks again.

David.

Re: GFXLIB
Post by admin on Jun 9^th, 2009, 08:52am

Quote:

I thought I was operating on two pixels in parallel

Because of the processor's superscalar architecture, it's entirely possible your non-MMX version was processing "two pixels in parallel" too. Anyway, as I said, they may both have been memory-bandwidth bound, in which case the speed would be much the same.

Quote:

The ADC instruction in the non-MMX routine is quite an execution speed killer, it would seem.

As I expect you appreciate, it's not that ADC is particularly slow but it's because of the dependencies it introduces. You have to think of the effect of the instruction not on its own, but in the context of the instructions which surround it.

In this particular case the main significance is that since adc ebp,0 depends on the state of the carry flag, and since the preceding shr ebp,24 affects the carry flag, the two instructions are forced to be serialised and run on the same execution unit.

I expect that, when you remove the ADC, it gives the processor more opportunity to take advantage of out-of-order execution, and of scheduling instructions on the different execution units. Therefore the speed improvement is disproportionate to the time taken by the ADC in isolation.

It's for this kind of reason that modern compilers can sometimes beat 'hand assembly' for speed. They understand (better than the average human programmer!) the internal architecture of the CPU, and when it can be of benefit to change the sequence of instructions to improve performance even if the clarity of the code suffers.

Whether a similar issue explains the 'anomalous' speed difference between your two non-MMX versions I can't say, but it may do.

Quote:

Anyway, Richard, your MMX-powered alphablending routine would be a fine and very welcome addition to GFXLIB (if I may !?).

Of course you may, with the appropriate grovelling acknowledgement!!

Richard.

Re: GFXLIB
Post by David Williams on Jan 1^st, 2010, 6:57pm

~~http://www.bb4w-games.com/bb4wprogs/superposealphamapdemo.zip~~

This demo employs a few new(-ish) GFXLIB routines:

* Richard's MMXAlphaBlend
* PlotBlendLD (LD = Luminosity-Dependent)
* MMXScale2X -- fast 2x scaling of a bitmap
* SuperposeAlphaMap -- Superposes (or should that be Superimposes?) an alpha map (or mask) over a bitmap
* BoxBlur3x3

This is all rather burdensome for the CPU, and I'm sure a similar effect can be achieved much more efficiently by other means, but really the point of the demo is to er... demonstrate SuperposeAlphaMap.

Regards,
David.

Re: GFXLIB
Post by David Williams on Jan 2^nd, 2010, 07:59am

This one's a little bit prettier:

~~http://www.bb4w-games.com/bb4wprogs/superposealphamapdemo2.zip~~ -- (1.3 MB)

David.
--

Re: GFXLIB
Post by David Williams on Jan 9^th, 2010, 8:38pm

Another day, another GFXLIB routine.

This one's called PlotRearrangeInvertRGB.

(Yikes !)

It rearranges the RGB colour components of every non-black pixel in a given bitmap, and then inverts the specified RGB components, then plots the pixel.

Code:

SYS GFXLIB_PlotRearrangeInvertRGB%, dispVars{}, bmAddr, bmW, bmH, x, y, rgbRearrangementCode, rgbInversionFlags

The RGB rearrangement parameter is an integer in the range 0 to 5:

0. RGB (no rearrangement)
1. RBG
2. GRB
3. GBR
4. BRG
5. BGR

The RGB inversion flags parameter:

bit 0 ---> invert Red value
bit 1 ---> invert Green value
bit 2 ---> invert Blue value

So supplying a value of 1 for the inversion flags parameter will result in the red channel being inverted, 6 would result in both green and blue channels being inverted, etc.

Here's a demo of the routine (the above colour-transforming operation is applied in real-time in this program):

~~http://www.bb4w-games.com/bb4wprogs/plotrearrangeinvertrgb_demo.zip~~

Below is the original, unadulterated 64x64 ball bitmap (from the demo):

User Image

I may soon post a question in the Assembly Language section on what is the fastest possible way of rearranging and/or inverting RGB components of a pixel, because I'm probably doing it in one of the slowest ways.

Regards,
David.

Re: GFXLIB
Post by David Williams on Jan 11^th, 2010, 9:49pm

Just coloured balls with blurred trails:

~~http://www.bb4w-games.com/bb4wprogs/blurredcolouredballs.zip~~ -- (92 Kb)

The colouring is done with PlotRearrangeInvertRGB, and the blurring is achieved with Michael Hutton's fast blurring routine.

David.

Re: GFXLIB
Post by David Williams on Feb 6^th, 2010, 10:01am

Thought I'd post this as it looks quite pretty (and uses not much CPU bandwidth):

~~http://www.bb4w-games.com/bb4wprogs/plotpixellist3_ex4.zip~~

Yes, OK, the Bézier curves are precalculated, but only because if they had to be calcluated in real time (in BASIC), the CPU usage would shoot right up (to ~50% on my laptop). The actual plotting of the curves takes next-to-no time.

What's supposed to be being demonstrated here is a new (but not terribly exciting) GFXLIB routine PlotPixelList3, which is faster and more flexible than the first two.

Interested folks might like to keep an eye on this page over the coming weeks:

~~http://www.bb4w-games.com/gfxlib2/gfxlib2page.html~~

Regards,
David.

Re: GFXLIB
Post by 81RED on Feb 6^th, 2010, 10:34am

on Feb 6^th, 2010, 10:01am, David Williams wrote:

http://www.bb4w-games.com/gfxlib2/gfxlib2page.html

Yay!

Any benefits to be gained from recoding my stuff to use the new version (other than smaller code, which does not bother me), or is that better left off until my next project actually happens?

Simon

P.S. Weren't you supposed to be in rehab? wink

Re: GFXLIB
Post by David Williams on Feb 6^th, 2010, 11:29am

on Feb 6^th, 2010, 10:34am, Simon Mathiassen wrote:

Yay!

Any benefits to be gained from recoding my stuff to use the new version (other than smaller code, which does not bother me), or is that better left off until my next project actually happens?

Not really. The routines may have been 'modularized', but they haven't (yet) been optimized or improved in any other way.

A multicore version of the 'workhorse' routine BPlot - the one you'd normally use to draw backgrounds - may be in the pipeline. Whilst I doubt that four cores would necessarily translate to "four times faster" (which would be nice), BPlot (and Plot, as it happens) are such frequently used routines that it may be worthwhile trying to produce multicore/multithreading versions of them. The skills of Michael Hutton and/or Richard may (or rather, will) be indispensable here.

(Not that I would ever take either of them for granted, let me just say.)

on Feb 6^th, 2010, 10:34am, Simon Mathiassen wrote:

P.S. Weren't you supposed to be in rehab? wink

I evidently appear to be slipping.

Regards,
David.

Re: GFXLIB
Post by David Williams on Feb 11^th, 2010, 9:49pm

Here is a simple demo of a new routine called DrawTileMap:

(Use the arrow keys to move around)

~~http://www.bb4w-games.com/bb4wprogs/tilemapdemo1.zip~~

Using DrawTileMap is simplicity itself. First you need a set of tile bitmaps (all of the same dimensions - 64x64 is quite typical). Then you create a tile map using a map editor (a simple one will ship with GFXLIB 2). Then once you've installed and initialised GFXLIB 2, DrawTileMap and TileMapFunctions, and loaded the tile map with PROCLoadTileMap, you draw the portion of the map you wish to display using:

Code:

SYS GFXLIB_DrawTileMap%, dispVars{}, mapInfo{}, x%, y%

Simple as that.

Something to look forward to, right?

Regards,

David.

PS. There will be a number of routines for converting between screen and map world coordinates, and routines intended for the purposes of collision detection. In fact, the following routines are already in place (with lots more to come):

Code:

GFXLIB_TMConvertScrCoordsToWorldCoords%        | *mapInfoStruc{}, scrX%, scrY%, *worldX%, *worldY%
GFXLIB_TMConvertWorldCoordsToScrCoords%        | *mapInfoStruc{}, worldX%, worldY%, *scrX%, *scrY%
GFXLIB_TMConvertScrCoordsToWorldCoordsF64%     | *mapInfoStruc{}, *scrX#, *scrY#, *worldX%, *worldY%
GFXLIB_TMConvertWorldCoordsToScrCoordsF64%     | *mapInfoStruc{}, *worldX#, *worldY#, *scrX%, *scrY%
GFXLIB_TMGetTileIndexAtWorldPos%               | *mapInfoStruc{}, worldX%, worldY% (returns tile index in EAX)
GFXLIB_TMGetTileIndexAtScrPos%                 | *mapInfoStruc{}, scrX%, scrY% (returns tile index in EAX)
GFXLIB_TMGetPixelAtWorldPos%                   | *mapInfoStruc{}, worldX%, worldY% (returns 32-bit ARGB pixel in EAX)
GFXLIB_TMGetPixelAlphaValueAtWorldPos%         | *mapInfoStruc{}, worldX%, worldY% (returns 8-bit alpha value in EAX)
GFXLIB_TMTestPixelAlphaBitAtWorldPos%          | *mapInfoStruc{}, worldX%, worldY%, testBit% (returns 0 or 1 in EAX)

Re: GFXLIB
Post by Michael Hutton on Feb 12^th, 2010, 02:12am

Very nifty piece of work!

My question is: What do the * mean in

*mapInfoStruc{}, *worldX#, *worldY#, *scrX%, *scrY%

I presume they are not C++ pointers?! Or is this a new type of BB4W addressing mode I haven't come accross? wink

Michael

Re: GFXLIB
Post by David Williams on Feb 12^th, 2010, 02:49am

on Feb 12^th, 2010, 02:12am, Michael Hutton wrote:

My question is: What do the * mean in

*mapInfoStruc{}, *worldX#, *worldY#, *scrX%, *scrY%

I presume they are not C++ pointers?! Or is this a new type of BB4W addressing mode I haven't come accross? wink

Actually, I don't think it was correct of me to prefix mapInfoStruc{} with an asterisk.

Yes, an asterisk indicates that the parameter is a pointer to a memory location that contains (or is to contain) either an integer or a 64-bit float.

I realise that 32-bit floats can be supplied 'directly' using FN_f4, but it's probably not suitable for heavy use within a game loop.

By the way, those comments are really just notes to myself. I don't normally use asterisks to indicate pointer variables! I might from now on, though.

Regards,

David.

Re: GFXLIB
Post by Richard Russell on Feb 12^th, 2010, 08:11am

on Feb 12^th, 2010, 02:49am, David Williams wrote:

I don't normally use asterisks to indicate pointer variables! I might from now on, though.

Is there a particular reason why you don't use ^, to reflect the syntax you'd use if calling the routine from BASIC?

Richard.

Re: GFXLIB
Post by David Williams on Feb 12^th, 2010, 3:05pm

on Feb 12^th, 2010, 08:11am, Guest-Richard Russell wrote:

Is there a particular reason why you don't use ^, to reflect the syntax you'd use if calling the routine from BASIC?

Perhaps I suffered a momentary lapse of common sense.

undecided

David.

Re: GFXLIB
Post by 81RED on Feb 14^th, 2010, 7:19pm

on Feb 11^th, 2010, 9:49pm, David Williams wrote:

Code:

SYS GFXLIB_DrawTileMap%, dispVars{}, mapInfo{}, x%, y%

Simple as that.

Something to look forward to, right?

Bloody hell! That looks VERY useful indeed.
Am I correct in guessing your experiences coding Crystal Hunter II had something to do with it?

Simon

Re: GFXLIB
Post by David Williams on Feb 15^th, 2010, 01:36am

on Feb 14^th, 2010, 7:19pm, Simon Mathiassen wrote:

Bloody hell! That looks VERY useful indeed.
Am I correct in guessing your experiences coding Crystal Hunter II had something to do with it?

Simon

No, I don't think so.

In addition to those tile map-related routines, I've started work on a new library called GAMEOBJLIB (Game Object Library), although don't be too surprised if the final release version is called something else. It's aim (whether it'll be achieved or not) is to simplify and speed up the development of 2D games, especially ones with extended 'worlds' - such as scrolling platformers, shoot-'em-ups, etc. I'll probably have some demos of it in action soon.

Regards,
David.

Re: GFXLIB
Post by David Williams on Jun 27^th, 2010, 02:33am

I've uploaded GFXLIB version 2.00 to my website's FTP area, although I don't yet have a separate web page dedicated to it. That'll come in a few days.

Download link
http://www.bb4wgames.com/progs/zip/gfxlib2.zip [3.93 MB]

Quick Start Guide
http://www.bb4wgames.com/misc/GFXLIB2QuickStartGuide.pdf

There is plenty of documentation in the form of example programs (at least one example program and Info.TXT file for almost every GFXLIB routine).

Also included in the package are dozens of demo programs originally written for GFXLIB 1, but which have since been modified to work with GFXLIB 2. I apologise in advance for the fact that the program code for some of the demos is rather cryptic and poorly commented.

One of the most important differences between this version (2.00) and previous beta versions of GFXLIB 2 is that the name of the display variables structure (dispVars{}) must now be declared when initialising GFXLIB 2:

Code:

INSTALL @lib$ + "GFXLIB2"
PROCInitGFXLIB( dispVars{}, 0 )

Happy game writing ?

Regards,
David.

http://www.bb4wgames.com

Re: GFXLIB
Post by 81RED on Jun 27^th, 2010, 09:44am

Congratulations! Wow, that was a LONG and difficult birth – glad you got there in the end.

EDIT: Has dibSectionaddr% simply been replaced by dibs%, or is there more to it than that?

Re: GFXLIB
Post by David Williams on Jun 27^th, 2010, 1:26pm

on Jun 27^th, 2010, 09:44am, Simon Mathiassen wrote:

Congratulations! Wow, that was a LONG and difficult birth – glad you got there in the end.

Thanks. Things are still far from ideal, though.

on Jun 27^th, 2010, 09:44am, Simon Mathiassen wrote:

EDIT: Has dibSectionaddr% simply been replaced by dibs%, or is there more to it than that?

The global variable dibs% has simply replaced dibSectionAddr%, although they did exist concurrently for a while.

Perhaps the global variables declared upon initialising GFXLIB will disappear altogether in subsequent versions. Their fate depends on how much hassle this would cause the end-user (and myself!).

Regards,
David.

Re: GFXLIB
Post by admin on Jun 27^th, 2010, 5:57pm

on Jun 27^th, 2010, 1:26pm, David Williams wrote:

Perhaps the global variables declared upon initialising GFXLIB will disappear altogether in subsequent versions.

Am I right in thinking that none of these 'global variables' are actually needed by the user's program, but are provided only as an 'optional extra' to save him some effort (for example the predefined bitmap bm32%)? I understood that all the 'essential' shared variables are stored in dispVars{} and therefore not actually 'global'.

Richard.

Re: GFXLIB
Post by David Williams on Jun 27^th, 2010, 7:00pm

on Jun 27^th, 2010, 5:57pm, Richard Russell wrote:

A number of basic 'core' routines are assembled in GFXLIB2.BBC, including Plot, BPlot, and Clr. The addresses of these 'workhorse' routines are stored in global variables (GFXLIB_Plot%, GFXLIB_BPlot%, GFXLIB_Clr%, etc.). Which is why a user of GFXLIB 2 can simply do this to get a bitmap plotted:

Code:

REM. Display a pre-defined 64x64 bitmap
MODE 8
INSTALL @lib$ + "GFXLIB2"
PROCInitGFXLIB( dispVars{}, 0 )
SYS GFXLIB_Plot%, dispVars{}, bm32%, 64, 64, 320, 256
PROCdisplay

(Yes, PROCdisplay is defined in GFXLIB2.BBC).

A few other global variables are created in GFXLIB2.BBC as well, including those containing the address of some trigonometric and division tables required by some of the external GFXLIB modules.

For every external GFXLIB module that the user INSTALLs, another global variable is created. For example, the external module PlotBlend creates a new global called GFXLIB_PlotBlend%:

Code:

MODE 8
INSTALL @lib$ + "GFXLIB2" : PROCInitGFXLIB( dispVars{}, 0 )
INSTALL @lib$ + "GFXLIB_modules\PlotBlend.BBC" : PROCInitModule
SYS GFXLIB_PlotBlend%, dispVars{}, bm32%, 64, 64, 320, 256, 80
PROCdisplay

I was going to have the user initialise the module using something like (for example):

Code:

INSTALL @lib$ + "GFXLIB_modules\PlotBlend.BBC"
plotblend% = FNInitModule

Which would at least have the global created within the user's program, rather than in the module.

But I'd completely forgotten about implementing this idea until about er... 15 seconds ago.

Not to mention the considerable extra work that would be involved.

Regards,

David.

Re: GFXLIB
Post by admin on Jun 27^th, 2010, 9:51pm

on Jun 27^th, 2010, 7:00pm, David Williams wrote:

The addresses of these 'workhorse' routines are stored in global variables (GFXLIB_Plot%, GFXLIB_BPlot%, GFXLIB_Clr%, etc.)

OK. I suppose I wasn't counting the actual routine addresses as global variables, but I see what you mean.

Quote:

I was going to have the user initialise the module using something like (for example)...
plotblend% = FNInitModule

Yes, that seems a nice way to do it and decouples the library's name space from the user's name space.

Richard.

Re: GFXLIB
Post by David Williams on Jul 2^nd, 2010, 01:51am

"Flying pink worm" demo (DirectX9 fullscreen version)

Nothing new here really, just a previously uploaded GFXLIB demo modified to employ Michael Hutton's GFXD3D9LIB for fullscreen DirectX9 display.

DirectX9 is therefore required. Users running the demo on older computers (more than 8 years old, let's say) should expect a low frame rate.

~~http://www.bezu.co.uk/filesdump/temp/progs/flyingpinkworm-dx9.zip~~

Press SPACE BAR during the demo to exit. Don't press Alt+F4 to exit.

The forthcoming release of GFXLIB (version 2.01) will, apart from some important modifications, contain a few additional programs demonstrating the use of GFXD3D9LIB.

David.

Re: GFXLIB
Post by admin on Jul 3^rd, 2010, 9:06pm

on Jul 2^nd, 2010, 01:51am, David Williams wrote:

Don't press Alt+F4 to exit.

Why not? It's what I always do to exit a full-screen program (if there's no obvious alternative). Tony Tooth's full-screen programs specifically tell you to quit them that way.

If for some reason you really don't want Alt-F4 used, put an ON CLOSE RETURN in the program, but in that case I'd expect pressing Esc to work.

Richard.

Re: GFXLIB
Post by David Williams on Jul 3^rd, 2010, 11:30pm

on Jul 3^rd, 2010, 9:06pm, Richard Russell wrote:

Why not?

Circuitous answer: Why would you want to press Alt+F4 (two keystrokes) to exit the program when you can simply press Space Bar?

Alt+F4 is more fiddly and more energy consuming.

The other reason is that this particular version of the program might not exit cleanly if you press Alt+F4.

David.

Re: GFXLIB
Post by admin on Jul 4^th, 2010, 09:54am

on Jul 3^rd, 2010, 11:30pm, David Williams wrote:

Why would you want to press Alt+F4 (two keystrokes) to exit the program when you can simply press Space Bar?

Because by that time I've forgotten that it said to press the Space Bar (if I ever noticed in the first place!).
Because Alt+F4 is a standard way of quitting a program (there's a standard for a single-key exit too, but it's Esc not Space).

Quote:

The other reason is that this particular version of the program might not exit cleanly if you press Alt+F4

I'm surprised. Normally Windows does a good job of cleaning up after itself when quitting an executable (of course Alt+F4 might not work when running the program in the IDE, but I don't have the opportunity to do that!). But, as I said, in that case you should include ON CLOSE RETURN in your code, not rely on the user reading the instructions!

Richard.

Re: GFXLIB
Post by David Williams on Jul 4^th, 2010, 9:44pm

on Jul 4^th, 2010, 09:54am, Richard Russell wrote:

But, as I said, in that case you should include ON CLOSE RETURN in your code, not rely on the user reading the instructions!

Thanks.

If only I'd known about the ON CLOSE RETURN trick when I was making Alien Eliminator.

The updated version of the flying pink worm demo (which has replaced the old) now ignores Alt+F4 and has the user exiting the program either by pressing Escape or by clicking a mouse button.

David.

Re: GFXLIB
Post by David Williams on Jul 8^th, 2010, 07:45am

I'll be releasing an updated version (2.01) of GFXLIB on Friday (9th July),
but in the meantime here's a 'mini-game' which will be included in the
GFXLIB package. Not much of a game, admittedly, however the
source code is meant to be instructive:

~~http://www.bezu.co.uk/filesdump/temp/progs/cowboyshootout.zip~~ (283 KB)

(The ZIP package only contains the compiled EXE, not the source.)

Re: GFXLIB
Post by David Williams on Jul 9^th, 2010, 10:08pm

GFXLIB version 2.01 has just been released:

http://www.bb4wgames.com/gfxlib/gfxlibpage.html

Happy game-making. grin

----------------------------------------------------
2.01 09-Jul-2010

- Most of GFXLIB routines have been modified to read
the dispVars.flags.paint& and dispVars.flags.flipY&
flags. A few routines remain, however, to be similarly
modified. This will be done in due course.

- Added new routine GetQuarticBezierCurvePoint, and
made some example programs for it.

- Made some example programs for GetQuadraticBezierCurvePoint.

- Added new routine PlotGetCumulativeAlphaBits and
ShapeGetCumulativeAlphaBits and wrote an example program
for each of them.

- PROCWait (as employed by PROCdisplay - you don't normally
call this subroutine directly yourself) now much more efficient.

- Added PROCFlipBmFont subroutine which vertically flips
all bitmap characters in a GFXLIB bitmap font definition
file.

- Re-introduced parameterless PROCInitGFXLIB call.
Calling simply PROCInitGFXLIB is equivalent to
PROCInitGFXLIB(dispVars{}, 0).

- Added new core routine: SetDispVars2

- Core routines now documented -
(see "Core GFXLIB routines.TXT").

- GFXLIB package now includes Michael Hutton's
GFXD3D9LIB.BBC (with some minor modifications by D.W.),
and wrote an example program for it.

- Modified some old GFXLIB demos to make use of GFXD3D9LIB
(see the DX9 subfolder in the GFXLIB_demos folder).

- Corrected or otherwise modified some of the GFXLIB
routine documentation. Still quite a bit of correcting and
modifying to do!

- Included a new mini-game "Cowboy Shootout" whose purpose,
apart from entertaining for a few seconds, is to demonstrate
the use of a number of GFXLIB routines.

- Several other minor changes here and there.
----------------------------------------------------

Re: GFXLIB
Post by admin on Jul 10^th, 2010, 09:18am

on Jul 8^th, 2010, 07:45am, David Williams wrote:

the source code is meant to be instructive.... (The ZIP package only contains the compiled EXE, not the source.)

ROFL!

Richard.

Re: GFXLIB
Post by David Williams on Jul 10^th, 2010, 09:34am

on Jul 10^th, 2010, 09:18am, Richard Russell wrote:

ROFL!

If you insist.

I should have stated that the source program (for "Cowboy Shootout") comes with the GFXLIB package, whereas the ZIP folder (cowboyshootout.zip) contains only the compiled executable.

The GFXLIB package contains an updated (i.e. slightly better) version of Cowboy Shootout.

Re: GFXLIB
Post by David Williams on Jul 14^th, 2010, 01:38am

I've been working on a new bitmap rotation routine for GFXLIB.

Despite still partially being in BASIC(!), this one is significantly faster than "my old one",
and yet it's still very far from optimal. In fact, until I can get my Sutherland-Hodgman polygon clipper
working properly, this routine will remain very inefficient.

Here's a preview:

~~http://www.bezu.co.uk/filesdump/temp/progs/bitmaprotator.zip~~

Use the left and right mouse buttons to zoom in and out.

David.

Re: GFXLIB
Post by David Williams on Jul 21^st, 2010, 12:09am

New routine (alpha-blend with master opacity control) as recently requested by a BB4W user who's developing a GFXLIB-based game.

Here's a quick demo of PlotAlphaBlend4:

~~http://www.bezu.co.uk/filesdump/temp/progs/plotalphablend4demo.zip~~

Notice the nice smooth sprite edges? Use the left/right arrow keys to decrease/increase the number of sprites.

This routine will be included in the next release (v2.02) of GFXLIB due out in a month or three.

David.

Re: GFXLIB
Post by David Williams on Jan 18^th, 2012, 5:23pm

NOTE: Most of the web links (URLs) in this thread (prior to this post) are no longer valid.

~ ~ ~

I have modified an old GFXLIB example program to display a rotating 3D toroidal ring donut:

http://www.bb4wgames.com/gfxlibdemos/progs/exe/donut.zip

I was 'inspired' by this little DarkBASIC demo (YouTube video):

"Oldschool Demo - Shaded vector balls"
http://www.youtube.com/watch?v=XUGHoqM7myk

EDIT (21/01/2012): Here's my attempt so far (using above YouTube video as a reference).
It only runs at half the 'optimum' frame rate on my laptop (30 fps instead of 60 fps):
http://www.bb4wgames.com/gfxlibdemos/progs/exe/donut2.zip

Bear in mind that the DarkBASIC program will have been compiled to native x86 machine code, whereas donut2.exe is still actually interpreted BASIC.

Other GFXLIB demos:

http://www.bb4wgames.com/gfxlibdemos/gfxlib_demos_index.html

David.