Author |
Topic: GFXLIB (Read 2187 times) |
|
Michael Hutton
Developer
member is offline


Gender: 
Posts: 248
|
 |
Re: GFXLIB
« Reply #43 on: Oct 26th, 2008, 07:02am » |
|
David,
I have made a routine to sort an array of structures by a 4 byte float key. It is whittled down to the bear essentials and is fast! (well, I think so anyway).
It will only accept one sort key parameter and you need to pass the structure as the first parameter.
http://tech.groups.yahoo.com/group/bb4w/files/Libraries/SORTSAF4LIB.bbc
Let me know if you think it is useful at all.
Michael
|
|
Logged
|
|
|
|
David Williams
Developer
member is offline

meh

Gender: 
Posts: 452
|
 |
Re: GFXLIB
« Reply #44 on: Oct 27th, 2008, 11:32am » |
|
on Oct 26th, 2008, 07:02am, Michael Hutton wrote:David,
I have made a routine to sort an array of structures by a 4 byte float key. It is whittled down to the bear essentials and is fast! (well, I think so anyway).
It will only accept one sort key parameter and you need to pass the structure as the first parameter.
http://tech.groups.yahoo.com/group/bb4w/files/Libraries/SORTSAF4LIB.bbc
Let me know if you think it is useful at all.
Michael |
|
Well, great work I'm sure, and beautiful-looking assembly language to boot, but it's not immediately useful to me because I don't use four-byte floats (although I had to use them of course during my brief adventure with D3DLIB).
I was interested in the instruction timings/clock cycles you neatly gave alongside most of the assembler instructions, but are you sure they're correct (even for the specified Pentium processor)?
For instance, aren't instruction pairs like
add eax,9 shl edx,3
fetched 'simulatenously' into the two separate pipelines (U and V), effectively having both instructions executed in 1 cycle?
Similarly for this pair:
mov edx, ecx inc ebx
I don't think memory accesses are completed in one cycle either:
mov esi, [ebx] mov edi, [ebp+7]
As for fld dword [ecx], well, no way is that completed in one cycle!
Please forgive my nitpicking 
Regards,
David.
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: GFXLIB
« Reply #45 on: Oct 27th, 2008, 2:47pm » |
|
Quote:| For instance, aren't instruction pairs like add eax,9 : shl edx,3 fetched 'simulatenously' |
|
I'm not sure that's such a good example, because since both those instructions affect the flags it's necessary for the CPU to ensure that they are effectively executed in the specified sequence (of course that doesn't necessarily imply more than one clock cycle).
Quote:| Similarly for this pair: mov edx, ecx : inc ebx |
|
That's a better example, because they are genuinely independent. Having said that, 'inc ebx' isn't a particularly fast instruction on some modern processors (because of the need to preserve the carry flag) - 'add ebx,1' may execute faster despite requiring more bytes.
Richard.
|
|
Logged
|
|
|
|
Michael Hutton
Guest
|
 |
Re: GFXLIB
« Reply #46 on: Oct 28th, 2008, 07:53am » |
|
on Oct 27th, 2008, 11:32am, David Williams wrote:As for fld dword [ecx], well, no way is that completed in one cycle!
Please forgive my nitpicking 
|
|
I know, I'm not sure I believe it also and I remember being quite surprised , but I wasn't going to argue. There are obviously qualifiers to the memory access opcodes, but I think I was right when it didn't seem to require adding any other cycles.
I will definitely go and check again, it's on another computer.
Don't thank me for the elegant code ! - it's a complete ripoff of SORTLIB which contains the beef of the QUICKSORT routine. I've only added the addressing of structures...
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: GFXLIB
« Reply #47 on: Oct 28th, 2008, 09:17am » |
|
Quote:| I will definitely go and check again, it's on another computer. |
|
The only way is to measure it. This paper gives some useful hints, including the use of CPUID as a 'serialising' instruction: http://pasta.east.isi.edu/algorithms/IntegerMath/Timers/rdtscpm1.pdf
Richard.
|
|
Logged
|
|
|
|
Michael Hutton
Developer
member is offline


Gender: 
Posts: 248
|
 |
Re: GFXLIB
« Reply #48 on: Oct 30th, 2008, 12:38pm » |
|
Thanks Richard, I had remembered something about 'a time stamp' or CPUID feature and was looking at for it again but that saves me a bit of reading.
I got the original timing from
http://www.packetstormsecurity.com/programming-tutorials/Assembly/fpuopcode.html
FLD Floating point load
operand 8087 287 387 486 Pentium reg 17-22 17-22 14 4 1 FX mem32 (38-56)+EA 38-56 20 3 1 FX mem64 (40-60)+EA 40-60 25 3 1 FX mem80 (53-65)+EA 53-65 44 6 3 NP
but I will try and use Richard's link..
Michael
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: GFXLIB
« Reply #49 on: Oct 30th, 2008, 2:45pm » |
|
Code:
operand 8087 287 387 486 Pentium
mem32 (38-56)+EA 38-56 20 3 1
mem64 (40-60)+EA 40-60 25 3 1
mem80 (53-65)+EA 53-65 44 6 3 Don't those timings exclude the memory fetch? Any memory access may incur an overhead, depending on where the data is (L1 cache, L2 cache, main memory) unless it has been 'prefetched'.
Richard.
|
|
Logged
|
|
|
|
Michael Hutton
Developer
member is offline


Gender: 
Posts: 248
|
 |
Re: GFXLIB
« Reply #50 on: Oct 30th, 2008, 11:05pm » |
|
Hmm, yes it seems so. Those must be timings from when the data is where exactly?
(Should I start a new topic OPCODE TIMINGS) in the assembler section?)
I spent last night using the timestamp instruction to time fld dword [mem], will post later. Have to go on ward round!
|
|
Logged
|
|
|
|
Michael Hutton
Developer
member is offline


Gender: 
Posts: 248
|
 |
Re: GFXLIB
« Reply #51 on: Nov 15th, 2008, 03:19am » |
|
Hello,
I've been using GFXLIB. I've got the latest version and I like the Autoinit bit!
http://tech.groups.yahoo.com/group/bb4w/files/Graphics/Plasmas/
I've been making some simple plasmas, reminds me of the FRACTINT plasmas but at the moment they are all fixed SIN functions. I am coding a 'realtime' RGB plasma which should change the pattern with time not just the colours. I was also going to try to use the plasma as an alpha mask...
At the moment I am using my own bit of code to do the palette rotation but I thought I might be able to use GFXLIB_****LUT1/2/3 but I wasn't quite sure how to implement them.
LUT1, looks up the colour value (RGB) from a one dimensional look up table, yes? So in effect I was thinking of filling a bit map with the plasma pallete values, each separate byte containing a separate pallete code. For example if the palette colour was 255 the pixel would be &FFFFFF not just &FF and then using LUT1 to find my colour and then BPlot the resultant bit map... I wasn't quite sure form the documentation what exactly I should be doing. Could I use LUT3 instead. Also I noticed a line REM'd out in LUT1 function. I presume this is correct but was just wondering!
I love the Bplotscale(NC) functions I got some very 'nice' plasma effects in BASIC by drawing a 50x50 grid using PLOT and then scaling it up to full screen. The 'Blockyness' was quite good.
I have found a good way (well, I think it's good) to make a blank bitmap which avoids all the CreateDC,CreateCompatibleBitmap, SelectObject calls (which I HATE by the way, as I always get lost (same as idiv/div - always gets me down... ). Just *SCREENSAVE a bit of the screen (I found you have to make it four times the x and y co-ordinates you want) and then load it back in with GFX. I noticed that when I got the size wrong subsequently when I DIM'd more variables they ended up in the middle of my bitmap! Very frustrating, with multiple crashes...
Are there any transforming (ie rotation, sphere, cylinder stuff coming up?) I know I don't ask for much! I have tried a sort of cylinder but only can manage it aligned along the x-axis using Plot**row (can' remember the name sitting here!) and some sin/cos functions. It would be good to be able to wrap it 'properly'.
anyway, keep up the good work. Is there more GFXLIB documentation coming up?
|
|
Logged
|
|
|
|
David Williams
Developer
member is offline

meh

Gender: 
Posts: 452
|
 |
Re: GFXLIB
« Reply #52 on: Nov 16th, 2008, 12:24am » |
|
on Nov 15th, 2008, 03:19am, Michael Hutton wrote:| I've been making some simple plasmas |
|
They're great! Plasma 1.03 is especially impressive (and the closest I'll ever get to experiencing an acid trip).
on Nov 15th, 2008, 03:19am, Michael Hutton wrote:| At the moment I am using my own bit of code to do the palette rotation but I thought I might be able to use GFXLIB_****LUT1/2/3 but I wasn't quite sure how to implement them. |
|
Sorry, about the lack of documentation concerning these and other GFXLIB routines -- it'll come, eventually.
LUT1 takes an RGB word from a source bitmap (the one you want to plot), and extracts the individual RGB bytes from it, each of course being in the range 0 to 255 (I'm being verbose here for the benefit of others!). The extracted byte values are then used as indices for a single 256-byte lookup table, from which bytes are read and it is these bytes which are written to the destination bitmap (or DIB section). It's important to note that the three RGB components are indices for just one single table -- greater versatility would be had if each component indexed its own table (I'll probably implement this for a future LUT4).
Copy, paste and run this example program:
Code: M% = 2
HIMEM = LOMEM + M%*&FA000
MODE 8 : OFF
INSTALL @lib$+"GFXLIB"
PROCAutoInit32
REM. Reserve memory for a 640x512 32bpp bitmap
DIM bm% 4*640*512-1
REM. Reserve space for a 256-byte colour table
DIM table% 255
REM. Fill the colour table
FOR I%=0 TO 255
table%?I% = 255-I%
NEXT I%
REM. Redirect GFXLIB's output to addr pointed to by bm%
REM. (Normally, GFXLIB's output is to the DIB section/screen memory)
SYS GFXLIB_QuickSetDispVars, dispVars{}, bm%
REM. QuickSetDispVars can be used here because bitmap pointed to by bm%
REM. is the same dimensions as the program window
REM. Draw 100 bitmaps to the bitmap pointed to by bm%
REM. Note that the global variable demoBm32% points to a 64x64 32bpp bitmap
REM. set up when PROCAutoInit32 is called (it doesn't just appear out of thin air!)
FOR I%=1 TO 100
SYS GFXLIB_Plot, dispVars{}, demoBm32%, 64, 64, RND(640), RND(512)
NEXT I%
REM. Restore GFXLIB's output to the DIB section pointed to by dibSectionAddr%
SYS GFXLIB_QuickSetDispVars, dispVars{}, dibSectionAddr%
REPEAT
IF (TIME DIV 100) MOD 2=0 THEN
REM. Draw the bitmap (bm%) normally using BPlot
SYS GFXLIB_BPlot, dispVars{}, bm%, 640, 512, 0, 0
ELSE
REM. Draw the bitmap (bm%) using BPlotLUT1
SYS GFXLIB_BPlotLUT1, dispVars{}, bm%, table%, 640, 512, 0, 0
ENDIF
SYS "InvalidateRect", @hwnd%, 0, 0
PROCWait(4)
*REFRESH
UNTIL FALSE
After running, replace the line
Code:
with
Code:
LUT2 is similar, except that it uses the RGB components of the background pixels as indices into a 256-byte table, rather than the source bitmap pixels. LUT2 can be used for shadow effects, for example.
LUT3 relies on a 2D table (256*256 bytes) as the RGB components of both the background and source bitmap pixels are taken into account. LUT3 lends itself to various transparency effects, although, as with LUT1 and LUT2, the individual RGB components of the background and source pixels are indices for a single common colour table. Would be nice if they could each have their own table (a future LUT5, perhaps).
If you have the time and inclination, then copy, paste and run this proggy:
Code: M% = 2
HIMEM = LOMEM + M%*&FA000
MODE 8 : OFF
INSTALL @lib$+"GFXLIB"
PROCAutoInit32
DIM table% 256*256-1
opacity% = 25
f# = 1.0 - opacity%/100
FOR I%=0 TO 255
FOR J%=0 TO 255
table%?(256*I% + J%) = I% + f#*(J%-I%)
NEXT J%
NEXT I%
FOR I%=1 TO 100
SYS GFXLIB_PlotLUT3, dispVars{}, demoBm32%, table%, 64, 64, RND(640)-32, RND(512)-32
NEXT
SYS "InvalidateRect", @hwnd%, 0, 0
PROCWait(4)
*REFRESH
on Nov 15th, 2008, 03:19am, Michael Hutton wrote:| I love the Bplotscale(NC) functions I got some very 'nice' plasma effects in BASIC by drawing a 50x50 grid using PLOT and then scaling it up to full screen. The 'Blockyness' was quite good. |
|
Just in case anyone's wondering... the NC part of the routine name GFXLIB_BPlotScaleNC means "Not Clipped" or "No Clipping". It's a fast and rather dangerous routine! But if used carefully (as I'm sure you've done), it's a mighty fast bitmap scaler - almost as fast as straightforward GFXLIB_Plot.
on Nov 15th, 2008, 03:19am, Michael Hutton wrote:| I have found a good way (well, I think it's good) to make a blank bitmap which avoids all the CreateDC,CreateCompatibleBitmap, SelectObject calls (which I HATE by the way, as I always get lost (same as idiv/div - always gets me down... ;D). Just *SCREENSAVE a bit of the screen (I found you have to make it four times the x and y co-ordinates you want) |
|
I've found a good way of achieving the same thing -- that is simply to reserve a suitably sized area of memory using DIM (or whatever). Most of GFXLIB's aren't concerned with bitmap headers as most of the routines assume a 32bpp bitmap, and the width and height of the bitmap is - in most cases - specified when the routine is called.
So this bit of code from Plasma 1.03 ...
Code: REM Create blank bitmaps to store the plasma and the buffer bitmap
PRINT"Creating Blank Bitmap...";
plasma_file$="plasma.bmp"
OSCLI "SCREENSAVE """+plasma_file$+""" "+STR$0+","+STR$0+","+STR$(cx%*4)+","+STR$(cy%*4)
PROCLoadBMP(@dir$+plasma_file$, plasmaBM%, FALSE )
OSCLI "DEL """+plasma_file$+""""
PRINT"OK!"
can be replaced by
Code: DIM plasmaBM% 4*(cx%*cy% + cx%) +4
plasmaBM%=(plasmaBM%+3) AND -4
The additional cx% is required in the DIM declaration because your plasma routine has either a memory leak (poking data outside the 640x512 bitmap), or you're attempting to read data from a location outside the bitmap. I suspect that the former case is at foot here.
For those who don't know, the plasmaBM%=(plasmaBM%+3) AND -4 statement ensures that plasmaBM% points to a word-aligned address (always a good idea as non-aligned memory accesses can incur a speed penalty, on some systems). The +4 tacked on the end ensures there's sufficient room in case plasmaBM% does need to be incremented by at most 3 bytes.
on Nov 15th, 2008, 03:19am, Michael Hutton wrote:| Are there any transforming (ie rotation, sphere, cylinder stuff coming up?) |
|
I started coding coordinate rotation routines for GFXLIB a few weeks ago, with the idea of providing high-precision but slow-ish (FPU- based) routines, and a parallel set of routines based on fast fixed-point integers. I'll have to resume work on this, but at the moment, my 'batteries' are a bit flat, as it were.
on Nov 15th, 2008, 03:19am, Michael Hutton wrote:| I have tried a sort of cylinder but only can manage it aligned along the x-axis using Plot**row (can' remember the name sitting here!) and some sin/cos functions. It would be good to be able to wrap it 'properly'. |
|
I've got such a routine to do just that... but it's currently in BASIC! I think it's called HLineShift or something like that (horizontal line shift with wraparound (on same raster)).
on Nov 15th, 2008, 03:19am, Michael Hutton wrote:| Is there more GFXLIB documentation coming up? |
|
Yes, and I'm sorry I've been dragging my feet on this -- my batteries really are low! Just can't get motivated. I'll do some work on the docs this week -- certainly on the PlotLUT routines. Will try to provide plenty of examples.
Again, your plasmas look great (and not too nausea-inducing) and very slick. I look forward to seeing the shapes/patterns change in subsequent versions!
Regards,
David.
|
|
|
|
Michael Hutton
Developer
member is offline


Gender: 
Posts: 248
|
 |
Re: GFXLIB
« Reply #53 on: Nov 16th, 2008, 02:24am » |
|
Quote:I've found a good way of achieving the same thing -- |
|
Ah, yes. At the time I was looking for the whole bitmap including the header, but good point!
As with my 'plot' code, I always get mixed up with the addressing. Not good.
Thanks for the examples for LUT1 and LUT2, I think I see what is happening a bit more clearly. I think I will be able to use them. I was thinking that I might be able to get the graphics card to cycle the colour palette rather than me do it but I suppose that involves DirectX (Draw?) which I haven't really got into yet.
I have a working version of the moving plasma but I'll 'tidy' it up today before posting.
Michael
|
|
Logged
|
|
|
|
admin
Administrator
member is offline


Posts: 1145
|
 |
Re: GFXLIB
« Reply #54 on: Nov 16th, 2008, 09:53am » |
|
Quote:| For those who don't know, the plasmaBM%=(plasmaBM%+3) AND -4 statement ensures that plasmaBM% points to a word-aligned address |
|
Bear in mind that in a Windows bitmap (either in the form of a .BMP file or a DIB in memory) every row must be DWORD-aligned.
That doesn't involve any overhead if you're using a 32-bpp bitmap (if the first row is aligned, so must all the others) but if you're using a 24-bpp bitmap then it means every row must be padded to a multiple of 4 bytes if it isn't already.
So the general form of memory allocation for a bitmap is:
Code:
DIM bmp% cy% * ((bpp% DIV 8 * cx% + 3) AND -4) + 3
bmp% = (bmp% + 3) AND -4 Richard.
|
|
Logged
|
|
|
|
David Williams
Developer
member is offline

meh

Gender: 
Posts: 452
|
 |
Re: GFXLIB
« Reply #55 on: Nov 17th, 2008, 5:45pm » |
|
on Nov 16th, 2008, 09:53am, Richard Russell wrote:So the general form of memory allocation for a bitmap is:
Code:
DIM bmp% cy% * ((bpp% DIV 8 * cx% + 3) AND -4) + 3
bmp% = (bmp% + 3) AND -4 |
|
Thanks for that bit of code, Richard.
GFXLIB is intended for use mostly with 32-bpp bitmaps (although there is a handful of routines for 8-bpp bitmaps). The one routine available for simply displaying a 24-bpp bitmap (GFXLIB_BPlotBMP24) can only be safely written to a 32-bpp DIB section or bitmap buffer.
GFXLIB provides the subroutine PROCLoadBMP which will load 8, 24 and 32-bpp bitmaps, and in the first two cases, will convert the bitmap to 32-bpp. 16-bpp isn't currently catered for, but then who uses those nowadays?
David.
|
|
Logged
|
|
|
|
David Williams
Developer
member is offline

meh

Gender: 
Posts: 452
|
 |
Re: GFXLIB
« Reply #56 on: Dec 24th, 2008, 11:51am » |
|
The next version of GFXLIB (version 1.2) will feature several new routines, including line, circle and filled triangle plotters, and possibly also routines for rotating lists of 3D coordinates.
Meanwhile, this wee 'demo' indicates the speed of GFXLIB's new line plotter (bear in mind that the coordinates of the line endpoints are updated and checked in BASIC, and that one SYS GFXLIB_Line... statement is issued for each line drawn).
http://www.bb4w-games.com/temp/gfxlib_line.zip
I'll make some much more interesting demos at a later date. 
David.
|
|
|
|
|