Author |
Topic: 256 Thread fast ping sweep (Read 1528 times) |
|
sveinioslo
Developer
member is offline


Posts: 64
|
 |
256 Thread fast ping sweep
« Thread started on: Mar 27th, 2015, 01:00am » |
|
Wanted to see how fast one could do a single ping sweep, 0.1 second seems to be the fastest on lan. It's a bit too fast sometimes, but then one can just run it again.
Does someone know the 'exact' requirement for a thread safe mov instruction ? Is it the instruction itself, its data, or both that must be on dword boundary ?
Also, does someone know if a thread can report being finished, before it actually is ? I have seen an occasional printout like '192 0 0 0' or '0 0 0 0', without WAITing a bit before reading the ip table result. Which is pretty hard to explain otherwise.?
Remember to edit The Ip$= to your local ip address, start at zero.
Svein
Code:
REM 19.okt.2014 Svein Svensson, edit 26.mar.2015
REM Fast local ping scanner
REM 8*32=256 threads
Ip$="192.168.0.0"+CHR$0 : REM local net
ON ERROR PROCclose : REPORT : END
ON CLOSE PROCclose : END
DIM Ping{(255) hping%, ip%, dta&(31), rep&(63), buf&(15), fill&(7)}
DIM hMain%(7), hThread%(255)
DIM Code% NOTEND AND 2047, Code% 500, L%-1
SYS "LoadLibrary", "iphlpapi.dll" TO Iphlpapi%
SYS "GetProcAddress", Iphlpapi%, "IcmpSendEcho" TO IcmpSendEcho%
SYS "GetProcAddress", Iphlpapi%, "IcmpCreateFile" TO IcmpCreateFile%
SYS "GetProcAddress", Iphlpapi%, "IcmpCloseHandle" TO IcmpCloseHandle%
SYS "LoadLibrary", "Ntdll.dll" TO Ntdll%
SYS "GetProcAddress", Ntdll%, "RtlIpv4AddressToStringA" TO Rtl2String%
SYS "GetProcAddress", Ntdll%, "RtlIpv4StringToAddressA" TO RtlString2adr%
FOR Pass%=8 TO 10 STEP 2
P% = Code%
[OPT Pass%
.Ping%
cld
mov ebp,[esp+4]
mov eax,500 : push eax ; timeout, can't go faster ! (change to 1000 if on wlan)
mov eax,64 : push eax ; replysize
mov eax,ebp : add eax,40 : push eax ; replybuffer rep&(0)
xor eax,eax : push eax ; options
mov eax,32 : push eax ; sendsize, msdn say's word but that doesn't work
mov eax,ebp : add eax,8 : push eax ; sendbuffer dta&(0)
push [ebp+4] ; ip%
push [ebp] ; hping%
call IcmpSendEcho%
or eax,eax ; if result=0 then timeout or some error else we got a reply
jnz Ping4%
call "GetLastError"
mov [ebp+8],eax
jmps Ping3%
.Ping4%
mov eax,[ebp+44] ; if reply_stat=0 then valid ip else some error
mov [ebp+12],eax
or eax,eax
jz Ping2%
.Ping3%
xor eax,eax
]
WHILE P%AND3:[OPT Pass%:nop:]:ENDWHILE : REM dword alignment for mov [ebp+4],eax
[OPT Pass%
mov [ebp+4],eax ; no reply, clear ip table entry
.Ping2%
push [ebp] : call IcmpCloseHandle% : ret
.Wait% ; 32 sub wait threads
mov ebp,[esp+4]
mov eax,2000 : push eax
mov eax,1 : push eax
push ebp
mov eax,32 : push eax
call "WaitForMultipleObjects" : ret
]
NEXT Pass%
REM create ip table
SYS RtlString2adr%, !^Ip$, 1, ^C%, ^Ip% TO D%
IF D%<>0 THEN ERROR 100, "Ip$ convert error"
FOR I%=0 TO 255
SYS IcmpCreateFile% TO C%
Ping{(I%)}.hping%=C%
Ping{(I%)}.ip%=Ip%
?(^Ping{(I%)}.ip%+3)=I%
NEXT I%
T%=TIME
REM create 256 worker threads
FOR I% = 0 TO 255
H%=^Ping{(I%)}.hping%
SYS "CreateThread", 0, 1024, Ping%, H%, 0, 0 TO hThread%(I%)
IF hThread%(I%) = 0 THEN ERROR 100,"Failed to create Thread."
NEXT I%
REM create 8 main wait threads each waiting for 32 sub threads
FOR I%=0 TO 7
H%=^hThread%(I%*32)
SYS "CreateThread", 0, 1024, Wait%, H%, 0, 0 TO hMain%(I%)
IF hMain%(I%) = 0 THEN ERROR 100,"Failed to create MainThread."
NEXT I%
REM wait for the 8 main wait threads
SYS "WaitForMultipleObjects", 7, ^hMain%(0), 1, 5000
WAIT 10 : REM wait a bit before scanning ip table, threads not immediately ready ?
REM print non zero values from ip table
FOR I%=0 TO 255
IF Ping{(I%)}.ip%<>0 THEN
D%=^Ping{(I%)}.ip%
PRINT STR$(D%?0);" ";STR$(D%?1);" ";STR$(D%?2);" ";STR$(D%?3);
IF D%!4 THEN PRINT " IcmpErrorCode=";D%!4 ELSE PRINT
IF D%!8 THEN PRINT " IcmpReplyStat=";D%!8
ENDIF
NEXT I%
PRINT "Scan complete in ";(TIME-T%)/100;" seconds"
FOR I%=0 TO 7
SYS "CloseHandle", hMain%(I%)
NEXT I%
FOR I%=0 TO 255
SYS "CloseHandle", hThread%(I%)
NEXT I%
PROCclose
END
DEF PROCclose
Ntdll%+=0 : IF Ntdll% SYS "FreeLibrary", Ntdll%
Iphlpapi%+=0 : IF Iphlpapi% SYS "FreeLibrary", Iphlpapi%
ENDPROC
|
|
Logged
|
|
|
|
rtr2
Guest
|
 |
Re: 256 Thread fast ping sweep
« Reply #1 on: Mar 27th, 2015, 09:36am » |
|
on Mar 27th, 2015, 01:00am, sveinioslo wrote:Does someone know the 'exact' requirement for a thread safe mov instruction ? Is it the instruction itself, its data, or both that must be on dword boundary ? |
|
It's only data alignment that matters for an atomic read or write.
Quote:Also, does someone know if a thread can report being finished, before it actually is ? |
|
Extremely unlikely, I would have thought.
Incidentally, there are a few places in your program where you do something like this:
Code: The code would be shorter, and easier to read, if you did:
Code: Richard.
|
|
Logged
|
|
|
|
sveinioslo
Developer
member is offline


Posts: 64
|
 |
Re: 256 Thread fast ping sweep
« Reply #2 on: Mar 27th, 2015, 5:37pm » |
|
That is because 'push 32' gives op-code '6A 20' which is 'push imm8' in my manual. I have not read anywhere if that means only one byte is pushed or if it is padded to dword. Msdn specifies dword (actually they say word but that doesn't work), so better safe than sorry.
Svein
|
|
Logged
|
|
|
|
rtr2
Guest
|
 |
Re: 256 Thread fast ping sweep
« Reply #3 on: Mar 27th, 2015, 6:46pm » |
|
on Mar 27th, 2015, 5:37pm, sveinioslo wrote: You perhaps forget how much experience I have had of writing x86 assembler code - the entire BBC BASIC for Windows interpreter is implemented that way! I would not have recommended that you use push 32 if there was a risk associated with it; there isn't. The imm8 referred to is the size of the operand (32 fits into a signed 8-bit number); it doesn't relate to the number of bytes pushed onto the stack, which is always 4 (or a multiple thereof).
As I said, using push 32 will make your code shorter and easier to read. If you think there is some sort of risk associated with doing that you should stop using BB4W immediately because there are probably hundreds of such instructions in the code of the interpreter! 
Richard.
|
« Last Edit: Mar 27th, 2015, 7:30pm by rtr2 » |
Logged
|
|
|
|
sveinioslo
Developer
member is offline


Posts: 64
|
 |
Re: 256 Thread fast ping sweep
« Reply #4 on: Mar 30th, 2015, 08:00am » |
|
Hehe, i used to use 'push imm' but changed it to the 'push reg' because i wasn't sure how many bytes was pushed. This program wasn't easy to get working, my first multithreaded project, it required a lot of research. I am making a note on the mov/push instructions, thank you.
Svein
|
|
Logged
|
|
|
|
Ric
Full Member
member is offline


Gender: 
Posts: 136
|
 |
Re: 256 Thread fast ping sweep
« Reply #5 on: May 28th, 2016, 1:26pm » |
|
Svein,
I notice that you use the phrase, "multi-threading", which has caught my eye. I am currently playing around with 3D graphics using asm and wondered if I could get it to go faster by multi-threading. Unfortunately I have been unable to find satisfactory explanations on the net. Does your code enable two or more sections of code to execute at the same?
The project I am working on is in General Board under 3D gaming project.
Any help would be greatly appreciated.
Ric
|
|
Logged
|
It's always possible, but not necessarily how you first thought. Chin up and try again.
|
|
|
michael
Senior Member
member is offline


Posts: 335
|
 |
Re: 256 Thread fast ping sweep
« Reply #6 on: May 31st, 2016, 11:59am » |
|
Apparently, you would need to ask David Williams, as he was apparently behind creating: GFXLIB library I am also curious about being able to use ASM to draw super fast to the screen. Its all about stepping stones. If David were willing to repost the research and the Library for us, maybe we could have some fun. Its up to you David. PLEASE?
|
|
Logged
|
I like making program generators and like reinventing the wheel
|
|
|
Ric
Full Member
member is offline


Gender: 
Posts: 136
|
 |
Re: 256 Thread fast ping sweep
« Reply #9 on: Jun 16th, 2016, 07:59am » |
|
Thanks guys, I'll look into it.
|
|
Logged
|
It's always possible, but not necessarily how you first thought. Chin up and try again.
|
|
|
|