In this Posting, we'll take a look, how the Datatype we use as a Loop-Variable will influence the construction of the Loop on the lowest level.
EXAMPLE 1:
We start with our fastest Loop. REGISTER-Variable, which is however NOT used inside the Loop.
Please note that this will make the compiler to reverse the Loop-direction and check for "JNZ" (Jump-Not-zero)
as this is the fastest way of Looping..
REGISTER R01 AS LONG
FOR R01=10 TO 200
!NOP
NEXT R01
' becomes our fastest Loop:
4023D6 MOV ESI, DWORD 000000BF
4023DC NOP
4023DD DEC ESI
4023DF JNZ SHORT L4023DC
please note that you can undermine that system, if you make sure that you use the REGISTER-Variables in the right Order.
Knowing that ESI - contains the first declared REGISTER-Variable, and EDI - the second, you can this way still use these registers with ASM-Code.
EXAMPLE 2:
We can still use the REGISTER-Variable and have the downrunning loop, byaccessing the REGISTER directly.
While I do not really recommend this. I show this, to demonstrate that the Compiler just looks if "R01" is been found inside the Loop.
If not he will reverse the Loop direction.
FOR R01=10 TO 200
!NOP
! MOV EAX,ESI
NEXT R01
4023D6 MOV ESI, DWORD 000000BF
4023DC NOP
4023DD MOV EAX, ESI
4023DF DEC ESI
4023E1 JNZ SHORT L4023DC
EXAMPLE 3:
In this example we do something really useless, we assign R01 the Content of R01.
The reason is, to see how the compiler will change the Loop IF R01 is been used inside the Loop.
FOR R01=10 TO 200
!NOP
R01=R01
NEXT R01
4023D6 MOV ESI, DWORD 0000000A
4023DC NOP
4023DD MOV EAX, ESI
4023DF MOV ESI, EAX
4023E1 INC ESI
4023E3 CMP ESI, DWORD 000000C8
4023E9 JLE SHORT L4023DC
Now we see that this Loop is just a bit bigger.
Instead of just the DEC and the JNZ, we have now INC,CMP and JLE (Jump-If-Less-or-Equal).
We see in this example, that the compiler makes really use of the REGISTER Variables (R01 = ESI Register).
EXAMPLE 4:
Lets now change the Loop-direction manually. I only want to see, if I will get the quick loop with JNZ then.
FOR R01=200 TO 0 STEP -1
!NOP
R01=R01
NEXT R01
4023D6 MOV ESI, DWORD 000000C8
4023DC NOP
4023DD MOV EAX, ESI
4023DF MOV ESI, EAX
4023E1 DEC ESI
4023E3 CMP ESI, BYTE 00
4023E6 JGE SHORT L4023DC
No chance. We just get a JGE (Jump-If-Greater_or_Equal) insteaad of the JLE from Example 3.
EXAMPLE 5:
Please note at this chance, that you can't do this:
REGISTER R01 as DWORD
FOR R01=200 TO 0 STEP -1
!NOP
R01=R01
NEXT R01
While you might say "The DWORD is always in RANGE", it does not compile.
DWORD can not hold negative numbers, and therefore the compiler does not accept them as STEP in this case.
Please note that in case of this Loops the optimization is not different, wether you take
REGISTER R01 AS LONG or
REGISTER R01 AS DWORD
doesn't make a difference here in the code.
EXAMPLE 6:
What definitely makes a difference is, if we use EXTENDED (Floating Point Datatype, declared as REGISTER)
REGISTER E01 AS EXTENDED
FOR E01=1 TO 200
!NOP
NEXT
4023D6 FILD INTEGER PTR [00406730]
4023DC FSTP EXT (TBYTE) PTR [EBP+FFFFFF2C]
4023E2 FLD1
4023E4 FSTP EXT (TBYTE) PTR [EBP+FFFFFF38]
4023EA FLD1
4023EC FSTPST, ST(1)
4023EE NOP
4023EF FLD EXT (TBYTE) PTR [EBP+FFFFFF38]
4023F5 FLDST, ST(1)
4023F7 FADDP ST(1), ST
4023F9 FSTPST, ST(1)
4023FB FLDST, ST(0)
4023FD FLD EXT (TBYTE) PTR [EBP+FFFFFF2C]
402403 FCOMPP
402405 FNSTSW AX
402407 SAHF
402408 JNB SHORT L4023EE
There is not much to say to this, other then that you get an automatic REGISTER-Assignement for your first 4 EXTENDED Variables, even if they are only declared using LOCAL instead of REGISTER.
EXAMPLE 6:
Unless you do an explicit #REGISTER NONE. And this #REGISTER NONE will - as expected - not prevent you from beeing able to explicitly declare
REGISTER R01 AS LONG, R02 AS LONG as REGISTER Variables.
For Testing we do this:
REGISTER R01 AS DWORD,R02 AS DWORD
LOCAL E01,E02,E03,E04,E05,E06 AS EXTENDED
R01=R01
!NOP
FOR E01=1 TO 200
!NOP
R01=R01
NEXT
! NOP
' This 4 Commands make the REGISTER Assignement for the EXTENDED Variables
' These Instructions are not generated if you do not assign REGISTERS to the EXTENDED Variables.
4023D1 FLDZ
4023D3 FLDZ
4023D5 FLDZ
4023D7 FLDZ
4023D9 MOV EAX, ESI
4023DB MOV ESI, EAX
4023DD NOP
4023DE FILD INTEGER PTR [00406730]
4023E4 FSTP EXT (TBYTE) PTR [EBP+FFFFFF14]
4023EA FLD1
4023EC FSTP EXT (TBYTE) PTR [EBP+FFFFFF20]
4023F2 FLD1
4023F4 FSTPST, ST(1)
4023F6 NOP
' This is inside our Loop
4023F7 MOV EAX, ESI
4023F9 MOV ESI, EAX
' Until here
4023FB FLD EXT (TBYTE) PTR [EBP+FFFFFF20]
402401 FLDST, ST(1)
402403 FADDP ST(1), ST
402405 FSTPST, ST(1)
402407 FLDST, ST(0)
402409 FLD EXT (TBYTE) PTR [EBP+FFFFFF14]
40240F FCOMPP
402411 FNSTSW AX
402413 SAHF
402414 JNB SHORT L4023F6
402416 NOP
' This 4 Commands reverse the REGISTER Assignement for the EXTENDED Variables
' These Instructions are not generated if you do not assign REGISTERS to the EXTENDED Variables.
402417 FSTPST, ST(0)
402419 FSTPST, ST(0)
40241B FSTPST, ST(0)
40241D FSTPST, ST(0)
EXAMPLE 7:
Now lets take a look on QUAD-Loops.
LOCAL E01,E02,E03,E04,E05,E06 AS QUAD
FOR E01=1 TO 200
!NOP
R01=R01
NEXT
' becomes
4023D6 FILD INTEGER PTR [00406730]
4023DC FISTP QUAD PTR [EBP+FFFFFF34]
4023E2 FLD1
4023E4 FISTP QUAD PTR [EBP+FFFFFF3C]
4023EA FLD1
4023EC FISTP QUAD PTR [EBP+FFFFFF6C]
'-----------------------------------------------
' Here we're inside the Loop
'-----------------------------------------------
4023F2 NOP
4023F3 MOV EAX, ESI
4023F5 MOV ESI, EAX
'-----------------------------------------------
' Here the Loop-Counter Quad is incremented
'-----------------------------------------------
4023F7 FILD QUAD PTR [EBP+FFFFFF3C]
4023FD FILD QUAD PTR [EBP+FFFFFF6C]
402403 FADDP ST(1), ST
402405 FISTP QUAD PTR [EBP+FFFFFF6C]
'-----------------------------------------------
' The QUADs are loaded as 2x32 bit and then compared as Floating-Point
'-----------------------------------------------
40240B FILD QUAD PTR [EBP+FFFFFF6C]
402411 FILD QUAD PTR [EBP+FFFFFF34]
402417 FCOMPP
402419 FNSTSW AX
40241B SAHF
40241C JNB SHORT L4023F2
What we see here, is that - as expected - on a 32 bit system QUADS are beeing treated less efficient,
then are 32 Bit LONG's. For QUADS there is no REGISTER ALLOCATION possible.
EXAMPLE 8:
Lets take a look on DOUBLE-variable Loops. If you followed me until here, there should be not much surprise any more.
LOCAL E01,E02,E03,E04,E05,E06 AS DOUBLE
R01=R01
!NOP
FOR E01=1 TO 200
!NOP
R01=R01
NEXT
4023D6 FILD INTEGER PTR [00406730]
4023DC FSTP DOUBLE PTR [EBP+FFFFFF34]
4023E2 FLD1
4023E4 FSTP DOUBLE PTR [EBP+FFFFFF3C]
4023EA FLD1
4023EC FSTP DOUBLE PTR [EBP+FFFFFF6C]
'-----------------------------------------------
' Here we're inside the Loop
'-----------------------------------------------
4023F2 NOP
4023F3 MOV EAX, ESI
4023F5 MOV ESI, EAX
'-----------------------------------------------
' Here the Loop-Counter is incremented
'-----------------------------------------------
4023F7 FLD DOUBLE PTR [EBP+FFFFFF3C]
4023FD FADD DOUBLE PTR [EBP+FFFFFF6C]
402403 FSTP DOUBLE PTR [EBP+FFFFFF6C]
402409 FLD DOUBLE PTR [EBP+FFFFFF6C]
40240F FCOMP DOUBLE PTR [EBP+FFFFFF34]
402415 FNSTSW AX
402417 SAHF
402418 JBE SHORT L4023F2
EXAMPLE 9:
Lets try a BYTE-Loop.
LOCAL E01,E02,E03,E04,E05,E06 AS BYTE
LOCAL R01 AS LONG
R01=R01
!NOP
FOR E01=1 TO 200
!NOP
R01=R01
NEXT
4023D4 MOV DWORD PTR [EBP+FFFFFF78], DWORD 000000C8
4023DE NOP
4023DF MOV EAX, ESI
4023E1 MOV ESI, EAX
4023E3 DEC DWORD PTR [EBP+FFFFFF78]
4023E9 JNZ SHORT L4023DE
4023EB NOP
What we see is, that we get a automatic REGISTER Assignment of the LONG Variable. And we got the Loop-Direction reversed again, becasue the Loop-Counter was not used inside the Loop.
This will happen to BYTE,WORD,INTEGER,LONG,DWORD Datatypes
EXAMPLE 10:
WORD and INTEGER
Lets start with WORD. WORD can not be assigned to REGISTER therefore they have a disadvantage against LONG, as do have INTEGER.
The Loopcode is not much diffrent from the Loopcode for a LONG, therefore in this cases you do not have anb advantage from choosing a INTEGER or WORD compared to a LONG, even if you do not need all the Bits.
LOCAL E01,E02,E03,E04,E05,E06 AS WORD
REGISTER R01 AS LONG
R01=R01
!NOP
FOR E01=1 TO 200
!NOP
R01=E01
NEXT
' will become:
4023D6 MOV EDI, WORD 0001
4023DB NOP
4023DC MOVZX EAX,EDI
4023DF MOV ESI, EAX
4023E1 INC EDI
4023E4 CMP EDI, WORD 00C8
4023E9 JBE SHORT L4023DB
And finally INTEGER (signed 16-bit), the BASIC-Code is the same as before just that we have changed the WORD to INTEGER.
4023D5 NOP
4023D6 MOV EDI, WORD 0001
4023DB NOP
4023DC XADD EDI, EAX
4023DF MOV ESI, EAX
4023E1 INC EDI
4023E4 CMP EDI, WORD 00C8
4023E9 JLE SHORT L4023DB
Thats where we end ... but about Loops - there is more.