• Welcome to Jose's Read Only Forum 2023.
 

Divide and Multiply LONGS

Started by Theo Gottwald, December 23, 2009, 06:52:34 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Theo Gottwald

Hi Hutch, take a look at these.
It's quite a while that i did these, are they still state of the art?
Or would you suggest a change somewhere to get sweeze some additional cycles out?


'------------------------------------------------------------------------------------------------
' Unsigned Multipliziert P1=P2*P3
' Uses Flags, EAX,EDX. Ergebnis ist am Ende auch in EAX
' P1,P2,P3 sind Variablen oder Registernamen
MACRO A_MUL(P1,P2,P3)
! MOV EAX,P2
! MUL P3
! MOV P1,EAX
END MACRO
'------------------------------------------------------------------------------------------------
' Sgned Multipliziert P1=P2*P3
' Uses Flags, EAX,EDX. Ergebnis ist am Ende auch in EAX
' P1,P2,P3 sind Variablen oder Registernamen
MACRO A_SMUL(P1,P2,P3)
! MOV EAX,P2
! CDQ
! IMUL P3
! MOV P1,EAX
END MACRO
'------------------------------------------------------------------------------------------------
' Unsigned-Dividiert P1=P2/P3
' Uses Flags, EAX,EDX. Ergebnis ist am Ende auch in EAX
' P1,P2,P3 sind Variablen oder Registernamen
MACRO A_DIV(P1,P2,P3)
! MOV EAX,P2
! CDQ
! DIV P3
! MOV P1,EAX
END MACRO
'------------------------------------------------------------------------------------------------
' Signed Dividiert P1=P2/P3
' Uses Flags, EAX,EDX. Ergebnis ist am Ende auch in EAX
' P1,P2,P3 sind Variablen oder Registernamen
MACRO A_SDIV(P1,P2,P3)
! MOV EAX,P2
! CDQ
! IDIV P3
! MOV P1,EAX
END MACRO

Steve Hutchesson

Theo,

Just taking a quick look at the macros, I doubt there are any real gains as the speed limit will be the respective DIV and MUL instructions which have alwas been slow operations. If you were going to try a different technique you could try the FP unit but from memory its no faster. An option I have not looked at is some of the very late SSE instructions which may be faster but they are not general purtpose like the macros you have posted.

Theo Gottwald

Sometimes you just need an Integer Divide, you don't need anything after the Point.
Do you expect the FP-Unit to be faster then these Integer Operations?

Even if they can be made in parallel to some extent on modern CPU's?
Thats a point i am not sure about.

Charles Pegge


On recent CPUs, MUL is as fast as ADD but DIV is the really slow one. usually 20-40 times slower because it uses microcoded successive approximation . This is true of both the CPU and FPU. So to get the best execution times, avoid division wherever possible. In many situations, where you would normally divide by a constant, you can multiply by the reciprocal instead.

Steve Hutchesson

Charles,

I remember from my school days the technique you mentioned and while I know how to do this with fractions on paper, I have not seen a simple way to do this with bare integers. Do you have flooating around or remember how to do this with integer values ?

Frederick J. Harris

#5
Quote
...where you would normally divide by a constant, you can multiply by the reciprocal instead.

To get the reciprocal to perform the multiplication would involve a division, not so?  (Oops!  Divide the constant yourself first dummy!)

Charles Pegge

Yes Frederick, you only get the performance benefit if the reciprocal is a constant that can be used repeatedly.

Steve,
I would do all the arithmetic in the FPU - the performance hit is minimal. You could devise a fixed-point reciprocal system for the CPU - using EAX for the fraction & pulling the integer result from EDX but I think this would be too restrictive to be safe for general use.