• Welcome to Jose's Read Only Forum 2023.
 

Dynamic Assembly:

Started by Charles Pegge, May 28, 2007, 12:14:21 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Charles Pegge

Strings of Machine Code


This shows how to set up machine code in a string and execute it. For beefing up the  performance of scripting engines to give speed as well as flexibility.

How we used to do it with home computers before the PC came along.


' Dynamic Assemply
' How to create strings of machine code during run time and execute them.

' 28 May 2007
' Charles E V Pegge


' Useful Instruction set reference:
' Intel Architecture Software Developer's Manual Vol 2: Instruction Set Reference
' http://developer.intel.com/design/pentiumii/manuals/243191.htm


' Console trace using FreeBasic 0.16 under Linux
' (dont worry about the warning messages)
'
' [charles@localhost script0702L]$ fbc run.bas
' fbc: Symbol `ospeed' has different size in shared object, consider re-linking
' run.bas(463) : warning level 0: Suspicious pointer assignment
' [charles@localhost script0702L]$ 
' [charles@localhost script0702L]$ ./run
' Answer (ABCD) = ABCD


'----------------------------------------------------

DIM goCPU AS FUNCTION() AS LONG
DIM s AS STRING
DIM p AS LONG
DIM a AS LONG

' b8 cc ab 00 00    ' mov eax,&h0000ABCC
' 40                ' inc eax
' c3                ' ret

' hex codes in a string
  s=chr$(&hb8)+mkl$(&hABCC)+chr$(&h40)+chr$(&hC3)

' In Powerbasic
'--------------
' p= strptr(s)
' call dword p using goCPU() to a

' In Freebasic
'-------------
  goCPU = strptr(s)
  a=goCPU()

print "Answer (ABCD) = "; hex$(a)


end



Charles Pegge


Combining Strings of Machine Code

- with the prospect of doing something useful.



' Combining strings of machine code:
' (This uses Freebasic syntax)

DIM goCPU AS FUNCTION() AS LONG
DIM s AS STRING
DIM p AS LONG
DIM a AS LONG

DIM AS STRING push_regs= chr$(&h51)+chr$(&h52)+chr$(&h53)     ' pushes registers ecx edx ebx onto the stack
DIM AS STRING pop_regs=  chr$(&h5b)+chr$(&h5a)+chr$(&h59)     ' pops registers  ebx edx ecx from the stack
DIM AS STRING new_frame= chr$(&h83)+chr$(&hc6)+chr$(&h40)     ' add 64 to index register esi
DIM AS STRING old_frame= chr$(&h83)+chr$(&hee)+chr$(&h40)     ' subtract 64 from index register esi
DIM AS STRING rets=      chr$(&hc3)                           ' standard return


' use these to test the strings above:

DIM AS STRING movAXSI=   chr$(&h8b)+chr$(&hc6)                          ' move esi to eax
DIM AS STRING subAXSI=   chr$(&h2b)+chr$(&hc6)                          ' subtract esi from eax


'Run these 2 strings to test our basic instruction strings:

s=push_regs + movAXSI + new_frame + subAXSI +old_frame + pop_regs + rets
goCPU=strptr(s): a=goCPU()
print "Answer: ";hex$(a)

s=push_regs + movAXSI + new_frame + old_frame + subAXSI + pop_regs + rets
goCPU=strptr(s): a=goCPU()
print "Answer: ";hex$(a)

end

'
'Answer: FFFFFFC0
'Answer: 0
'




Charles Pegge


Combining Strings of Machine Code

Next stage

This does the same as above but with two helper functions to simplify coding.


'------------------------------------------------------------------------
' How to create strings of machine code during run time and execute them.
'------------------------------------------------------------------------
' Version 2
' with helper functions


' 28 May 2007
' Charles E V Pegge
' (Using FreeBasic)


'------------------------------------------------------------------------

' pass location of code and location of data space to this function:

function caller( byval p as byte pointer, byval q as byte pointer) as long
asm                   '
  mov esi,[q]          ' location of work space into index register esi
  mov eax,[p]          ' location of code into eax register
  call eax             ' call to the afdress contained in eax
  mov [function],eax   ' assume eax contains something meaningful and return it
end asm               '
end function

'------------------------------------------------------------------------


'pass the hexadecimal text string to this function and obtain a binary string:

function hexconvert(byref s as string) as string
dim as long i=1
dim as long j=1
dim as long l
s=ltrim$(s)
dim t as string
t=string$(len(s)/2,chr$(0)) ' estimate max length
l=len(s)
do
if i>l then exit do
mid$(t,j)=chr$(val("&h"+mid$(s,i,2)))
i+=2: j+=1
' skip space for next hex code
do
  if asc(s,i)>32 then exit do
  i+=1
  if i>l then exit do
loop
loop
function=left$(t,j-1)
end function

'-------------------------------------

' Combining strings of machine code:

DIM AS STRING w = string$( 8192,chr$(0) )           ' indexed work space for the code
DIM s AS STRING                                     ' string for the executable code

DIM AS STRING push_regs = hexconvert("51 52 53")    ' pushes registers ecx edx ebx onto the stack
DIM AS STRING pop_regs  = hexconvert("5b 5a 59")    ' pops registers  ebx edx ecx from the stack
DIM AS STRING new_frame = hexconvert("83 c6 40")    ' add 64 to index register esi
DIM AS STRING old_frame = hexconvert("83 ee 40")    ' subtract 64 from index register esi
DIM AS STRING rets      = chr$(&hc3)                ' standard return

' use these to test the strings above:

DIM AS STRING movAXSI   = hexconvert("8b c6")       ' move esi to eax
DIM AS STRING subAXSI   = hexconvert("2b c6")       ' subtract esi from eax

s=push_regs + movAXSI + new_frame + subAXSI + old_frame + pop_regs + rets

'-------------------------------------

a=caller(strptr(s),strptr(w))

print "Answer: ";hex$(a)

'-------------------------------------

end



Charles Pegge

#3
Generating Relocatable Code.

To allow assembled code to run from any address without having to patch up a whole load of location dependent code, it is necessary to make all the calls, jumps and static data regions relative to the EIP instruction pointer.

The x86 does this for procedures by providing relative jumps and calls but not for data. Until you come to use the x86-64 in long mode, there is no instruction to read the instruction pointer directly so you have to trick the processor into giving you this information.

This is one way to do it:

Suppose we want to set up a data area which is private and does not use the current stack. We get the CPU to call the address that immediately follows the call instruction. The CPU pushes its return address onto the stack. This is the very same address. The instruction at this location is a POP, so we can pop this address into any index register to reference the data.  After the POP follows a JUMP instruction to step over the data block and continue execution, or a RET to go back to a calling routine. The value in the index register must then be adjusted to the actual start of the data block.



'-----------------------'
setup_datablock:        '
'-----------------------'
call here              ' WHERE AM I?
here:                  '
pop esi                '
ret                    '
........data           '
'-----------------------'
main_code:             '
'-----------------------'
call setup_datablock1  '
inc esi                ' adjust ESI to the start of the data
                        ' (POP and RET are single bytes)
inc esi                '
...                    ' continue code ....
ret                    '
'-----------------------'



Step Over

With JMP instructions, you cant be sure whether the assembler will adopt a short jump or a long jump, which affects the adjustment we need to make to the index register afterwords.

Working in pure opcodes this time:



'------------------'
' ...              ' WHERE AM I?
E8 00 00 00 00     ' call the next location
5E                 ' pop the address into esi
E9 00 01 00 00     ' jump over 256 bytes using a long jump
'------------------'
'...               ' data block of 256 bytes...
'------------------'
83 c6 06           ' add 6 to esi for start of data block
'....              ' continue code ....
C3                 ' ret
'------------------'