• Welcome to Jose's Read Only Forum 2023.
 

(Optimization) The RETURN of GOSUB

Started by Theo Gottwald, January 02, 2007, 09:47:20 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Theo Gottwald

There is a small sentence in the Powerbasic help file, which I read once like a sidenote:

For time critical or high-performance code, using a GOSUB to perform a repetitive task is almost always faster then performing a call to a Sub or Function, since there is no overhead in setting up a stack frame for a GOSUB.

More and more I realized, that this is not just a side-note.
Its one of those small sentences which may need a closer look.

Some time ago I tried to make everything insde a SUB or FUNCTION.

Lastly I realized that in time critical code GOSUB is better then using Sub's or Functions and still clean.
It can even be faster then to write code with a MACRO, so far the code is often called at multiple places.
Because shorter code fits entirely into the CPU-Cache.

GOSUB makes SHORT. Its faster, its smaller. :-)

Its one of those small Optimization possibilities, easy to overlook.
Just a Subroutine-Call, no new stack-frame.

Example:


Function XY() as LONG
REGISTER LResult as LONG
...
enx:
Function=lResult
EXIT FUNCTION
'
' Here is a good place for the Subroutines
MyLabelA:
...
RETURN

MyLabelB:
...
RETURN

' Here's a good place for an error-handler
ero:
...
lResult=0
GOTO enx
' Finally we jump back to the EXIT to be shure we
' don't have multiple exit-points.
END FUNCTION


There is one thing, you should know when using Labels in PowerBasic.

A LABEL with Name "XYZ" will conflict with any other variable of this name in your whole Powerbasic program.

The help file says it like this:

Quote..., it should be noted that symbol names must be unique: a label may not share the name of any other symbol (Sub name, Function name, user-defined type or union definition, variable name, etc), and they are local to the Sub or Function in which they appear.


Donald Darden

The fastest approach is sequential coding without any deviations. But sequential coding is very limited, because it means avoiding branching, and this required the limited use of IF - THEN, SELECT CASE, FOR - NEXT, DO - LOOP, and WHILE - WEND structures.  It is often not very practical.  However, you will see cases where attempts are made to reduce decisions trees as much as possible to speed through determination points.

The use of GOSUB - RETURN is a branching instruction which actually happens
twice - first you branch to a separate body of code and execute, then you
branch back.  But for repeated tasks, it means much smaller code might be
possible, and it can make it much easier to follow how the code works.  So for
the small penalty of time involved, the use of GOSUB is often worthwhile.

Some people prefer the concept of using a MACRO instead, which can also
represent a body of code, but now that body of code is not separate, but
included in the current sequence of code during the compile stage.  Yet the
Macro allows the same ease of being able to read how the code works as most
GOSUBs, FUNCTIONs, and SUBs allow.  So it represents an attempt to get the
maximum speed out of code that is kept sequential and allowed to grow quite
large in some cases.  The advantage of GOSUB - RETURN is that for a very
small penalty in time, it actually helps shrink overall code size.

SUBs and FUNCTIONs can be thought of as GOSUBs on steroids.  They offer
a number of substantial benefits in terms of capabilities, but come at a major
sacrifice in performance speed.  First, they allow a number of parameters to be
passed when they are called.  These have to be prepared and placed on the
stack prior to the call, then the return address has to be placed on the stack
then the call is made.  This takes up time and space on the stack, and the
stack has to be purged of the added parameters and the return address
recovered on exit.  Again, there is a time penalty involved.

So what is the Stack?  Well, it is actually just a part of your main memory, your
RAM, and it is used in conjunction with a special register called the Stack
Pointer, or SP.  In newer processors, it is probably the ESP, or Extended Stack
Pointer.  The SP, or ESP, works with a certain group of instructions built into
the processor for certain stack-oriented operations, such as with GOSUB,
where the return address is automaticall stored at the current stack address
and the SP, or ESP, is decremented.  On the RETIRN instruction, the stack
address is incremented, and the return address is automatically copied from
the current stack location and moved into the IP (Instruction Pointer) register,
which sends you back to the point in the program where the next instruction
is to be executed.

It may seem complicated, but it comes down to this:  When something is put
on the Stack, using a stack-oriented instruction like GOSUB or PUSH, it is
stored in the current stack location and the stack pointer is decremented; when
something is taken from the stack, using an instruction like RETURN or POP, the
stack pointer is first incremented, then the value previously stored it returned
to one of the designated registers in the processor.  The stack then is a
top-down temporary storage mechanism that is automatically associated
with the actions of some commands, such as GOSUB, RETIRN, PUSH and POP.

The question is, what if you push or pop the wrong thing?  That can and does happen on occasion, particularly when just developing or modifying a low-level
(assembler) program, and the result is almost always an immediate program
crash.  So it has to be avoided, and it explains why most high level languages
do not have PUSH and POP in their syntax.  Those that do probably support a
pseudo-stack, which would actually be an array structure with some
safeguards against incorrect use.

PowerBasic, on the other hand, does support PUSH and POP via the built-in
FASM Assembler and inline statement capability.  The PUSH and POP here are
the real thing.  You can certainly use them, but remember this rule:  PUSH
comes before POP, and there should be a POP for every PUSH.

So now you understand that FUNCTIONs and SUBs both involve more use of
the stack than just simply using GOSUB and RETURN, and more use means more
instructions and more time as well.

But PowerBasic goes the extra mile with FUNCTIONs and SUBs.  As mentioned,
there is a Stack Frame, which involves some extra overhead.  First, the
processor has only a few registers available that have to be used everywhere
for everything, and yet when you enter a Function or Sub, you want some (or
most) of them available for your own purposes.  So how do we save the content
of registers temporarily?  Simple:  We put their content onto the stack.  Only
we don't have to do it, PowerBasic does it for us, automatically.  So between
the bottom of the stack and the point where the return address is, there is now more things on the stack - the content of all the registers that need to be saved.

Second, PowerBasic takes responsibility for restoring the content to all the registers when we exit the Function or Sub, and to do that, it directs the EXIT
statement to a routine that it embeds in the Stack Frame that handles that
job.  But it also takes responsibility for removing all the paramters that were
placed on the stack during the call to the Function or Sub.  All this activity and
use of the stack make for a fairly sizeable Stack Frame, but it also means that
you can call and use Functions and Subs without concern for what lies outside
that Function or Sub.

GOSUB and RETUN are essentially the same at the high level as the CALL and
RET instructions found in low level assembly language.  They are not fully
separated from the variables and labels and other things found in the main
body of the program.  In fact GOSUB can only be used within the same function
or sub, and jumps to a line label in that function or sub, whereas a call to a
function or sub would be to code elsewhere outside the immediate function or
sub (except in the rare case or recursive programming, where the same function
or sub would continue to call itself until a task was complete).

So you do not pass anything when you call a GOSUB statement, you just
continue working with the same variables as used elsewhere in the function or
sub, and you do not return any results.  You just do what you do, and whatever
changes you made to the variables are retained.

Not so with FUNCTIONs and SUBs.  There, you get to define which variables
are GLOBAL, STATIC, or LOCAL.  Global references apply to all functions and
subs.  Static references are preserved as they were within a given function
or sub (and are not initialized on entry).  Local references are always set to
zero, (strings are marked as empty) each time that function of sub is called (and require a nulling operator be passed on the stack, which PowerBasic does
automatically during the call).

Because functions and subs can be written to work with specific groups of
global, static, and local references, and because the variable and label names
used within a given function or sub do not conflict with identical names that
are used in other context, there is good reason to resort to the use of a function or sub where it is deemed appropriate.  It is really a question of what best serves in a given situation. 


Theo Gottwald

Thanks,
that was a really good explanation, for those beeeing new to this topic, Donald!