• Welcome to Jose's Read Only Forum 2023.
 

ProgEx31 - Add More String Overloads on operator= and operator+

Started by Frederick J. Harris, November 17, 2009, 09:17:45 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Frederick J. Harris


/*
 ProgEx31  --  Add More String Overloads on operator= and operator+

 In this enhancement of our String Class we've added support for the assignment
 of a char to a String and for the assignment of another String object to a
 String, and lastly, for the type of String concatenation operations we do in
 PowerBASIC all the time involving adding strings with the '+' operator such as
 s1 = s1 + s2.  If you recall in ProgEx29 and 30 we extensively discussed the
 operator= member with regard to assigning a char* to a String.  With the
 support added here we can do the following...

 String s1, s2;

 s1 = "Hello, ";     //Assign char* to String (We've already done this).
 s1 = s1 + "World";  //Addition operator is new.  We can now add strings.
 s2 = s1;            //This is also new.  Assign one string to another.

 If you've been able to follow what I consider a rather detailed examination
 of the operator= member with regard to a char* as in ProgEx29, then I believe
 you'll have no problem following the logic of the treatment of char and String
 assignment in Strings.cpp.  So at this point our Strings.h header containing
 the declaration of the String class looks like this...

 //Strings.h                         Class Declaration For String Class

 class String
 {
  public:
  String();                         //Uninitialized Constructor - Does nothing
  String& operator=(const char);    //Operator= for assigning char to String
  String& operator=(const char*);   //Operator= for assigning literal or char* to String
  String& operator=(const String&); //Operator= for assigning 2nd String to 1st
  String& operator+(const char);    //Operator+ for adding char to String
  char* lpStr(void);                //Returns Address of String data Member  >> this->lpStr()
  ~String();                        //String Destructor - deallocates String memory

  private:
  char* pStrBuffer;                 //This char* member holds the address of an allocated String
 };

 Below is a fun little program to run to examine the sequence of member
 function calls involved when String concatenations are done in a loop.  Its
 easy for programmers who might not be experienced in languages other than
 various BASIC dialects to lose sight of the underlying fury and viciousness
 of memory activity that goes on in these sorts of operations.

 It might be worthwhile to just run through quickly what must happen when
 repetitive String assignments and concatenations occur, and to do this without
 reference to any particular programming language.  Since I've been using my
 name a lot, that will surfice.  Say we want to assign my mame to a String...

 s1 = "Frederick"

 First, a memory allocation has to be made for at least 10 characters.  If we
 are doing this in BASIC the language will also allocate enough additional
 memory to maintain whatever memory structures it needs to track and handle
 the String.  I'm referring here to such things as String descriptors.  If we
 wish then to add another String to the one above such as my middle name
 " John" then we'll have this...

 s1 = s1 + " John"

 In that case the language will have to determine whether the original String
 s1 is holding enough memory to add the additional five characters.  The
 situation here is that when it made the allocation for "Frederick" it may have
 asummed that concatenations might occur, and it may have acquired additional
 memory -  perhaps it doubled the original String length and asked for and
 received 20 bytes from the operating system's memory allocator.  But then
 again, it may not.  It might only have a 10 byte buffer and no more.  So what
 I'm saying here is that this sort of operation in BASIC family languages
 involves proprietary code.  If the String s1 has enough memory associated with
 it the second String can simply be copied after the first.  If not, a new
 allocation must be made for enough bytes to accomodate both Strings, and the
 memory for the first String is then released.  In terms of PowerBASIC, I do
 believe the manual states that every String re-assignment will cause the
 underlying String's memory to move.  If that is so then its likely PowerBASIC
 isn't requesting additional memory beyond the String's length to accomodate
 repeated String concatenations.

 The issue here is a speed verses memory frugality tradeoff.  Memory
 allocations are one of the more troublesome issues an operating system faces.
 Certainly operating systems must at all times keep track of where programs are
 loaded and where memory is free.  It must be able to hand out memory to
 programs when requests are made, and it must be able to re-consolidate memory
 back into its free store when programs close or release memory back.
 Operating systems have extremely complicated algorithms to track and manage
 these things, and the intensity of the measures they undertake to maintain
 free memory depends largely on the current state of the system in terms of its
 'memory stress'.  If many programs are running or even few programs but some
 making massive memory requests, then the operating system will be agressively
 attempting to reconsolidate fragmented memory back into the free store.  Under
 low memory stress conditions it may leave lots of chunks of various sizes
 scattered about for awhile.

 Because of issues such as these there is a very good arguement that can be
 made for attempting to minimize the number of memory allocation requests a
 program makes.  This is especially true if execution speed is an issue,
 because memory allocations time wise are not deterministic due to their
 dependence on the memory state of the machine at the time of the request.

 Getting back to our String Class, as it presently stands its very frugal in
 terms of memory usage at the cost of making quite excessive use of memory
 allocations.  Take for example the program below that runs a for loop starting
 at 65 - which is the ascii code for 'A', and continues up to 67 - which is the
 ascii code for 'C'.  Looking at the output directly after the program you can
 see that after the uninitialized String onstructor call, the first function
 call is the operator= call with a const char parameter.  In that call an
 allocation is made for a miserly two bytes; one for the char and one for the
 null.  That function completes the copy of the char to the String buffer.
 Then the next member function call completes the assignment of the char to the
 String on the left side of the assignment statement.

 This process is repeated for the 'B' and 'C' chars except in those cases the
 memory for the original String is released after a new allocation for each
 single extra byte occurs.  It should be obvious that this technique seems to
 save memory by not requesting many bytes, but it does this at the expense of
 a new release/memory allocation for each char.  If we did the whole alphabet
 in our for loop we would have 26 memory allocations!  This is excessive.

 Its important to consider that the granularity of memory allocations from
 Windows is some multiple of 4 bytes.  I'm personally not sure what it is and
 the number might even vary with the memory allocation function used but some
 of the older Win32 documentation I've seen specified it as 16 bytes.  In any
 case some number that is a multiple of 4 bytes would be a more efficient
 starting point for memory allocations.  What might be a possible starting
 point for improving our String class would be a starting allocation of 8 or
 16 bytes, and the addition of another class member variable to keep track of
 the present allocation for the String.  Also, we might add some logic to
 further increase the allocation in operator+ overloads in anticipation of
 additional concatenations.  How this will affect the logic seen so far is that
 when concatenations occur, the new member storing the maximum allocation will
 have to be checked to see if there is enough unused but allocated memory to
 add the new String to the already allocated one without a new allocation and
 release sequence.  We'll take that up in the next ProgEx.
*/

#include <cstdio>
#include <cstdlib>
#include <new>
#include "Strings.h"

void OutOfMemory()                 //When the program starts I do a call to
{                                  //set_new_handler().  If a memory allocation
puts("No More Memory!  Too Bad!  I'm Outta Here!"); //with new fails OutOfMemory()
exit(1);                          //will be called and the exit(1) function in
}                                  //cstdlib will be called and the program will exit.

int main(void)                     //Lets concatenate the capital letters from
{                                  //A to C together in a loop.  If you look at
String s1;                        //the operator+ member in Strings.cpp that
                                  //takes a char as a parameter, you'll see
std::set_new_handler(OutOfMemory);//that the function allocates memory for a
for(unsigned int i=65; i<68; i++) //String equal in length to the existing
{                                 //String plus an extra byte for the char to
    s1=s1+i;                      //be added plus another for the terminating
    printf("%s\n\n",s1.lpStr());  //NULL.  The operator= then assigns the
}                                 //object on the right to the one on the left
getchar();                        //of the equal sign.

return 0;
}

/*            --Output--
Entering Uninitialized String Constructor!
Leaving Uninitialized String Constructor!

Entering operator+(const char ch)
 ch   = A
 iLen = 0
Leaving operator+(const char ch)

Entering operator=(const String& strRight)
 strRight.pStrBuffer = A
Leaving operator=(const String& strRight) In Early Exit

A

Entering operator+(const char ch)
 ch   = B
 iLen = 1
Leaving operator+(const char ch)

Entering operator=(const String& strRight)
 strRight.pStrBuffer = AB
Leaving operator=(const String& strRight) In Early Exit

AB

Entering operator+(const char ch)
 ch   = C
 iLen = 2
Leaving operator+(const char ch)

Entering operator=(const String& strRight)
 strRight.pStrBuffer = ABC
Leaving operator=(const String& strRight) In Early Exit

ABC
*/


Here is Strings.h

//Strings.h                         Class Declaration For String Class So Far

class String
{
 public:
 String();                         //Uninitialized Constructor - Does nothing
 String& operator=(const char);    //Operator= for assigning char to String
 String& operator=(const char*);   //Operator= for assigning literal or char* to String
 String& operator=(const String&); //Operator= for assigning 2nd String to 1st
 String& operator+(const char);    //Operator+ for adding char to String
 char* lpStr(void);                //Returns Address of String data Member  >> this->lpStr()
 ~String();                        //String Destructor
 
 private:
 char* pStrBuffer;                 //This char* member holds the address of an allocated String
};


And here is Strings.cpp

//String.cpp             Implementation Of String Class
#include  <stdio.h>      
#include  <string.h>
#include  "Strings.h"


/* ****************************************************************************/
// String::String()  This is an uninitialized String Constructor.  It is
// called when a declaration of a String such as this occurs...
//  
// String s1;
//  
// All it does is assign NULL to the pStrBuffer member.
/* ****************************************************************************/
String::String()
{
puts("Entering Uninitialized String Constructor!");
puts("Leaving Uninitialized String Constructor!\n");  
pStrBuffer=NULL;
}



/* ************************************************************************** */
// String& String::operator=(const char c)   This is an overloaded operator=
// called when a String is assigned a char like so...
//
// String s1, s2;
//
// s1='F';
// s2=114;  //'r'
/* ************************************************************************** */
String& String::operator=(const char c)  //Overloaded operator = for assigning a
{                                        //character to a String
puts("Entering operator=(const char)");  
if(this->pStrBuffer)            //If String already is storing something delete
   delete [] this->pStrBuffer;  //it.  Then allocate a new buffer of two bytes;
pStrBuffer=new char[2];         //one for the char and one for the obligatory
this->pStrBuffer[0]=c;          //NULL.  Then, using base pointer offset
this->pStrBuffer[1]='\0';       //notation, copy char to offset zero and the
puts("Leaving operator=(const char)");  
                                //NULL right afer it.
return *this;
}



/* ************************************************************************** */
//  This overloaded operator= is called when a String is assigned a char*.  In
//  other words, something like this...
//  
//  String s1;
//  
//  s1="Hello, World!"  
//  
//  or
//  
//  char* pStr="Hello, World!";
//  s1=pStr;
//  
//  The 1st thing the function does is check to see if this->pStrBuffer isn't
//  NULL.  If it isn't then the String is holding some previous String the user
//  doesn't want anymore.  In that case delete [] is called on the String.  In
//  any case a new buffer is then allocated of a length equal to the length of
//  the char* pStr passed in plus an extra byte for the null terminator.  Then
//  the passed in parameter string is copied to the newly acquired buffer.  
//  While this points to a String, *this is a String, and now since the char*
//  has been transfered to the String's storage, we simply return *this.
/* ************************************************************************** */
String& String::operator=(const char* pStr)
{
puts("Entering operator=(const char* pStr)");  
if(this->pStrBuffer)
   delete pStrBuffer;
pStrBuffer=new char[strlen(pStr)+1];
strcpy(pStrBuffer,pStr);
puts("Leaving operator=(const char* pStr)");  
   
return *this;
}



/* ************************************************************************** */
// String& String::operator=(const String& strRight)  Overloaded operator= for
// assigning a second string to a first.  It is triggered by a statement such as
// this
//
// String s1, s2;
//
// s1 = "PowerBASIC Put The Power In BASIC";
// s2 = s1;
//
// The only thing tricky here is the top if.  If you attempt to assign a String
// to itself, e.g.,
//
// s1 = s1;
//
// what will happen is the String will delete itself, then attempt to copy the
// deleted String back to a zero length buffer.  The only sane alternative is to
// return the existing String 'as is', as you can see by the return *this.
//
// Otherwise, the function deletes [] the existing buffer, allocates a new one
// adequate to hold the parameter's String, then copies the latter to the new
// buffer.
/* ************************************************************************** */
String& String::operator=(const String& strRight) //Overloaded operator = for
{                                                 //assigning another String to
puts("Entering operator=(const String& strRight)");  
printf("  strRight.pStrBuffer = %s\n",strRight.pStrBuffer);
if(this==&strRight)                              //a String
{
   puts("Leaving operator=(const String& strRight) In Early Exit\n");    
   return *this;
}  
delete [] this->pStrBuffer;
pStrBuffer=new char[strlen(strRight.pStrBuffer)+1];
strcpy(pStrBuffer,strRight.pStrBuffer);
puts("Leaving operator=(const String& strRight)\n");  

return *this;
}



/* ************************************************************************** */
// String& String::operator+(const char ch)  Adds a character to a String.  If
// you have this...
//
// String s1;
//
// s1 = "ABC";
// s1 = s1 + 'D';
//
// ...then this overloaded operator= String member function will be called.
/* ************************************************************************** */
String& String::operator+(const char ch) //Overloaded operator + (Adds char in
{                                        //String)
unsigned int iLen=0;
char* pNew;

puts("Entering operator+(const char ch)");
printf("  ch   = %c\n",ch);
if(this->pStrBuffer) //1st term (pStrBuffer) isn't empty (NULL)
{
   iLen=strlen(this->pStrBuffer);  //find out length of existing String
   pNew=new char[iLen+2];          //allocate new buffer with two extra bytes
   strcpy(pNew,this->pStrBuffer);  //copy existing String to new buffer
   delete [] this->pStrBuffer;     //delete [] existing String
   this->pStrBuffer=pNew;          //set 'this' String to new buffer address
   this->pStrBuffer[iLen]=ch;      //add char to end of existing String
   this->pStrBuffer[iLen+1]='\0';  //add NULL byte right after that
}
else                 //1st string (this->pStrBuffer) is empty (NULL)
{
   pStrBuffer=new char[2];
   this->pStrBuffer[0]=ch;
   this->pStrBuffer[1]='\0';
}
printf("  iLen = %u\n",iLen);
puts("Leaving operator+(const char ch)\n");

return *this;
}



/* ************************************************************************** */
// This function simply returns the pStrBuffer private char* address of the
// String.  It is useful for example in outputing the string to a file or
// display with the printf function.
/* ************************************************************************** */
char* String::lpStr()
{
return this->pStrBuffer;
}



/* ************************************************************************** */
// This is the String Destructor.  It is called when a String is being
// destroyed.  In such cases it is considered good practice to set any class
// buffers to 0 so the same address does't accidentally get deleted twice.  
// Calling delete on a NULL pointer is OK; calling it twice on a non null
// pointer isn't OK, because there may be something else at that address.
/* ************************************************************************** */
String::~String()    //String Destructor
{
delete pStrBuffer;
pStrBuffer=NULL;
}