• Welcome to Jose's Read Only Forum 2023.
 

ProgEx29 - Beginning Of A C++ String Class

Started by Frederick J. Harris, November 09, 2009, 01:34:00 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Frederick J. Harris


/*
  ProgEx29  C++ Classes Using Creation Of A String Class As An Example

  Creating a String Class is actually a very good exercise for an aspiring C++
  programmer.  Recall where we left off in ProgEx28 we found that if we place an
  uninitialized Box object on the left side of an equal sign (assignment
  operator) and an initialized Box object on the right side, the object on the
  right will be assigned member by member to the one on the left.  Considering
  the difficulties we've had with C strings it might be interesting to see what
  we can get to work in terms of this behavior with a String class.

  This ProgEx shows the skeleton for a String class.  For this bare bones
  example the only really necessary data member is a char* variable to point to
  the Null terminated String.  Here I've named it pStrBuffer and you can see it
  down in the private section of the String Class below.  Like our Box class we
  have an uninitialized String Constructor that doesn't do very much except
  assign a NULL to pStrBuffer when a String declaration such as this is found in
  the program...

  String s1;

  In fact, we have such a declaration in the program below, and if you run it or
  look at the output I've provided, all that happens is this message is
  displayed...

  Entering Uninitialized Constructor!
    this        = 2293584
  Leaving Uninitialized Constructor!

  The C++ keyword 'this' is kind of interesting.  Its like 'me' in PowerBASIC or
  VisualBasic.NET.  It refers to the present object for which member functions
  or data members are being acted upon.  If you think of a struct, a struct is
  actually a block of memory where the various member variables are laid out. In
  terms of a struct 'this' would refer to the base memory allocation address. In
  terms of a class its essentially the same thing, and C's indirect member
  selection operator (which I havn't covered yet) is used instead of the '.' to
  select the member of the class you want to refer to if you are working with a
  pointer to the base allocation instead of a class object itself.  For example,
  this term...

  this->pStrBuffer

  is the same within the class as this one...

  pStrBuffer
 
  However, if you wish to specify simply the address of the instantiated object
  itself, that would simply be...
 
  this
 
  So, 'this' is an object's address, and *this is the object itself.  Clear as
  mud I expect, but I'm doing my best.  You only learn this stuff by working
  with it awhile - unless you're a genius, of course.

  Within the class member functions the use of 'this' to refer to data or
  methods is always optional.  However, 'this', which is sometimes called the
  implicit class pointer, is oftentimes useful to output the address of the
  class object being acted upon, which in this example is the String object s1
  down in main().

  In examining the output you can see that the value printed out for 'this' in
  the uninitialized constructor agrees with the address of s1 down in main() as
  derived from this statement in main()...

  printf("&s1           = %u\n",&s1);

  They are both equal to 2293584.  You can also see that the sizeof a String
  object is only four bytes.  Those four bytes would be due to the char* single
  data member pStrBuffer.

  Looking down at the code in main() you'll see this statement several lines
  down from where String s1 was declared...

  s1="Hello, World!";

  It is that assignment statement involving a String object on the left of the
  equal sign and a character string literal on the right side that will trigger
  the C++ class mechanism to look for an operator= defined class function in the
  String Class that takes a char* as a parameter.  The above term is also
  equivalent to this, I might add..
 
  s1.operator=("Hello, World!"); 
 
  In either case this is the String Class operator= member function that will
  get called...

  String& operator=(char* pStr)
  {                             
   if(this->pStrBuffer)
   {       
      delete [] this->pStrBuffer;
      this->pStrBuffer=NULL;
   }       
   this->pStrBuffer=new char[strlen(pStr)+1];
   strcpy(this->pStrBuffer,pStr);
     
   return *this;
  }

  I'll try my best to explain this for you but I have to warn you this is
  conceptually difficult - so get ready.  For one thing, note the return value
  of the function.  It is this...

  String&

  You've never seen that before (I feared I was springing too much on you all at
  once here, so several days after I wrote this I created ProgEx28a where I
  tried to explain reference return values and provide a nice example.  See that
  for more details).  Its a reference to a String object.  So far I have not
  discussed this before, but functions in C++ can return a reference.  The
  interesting thing about a reference as a return value is that it is something
  termed an lvalue.  A characteristic of an lvalue is that it can appear on the
  left side of an assignment statement.  So, believe it or not, if you have a
  function named Fn1(SomeParam), instead of this...
 
  iValue = Fn1(SomeParam);
 
  you can have this instead...
 
  Fn1(SomeParam) = iValue;    //  !!!!!!
 
  Strange as that looks, think for a moment of PowerBASIC's Mid$ statement. 
  Here's a short example where I stick my middle initial and a period within
  a BASIC String just consisting of my first and last names...
 
  #Compile Exe
  #Dim All

  Function PBMain() As Long
    Local strBuffer As String

    strBuffer="Frederick    Harris"
    Print "strBuffer = " strBuffer
    Mid$(strBuffer, 11, 2) = "J."
    Print "strBuffer = " strBuffer
    Waitkey$

    PBMain=0
  End Function

  'strBuffer = Frederick    Harris
  'strBuffer = Frederick J. Harris 
 
  So the above placement of a function on the left side of an equal sign isn't
  such a foreign idea after all.  Here is a small C++ program showing a similiar
  idea...       
     
  #include <stdio.h>
  typedef char* PCHAR;

  PCHAR& MakeString(PCHAR& pCharacter)
  {
   return pCharacter;
  }

  int main(void)
  {
   PCHAR pChar=NULL;

   MakeString(pChar)="Hello, World!";
   printf("pChar = %s\n",pChar);
   getchar();

   return 0;
  }

  //       --Output--
  //   pChar = Hello, World!
  //
 
  Note that I used a typedef of a char* so that I could use PCHAR in place of
  it.  If you remove the typedef the MakeString() function has to be prototyped
  as follows...
 
  char*& MakeString(char*& pChar);
 
  ...and that's just a bit too ugly for me!  What the effect is in the above
  program of the reference return value of MakeString is that it seems like the
 
  return pChar
 
  is the same as putting the pChar on the left of the assignment statement down
  in main() like so...
 
  pChar="Hello, World!";
 
  Possibly another way to think about it is to compare a reference return value
  to a reference parameter in a function.  In the case of a reference parameter
  the function is actually working with the actual variable itself declared in
  the calling procedure because it was passed its address through the parameter.
  This is so even if the parameter is named different than the name it was given
  in the calling procedure; in other words, the function is working with the
  original variable through an alias name.  In the case above pChar is being
  assigned the "Hello, World!" string through the alias 'pCharacter' in the
  MakeString function.
 
  As I intimated, this may take some work on your part to understand, unless you
  just want to 'cookbook' it and move on.  In that case you can simply use what
  I provide here.  I'll eventually just produce the String class for you, but
  I'll proceed on the assumption that you want to understand it.
 
  Having gotten through that part about the reference return value, I believe
  we're through the worst of it.  Referring to the program run below, you can
  see that the address of the String object s1 down in main() is 2293584.  This
  fact is reinforced when the declaration of s1 results in a call to the default
  uninitialized String Class Constructor and the address of the object for which
  the call is being made, i.e., s1, reveals that this = 2293584.
 
  Right after producing the 4 byte sizeof(s1) we have the assignment statement
  which triggers the call to the String operator= member function.  You can see
  in the output from operator= that the right side of the assignment statement,
  i.e., "Hello, World!" came through in the const char* pStr parameter.  The
  operator= prototype (function declaration) used the const keyword to indicate
  to the function that the pStr parameter would not be modified within the proc.
 
  Before proceeding let me first describe in general terms what has to happen
  for the character array "Hello, World!" to be owned by the String object s1.
  First, its important to recognize that the character array "Hello, World!" is
  known as as literal string constant.  When the program is compiled the
  compiler will allocate memory to hold the characters plus the null terminator
  in the program's data segment.  As such it will be in memory when the program
  is loaded and it will have an address.  However, at this point it may be in
  the program alright, but it doesn't belong to the String s1.  For it to belong
  to the String s1, the String is going to have to allocate memory to hold it,
  then the String literal "Hello, World!" is going to have to be copied to that
  recently acquired memory.  The String Class has as its only data member a
  variable named pStrBuffer and it is the type of C/C++ variable that can hold
  the address of a character array.  So what we need to do is acquire a block of
  memory large enough to hold the String and assign that address to
  this->pStrBuffer.
 
  Referring to both the operator= function and the output from same you should
  see that initially pStrBuffer = 0.  This value was set in the Constructor.
  Right below the line that produces that output of zero you'll see this...
 
  if(this->pStrBuffer)         //If this string is already holding
  {                            //another String - delete its memory
     delete [] pStrBuffer;       
     pStrBuffer=NULL;
  }
 
  The purpose of this code (which doesn't execute in this example) is to check
  to see if s1 is already holding a String.  In our case it isn't, but if it
  were we would want to call the C++ delete operator on it to release the memory
  and String so there would be no memory leak.  If one is deleting any kind of
  array whether it be an array of chars or an array of objects, one should place
  the '[]' brackets in between the delete operator and the object to be deleted.
  The effect this has is to not only release the block of memory pointed to by
  the object variable, but to call the Destructors on all the elements of the
  array.  Its pretty much overkill for a char array which is one of the C
  primitives, but those are the rules one must live by.  After those lines
  you'll see this...
 
  pStrBuffer=new char[strlen(pStr)+1];       //Allocate sufficient memory
  printf("  pStrBuffer  = %u\n",pStrBuffer); //for this String
  strcpy(pStrBuffer,pStr);                   //Move bytes at pStr to this String
 
  The 1st line of the three uses the C++ 'new' operator which is a memory
  allocation device.  It likely calls malloc() but does a number of other rather
  sophisticated things to enable C++ functionality.  After the new keyword is
  the datatype we wish associated with the memory allocation and it is an array
  of chars.  inside the '[]' brackets is the strlen() function from the C
  Standard Library which was actually incorporated into the C++ libraries.  That
  particular function call will determine that 13 bytes are needed and an extra
  byte will be added to that for the NULL terminator.
 
  After that the address of this memory allocation is output and you can see the
  number is 4072496.  That address represents the starting byte where are 14
  bytes begin.  The String s1 ownes those 14 bytes.  The strcpy (String Copy)
  function from string.h is then used to copy the "Hello, World!" in the
  program's data segment to the newly acquired memory at 4072496.  At this point
  s1 ownes that String. It has been assigned to its this->pStrBuffer member.
 
  The next part is unfortunately pretty confusing.  What is returned through the
  return statement is...
 
  return *this;
 
  Try to realize that the String object s1 is not exactly equivalent to the
  char array it owns, i.e., "Hello, World!".  When we output the address of s1
  in main() and the number held in 'this' in the default constructor we came up
  with this equality...
   
  &s1 = this = 2293584
 
  That is the address of a String object that has a number of various data and
  member functions.  Its single member variable pStrBuffer itself is holding
  another address, and that address is the address of the allocation where
  "Hello, World!" is stored.  So s1 is at 2293584, and the char string
  "Hello, World!" is at 4072496.  So, what's stored at the address of this is
  4072496.  Or, alternately...
 
  *this=4072496;
 
  ...and from 4072496 to 4072508 are the bytes of "Hello, World!".  So what
  needs to be returned from the operator= member function is 4072496.  After the
  return and back in main() you can see I imediately called my .lpStr() member
  function on s1 to output both the String "Hello, World!", and its address of
  4072496.  If you take a quick look at the lpStr() member you'll see this...
 
  char* lpStr(void)
  {
   return this->pStrBuffer;
  }
 
  All it is is an accessor member to the private data member pStrBuffer.  We
  need the address of the string to output it through the printf() function.
   
  Good news!  The worst is over!  The rest is pure fun!  Before long we'll be
  using Mid, Left, Right, Instr, Print, Parse, etc!  And in C++!
*/

#include <stdio.h>
#include "string.h"

class String
{
  public:
  String()                      //Uninitialized Constructor
  {
   printf("Entering Uninitialized Constructor!\n");
   pStrBuffer=NULL;
   printf("  this        = %u\n",this);
   printf("Leaving Uninitialized Constructor!\n\n");
  }

  String& operator=(char* pStr) //Operator Equal -- Used For Assignment
  {                             //of char*, i.e., another char string
   printf("  Entering String& operator=(const char* pStr)\n");
   printf("    pStr        = %s\n",pStr);
   printf("    pStrBuffer  = %u\n",pStrBuffer);
   printf("    this        = %u\n",this);   
   printf("    &pStrBuffer = %u\n",&pStrBuffer);
   if(this->pStrBuffer)         //If this string is already holding
   {                            //another String - delete its memory
      delete [] pStrBuffer;     //and set pStrBuffer equal to 0 (null)   
      pStrBuffer=NULL;
   }
   pStrBuffer=new char[strlen(pStr)+1];          //Allocate sufficient memory
   printf("    pStrBuffer  = %u\n",pStrBuffer);  //for this String and move
   strcpy(pStrBuffer,pStr);                      //bytesat pStr to this String
   printf("  Leaving String& operator=(const char* pStr)\n\n");
   
   return *this;
  }
             
  char* lpStr(void)
  {
   return this->pStrBuffer;
  }
                   
  ~String()                           
  {
   delete pStrBuffer;
  }

  private:
  char* pStrBuffer;
};

int main(void)
{
String s1;
puts("In Main()");
printf("  &s1           = %u\n",&s1);
printf("  sizeof(s1)    = %u\n\n",sizeof(s1));
s1="Hello, World!";
printf("  s1.lpStr()    = %s\n",s1.lpStr());
printf("  s1.lpStr()    = %u\n",s1.lpStr());
puts("Leaving Main()");
getchar();

return 0;
}

/*         --Output--
Entering Uninitialized Constructor!
  this        = 2293584
Leaving Uninitialized Constructor!

In Main()
  &s1           = 2293584
  sizeof(s1)    = 4

  Entering String& operator=(const char* pStr)
    pStr        = Hello, World!
    pStrBuffer  = 0
    this        = 2293584
    &pStrBuffer = 2293584
    pStrBuffer  = 4072496
  Leaving String& operator=(const char* pStr)

  s1.lpStr()    = Hello, World!
  s1.lpStr()    = 4072496
Leaving Main()
*/