[comp.lang.c++] How are strings efficiently concatenated?

jamesh@cs.umr.edu (James Hartley) (06/16/91)

Most implementations of strings that I have seen are centered around re-
tention of a character point in the class;  specifically,

        class string {
            char *str;
	    ...
        };

Because of the possibility of differing extents, memberwise initialization
of two strings is unwise.  For example,

        string s0, s1("one");
        s0 = s1;

s0 should dynamically allocate space for a copy of "one" rather than receive
a copy of the pointer to "one".

How is one to reclaim heapspace when concatenating strings together as in:

        string s2("two");
        s0 = s1 + s2;

The expression s1 + s2 will allocate space for "onetwo", but s0 will create
a new copy "onetwo".  How is the intermediate string deallocated?  What I
would really like to do is write efficient code for the following --

        s0 = s1 + s2 + s3;

No reasonable request will be turned down;  inquiring minds want to know...
-- 
James J. Hartley               _   /| Internet: jamesh@cs.umr.edu
Department of Computer Science \'o.O'   Bitnet: jamesh@cs.umr.edu@umrvmb.bitnet
University of Missouri - Rolla =(___)=    UUCP: ...!uunet!cs.umr.edu!jamesh
"Life is like an analogy..."      U    ACK!  PHFFT!

schemers@vela.acs.oakland.edu (Roland Schemers III) (06/16/91)

In article <2825@umriscc.isc.umr.edu> jamesh@cs.umr.edu (James Hartley) writes:
>
>How is one to reclaim heapspace when concatenating strings together as in:
>
>        string s2("two");
>        s0 = s1 + s2;
>
>The expression s1 + s2 will allocate space for "onetwo", but s0 will create
>a new copy "onetwo".  How is the intermediate string deallocated?  What I
>would really like to do is write efficient code for the following --
>
>        s0 = s1 + s2 + s3;

The way I have seen to do it is creating another string type called StringTmp
for example, and have + return a StringTmp. The first 's1+s2' returns a 
StringTmp which is then added to s3. Since you return a StringTmp you can write
StringTmp &operator+(StringTmp &,String &) so that it just appends the 
String. Then the assignment to s0 just involves changing s0 to point to
the StringTmp and at the same time null out the pointer in the StringTmp
so it does not get deallocated by the destructor. This can be done through
the function String &operator = (StringTmp &). 

If I ever get time I was planning on adding StringTmp's back to my String
class. I took them out because of all the extra functions they were adding
and I was trying to keep the class to a minimum. Another suggestion for
the StringTmp class might be to have it initially allocate more space 
then a normal String, since it might be more likely to grow. My String class
has a static variable called hunksize which is the size the String will
grow when it is expanded.

Roland
-- 
Roland J. Schemers III                              Systems/Network Manager
schemers@vela.acs.oakland.edu (Ultrix)              Oakland University 
schemers@argo.acs.oakland.edu (VMS)                 Rochester, MI 48309-4401
OU in Michigan! Say it slow: M-i-c-h-i-g-a-n        (313)-370-4323

wicklund@intellistor.com (Tom Wicklund) (06/18/91)

In <2825@umriscc.isc.umr.edu> jamesh@cs.umr.edu (James Hartley) writes:

>The expression s1 + s2 will allocate space for "onetwo", but s0 will create
>a new copy "onetwo".  How is the intermediate string deallocated?  What I
>would really like to do is write efficient code for the following --

>        s0 = s1 + s2 + s3;


Another way is to implement an append operation analagous to the "<<"
of stream output:

   s0 = "";
   s0 << s1 << s2 << s3;

Ideally this should then be merged into streams so that a string can
act as an input or output stream.

android@ccwf.cc.utexas.edu (Andy Wilks) (06/18/91)

In article <1991Jun17.234430.8877@intellistor.com> wicklund@intellistor.com (Tom Wicklund) writes:
)In <2825@umriscc.isc.umr.edu> jamesh@cs.umr.edu (James Hartley) writes:
)
)>The expression s1 + s2 will allocate space for "onetwo", but s0 will create
)>a new copy "onetwo".  How is the intermediate string deallocated?  What I
)>would really like to do is write efficient code for the following --
)
)>        s0 = s1 + s2 + s3;
)
)
)Another way is to implement an append operation analagous to the "<<"
)of stream output:
)
)   s0 = "";
)   s0 << s1 << s2 << s3;
)
)Ideally this should then be merged into streams so that a string can
)act as an input or output stream.

Why don't you overload the "+" and "=" operators, have it return a 
string&, write constructors and destructors and let the compiler 
take care of the rest?  This is very similar to the stream operator
but the "+" would be a more natural operator for concat.

/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/
*   I don't express opinions, I just follow orders...             (___)   *
/                                                                 (o o)   /
*   Andy Wilks                   One of the few .sig's->   /-------\ /    *
/   andy@fiskville.mc.utexas.edu   with ASCII livestock.  / |     ||O     /
*   android@ccwf.cc.utexas.edu                           *  ||,---||      *
/   University of Texas at Austin                           ^^    ^^      /
*   copyright (c) 1934,1942,1961,1990,1991                    BEVO        *
/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/*/