[comp.lang.c++] String class

schemers@vela.acs.oakland.edu (Roland Schemers III) (07/23/90)

Hello! As part of summer project on learning C++, I have written
a dynamic String class for C++. I have examined the NIH String class, 
and the GNU libg++ String class and have designed my own class from the two.

It differs from both in that the SubString class is derived from the String
class. This means I didn't have to write special functions to deal with
SubStrings, and all my String functions work with them. The actual
class header for the SubString class is only 18 lines, and the only other
place you will see a SubString mentioned is the return value of a function.
For example, the after function returns the SubString after a given value,
since a SubString is derived from a String, you can use the SubString
returned from the after function with another function, such as the before
function:

String s1("hello there world");
s1.after("hello").before("world") = " "; //s1=="hello world"

I have also included complete support for regular expressions with a Regex
class (which is also derived from class String). The actual code that 
handles the regular expressions is take from the latest version of 
GNU EMACS. Since regular expressions are derived from the String class,
you can say something like:
Regex RXa("[aA]*"), RXb("[bB]*"), RXab;
RXab = RXa+RXb;

Anyways, here is a brief description of the class. If there is interest, 
I will make the source, tests, and docs availble via anonymous ftp.

----------------------------------------------------------------------

  In the following descriptions assume:

	- S is a String
	- ch is a char
	- n,i,p,l are int's
	- s is a side type (Left,Right,Both)
	- C is either a char, const char *, const String &
	- X is either a C or a const Regex &
	- Y is either a X or a int (pos)
 	- Z is either a Y or a int,int (pos and len)
	- R is a Regular Expression

	return types --

	- bool is an int value where zero is false and non-zero is true
	- ushort is an unsigned short 
	- String& is a reference to S
	- SubString is a SubString within S, and can be used on 
	  the left hand of assignment ie. S.after("hello")="good bye";

Constructors -

  String()
     creates a empty String // String s;
  String(n)
     creates a String with pre-allocated size n // String s(32);
  String(ch)
     creates a String equal to character ch // String space(' ');
  String(const char *)
     creates a String equal to const char * // String word("hello");
  String(const char *s,int n)
     creates a String equal to const char *,with length n // String word(s,4);
  String(const String &s)
     creates a String equal to another String // String s(word);

returns		function		description
-------		--------		-----------
String&		S  = X			assigns X to S
String&		S += Y			appends Y to the end of S
String&		S -= Y			removes Y if at end of S
String&		S *= n			multiplies S by n
String&		S /= X			remove all occurances of X from String
String		C1 + C2			concats S2 to S1
String		C1 - C2			removes string S2 if at the end of S1
String		C1 * C2
String		S  / X			remove all occurances of S2 from S1
char		S[i]			returns char indexed by 'i'
SubString	S(Z)			returns the substring Z in S
bool		!S			true if length=0
bool		C1 <  C2		standard relational
bool		C1 <= C2		  "          "
bool		C1 == C2		  "          "
bool		C1 != C2		  "          "
bool		C1 >= C2		  "          "
bool		C1 >  C2		  "          "
ushort		S.length()		returns length of string
bool		S.empty()		true if length=0
const char*	(const char *) S 	converts string to const char *
char*		S.cptr()		converts string to char *
bool(void *)	S			true if length!=0 // if (S) ... 
ostream&	ios << S		outputs string
istream&	ios >> S		inputs string
int		getline(ios,S,ch)	reads a line into S from steam ios, 
					using delimiter ch,returns length
int		S.index(X)		position of X within S
bool		S.contains(X)		true if S contains X
SubString	S.substr(p)		substring starting at p to end of S
SubString	S.substr(p,l)		substring starting at p with length l
SubString	S.left(n)		left n characters in S
SubString	S.right(n)		right n characters in S
SubString	S.between(n1,n2)	characters between n1 and n2
String&		S.insert(p,C)		inserts C into S at position p
String&		S.prepend(C)		prepends C to S
String&		S.append(C) 		appends C to S
String&		S.remove(Z)		removes Z from S
SubString	S.before(Y)		substring S before Y
SubString	S.through(Y)		substring through (upto and including) Y
SubString	S.at(Z)			substring at Z
SubString	S.from(Y)		substring from Y to end of S
SubString	S.after(Y)		substring after Y to end of S
String		S.except(Z)		everything in S except Z
SubString	S.skip(X)		skips optional X, returns substring after
SubString	S.ws()			skips white space
String&		S.replace(X1,X2)	replaces X1 with X2
SubString	S.pos(i)		assigns i current position within S
SubString	S.moveto(X)		moves upto X and returns from X to end
SubString	S.find(X)		finds X in S, return after X to end
SubString	S.match(X)		matches X with leftmost side of S
int		S.split(R,S1[],n)	splits S into string array S1[0..n],
					on Regex R and returns number split
String&		S.trim(s)		trims white space on side s
String&		S.pad(n,s,ch)		pads to length n with char ch on side s
String&		S.trunc(n)		truncates S to length n
String&		S.upper()		upper cases string
String&		S.lower()		lower cases string
String&		S.reverse()		reverse string
String&		S.icase()		ignores case during next relational 
					operation // S.icase() < "HELLO"
String&		S.ucase()		uses case during next relational op.
-- 
Roland J. Schemers III                              Systems Programmer 
schemers@vela.acs.oakland.edu (Ultrix)              Oakland University 
schemers@argo.acs.oakland.edu (VMS)                 Rochester, MI 48309-4401
"Get off your LEF and do something!"                (313)-370-4323

schemers@vela.acs.oakland.edu (Roland Schemers III) (07/24/90)

Hello! Due to the number of requests I am getting for my String class,
I have placed a copy in the anonymous ftp directory on the machine
vela.acs.oakland.edu (35.146.10.2). Simply login as anonymous, and
cd /pub/C++. The file is called String.tar.Z and is about a 100K.

It was written using AT&T's cfront 2.0, on a VAX6310 running Ultrix 3.1b, 
a DECsystem 5820 running Ultrix 3.1c, and a VAXstation 3100 running Ultrix 4.0. 
My code is very portable since I don't use any of the strxxx functions
(except strncmp and strncasecmp, which I haven't phased out yet), and
the GNU stuff should also be fairly portable.

Do what you want with my code, the regular expression stuff if from 
GNU EMACS 18.55, so the normal rules apply there (whatever that means...).

For all the people who don't have ftp access, how do you want the sources
mailed? they are about 300K uncompressed. I could use shar and then
break it up into small chunks if thats ok.

Roland
-- 
Roland J. Schemers III                              Systems Programmer 
schemers@vela.acs.oakland.edu (Ultrix)              Oakland University 
schemers@argo.acs.oakland.edu (VMS)                 Rochester, MI 48309-4401
"Get off your LEF and do something!"                (313)-370-4323

daf@public.BTR.COM (David A. Feustel daf@btr.com) (07/25/90)

How about COMPRESSing the source and then UUENCODEing it?

schemers@vela.acs.oakland.edu (Roland Schemers III) (08/13/90)

Hello! For those of you using my String class, I thought I would mention
I have a new release, which is quite improved over the last one. The main
changes are with substrings and regular expressions.

Since the SubString class is derived from the String class, you can now use
any of the String functions safely with a SubString, for example:

String s1("ab 12 cd");
s1.at("12") += "34";  // s1=="ab 1234 cd"

s1="12 y 34";
s1.at("y").append("z").prepend("x").insert(0,"w"); // s1 =="12 wxyz 34"

Also, You can now even input directly into a SubString:

s1="hello XXX world";
cin >> s1.substr(6,3) ;  // input from standard input to substring XXX

type: there

// s1 == "hello there world"

---

The second major change is in the way the Regex class works. The Regex
class is now derived from the StringSearch class, and you can now
create your own class which is derived from the StringSearch class.
Your class can then be used with any of the String functions that
take a StringSearch argument. All you have to do is write the virtual
search function. For example:

class SSwhitespace : public StringSearch {
public:
	SSwhitespace(){}
	int search(const String &s, int &matchlen) const ;
};

int SSwhitespace::search(const String &s, int &len) const
{
int p1;
StringIterator next(s);
char ch;

   while (next(ch)) 
	if (isspace(ch)) {
	      p1=next.pos();
	      while (next(ch) && isspace(ch));
	      len=next.pos()-p1; // set length to length of match
	      return p1;	 // return start of match
	}

   len=0;		// set matched length to 0
   return -1;		// return -1, not found!

}

const SSwhitespace SSwhite;

void main() 
{
String s1("This	 	  is a test");

s1.at(SSwhite)=" "; // s1 == "This is a test"

}

Using this technique you can do away with the Regex class, and simply
write your own searching functions. Note that you don't have to derive
a class to use the StringSearch class, you can simply declare a
StringSearch variable:

int whitespace_search(const String &s, int &len) const
StringSearch SSwhite(SS_whitespace);

This example also shows the use of the new StringIterator class, which
creates inline functions that safely step through a String.

Classes derived from the StringSearch function could be quite powerful.
You could write classes that lookup keywords in a symbol table, that use 
a hashing function and so on.

Any ways, enough of my babbling, the new String class is avaible via 
anonymous ftp at vela.acs.oakland.edu (35.146.10.2). 
It is in pub/C++/String1.2.tar.Z, and is about 93K.

cheers, Roland



-- 
Roland J. Schemers III                              Systems Programmer 
schemers@vela.acs.oakland.edu (Ultrix)              Oakland University 
schemers@argo.acs.oakland.edu (VMS)                 Rochester, MI 48309-4401
"Get off your LEF and do something!"                (313)-370-4323