[comp.lang.perl] Changing the first character of a string.

sean@ms.uky.edu (Sean Casey) (06/30/90)

How can I change an arbitrary character of a string? I've got a fixed length
string, $flags, that is used in a printf. I need to be able to do something
like:

	$flags = "   ";

	if ($mb_flags[$num] && $FL_UNREAD) {
		set first character of $flags to 'U';
	}
	if ($mb_flags[$num] && $FL_SENT) {
		set second character of $flags to 'S';
	}
	and so on

That's why we need a perl book. I've written 40 or so odd perl scripts, and
something this simple evades me.

Sean

emv@math.lsa.umich.edu (Edward Vielmetti) (06/30/90)

(we need a perl book)

	$flags = '';
	if ($mb_flags[$num] && $FL_UNREAD) {
		$flags .= "U" ;
	} else {
		$flags .= " " ;
	}

would concatenate the flag character onto your string.  Or you
could use an array and use join to print them out.  Or probably
some other method is more efficient...

--Ed

Edward Vielmetti, U of Michigan math dept <emv@math.lsa.umich.edu>

merlyn@iwarp.intel.com (Randal Schwartz) (07/01/90)

In article <sean.646706730@s.ms.uky.edu>, sean@ms (Sean Casey) writes:
| 
| How can I change an arbitrary character of a string? I've got a fixed length
| string, $flags, that is used in a printf. I need to be able to do something
| like:
| 
| 	$flags = "   ";
| 
| 	if ($mb_flags[$num] && $FL_UNREAD) {
| 		set first character of $flags to 'U';
| 	}
| 	if ($mb_flags[$num] && $FL_SENT) {
| 		set second character of $flags to 'S';
| 	}
| 	and so on

substr() can be used as an lvalue.  From the manpage:

     substr(EXPR,OFFSET,LEN)
             Extracts a substring out of  EXPR  and  returns  it.
             First  character  is at offset 0, or whatever you've
             set $[ to.  If OFFSET is negative, starts  that  far
             from  the  end  of  the  string.   You  can  use the
                                                =================
             substr() function as an lvalue, in which  case  EXPR
             ====================================================
             must  be an lvalue.  If you assign something shorter
             ===================
             than LEN, the string will shrink, and if you  assign
             something  longer  than LEN, the string will grow to
             accommodate it.  To keep the string the same  length
             you  may  need  to  pad  or  chop  your  value using
             sprintf().

In English (or something resembling English :-), this means that you
can select portions of strings with substr, and then assign into them.
So, a sample of what you wanted to do is:

	$flags = " " x 3; # hard to count spaces... :-)
	substr($flags,0,1) = "U";
	print "<$flags>\n";

| That's why we need a perl book. I've written 40 or so odd perl scripts, and
| something this simple evades me.

I've written probably ten times that many *production* scripts (and
many more one-ups), and I still find myself reading the manpage (and
occasionally the code :-) from cover to cover to get what I need to
know.  I hope the book clears up some of the folklore and allows you
to have access to our collective experience.

for("hacker","Perl","another","Just"){substr($x,0,0)="$_ ";}substr($x,-1,1)=",";print$x
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/

worley@compass.com (Dale Worley) (07/02/90)

Of course, there's always:

	$flags =~ s/^./U/;

(If $flags doesn't contain newlines, or $(whatever) is 0.)

Dale Worley		Compass, Inc.			worley@compass.com
--
PHOTOVOLTAICS: safe and clean (but not cheap) electricity from the SUN.

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (07/03/90)

In article <1990Jul2.153957.24671@uvaarpa.Virginia.EDU> worley@compass.com writes:
: Of course, there's always:
: 
: 	$flags =~ s/^./U/;
: 
: (If $flags doesn't contain newlines, or $(whatever) is 0.)

If you want to avoid that problem, you can get fancy and say

	vec($flags,$offset,8) = ord('U');

That will extend the string as necessary.  (Of course, it'll extend it with
null characters if $offset is greater than the current length...)

You could also say

	$flags =~ s/^[^\0]?/U/;

But I still vote for substr.

Larry

worley@compass.com (Dale Worley) (07/03/90)

   From: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall)

	   $flags =~ s/^[^\0]?/U/;

Hmmm...  Why is NUL treated specially?

Also, this illustrates one thing I don't like about regexps -- people
write code which depends on the order in which the alternatives are
matched.  For instance, in the regexp above, the case where [^\0]?
matches the null string can always match, so it implicitly depends on
the fact that the non-null match is tried first.  On the other hand,
it's hard (impossible?) to write a regexp which matches in only the
right way without some way to specify context for the match (shades of
\: and \;!!!).

Dale Worley		Compass, Inc.			worley@compass.com
--
They are actually planning to produce object oriented COBOL.
If it comes out, it will undoubtedly be called "ADD 1 TO COBOL".
	-- Cay Horstmann

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (07/04/90)

In article <1990Jul3.144552.5407@uvaarpa.Virginia.EDU> worley@compass.com writes:
: 
:    From: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall)
: 
: 	   $flags =~ s/^[^\0]?/U/;
: 
: Hmmm...  Why is NUL treated specially?

You'll have to ask the designers of C that question.  Perl isn't treating
it at all specially here--it's just relying on the convention that normal
text never contains it.  /[^\0]/ is just an idiom for matching any textual
character including newline, which /./ specifically won't match.

: Also, this illustrates one thing I don't like about regexps -- people
: write code which depends on the order in which the alternatives are
: matched.  For instance, in the regexp above, the case where [^\0]?
: matches the null string can always match, so it implicitly depends on
: the fact that the non-null match is tried first.  On the other hand,
: it's hard (impossible?) to write a regexp which matches in only the
: right way without some way to specify context for the match (shades of
: \: and \;!!!).

The longest-match-first principle is a long-standing tradition.  It seems
more useful (or at least, less confusing) to have things be self-fitting than
to have to take external measures to make them fit.

One interesting exception to this rule is that alternatives (in a typical
backtracking regexp package, anyway) are matched left to right, even if
the left alternative is shorter:

	$_ = 'abccccc';
	/(ab|abc)c*/;
	print $1;

will print "ab".  In general, this doesn't get in your way.

The place where longest-first makes problems is when you're scanning
for opening and closing delimiters when there might be more than one
pair on the line.  In this case, you have to find some way of restricting
the * from matching everything from the first opening delimiter to the
final closing delimiter.  With a single character delimiter, you can
do it easily (neglecting backslashes for the moment):

	/"[^"]*"/

But what about multi-character delimiters such as C comments?  The easiest
way in Perl is to first translate the multi-character delimiters to
single characters that don't otherwise occur, and then do the above trick.

Larry

worley@compass.com (Dale Worley) (07/05/90)

   From: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall)

   You'll have to ask the designers of C that question.  Perl isn't treating
   it at all specially here--it's just relying on the convention that normal
   text never contains it.  /[^\0]/ is just an idiom for matching any textual
   character including newline, which /./ specifically won't match.

Sounds like a good way to get screwed if the string actually contains
a NUL.  Why not "(.|\n)"?

   But what about multi-character delimiters such as C comments?

Well, my preference is something like

   .* \- .*/\*.*

(exploiting the fact that the complement of a regular language is
regular), but nobody implements that.

Dale Worley		Compass, Inc.			worley@compass.com
--
Software without hardware is an idea.
Hardware without software is a space heater.

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (07/06/90)

In article <1990Jul5.135434.11674@uvaarpa.Virginia.EDU> worley@compass.com writes:
: 
:    From: lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall)
: 
:    You'll have to ask the designers of C that question.  Perl isn't treating
:    it at all specially here--it's just relying on the convention that normal
:    text never contains it.  /[^\0]/ is just an idiom for matching any textual
:    character including newline, which /./ specifically won't match.
: 
: Sounds like a good way to get screwed if the string actually contains
: a NUL.  Why not "(.|\n)"?

It's not likely to contain a NUL in the context we were originally talking
about, in which the programmer will probably have initialized the string to
spaces.  Then again, it's unlikely to contain a newline for the same reason.

And (.|\n) is considerably slower, at least the way things are currently
set up.  It wouldn't be if we were using a DFA.  But we aren't.

Larry