[comp.lang.perl] path truncation quest

barn@convex.com (Tim Barney) (03/21/91)

Any suggestions on how, given the argument "/directory/sub/100.host.0"
that one could take off the "100.host.0" component, leaving
"/directory/sub" ?

I've thought of splitting it and just taking the first 2 components,
but there could be any number of directory levels.

tim_

barn@convex.com (03/21/91)

Does anybody have a mechanism for semaphoring exclusive access to
multiple instances of a perl script (for the purpose of accessing files
and directories). I especially desire a mechanism without race conditions
(using Test-And-Set type features of someone's hardware). Thanks! tim_

ziegast@eng.umd.edu (Eric W. Ziegast) (03/21/91)

In article <1991Mar21.010547.13331@convex.com> you write:
>Any suggestions on how, given the argument "/directory/sub/100.host.0"
>that one could take off the "100.host.0" component, leaving
>"/directory/sub" ?

At first, I thought about using an array:

	@tmp = split(/\//,$path);
	$dir = join("/",pop(@tmp));

But in Perl, there's always a better way of doing things:

	$path =~ s/((\/[^\s\/]+)+)\/([^\s\/]+)/$1/;

A simplified version would be:

	/((\/F)+)+)\/(F)/

	F matches a one word file/directory name.
		In my example I use [^\s\/]+ which simply matches
		any word without whitespace or "/" in it.
		You may want to use something more strict instead.

	$1 will match the directory path.  (man dirname)
	$2 is used only for /F grouping
	$3 will match the base name.  (man basename)

This is, of course, only if it's a full path name.  Anything else
won't match (and won't be changed).


--
Perl - 1. (n) The Mother of All Languages
       2. (v) To convert an otherwise boring program to use Perl.
	  [-ed, -ing]
________________________________________________________________________
Eric W. Ziegast, University of Merryland, Engineering Computing Services
ziegast@eng.umd.edu - Eric@(301.405.3689)

composer@chem.bu.edu (Jeff Kellem) (03/22/91)

In article <1991Mar21.064642.29427@eng.umd.edu> ziegast@eng.umd.edu (Eric W. Ziegast) writes:
 > In article <1991Mar21.010547.13331@convex.com> you write:
 > >Any suggestions on how, given the argument "/directory/sub/100.host.0"
 > >that one could take off the "100.host.0" component, leaving
 > >"/directory/sub" ?
 >
 > But in Perl, there's always a better way of doing things:
 >
 >	   $path =~ s/((\/[^\s\/]+)+)\/([^\s\/]+)/$1/;

You could make the above a tiny bit more readable by writing it as:

	   $path =~ s:((/[^\s/]+)+)/([^\s/]+):$1:;

or, use whatever delimiters you'd like.  That way you don't have to deal
with quoting the slashes.  On the other hand, if I was to write the above,
assuming I didn't want to save the filename component, I'd probably write
it as:

	   $path =~ s|/[^/]*$||;

This would get rid of any trailing slash, but will also turn "/" to ""..
which is probably not what's wanted .. If you wanted to leave any trailing
slash, you could write it as:

	   $path =~ s|[^/]*$||;
or
	   $path =~ s|/[^/]+$||;

There are, of course, other ways of dealing with the problem.  That should
give you a start, though...

Cheers...

			-jeff

Jeff Kellem
Internet: composer@chem.bu.edu

merlyn@iwarp.intel.com (Randal L. Schwartz) (03/22/91)

In article <1991Mar21.064642.29427@eng.umd.edu>, ziegast@eng (Eric W. Ziegast) writes:
| In article <1991Mar21.010547.13331@convex.com> you write:
| >Any suggestions on how, given the argument "/directory/sub/100.host.0"
| >that one could take off the "100.host.0" component, leaving
| >"/directory/sub" ?
| 
| At first, I thought about using an array:
| 
| 	@tmp = split(/\//,$path);
| 	$dir = join("/",pop(@tmp));
| 
| But in Perl, there's always a better way of doing things:
| 
| 	$path =~ s/((\/[^\s\/]+)+)\/([^\s\/]+)/$1/;
| 
| A simplified version would be:
| 
| 	/((\/F)+)+)\/(F)/
| 
| 	F matches a one word file/directory name.
| 		In my example I use [^\s\/]+ which simply matches
| 		any word without whitespace or "/" in it.
| 		You may want to use something more strict instead.
| 
| 	$1 will match the directory path.  (man dirname)
| 	$2 is used only for /F grouping
| 	$3 will match the base name.  (man basename)
| 
| This is, of course, only if it's a full path name.  Anything else
| won't match (and won't be changed).

Ouch.  Why not just:

	$path =~ s#^(/.*)/.*$#$1#;

(If you have newlines in your filenames, replace "." with "(.|\n)".)

The nested "+" ops aren't buying you anything, nor is the paren-pair
on the right side.  And .* in the left is guaranteed to match the
longest string.

$_ = "/Just/another/Perl/hacker,/OK?"; s#^(/.*)/.*$#$1#; y#/# #; s/^.//; print
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Intel: putting the 'backward' in 'backward compatible'..."====/

rbj@uunet.UU.NET (Root Boy Jim) (03/22/91)

In article <1991Mar21.064642.29427@eng.umd.edu> ziegast@eng.umd.edu (Eric W. Ziegast) writes:
>In article <1991Mar21.010547.13331@convex.com> you write:
>>given the argument "/directory/sub/100.host.0"
>>[how do you remove the last component], leaving "/directory/sub" ?
>
>At first, I thought about using an array:
>
>	@tmp = split(/\//,$path);
>	$dir = join("/",pop(@tmp));

Or, you could reverse it, s;[^/]*;;, and reverse again.

>But in Perl, there's always a better way of doing things:

At least one.

>	$path =~ s/((\/[^\s\/]+)+)\/([^\s\/]+)/$1/;
>
>A simplified version would be:
>
>	/((\/F)+)+)\/(F)/

Too complex. First, pick another delimiter, and get rid of the \/'s.
Second, there is no sense saving the last pattern. Third, and
most importantly, you are going overboard. Restated, the goal is
"keep everything up to but not including the last slash".

Translated, this becomes: $path =~ s"^(.*)/.*$"$1"

Alternately, the goal can be restated "throw away everything
including and after the last slash".

Translated, this becomes: $path =~ s"/[^/]*$""

>	F matches a one word file/directory name.
>		In my example I use [^\s\/]+ which simply matches
>		any word without whitespace or "/" in it.
>		You may want to use something more strict instead.

Actually, less strict. Slashes are all that count. OK, nulls too,
but I'm ignoring them for now. s/\0.*// if you really care.

>	$1 will match the directory path.  (man dirname)
>	$2 is used only for /F grouping
>	$3 will match the base name.  (man basename)

Aha! It seems that you have adapted a general routine.

>This is, of course, only if it's a full path name.  Anything else
>won't match (and won't be changed).

Which may or may not matter to the caller.

But onward. I can think of a few other ways.
1) use rindex and substr
2) 1 while (chop($path) ne '/')
3) first, we generate pi/4 :-)

-- 
		[rbj@uunet 1] stty sane
		unknown mode: sane

ziegast@eng.umd.edu (Eric W. Ziegast) (03/22/91)

In article <126224@uunet.UU.NET> rbj@uunet.UU.NET (Root Boy Jim) writes:
>>A simplified version would be:
>>	/((\/F)+)+)\/(F)/
>
>Too complex. First, pick another delimiter, and get rid of the \/'s.
>Second, there is no sense saving the last pattern. Third, and
>most importantly, you are going overboard. Restated, the goal is
>"keep everything up to but not including the last slash".
. . . etc . . .
>>	$1 will match the directory path.  (man dirname)
>>	$2 is used only for /F grouping
>>	$3 will match the base name.  (man basename)
>
>Aha! It seems that you have adapted a general routine.

Call it "feeping creaturism". ;-)

Eric Z