[comp.sys.mac] What do you use to remove Paragraph Returns from your Text?

moriarty@tc.fluke.COM (Jeff Meyer) (10/26/87)

As a frequent downloader of Unix articles to my Macintosh, I have a great
need to have a utility that removes the Return characters from the ends of
every line in a paragraph.  MacWrite does it when you open a text document
and hit the LINE FEEDS button, and the Fkey Clipper 1.5 USED to do this to
any text in the Clipboard... until I started using System/Finder 4.1/5.5.

Does anyone have an easy way?  A PD/ShareWare application that does this?
I'd appreciate any info...

                        "Not only is Bill Atkinson a genius, but he produces
                         shippable code."
                                            -- John Sculley

                                        Moriarty, aka Jeff Meyer
INTERNET:     moriarty@tc.fluke.COM
Manual UUCP:  {uw-beaver, sun, allegra, hplsla, lbl-csam}!fluke!moriarty
CREDO:        You gotta be Cruel to be Kind...
<*> DISCLAIMER: Do what you want with me, but leave my employers alone! <*>

chuq%plaid@Sun.COM (Chuq Von Rospach) (10/26/87)

>As a frequent downloader of Unix articles to my Macintosh, I have a great
>need to have a utility that removes the Return characters from the ends of
>every line in a paragraph.

>Does anyone have an easy way?  A PD/ShareWare application that does this?
>I'd appreciate any info...

I picked up a program off of Delphi called Macify. It does the paragraph
cleanup, converts '--' to em-dashes, handles typeset quotes and double
quotes, and all sorts of other things. Prior to this I'd had my own LSC
application that did some of it, and I did the rest with search/replace in
Word. It's wonderful. if there is enough interest, I'll send it to
comp.binaries.mac.

chuq
Chuq Von Rospach					chuq@sun.COM
Editor, OtherRealms					Delphi: CHUQ

whp@apr.UUCP (Wayne Pollock) (10/27/87)

I know of three methods; I've had occation to use all of them:
1) Use a decent telecommunications program.  Red Ryder has an option for this.

2) Use a word processor.  Both MacWrite & Word 3.01 allow global replace.  I
   used to use macwrite, but I find word is faster on large documents.  For
   Macwrite, you "cut" the line-feed (the rectangle) and "paste" it into the
   find dialog.  In Word, for some reason you can't paste into the find dialog,
   but you can use "^010"; these four characters are a line-feed to word.
   For either, leave the replacement text empty, and choose "change all".

3) Use the shareware program Macify.  This neat program has several options,
   including stripping control characters, removing all carrige returns except
   between paragraphs, converting spaces to tabs, converting strings like "\*"
   to bullets, converting "fl", "fi", etc. to their proper ligatures,
   converting "..." to sets of open (left) quote ... close (right) quote, and
   more.

Wayne Pollock,	...!{cbatt, ihnp4}!cbosgd!apr!whp

paulm@nikhefk.UUCP (Paul Molenaar) (10/27/87)

In article <2089@sputnik.COM> moriarty@tc.fluke.COM (Jeff Meyer) writes:
>As a frequent downloader of Unix articles to my Macintosh, I have a great
>need to have a utility that removes the Return characters from the ends of
>every line in a paragraph.
>
>                                        Moriarty, aka Jeff Meyer

I use MS-Word 3.01 to perform this function. It really is quite simple,
it only requires three 'search&replace' sequences (oh yes, a specific
utility would be better... but then again.. ;)

What you want is to remove all the returns at the end of the lines.
But removing all returns would also remove paragraph formatting. So what
you do first is to change all _double_ returns (at the end of a paragraph)
to a temporary code (like two asterisks: **). This is achieved with:

Find: ^13^13 (you actually type the carret in the dialog box)
Replace with: ** (or whatever unique code you like)

Now it's time to remove all _single_ returns (at the end of each line)

Find: ^13 (again, also type the carret)
Replace with: <zilch>

And now turn the temporary code back into double returns:

Find: **
Replace with: ^13^13

That's it.

        Paul Molenaar

	"Just checking the walls"
		- Basil Fawlty -
-- 
        Paul Molenaar

	"Just checking the walls"
		- Basil Fawlty -

relph@presto.ig.com (John M. Relph) (10/27/87)

I was unable to reply directly, so here's my $.02:

In article <272@nikhefk.UUCP> paulm@nikhefk.UUCP (Paul Molenaar) writes:
>Find: ^13^13 (you actually type the carret in the dialog box)
>Replace with: ** (or whatever unique code you like)
>
>Now it's time to remove all _single_ returns (at the end of each line)
>
>Find: ^13 (again, also type the carret)
>Replace with: <zilch>
>
>And now turn the temporary code back into double returns:
>
>Find: **
>Replace with: ^13^13

The way I do it is very similar to yours except that I use "^p" to
represent paragraph marks (it's in the documentation).  I also replace
all single returns with " " (space).  However, I have sometimes had
problems with the third step, replacing all "**" with "^p^p".  If I
close the file and reopen it I don't have any problem.
        -- John
----
John M. Relph
IntelliGenetics, Inc.  700 East El Camino Real, Mountain View, CA 94040
Internet: relph@bionet-20.arpa

cheeser@dasys1.UUCP (10/29/87)

In article <31935@sun.uucp> chuq@sun.UUCP (Chuq Von Rospach) writes:
>>As a frequent downloader of Unix articles to my Macintosh, I have a great
>>need to have a utility that removes the Return characters from the ends of
>>every line in a paragraph.
>I picked up a program off of Delphi called Macify. It does the paragraph
>cleanup, converts '--' to em-dashes, handles typeset quotes and double
>quotes, and all sorts of other things. Prior to this I'd had my own LSC

On the same general subject, there are several good programs that do this
sort of work:

MacSink  This is another da that does damn near anything you can think of to
a text file, even edit it -- a feature missing in most of the other
applications that do conversion -- and sort it.

Apple File Exchange  Not quite the best bet, does some nice conversion from
MessyDos and prodos, that kind of thing, but has never heard of unix...
no option for cr to lf or lf to cr

there are also a number of da and apps that do things like split files up
into smaller files (split) join files together into one large file (join)
and pack and unpack that do just that, though with the advent of packit
and stuffit, these last two are out of date.

All of these are available on systems such as CI$ and GEnie and on BBSes.

cheeser

-- 
===============================================================================
Jonathan Bing, Master (cheeser)			...ihnp4!hoptoad!dasys1!cheeser
"Pereant, iniquit, qui ante nos nostra dixerunt!"                          also
"Non illegitimus carborundum!"        crash!pnet01!pro-sol!pro-carolina!cheeser

roberts@cognos.uucp (Robert Stanley) (10/30/87)

In article <2089@sputnik.COM> moriarty@tc.fluke.COM (Jeff Meyer) writes:
>As a frequent downloader of Unix articles to my Macintosh, I have a great
>need to have a utility that removes the Return characters from the ends of
>every line in a paragraph.

As a long-time Word victim I used to do a trio of search and replace
operations, first translating pairs of carriage-returns (CR) into a
hopefully unique string, then replacing all remaining CR's with a space,
then finally replacing the unique string with a CR to give me back my
paragraphs.  Had it down to a fine art, at least for small docs.

Now I use Emacs at both ends, Gnu on the UNIX machines and micro-Emacs
on the Mac, and remove the CR whichever end I remember I need to do it.
Micro-Emacs came over the net a few months ago (version 3.8).

Works for me.

Robert_S
-- 
Robert Stanley           Cognos Incorporated     S-mail: P.O. Box 9707
Voice: (613) 738-1440 (Research: there are 2!)           3755 Riverside Drive 
  FAX: (613) 738-0002    Compuserve: 76174,3024          Ottawa, Ontario 
 uucp: decvax!utzoo!dciem!nrcaer!cognos!roberts          CANADA  K1G 3Z4

dave@onfcanim.UUCP (11/11/87)

Well, here is what we use.  It's a program called "macfmt" that massages
the file on the UNIX machine before it is sent to the Mac.  Basically,
it turns trailing newlines into blank space, but does understand that
there should be two blanks after a period.

This probably ought to be a "lex" program, but it is such a simple problem
that it's coded directly as a finite state machine.

Macfmt is a pure filter - it takes no arguments.  It is usually called
by a shell script called "macsend" that runs macfmt first, then does
the transfer to the Mac using macput.

(this isn't big enough to be worth submitting to the moderated source group)

--------------------------------------------------------------------
#include <stdio.h>

#define	S_TEXT		0	/* in normal text */
#define	S_PERIOD	1	/* just saw a period */
#define	S_ENDWORD	2	/* found newline preceded by text character */
#define	S_ENDSENT	3	/* found newline preceded by period */
#define	S_ENDPARA	4	/* newline followed by newline */

main()
{
	register int	c, state = S_TEXT;

	while ((c=getchar()) != EOF) {
		switch (c) {

		default:
			switch (state) {
			case S_ENDWORD:
				putchar(' ');
				break;
			case S_ENDSENT:
				putchar(' ');
				putchar(' ');
				break;
			}
			putchar(c);
			state = S_TEXT;
			break;

		case '.':
		case '!':
			switch (state) {
			case S_ENDWORD:
				putchar(' ');
				break;
			case S_ENDSENT:
				putchar(' ');
				putchar(' ');
				break;
			}
			putchar(c);
			state = S_PERIOD;
			break;

		case '\n':
			switch (state) {
			case S_TEXT:
				state = S_ENDWORD;
				break;
			case S_PERIOD:
				state = S_ENDSENT;
				break;
			case S_ENDWORD:
			case S_ENDSENT:
			case S_ENDPARA:
				putchar(c);
				state = S_ENDPARA;
				break;
			}
			break;
		}
	}

	switch (state) {
	case S_TEXT:
	case S_ENDWORD:
	case S_ENDSENT:
		putchar ('\n');
	}
}