[comp.unix.shell] ^M in vi

martin@mwtech.UUCP (Martin Weitzel) (09/23/90)

In article <1990Sep21.134832.21480@virtech.uucp> cpcahil@virtech.UUCP (Conor P. Cahill) writes:
>In article <2965@wyse.wyse.com> bob@wyse.UUCP (Bob McGowen x4312 dept208) writes:
>>>> In article <1990Sep18.023649.1336@virtech.uucp> cpcahil@virtech.UUCP (Conor P. Cahill) writes:
>>>>   >:1,$s/!/<ctrl>-V<Return>C/g	(this will appear as:1,$s/!/^MC/g)
>>>> This will put carriage returns in the file, not newlines.
>>
>>If the input ^M were made into a linefeed the file should still be UNIX
>>only double spaced (in this example).  So under XENIX 2.[23].x and AT&T
>>UNIX SysV/386 Rel 3.2 there is no conversion of cr to nl set by vi.
>>Other systems may do this and I would be interested in knowing which ones.
>
>As far a I know it works (replaces with a newline) this way on every system. 
>Obviously that isn't the case. I have tried it (and know it works that way) 
>on Sun OS, HP-UX, Ultrix, and Interactive (both 2.2 and 2.0.2).

In fact it is easy to get confused on the matter how vi treats a ^M.
I'll try to summarize, where NL stands for a newline character and CR
stands for carriage return (to further disambiguate I allways added
the ASCII-value in parentheses):

a) In command mode ^M advances the cursor to the beginning of the next
   line (down) just like '+', ^J advances it down on the current position
   just like 'j'.
b) In insert-mode ^M and ^J are treated the same, ie. they insert a NL
   (ASCII 10).
   b1) If protected by a preceding ^V, in insert-mode ^M inserts a CR 
       (ASCII 13) but ^J still inserts a NL (ASCII 10)
c) If typed at the ex-prompt-level (eg. after starting a ':'-command),
   both ^M and ^J terminate the command, like ESCAPE does.
   c1) If protected by a preceding ^V, only ^J terminates, but ^M is
       treated depending on the context:
       c1.1) In the "pattern"-text of a substitute command, ^M is a CR
             (ASCII 13). (To specify a NL in a pattern-text the special
             char $ at the end of the pattern can be used, but the meaning
             is a little different as $ specifies only a context for the
             pattern but is not part of it, if it comes to replacement!)
       c1.2) In the "replacement"-text of a substitute command, ^M is
             translated to a NL (ASCII 10)
   c2) If the translation of ^M to NL is not desired, it can be disabled
       with a leading backslash. A leading backslash doesn't change the
       meaning of ^M in a context where it *no* translation occurs, so
       you can allways protect ^M if you want a CR (or ASCII 13).

Let's try to clarify this summary by some examples (note that you must
TYPE ^M as the two-key-combination ^V^M - I can not show this but I show
the lines like they look if you try yourself in vi):

	:s/^M/X/     -- replace CR (ASCII 13) with the character X
	:s/^M/^M/    -- replace CR (ASCII 13) with NL (ASCII 10)
	:s/\^M/^M/   -- same as above
	:s/X/^M/     -- replace X with NL (ASCII 10)
	:s/X/\^M/    -- replace X with CR (ASCII 13)
	:s/X/\\^M/   -- replace X with backslash and NL (ASCII 10)
	:s/X/\\\^M/  -- replace X with backslash and CR (ASCII 13)
	:s/X/\^M^M/  -- replace X with CR+NL (ASCII 13+10)
	:s/X/^M\^M/  -- replace X with NL+CR (ASCII 10+13)

I've told you, it's easy to get confused :-) ...
and wrt the other ':'-commands I would have to try myself,
though I'm "guessing" right in most cases :-) ...

Note that I tried all the above combination while creating this article
with the vi of ISC 386/ix Rel 2.0.2, but it would be nice if others try
if the results are the same on other systems (the examples are simple
enough).

Special note for all readers in Germany which use SINIX Systems: In case
of doubt use the vi of the "ucb"-universe - the version in the "att"-
universe seems to be heavily broken in this area and it's still more
confusing :-().
-- 
Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83