mark@unisec.USI.COM (Mark Rinfret) (03/23/87)
In the past few months, I've used about 3.5 different editors (the .5 is for the ones I tried for a couple of days but gave up on quickly). First, there was Ed, next came MicroEmacs and now Z. I'll not attempt to justify my switching to Z, though it's the reason for this posting. In the course of all these changes, my "tabbing and indentation philosophy" (that sounds lofty!) has undergone severe trauma. With Ed, the default tab setting was 3 characters, but tab characters didn't actually get stored in the file. With MicroEmacs, the default was 8 but if you changed it, you got spaces (right?). Finally, I have Z which, for all its faults, allows me to set tabs where I want them (4) and stores true tab characters in the file. This should result in a significant reduction in the sizes of some of my sources, considering the 4:1 reduction of spaces to tabs. Unfortunately, most of my files are currently tabbed at 8 characters, with sprinklings of spaces for intervening indentations. If you find yourself in the same or similar situation, the following utility may be of use to you. This trivial offering simply expands an input file using its current tab setting (known by you, hopefully), reformatting the file with a new tab setting. It does nothing about C indentation - just plays with tabs. It is smart in a small way about not entabbing quoted strings. This is the first opportunity I've had to offer anything to this group, though I've taken much. It's one of those stupid little things that anyone can write if they need it bad enough - I did. I compiled this under Aztec C V3.4 but it's vanilla enough to port just about anywhere. Mark ========================================================================== /* :set ts=4 */ /* * Redefine tabs in a text file. * Mark Rinfret, 03/18/87 mark@unisec.USI.COM * Filename: retab.c * * Description: * This program inputs a text file with one tab width setting and * creates a new output file which has either a new tab setting or no * tabs. A few smarts have been included to avoid introducing tab * characters into quoted strings and character constants. This program * only supports tab settings which are an even multiple of a given value. * For instance, a tab width of 4 results in tab stops at columns * 5, 9, 13, etc. Minimum tab width is 3 columns, maximum is 32. * Examples: * * retab -i8 -o4 infile outfile * Converts infile, currently set at 8 column tabs to outfile which * will have 4 column tabs. * * retab -i8 -o0 -q infile outfile * Expands all tabs in infile and places the result in outfile, * suppressing statistical info. There will be no tab characters * in outfile. * * retab infile outfile * Converts infile, currently set at 4 column tabs, to outfile, which * will also have 4 column tabs. Used in this manner, a cleanup * function is provided, optimizing file size by replacing spaces * with tabs, where possible. * * The author releases this source to the public domain with no * restrictions which means that you can use, rewrite, redistribute, * sell or eat it. */ #include <stdio.h> #include <ctype.h> #define LINEMAX 255 /* max input line length */ #define MAXTAB 32 /* maximum tab width allowed */ #define MINTAB 3 /* minimum tab width allowed (except 0) */ #define TABIN 4 /* default input file tab setting */ #define TABOUT 4 /* default output file tab setting */ FILE *OpenFile(); FILE *infile,*outfile; /* input / output files */ char *iname, *oname; /* input / output file names */ char linebuf[LINEMAX+1]; /* line buffer */ unsigned intab = TABIN, outtab = TABOUT; unsigned incol, outcol; unsigned iccnt = 0, occnt = 0, ilcnt = 0, olcnt = 0; unsigned statistics = 1; main(argc,argv) int argc; char **argv; { char c,*arg; ++argv; /* skip program name */ while (--argc && **argv == '-') { arg = *argv; if ((c = *++arg) == 'i') { intab = atoi(++arg); cktab(intab); /* check tab value */ } else if (c == 'o') { outtab = atoi(++arg); cktab(outtab); } else if (c == 'q') /* quiet mode */ statistics = 0; else Usage(); /* bad option */ ++argv; /* point to next arg */ } if (argc < 2) Usage(); iname = argv[0]; oname = argv[1]; infile = OpenFile(iname,"r"); outfile = OpenFile(oname,"w"); retab(); stats(); } /* Perform the retabbing function. */ retab() { int c; unsigned endfile = 0; while(!endfile) { incol = 1; outcol = 0; while ((c = fgetc(infile)) != '\n') { if (c == EOF) { ++endfile; break; } ++iccnt; if (c == '\t') { /* input was a tab? */ do { putbuf(' '); } while (outcol % intab != 0); } else { putbuf(c); } } if (c == '\n') { ++iccnt; ++ilcnt; } else if (outcol) /* something on last line? */ ++ilcnt; linebuf[outcol] = '\0'; outline(); } } /* Put one character in the line buffer, testing for overflow and * maintaining the output column, outcol. */ putbuf(c) { if (outcol == LINEMAX) outline(); linebuf[outcol++] = c; } /* Output the current line. */ outline() { char c,*s; unsigned blanks = 0,escape = 0, i, j, outpos = 0, quote = 0; unsigned pass_through; s = linebuf; for (i = 0; i < outcol; ++i) { /* scan characters in buffer */ if (outtab) { /* entab output line? */ if (i % outtab == 0) { /* at a tab stop? */ if (blanks && !quote) { if (blanks > 1) { fputc('\t',outfile); ++occnt; outpos = i; } else { fputc(' ',outfile); ++occnt; ++outpos; } blanks = 0; } } pass_through = 0; /* allow blank checking */ c = *s++; /* get next character */ if (escape) { /* pass through as is */ escape = 0; ++pass_through; } else if (c == '"' || c == '\'') { /* quotes? */ if (quote) { if (quote == c) quote = 0; /* end of quote */ } else quote = c; ++pass_through; } else if (c == '\\') { /* character escape */ escape = 1; ++pass_through; } if (c == ' ' && !pass_through) ++blanks; else { blanks = 0; while (outpos < i) { fputc(' ',outfile); ++occnt; ++outpos; } fputc(c,outfile); ++occnt; ++outpos; } } else { fputc(*s++,outfile); ++occnt; } } fputc('\n',outfile); /* line terminator */ ++olcnt; /* count output lines */ } /* Display correct program usage. */ Usage() { printf( "Usage: retab [-i<input tab>] [-o<output tab>] [-q] <input> <output>\n\n"); printf( "<input tab> is the tab value of the input file. If not given,\n"); printf("4 is assumed.\n"); printf( "<output tab> is the new tab value for the output file. If not\n"); printf( "given, 4 is assumed. Zero is also legal, the net effect of which\n"); printf( "is to expand tabs in the input file to spaces.\n"); printf( "-q specifies quiet mode - no statistics will be output.\n"); exit(1); } /* Check tab value for allowable range */ cktab(val) unsigned val; { if (val && (val < MINTAB || val > MAXTAB)) { printf("Tab value must be in the range of %d..%d\n", MINTAB,MAXTAB); exit(1); } } FILE *OpenFile(name,how) char *name, *how; { FILE *fp; extern int errno; if ((fp = fopen(name,how)) == NULL) { printf("Can't open %s for %s access, errno is %d.\n",name,how,errno); exit(1); } return fp; } /* Report program statistics. */ stats() { char *format = " %s %-12s Tabs: %2d, %5u Characters, %5u Lines\n"; if (statistics) { printf("retab statistics:\n"); printf(format,"Input: ", iname, intab, iccnt, ilcnt); printf(format,"Output: ", oname, outtab, occnt, olcnt); } } -- | Mark R. Rinfret, SofTech, Inc. mark@unisec.usi.com | | Guest of UniSecure Systems, Inc., Newport, RI | | UUCP: {gatech|mirror|cbosgd|uiucdcs|ihnp4}!rayssd!unisec!mark | | work: (401)-849-4174 home: (401)-846-7639 |
cjp@vax135.UUCP (03/24/87)
In article <443@unisec.USI.COM> mark@unisec.USI.COM (Mark Rinfret) writes: >(right?). Finally, I have Z which, for all its faults, allows me to set >tabs where I want them (4) and stores true tab characters in the file. Well for my money, I like Z a lot but the tabs handliing is one thing that's badly flawed. I don't care a bit whether the file is a few percent larger, but it is imperative that I be able to set 4-wide indentation stops that *print* *out* *on* *my* *printer* as 4-wide. My printer has only 8-space tabs and it does not auto-wrap. A major point of 4-spaces indentation is that I can more easily read deeply nested code. What use is it if that info all falls into a blot in the 80th column? And it's a pain in the behind to run a reformatter for each printing. In short: Jim, please bring back the vi usage of ^T, ^D and optimize out any superfluous spaces on the fly. Charles Poirier
dillon@CORY.BERKELEY.EDU.UUCP (03/24/87)
Both VI and Z have the same problem... namely that there is only one type of 'tab'. When you set tabs to 4 in either VI or Z, it writes out to the file using the tab character but assuming it's 4 rather than 8. Rightly, you should either have two separate variables, or should always write out files using tabs of 8 (i.e. translate the internal tab format of 4 to the external format of 8 without effecting the apparent text). -Matt
cjp@vax135.UUCP (03/24/87)
In article <8703240644.AA00426@cory.Berkeley.EDU> dillon@CORY.BERKELEY.EDU (Matt Dillon) writes: > > Both VI and Z have the same problem... namely that there is only one >type of 'tab'. When you set tabs to 4 in either VI or Z, it writes out >to the file using the tab character but assuming it's 4 rather than 8. Are we speaking of the same vi? On 4.2BSD, :set shiftwidth=4, then use ^T to shift line right, ^D to shift left (insert mode), or >>, << command mode. You get all (8-space) tabs except the rightmost 4 spaces every other shift. (I like this arrangement by the way.) Charles Poirier
mark@unisec.UUCP (03/25/87)
In article <1814@vax135.UUCP>, cjp@vax135.UUCP (Charles Poirier) writes: > In article <443@unisec.USI.COM> mark@unisec.USI.COM (Mark Rinfret) writes: > >(right?). Finally, I have Z which, for all its faults, allows me to set > >tabs where I want them (4) and stores true tab characters in the file. > > Well for my money, I like Z a lot but the tabs handliing is one thing > ... > ... , but it is imperative that I be able to set 4-wide > indentation stops that *print* *out* *on* *my* *printer* as 4-wide. My > printer has only 8-space tabs and it does not auto-wrap. A major point > of 4-spaces indentation is that I can more easily read deeply nested > code. I hear you! One of the first things I did when I started developing code on the Amiga was to get a "detabbing print utility" (expands tabs) and customize it for my own use. The command line takes an option (-i<tabs>) which allows you to specify what your tab setting is. I've also added Unix-style wildcarding (via Aztec's "scdir" function). It (pr) also outputs an optional header with filename, date, page number and the line number of the first line on the page. My reason for posting this rather than sending e-mail is to get some response/opinion to a concern related to "public domain" software. Just for fun, I added a copy of C. Heath's "getfile" requester package so that you can call my "pr" without filename parameters and it will put up a requester (it currently only works for one file, but I could easily add a "more?" loop to it). Here's the rub - I modified the "getfile" package to return a status code which informs me when the CANCEL gadget has been clicked. The author explicitly states in his source that no modified version of this package is to be released without clearing it through him (though it is on the update disks for the Aztec C compiler and probably a thousand other BBS's, etc.). I have so far respected the author's wishes, but I have no desire to send a letter out into the void, waiting for "permission" to re-release the modified source. What I'm leading up to is this - would it be ethical for me to maintain opening credits to C. Heath but rename the modified routine? I've been hesitant to do this since I don't want to step on any toes. I could just release the source (just another dumb little tool, mind you) with pointers to the changes that must be made, but that's a hassle. When I put things in the public domain (not much for Amiga yet, much for C64), I wave goodbye and encourage the world to do with it as they will. Though I am grateful to C. Heath for releasing his code in the first place, I wish he had been less restrictive in his "conditions". Thanks for listening. Mark -- | Mark R. Rinfret, SofTech, Inc. mark@unisec.usi.com | | Guest of UniSecure Systems, Inc., Newport, RI | | UUCP: {gatech|mirror|cbosgd|uiucdcs|ihnp4}!rayssd!unisec!mark | | work: (401)-849-4174 home: (401)-846-7639 |
glee@cognos.UUCP (Godfrey Lee) (03/27/87)
I think all editors should support settable tabs. The tabs should be stored as tabs in the file. I hate editors that changes what I type in, because I always end up in the situation of not being able to produce a file of exactly what I want. Vi does not commit that sin, aside from NULs, you can get anything into the file. If your printer doesn't support settable tabs, your print program should. If your print program doesn't, use "detab" or "expand" (they are also trivial to write). -- ----------------------------------------------------------------------------- Godfrey Lee, Cognos Incorporated, 3755 Riverside Drive, Ottawa, Ontario, CANADA K1G 3N3 (613) 738-1440 decvax!utzoo!dciem!nrcaer!cognos!glee
vanam@pttesac.UUCP (03/28/87)
Here's my 2 cents on the tab subject. I think Z is doing it just right when it displays tabs at whatever setting you choose, but still stores them internally as tab characters. It's not the fault of Z if a particular printer forces tabs to be every 8 characters. It's up to the editor, the printer driver (or printer itself) to allow the user to set tab stops wherever she wants. Anyhow, that's my opinion. Marnix -- Marnix (ain't unix!) A. van\ Ammers Work: (415) 545-8334 Home: (707) 644-9781 CEO: MAVANAMMERS:UNIX UUCP: {ihnp4|ptsfa}!pttesac!vanam CIS: 70027,70
cjp@vax135.UUCP (03/31/87)
In article <402@pttesac.UUCP> vanam@pttesac.UUCP (Marnix van Ammers) writes: >them internally as tab characters. It's not the fault of Z if a particular >printer forces tabs to be every 8 characters. It's up to the editor, the >printer driver (or printer itself) to allow the user to set tab stops >wherever she wants. I am not flaming here, but to defend my point: Do you propose then that I also modify type, more, and every other editor I occasionally use, as well as the printer device, to compensate for Z's too-simple treatment of indents? Sir, I claim that 8-space tabs are the standard and variable-length tabs are a frill, a kludge, and a hack. Vi did it right. I also claim that there is *no* saving in file length by using 4-space tabs as opposed to all 8-space tabs plus occasional runs of 4 spaces, for source code indented an average of roughly 3 times or more per line. (Analysis available on request.) Charles Poirier vax135!cjp
dillon@CORY.BERKELEY.EDU.UUCP (04/01/87)
>I think all editors should support settable tabs. The tabs should be stored >as tabs in the file. I hate editors that changes what I type in, because I >always end up in the situation of not being able to produce a file of >exactly what I want. Vi does not commit that sin, aside from NULs, you >can get anything into the file. > >If your printer doesn't support settable tabs, your print program should. >If your print program doesn't, use "detab" or "expand" (they are also trivial >to write). I disagree. Editors such as VI, EMACS, ED, and DME writeout files as normal text files, and thus there is no way for the printer driver to know what tab size to use unless *you* tell it.. for each file. I think the only way one can avoid having to know what the tabsize should be for a given files is to always use a standard tab size (I.E. 8) when reading and writing files, and converting to whatever internal tabbing you prefer. Then, you could VI, EMACS, ED, or DME any arbitrary programmer's files without known which tab size he likes to use. To prevent misunderstanding, here is an example: person A uses a tab size of 7 inside his editor. He has the following line IN THE EDITOR: <TAB><TAB>x He then writes the file to disk. On disk, the tabs are 8, and the file looks like this: <TAB><6 spaces>x ANYBODY can then load that file into their own editor with their own personal tabbing... since in the load process the editor *knows* the tabs are always 8 on disk, and converts. So person B likes tabs of 4. He loads person A's file and gets this: <TAB><TAB><TAB><2 spaces>x NOTE: The text file looks *exactly* the same whether you CAT it from disk, or EDIT it with your favorite tabbing. I personally use tabs of 4 in VI when I'm using UNIX systems, and I find it a bi#$@ch to have to 'expand -4' 40 source files before sending them to the printer. -Matt
farmer@ico.UUCP (David Farmer) (04/02/87)
Summary: Expires: Sender: Followup-To: Distribution: Keywords: In article <8704010701.AA20834@cory.Berkeley.EDU> dillon@CORY.BERKELEY.EDU (Matt Dillon) writes: >>I think all editors should support settable tabs. The tabs should be stored :>as tabs in the file. I hate editors that changes what I type in, because I :>always end up in the situation of not being able to produce a file of :>exactly what I want. Vi does not commit that sin, aside from NULs, you :>can get anything into the file. Actualy it does mangle characters with the HI-BIT set. :> :>If your printer doesn't support settable tabs, your print program should. :>If your print program doesn't, use "detab" or "expand" (they are also trivial :>to write). : : I disagree. Editors such as VI, EMACS, ED, and DME writeout files :as normal text files, and thus there is no way for the printer driver to :know what tab size to use unless *you* tell it.. for each file. I think :find it a bi#$@ch to have to 'expand -4' 40 source files before sending them :to the printer. : : -Matt Exactly. But I would suggest that with VI you set your shift-width (sw) to 4, and leave you tab-stops (ts) set at 8. This way if you make use of the auto indent, and << and >> VI will automaticly use tabs, or 4 spaces where appropriate. The only drawback is when I am typing an indented line, sometimes I hit the TAB key, and then BACKSPACE, and 4 spaces since a TAB put me in too far. I hope this discussion doesn't continue forever, but nobody seems to have mentioned this so far. David Farmer. Disclaimers?