davis@pacific.mps.ohio-state.edu ("John E. Davis") (03/25/91)
Hi, It seems to me that a tab character in a text file is a bad thing since there is some ambiguity over what it means. That is, it depends on the tab settings of the output device. Do most programs which deal with tabs in ascii files assume they are eight column tabs? The reason I want to know is that I wrote a program that strips excess whitespace at the end of lines of text files. I also made the program convert tab characters to the proper number of spaces (based on 8 column tabs). However, based on the length of the output file compared with the input file it seems that I should have made the program convert excess spaces to tab characters due to the large number of tabs that appear in text files. So my question is: Are tabs bad? If not then how does one interpret them in a device independent way? Thanks, -- John bitnet: davis@ohstpy internet: davis@pacific.mps.ohio-state.edu
enag@ifi.uio.no (Erik Naggum) (03/26/91)
Tabs are bad only when you think they are spaces, or vice versa. You have to _be_ the output device to get the tab expansion right, or you would have to shun (prohibit) all kinds of non-printing characters or character sequences from documents. That's definitely a lot worse than letting a tab remain a tab. For some applications, a tab character is a separator between elements that may contain spaces. For some output devices, control sequences can set tabs at any position on the line, some even individual tab stops for each line. You don't want to emulate that. All you know about a TAB is that it's horizontal (linear) white space. You don't know how many, nor whether it erases what it skips. Moral: Don't do anything with a TAB, it might mean something you cannot anticipate. -- [Erik Naggum] <enag@ifi.uio.no> Naggum Software, Oslo, Norway <erik@naggum.uu.no>
yfcw14@castle.ed.ac.uk (K P Donnelly) (03/27/91)
davis@pacific.mps.ohio-state.edu ("John E. Davis") writes: > It seems to me that a tab character in a text file is a bad thing since >there is some ambiguity over what it means. Hear hear! I hate tabs. I wish they had never been invented. The operating system we use does not support tabs. Every so often I get a mail message which looks a real mess - I now know to try writing it to a file and detabbing it, because it probably contains tab characters. We are starting to move to the Unix operating system. It seems to me that tabs are used all over the place in Unix as a primitive file structuring mechanism. The tabs are usually invisible by default when files are displayed, and they add complication to all programs which are written to process text files. Two files which look the same may be structurally different - one may have <TAB><TAB> between two fields, while the other may just have a lot of spaces which is usually taken to be equivalent to a single tab. There is ambiguity about whether tabs are structuring elements or space saving devices. File compression is a far better method of space saving anyway. Kevin Donnelly
woods@eci386.uucp (Greg A. Woods) (04/03/91)
In article <9336@castle.ed.ac.uk> yfcw14@castle.ed.ac.uk (K P Donnelly) writes: > davis@pacific.mps.ohio-state.edu ("John E. Davis") writes: > > It seems to me that a tab character in a text file is a bad thing since > >there is some ambiguity over what it means. > > Hear hear! I hate tabs. I wish they had never been invented. The > operating system we use does not support tabs. Every so often I get a > mail message which looks a real mess - I now know to try writing it to a > file and detabbing it, because it probably contains tab characters. Then again, setting "standard" tabstops on your "terminal" would probably "fix" your problem too. Almost every sane terminal I've dealt with defaults to fixed tabstops every 8 spaces, starting at the first position (i.e. before any spaces). Oh well, for those insane terminals that keep creeping up there's a tty driver flag to cause tab expansion. [ Thank goodness this isn't comp.terminals, or I'd probably get even more flames than I anticipate! :-) ] > We are starting to move to the Unix operating system. It seems to me > that tabs are used all over the place in Unix as a primitive file > structuring mechanism. Yes, and any sane person who uses tabs for such things follows the same conventions as the sane terminals I outlined above. What I hate about the way some people and programmes use tabs is when they have a logical tabsize of, say 4, and yet they use tab characters for 8-space physical fills. Vi is especially prone to this abuse. > The tabs are usually invisible by default when > files are displayed, and they add complication to all programs which are > written to process text files. Say what? Yes, tabs are either expanded by the UNIX terminal driver, or by the terminal itself. Any programme processing text files should treat tabs like any other character, such as space or newline. Any ambiguity can usually be removed by collapsing tabs and spaces into one character (unless "empty" fields are allowed for some weird reason). > There is ambiguity about whether tabs > are structuring elements or space saving devices. Not if you are aware of the source of the file. :-) In general input, especially human generated input, to programmes is, IMHO, best with tabs, and output, especially formatted (i.e. columnar) output is best with spaces. -- Greg A. Woods woods@{eci386,gate,robohack,ontmoh,tmsoft}.UUCP ECI and UniForum Canada +1-416-443-1734 [h] +1-416-595-5425 [w] VE3TCP Toronto, Ontario CANADA Political speech and writing are largely the defense of the indefensible-ORWELL