[comp.unix.wizards] Filtering Everything

smvorkoetter@watmum.UUCP (04/02/87)

I wish to write a program in C that I can somehow set up so all output
to my terminal goes through it first, and all input from my terminal goes
through it too.  I thought of a simple filter using read() and write(),
but it only seems to pass things through when a CR is received or sent.
I want immediate filtering.  My eventual aim is to write an inline data
compression routine to speed up 1200 baud terminal use (esp. in vi) to
something approximating 2400 baud.  Can anyone point me in the right
direction for such a project.  Thanks.

edw@ius2.cs.cmu.edu (Eddie Wyatt) (04/12/87)

In article <919@watmum.UUCP>, smvorkoetter@watmum.UUCP writes:
> 
> 
> I wish to write a program in C that I can somehow set up so all output
> to my terminal goes through it first, and all input from my terminal goes
> through it too.  I thought of a simple filter using read() and write(),
> but it only seems to pass things through when a CR is received or sent.
> I want immediate filtering.  My eventual aim is to write an inline data
> compression routine to speed up 1200 baud terminal use (esp. in vi) to
> something approximating 2400 baud.  Can anyone point me in the right
> direction for such a project.  Thanks.

    You will want to make a call to either 'ioctl' or 'stty' to set
'cbreaks' on (cbreaks - does not buffer input or output of characters. ei
make character available as soon as they are typed) and raw mode on (raw mode - 
does no processing of character input. ei.  parity bits are passed back, 
erase and interrupt characters are not executed.)

   You may want to take a look at Huffman codes and other variations of
frequency depended codes. 

-- 
					Eddie Wyatt

They say there are strangers, who threaten us
In our immigrants and infidels
They say there is strangeness, too dangerous
In our theatres and bookstore shelves
Those who know what's best for us-
Must rise and save us from ourselves

Quick to judge ... Quick to anger ... Slow to understand...
Ignorance and prejudice and fear [all] Walk hand in hand.
					- RUSH 

bzs@bu-cs.UUCP (04/12/87)

>I wish to write a program in C that I can somehow set up so all output
>to my terminal goes through it first, and all input from my terminal goes
>through it too.

You don't say what version of UNIX (maybe I should teach an Emacs key
to insert this at the top as a line eater giveaway :-)

On 4.2/4.3 you want to do this via pty's. The script program is almost
exactly what you want as a starting point (I don't know if you have
access to sources.) Gnu Emacs is a program available publicly in source
form which does this sort of thing (actually, if you run GNU emacs you
could probably write this as an embedded lisp function, see terminal.el.)

On SYSV I think you'll have difficulties doing what you say, especially
for terminal oriented programs (you mention vi.)

	-Barry Shein, Boston University

mcvoy@uwvax.UUCP (04/13/87)

(Someone) asks:
>>I wish to write a program in C that I can somehow set up so all output
>>to my terminal goes through it first, and all input from my terminal goes
>>through it too.

(Barry Shein) writes:
>On 4.2/4.3 you want to do this via pty's. The script program is almost
>
>On SYSV I think you'll have difficulties doing what you say, especially
>for terminal oriented programs (you mention vi.)

Barry is quite right here (right, right, you're bloody well right [oops:
digression]).  To put it more succinctly, you cannot do this with some
sort of pseudo tty support (I _know_ I've tried).  The problem is that
a lot of programs try to be (too?) smart about their environment and
what they do is call isatty(3) which returns false if its' arg is not
a file descriptor connected to a tty.  [I learned this after writing
a history filter for xlisp: it worked fine until I tried to call vi]

Now, there is a way around this problem:  It's a kludge, and
depends on you using shared libs (like on the pc7300), but it's worth
mentioning:

Snag the source to isatty (or rewrite it) and have it return true for
file descriptors that are pipes.  Reinstall the function in the shared
lib, reboot and away you go.  Stuff like "vi foo | wc" will be
mishandled, but who does that anyway?  :-)

Yeah, well, you asked, ok?

BTW - SysV.3 *has* pttys - they're called sxt's (I think).  Someone
will jump on this if I'm wrong....
-- 
Larry McVoy 	        mcvoy@rsch.wisc.edu  or  uwvax!mcvoy

"It's a joke, son! I say, I say, a Joke!!"  --Foghorn Leghorn

guy@gorodish.UUCP (04/13/87)

> BTW - SysV.3 *has* pttys - they're called sxt's (I think).  Someone
> will jump on this if I'm wrong....

You bet.  System V has "sxt"s, but they are *not* pseudo-ttys.  They are just
switches that get stuck in front of regular ttys.  To quote from a
comment at the beginning of "sxt.c":

	/*  A real terminal is associated with a number of virtual tty's.
	 *  The real terminal remains the communicator to the low level
	 *  driver,  while the virtual terminals and their queues are
	 *  used by the upper level (ie., ttread, ttwrite).
	...

mcvoy@uwvax.UUCP (04/13/87)

I wrote:
   BTW - SysV.3 *has* pttys - they're called sxt's (I think).  Someone
   will jump on this if I'm wrong....

Guy wrote:
   You bet.  System V has "sxt"s, but they are *not* pseudo-ttys.  They are just
   switches that get stuck in front of regular ttys.  To quote from a

I was wondering about this... I've never used sxt's (pc7300 doesn't run 5.3)
so I don't know what the diffference is.   Can't you use them just like
ptty's?  If I wanted to write an application that preprocesses tty input
but is transparent to applications underneath it, can I do that?  I.e., 
a history filter for sh (for those w/o [c|k]sh)...
-- 
Larry McVoy 	        mcvoy@rsch.wisc.edu  or  uwvax!mcvoy

"It's a joke, son! I say, I say, a Joke!!"  --Foghorn Leghorn

bzs@bu-cs.UUCP (04/14/87)

>I was wondering about this... I've never used sxt's (pc7300 doesn't run 5.3)
>so I don't know what the diffference is.   Can't you use them just like
>ptty's?  If I wanted to write an application that preprocesses tty input
>but is transparent to applications underneath it, can I do that?  I.e., 
>a history filter for sh (for those w/o [c|k]sh)...
>-- 
>Larry McVoy 	        mcvoy@rsch.wisc.edu  or  uwvax!mcvoy

Actually, sxt's were in the 5.2 release, so your 7300 is still somewhere
around 5.1.

No, sxt's are more like a form of job control. You switch one terminal
among processes via some ioctl's. The others get blocked on input.
Don't be confused by the similarity of the names (is there one?) More
like a window system controlling who is the current tty input window
by clicking a window with the mouse (and guess what it was used for!)

	-Barry Shein, Boston University

guy@gorodish.UUCP (04/14/87)

> Can't you use them just like ptty's?

No, you can't.  That's the whole point.  Think of an "sxt" as an
8-position switch connecting one physical terminal to up to 8 virtual
ones.  Data coming from the physical terminal is routed to the input
queue of one of the 8 virtual terminals.  Data written to any of the
virtual virtual terminals is routed to the output queue of the
physical terminal, unless the virtual terminal in question isn't the
current one and LOBLK is set on that virtual termnal in which case
the "write" blocks until the virtual terminal becomes the current
one.

There's no place here to plug in a user process so that it sees the
data flying by; the whole point of a pseudo-tty is to plug a user
process into the data stream.

mike@BRL.ARPA (04/15/87)

If your system is BSD, the place to start is with "script", which
inserts a PTY between you and your shell, for purposes of recording
the output.  It would be easily modified to alter the output with
some compression mechanism.

If you system is AT&T System V, this may be more difficult.
	-Mike

rbj@icst-cmr.arpa (Root Boy Jim) (04/17/87)

   My eventual aim is to write an inline data
   compression routine to speed up 1200 baud terminal use (esp. in vi) to
   something approximating 2400 baud.

Sorry, but to get a better baud rate, the program would have to live
in your terminal. I'm surprised nobody mentioned that.

Others have mentioned the use of script. That is the place to start.
I once wrote a filter for an Intecolor 8001 terminal. This beast was
designed back in the days when lower case was rarely used, and so they
thought they would do you a favor by causing the shift key to generate
lower case! I fixed the input with if (isalpha(c)) c ^= ' ';

The thing also wanted a ^Z for backspace, but the tty driver always
uses ^H, so on output if (c == 8) c = 26; was needed. But wait! Character
addressing was ^C followed by absolute x/y coordinates, and other
escape sequences could have ^H in them, so I ended up with a generalized
output parser before it was all over.

We also have an AED that does cursor motion by pixels, not chars. It's
a toss up whether I hack script or implement another :cm: capability.
I'll probably do the former, it's less linking.

	(Root Boy) Jim "Just Say Yes" Cottrell	<rbj@icst-cmr.arpa>
	YOW!!  The land of the rising SONY!!

smvorkoetter%watmum.waterloo.edu@RELAY.C (Stefan M. Vorkoetter) (04/17/87)

What do you mean, the program would have to live in my terminal?  If you
mean that my terminal would have to know about the data compression, then
that is no problem.  I am not using a terminal; I am using a terminal 
emulator which my company (S. M. Vorkoetter Software) markets, and to which
I obviously have the source code.  My idea is to compress data going from
the host, and uncompress it when it reaches the emulator, a process that is
completely transparent to the user of the system.

rbj@icst-cmr.arpa (Root Boy Jim) (04/17/87)

? What do you mean, the program would have to live in my terminal?

Just what I said. 

? If you
? mean that my terminal would have to know about the data compression, then
? that is no problem.  I am not using a terminal; I am using a terminal 
? emulator which my company (S. M. Vorkoetter Software) markets, and to which
? I obviously have the source code.

My mistake, I didn't really think about emulators as terminals.

?  My idea is to compress data going from
? the host, and uncompress it when it reaches the emulator, a process that is
? completely transparent to the user of the system.

Except that in order to compress data, you need to collect statistics on
frequency use. So you need to synchronize the xmt'er and rcv'er tables,
unless you periodically transmit them. Also,
suppose you *can* compress the data, say, four input bytes down to three.
What do you do when the third character is a line feed, or if you are
using character I/O for input? You lose your advantage, and you force
some means of deciding whether to buffer and encode or send it now,
because the data is needed.

It's *NOT* hopeless, but it's not trivial either. Telnet protocol
provides `macros' as well as other features. Other protocols provide
abbreviations for repeated characters. Have fun in any case.

	(Root Boy) Jim "Just Say Yes" Cottrell	<rbj@icst-cmr.arpa>
	I want to kill everyone here with a cute colorful Hydrogen Bomb!!

JOSH@ibm.COM (Joshua W. Knight) (04/17/87)

In response to earlier remarks by Jim Cottrell, Stefan Vorkoetter
(smvorkoetter%watmum.waterloo.edu) asks

 >
 > What do you mean, the program would have to live in my terminal?  If you
 > mean that my terminal would have to know about the data compression, then
 > that is no problem.  I am not using a terminal; I am using a terminal
 > emulator which my company (S. M. Vorkoetter Software) markets, and to which
 > I obviously have the source code.  My idea is to compress data going from
 > the host, and uncompress it when it reaches the emulator, a process that is
 > completely transparent to the user of the system.

To which Jim Cottrell replies:

 >
 > Except that in order to compress data, you need to collect statistics on
 > frequency use. So you need to synchronize the xmt'er and rcv'er tables,
 > unless you periodically transmit them. Also,
 > suppose you *can* compress the data, say, four input bytes down to three.
 > What do you do when the third character is a line feed, or if you are
 > using character I/O for input? You lose your advantage, and you force
 > some means of deciding whether to buffer and encode or send it now,
 > because the data is needed.
 >
 > It's *NOT* hopeless, but it's not trivial either. Telnet protocol
 > provides `macros' as well as other features. Other protocols provide
 > abbreviations for repeated characters. Have fun in any case.
 >
 >     (Root Boy) Jim "Just Say Yes" Cottrell    <rbj@icst-cmr.arpa>
 >     I want to kill everyone here with a cute colorful Hydrogen Bomb!!
 >

Internal to IBM there is a terminal emulation program that does exactly
what Stefan asks about, but for VM/CMS.  It uses a modification of the
Lempel-Ziv compression algorithm which constantly updates the "tables"
(actually a cache like algorithm).  On normal english text, this method
commonly gets a factor of 2 compression.

			Josh Knight
			IBM T.J. Watson Research Center
josh@ibm.com, josh@yktvmh.BITNET

bzs@bu-cs.BU.EDU (Barry Shein) (04/17/87)

>Internal to IBM there is a terminal emulation program that does exactly
>what Stefan asks about, but for VM/CMS.  It uses a modification of the
>Lempel-Ziv compression algorithm which constantly updates the "tables"
>(actually a cache like algorithm).  On normal english text, this method
>commonly gets a factor of 2 compression.
>
>			Josh Knight

But Josh, doesn't this take advantage of the half-duplex nature of the
terminals involved? I assume this is aimed at IBM327x terminals where
you essentially do local formatting of an entire screen and then hit
ENTER to send the entire screen at once. I think what Jim was worrying
about was full-duplex, character at a time terminal I/O. How well does
this compression work if all you hit are Function keys? I thought so.

Anyhow, now *I'LL* throw my 3c in...you're both right.

Jim, it won't work well on input, but who cares? It will work well
on OUTPUT which tends to come in much more voluminous bursts, single
chars (eg. echos) may as well come plaintext, who cares? 1200b is
plenty fast enough. But if you hit ^L for a full screen redraw then
it should be advantageous, or MORE a file etc.

So (to the original poster) don't listen to any of us, we do this
all the time (prove that the sun does/doesn't rise in the east), just
go ahead and write your program and post it when it's working, thanks.

	-Barry Shein, Boston Univeroxib.

kdavies@dalcsug.UUCP (04/17/87)

In article <6447@bu-cs.BU.EDU> bzs@bu-cs.BU.EDU (Barry Shein) writes:
>
>>I wish to write a program in C that I can somehow set up so all output
>>to my terminal goes through it first, and all input from my terminal goes
>>through it too.
>
>You don't say what version of UNIX (maybe I should teach an Emacs key
>to insert this at the top as a line eater giveaway :-)
>
>On SYSV I think you'll have difficulties doing what you say, especially
>for terminal oriented programs (you mention vi.)
>
>	-Barry Shein, Boston University


Actually, I had to do a program like this under Xenix 3.0 using a C compiler
that was ~1983 vintage. I had to "catch" all user input and had to "catch"
all output from a program (this was for security reasons).
(Neet way to let someone use vi without getting a 'shell' :-)

Anyway, I worked just great. The program is run as a front-end and a back-end
to the system program. The front-end sets the terminal in CBREAK mode (I think
					the back-end did too) and just did
multiple getc's.  And YES, it DOES work well with vi. Almost guaranteed to work
under 4.x BSD, definitely SysV (any flavour). It doesn't use curses but does
use the ioctl calls to set up the terminal.

A great tool for security reasons -- that's why it was done in the first place.
If enough are interested I will post it, otherwise I will email it.

BEWARE: May not be the neatest (lintest?) program but it does the job.
Oh, it forks itself and the system program. This could be changed with a line
arguement I guess.


-- 
Kevin Davies	 ...{seismo|watmath|utai|garfield} !dalcs!dalcsug!kdavies
Kirk :  "Spock, I do wish you'd stop using those colourful metaphors"
Spock:  "The _hell_ I will, Captain"
---------------------------------------------------------------

ken@argus.UUCP (Kenneth Ng) (04/18/87)

In article <1097@ius2.cs.cmu.edu>, edw@ius2.cs.cmu.edu (Eddie Wyatt) writes:
[edited query on trying to compress data coming across 1200 baud lines]
>    You may want to take a look at Huffman codes and other variations of
> frequency depended codes. 

I would recommend work done by Lempal, Ziv, and Weogman (sp).  Huffman
typically depends on having the entire file to be compressed first.  This
would be quite difficult in a terminal session (:->.  Lempal Ziv methods
build dual tree structures which are kept in sync on both ends of the
finite bandwith medium.

Just a note to the skeptics, it has been done, not on Unix, but it
has been done, and it works quite well.  What I like best is that
if you display page a, go down to page b, and then up to page a, the
second time page a is displayed, the display is much faster.

> 					Eddie Wyatt
-- 
Kenneth Ng: Post office: NJIT - CCCC, Newark New Jersey  07102
uucp !ihnp4!allegra!bellcore!argus!ken
     ***   WARNING:  NOT ken@bellcore.uucp ***
bitnet(prefered) ken@orion.bitnet

Kirk: "I don't care if you hit the broadside of a barn"
Spock: "Why should I aim at such an object?"

hutch@sdcsvax.UUCP (04/18/87)

<>
Well, after you get your streams/pty/tty code finished, you will still
be stuck with a truly nasty problem.  Line errors.  If you compress
the signal (characters) at 2:1, you will be multiplying the effects of
line noise.  If you use a state table driven compressor ala LZ you will
have to write code to recover from noise (interesting trick that will be),
or else go into hopeless noise mode until manual intervention.

With MNP and a good line, you may be able to get away with a lot, but
during this response I have been troubled by more donuts and squiggles
then I would care to have multiplied by any number > 1.0.

Good luck with it.  With modems at ~$250 for 2400 baud, it would still be
nice to be told that I am completely wrong and can have a 4800+ baud connection
to a local host from my pc (non-IBM).

-- 
    Jim Hutchison   		UUCP:	{dcdwest,ucbvax}!sdcsvax!hutch
		    		ARPA:	Hutch@sdcsvax.ucsd.edu
Disklame'r:
    One greater than the greatest signature representable with 184 symbols.

casey@vangogh.Berkeley.EDU (Casey Leedom) (04/21/87)

In article <3011@sdcsvax.UCSD.EDU> hutch@sdcsvax.UCSD.EDU (Jim Hutchison) writes:
> Well, after you get your streams/pty/tty code finished, you will still be
> stuck with a truly nasty problem.  Line errors. ... If you use a state table
> driven compressor ala LZ you will have to write code to recover from noise
> (interesting trick that will be), or else go into hopeless noise mode until
> manual intervention.

  Run an error free protocol and LZ on top of that.  What you'll probably end
up with is something barely above 1200 baud, but error free.  I'd take that.

Casey.

ken@argus.UUCP (Kenneth Ng) (04/21/87)

In article <3011@sdcsvax.UCSD.EDU>, hutch@sdcsvax.UCSD.EDU (Jim Hutchison) writes:
> <>
> Well, after you get your streams/pty/tty code finished, you will still
> be stuck with a truly nasty problem.  Line errors.  If you compress
> the signal (characters) at 2:1, you will be multiplying the effects of
> line noise.  If you use a state table driven compressor ala LZ you will
> have to write code to recover from noise (interesting trick that will be),
> or else go into hopeless noise mode until manual intervention.

Actually its worse, you don't multiply the effects of line noise, you
propagate it!  The one saving grace is that truely adapative LZ methods
tend to isolate, and then remove the effects of line noise, in most cases.
Shortly after a local flood from upstairs, I saw the effects of REALLY
BAD lines, the screen threw a fit, and then the program reset the table,
and displayed the screen correctly.  Last year I saw an LZ based program
on a telephone call across the country via ordinary 1200 baud modems, with
only ocasional line problems.  But you must remember, if you can reduce
the bytes sent by half, how expensive is a byte to do a CRC check?  Granted
there will be problems with finding optimal packet size, but the possibilities
are interesting.

>     Jim Hutchison   		UUCP:	{dcdwest,ucbvax}!sdcsvax!hutch
> 		    		ARPA:	Hutch@sdcsvax.ucsd.edu

-- 
Kenneth Ng: Post office: NJIT - CCCC, Newark New Jersey  07102
uucp !ihnp4!allegra!bellcore!argus!ken
     ***   WARNING:  NOT ken@bellcore.uucp ***
bitnet(prefered) ken@orion.bitnet

Kirk: "I don't care if you hit the broadside of a barn"
Spock: "Why should I aim at such an object?"

dlc@zog.cs.cmu.edu (Daryl Clevenger) (04/21/87)

    A professor that used to be here at CMU, put data compression firmware
into Concept 100/108 terminals.  His name was Leonard Zubdkoff, thus the
terminals became Concept-lnz's.  There is a technical paper here, but I have
never read it and I don't know if he published any official paper.  If any one
cares, send me mail and I will see what I can dig up.  Unfortunately, I don't
think that the I/O drivers use this feature and I think the only program that
actually takes advantage of it is Gosling's emacs.  It may not be that
helpful for a software approach, but the algorithm used may be able to be
applied in software.  I imagine it uses some variant of Lempel-Ziv, but I
really don't know.

							Daryl Clevenger

jsdy@hadron.UUCP (Joseph S. D. Yao) (04/27/87)

In article <3448@rsch.WISC.EDU> mcvoy@rsch.WISC.EDU (Larry McVoy) writes:
>I was wondering about this... I've never used sxt's (pc7300 doesn't run 5.3)
>so I don't know what the diffference is.   Can't you use them just like
>ptty's?  If I wanted to write an application that preprocesses tty input
>but is transparent to applications underneath it, can I do that?

Sxt's are an attempt to use the xt driver (for the blit / tty5620) for
other purposes.  It is  n o t  a general-purpose pty driver.  On the
BLIT, it is used to have one TTY port mapped to <= 7 windows.  What
happens is that xt000 (xt/000) is the control xt, and the other seven
(or six, I'm not 100% sure) are used to attach to different processes
in the different windows.  Similarly, with the sxt driver and shl
(which is the s5 "job control" shell), you have shl attached to sxt/000
and the processes attached to sxt/00[1-7].  The next shl takes sxt/01?,
etc.  You may not shl (or run layers on a blit) unless you are actually
on an interactive tty; which means that if you are on an xt or sxt (a
layer or shl sub-shell), then you can't re-shl to get another layer of
sub-shells, so you are limited to seven! (six?)  Curiously, on the VAX
11/7??, the console terminal was NOT considered an interactive tty, as
it was a subdriver of the front-end-processor driver!  (I modified both
xt and sxt drivers to understand that it really was kosher.)

	Joe Yao		jsdy@hadron.COM (not yet domainised)
	hadron!jsdy@{seismo.CSS.GOV,dtix.ARPA,decuac.DEC.COM}
{arinc,att,avatar,cos,decuac,dtix,ecogong,kcwc}!hadron!jsdy
     {netex,netxcom,rlgvax,seismo,smsdpg,sundc}!hadron!jsdy