jerry@oliveb.UUCP (Jerry Aguirre) (07/20/85)
Considering the number of garbled articles and the constant discussion about what site using what form of batching clobbered what line, shouldn't we consider adding a header line containing a checksum? The checksum would have to include the unchanging header lines and the body of the article. The path, forwarding version, xref, etc. headers would have to be excluded from the checksum. Or, the checksum could be recalculated as the headers are modified. With a checksum and the sendbad control message we might have some chance of cleaning up this problem. The log entry left by the sendbad control message would also be an indication to the SA that there is something wrong in the feed. The only hard part of this comming up with a checksum that is portable across the wide varity of machines on the net. Comments? Jerry Aguirre @ Olivetti ATC {hplabs|fortune|idi|ihnp4|tolerant|allegra|tymix}!oliveb!jerry
guy@sun.uucp (Guy Harris) (07/22/85)
> The only hard part of this comming up with a checksum that is portable > across the wide varity of machines on the net. Given that the vast majority of UUCP connections on USENET run over serial lines using the checksummed "g" protocol, I think we have an existence proof for such a protocol. We may not want to use that protocol, though, due to 1) trade secret restrictions, although Lauren Weinstein could just tell us what the algorithm is and 2) the fact that at least some implementations of it within UUCP don't compile into correct code with some C compilers. To make the checksum portable: 1) always checksum bytes, not anything larger - but since we're processing text here, it's unlikely that anybody'd go through the trouble to accumulate two bytes and do checksumming on two- byte quantities. 2) don't use "native" arithmetic operations like addition and subtraction; there is, I believe, at least one Univac 1100 on the net and it's a one's-complement machine. 3) don't assume that characters are signed or that they're unsigned - there are lots of signed-character VAXes on the net and there are lots of unsigned-character 3Bs on the net. 4) don't assume that casts will be done in the order you think they will - I believe that was the cause of lots of porting problems for the UUCP "g" protocol checksum code. Guy Harris
wls@astrovax.UUCP (William L. Sebok) (07/25/85)
I rather wish that there were some sort of article consistency check, like a checksum or maybe even just the line number, and if a "good" copy of an article arrived at a site after a "bad" copy that the "good" copy replace the "bad" copy rather than being rejected as a "duplicate article". The branchs in the network give it a potential for redundancy. However, with the present arrangement if a article is garbled somewhere the garbled copy is still the one propagated if it arrives at a branch point first. -- Bill Sebok Princeton University, Astrophysics {allegra,akgua,cbosgd,decvax,ihnp4,noao,philabs,princeton,vax135}!astrovax!wls