[comp.lang.c] Definition of isprint

scjones@sdrc.UUCP (Larry Jones) (12/14/88)

We're using SAS C on an IBM mainframe and just ran into an
interesting problem.  SAS has defined "printing character" in
terms of what is printable on an ancient line printer so some
obviously printable characters such as "{", "}", "[", "]", "\",
and "!" cause isprint() to return 0!

Am I the only one who thinks this is less than useful?

The dpANS seems to allow this, since "printing character" is
implementation defined, but it seems to me that the C source
character set should be specified as printable in the "C" locale.
The problem seems to be that the exact semantics of ispunct() are
not specified for the "C" locale.

Was this an oversight or was it intentional?  I was there, but I
don't remember.

----
Larry Jones                         UUCP: uunet!sdrc!scjones
SDRC                                      scjones@sdrc.uucp
2000 Eastman Dr.                    BIX:  ltl
Milford, OH  45150                  AT&T: (513) 576-2070
"Save the Quayles" - Mark Russell

gwyn@smoke.BRL.MIL (Doug Gwyn ) (12/15/88)

In article <474@sdrc.UUCP> scjones@sdrc.UUCP (Larry Jones) writes:
>SAS has defined "printing character" in
>terms of what is printable on an ancient line printer so some
>obviously printable characters such as "{", "}", "[", "]", "\",
>and "!" cause isprint() to return 0!

No doubt about it, one does find some stupid interpretations
of the CTYPE macros in many implementations.  Berkeley's was
pretty strange for a release or two (fixed now).

>The dpANS seems to allow this, since "printing character" is
>implementation defined, but it seems to me that the C source
>character set should be specified as printable in the "C" locale.
>The problem seems to be that the exact semantics of ispunct() are
>not specified for the "C" locale.

I don't have the proposed standard at hand, but I believe
that the intention was that in the "C" locale all the C source
graphic characters (glyphs) plus the space character are
supposed to satisfy isprint().  I believe that other non-control
characters were also supposed to be allowed (but not required)
to satify isprint() in the "C" locale, since that is the existing
practice and base document requirement from which the standard
was derived.  If indeed we missed this, implementors should
still do it right, if only as a favor to programmers of portable
applications.

I have to admit I don't find much use for isprint().
If it is broken in the standard, I can live with it.

henry@utzoo.uucp (Henry Spencer) (12/16/88)

In article <474@sdrc.UUCP> scjones@sdrc.UUCP (Larry Jones) writes:
>We're using SAS C on an IBM mainframe and just ran into an
>interesting problem.  SAS has defined "printing character" in
>terms of what is printable on an ancient line printer so some
>obviously printable characters such as "{", "}", "[", "]", "\",
>and "!" cause isprint() to return 0!

One should remember that on an IBM system speaking EBCDIC, there *is*
no firm definition of which characters print and which don't, or what
they print as when they print, because there *is* no single character
code named "EBCDIC".  EBCDIC is a large family of different, often
incompatible, character codes.  Some EBCDICs, e.g. the one on the IBM
360/370 summary card (at least, on my old copy of it), do not have some
of those characters at all.  "!" seems a curious omission, but the
location and presence of the other characters you mention are highly
variable in EBCDIC.
-- 
"God willing, we will return." |     Henry Spencer at U of Toronto Zoology
-Eugene Cernan, the Moon, 1972 | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

henry@utzoo.uucp (Henry Spencer) (12/16/88)

In article <9182@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>I don't have the proposed standard at hand, but I believe
>that the intention was that in the "C" locale all the C source
>graphic characters (glyphs) plus the space character are
>supposed to satisfy isprint()...

Alas, the wording at the head of section 4.3 (October draft) just says
"implementation-defined" without narrowing it down further.  A footnote
[note that footnotes are not binding on implementors, as they are not
officially part of the standard] pins down what this means for ASCII,
but not more generally.
-- 
"God willing, we will return." |     Henry Spencer at U of Toronto Zoology
-Eugene Cernan, the Moon, 1972 | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

gwyn@smoke.BRL.MIL (Doug Gwyn ) (12/17/88)

In article <1988Dec15.181144.2066@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>Alas, the wording at the head of section 4.3 (October draft) just says
>"implementation-defined" without narrowing it down further.  A footnote
>[note that footnotes are not binding on implementors, as they are not
>officially part of the standard] pins down what this means for ASCII,
>but not more generally.

Yes, I checked this with our fearless Redactor, who assures me that
the Committee deliberately decided not to be more specific about
these functions even in the "C" locale.  Given the wide variety of
display technologies in use, it's not clear that many of these
functions are all that useful anyway.  (isspace() is about the best.)

bill@twwells.uucp (T. William Wells) (12/18/88)

In article <9182@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
: In article <474@sdrc.UUCP> scjones@sdrc.UUCP (Larry Jones) writes:
: >SAS has defined "printing character" in
: >terms of what is printable on an ancient line printer so some
: >obviously printable characters such as "{", "}", "[", "]", "\",
: >and "!" cause isprint() to return 0!

: >The dpANS seems to allow this, since "printing character" is
: >implementation defined, but it seems to me that the C source
: >character set should be specified as printable in the "C" locale.
: >The problem seems to be that the exact semantics of ispunct() are
						       ^isprint()?
: >not specified for the "C" locale.
:
: I don't have the proposed standard at hand, but I believe
: that the intention was that in the "C" locale all the C source
: graphic characters (glyphs) plus the space character are
: supposed to satisfy isprint().  I believe that other non-control
: characters were also supposed to be allowed (but not required)
: to satify isprint() in the "C" locale, since that is the existing
: practice and base document requirement from which the standard
: was derived.  If indeed we missed this, implementors should
: still do it right, if only as a favor to programmers of portable
: applications.

Here is what the May 13 draft says:

4.3

"The term `printing character' refers to an implementation-defined
set of characters, each of which occupies one printing position on a
display device;..."

4.3.1.7

"The isprint function tests for any printing character including
space(' ')."

However, the standard doesn't require that all characters which are
printable be "printable characters". In fact, the set of printable
characters could, without contradicting the wording, be NULL!  (Or,
maybe if you think of 4.3.1.7 as a constraint on printing characters,
the set could contain only space.)

Uk!

[One more example of why some kind of "reasonable interpretation"
clause needs to be an implicit or explicit part of any standard, even
though it opens up another can of worms.

Standards for something as complicated as C are never complete; there
is always *something* which is left out.  Implementers need to
interpolate this information, but the mere absence of the information
ought not be taken as a licence for wildly inappropriate
implementations.]

---
Bill
{uunet|novavax}!proxftl!twwells!bill

davidsen@steinmetz.ge.com (William E. Davidsen Jr) (12/20/88)

In article <9206@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:

| Yes, I checked this with our fearless Redactor, who assures me that
| the Committee deliberately decided not to be more specific about
| these functions even in the "C" locale.  Given the wide variety of
| display technologies in use, it's not clear that many of these
| functions are all that useful anyway.  (isspace() is about the best.)

  Perhaps what we need is "isC()" which is like is print, but defines
any required character in C source to be "printing" for purposes of
compiler action. I usually want to avoid printing "magic" characters
when I use isprint. I suppose that there are (or will be) versions in
which characters like '{' will not be printing, unless represented by an
escape sequence.
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me