[comp.lang.c] Summary: Converting ascii hex to pure hex values

mpledger@cti1.UUCP (Mark Pledger) (10/26/90)

I sure caused alot of traffic for this little question.

Thanks to everyone who responded.

I guess I did'nt make myself clear enough on my question though.  I know I
can use scanf() or cast a char to int (for a single char).  I DID'NT want to
use scanf() and casting does not work for my question.  The original question
went something like this:  If you have a character string with the ascii 
representation of hex values (e.g. s[] = "63", which will be stored as TWO
hex byte values \x9 & \x9.  I only want to store them as ONE hex byte of \x63.
Without using scanf() (or even sprintf())  what is the best (fasted for me) way
to convert a two-digit ascii code to a one digit hex code, so I can put it back
into the charater string s[], append a null, and write it back out to disk.  I
currently use atoi() and just write out the int to disk.  I am interested in
creating (or finding) a routine that will take a character string as the 
argument, and returning the hex result in the same character string.

Any suggestions?



-- 
Sincerely,


Mark Pledger

--------------------------------------------------------------------------
CTI                              |              (703) 685-5434 [voice]
2121 Crystal Drive               |              (703) 685-7022 [fax]
Suite 103                        |              
Arlington, DC  22202             |              mpledger@cti.com
--------------------------------------------------------------------------

chris@mimsy.umd.edu (Chris Torek) (10/27/90)

In article <302@cti1.UUCP> mpledger@cti1.UUCP (Mark Pledger) writes:
>I sure caused alot of traffic for this little question.
>I guess I did'nt make myself clear enough on my question though.

Indeed.

>Without using scanf() (or even sprintf())  what is the best (fasted
>for me) way to convert a two-digit ascii code to a one digit hex code
>so I can put it back into the charater string s[], append a null, and
>write it back out to disk.

This question is *still* confused.

The Miriam-Webster dictionary defines `code' with four meanings.  The
one relevant to this discussion is number 3, `a system of signals', or
number 4, `a system of symbols or letters used (as in secret
communication or in a computing machine) with special meanings' (the
latter is more intended to refer to encryption codes and instruction
codes, I think).  In either case, there is no such thing as a `one
digit hex code'.  Hexadecimal is a base, not a value, and as such any
number of hexadecimal digits might be required to represent any
particular value.  Since hexadecimal is base sixteen, a single
hexadecimal digit can represent one of the values from the set
{0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15} (where these values are coded
in decimal).  Two hex digits can represent a value in the range
[0..255] (where the `[x..y]' notation implies `all integer values in
the range given with the lower bound x on the left and the upper bound
y on the right, inclusive of both bounds').

Now, as it happens, an eight bit byte, as is found on the most common
computers these days, can also represent a value in the range
[0..255].  The phrase `two-digit ASCII code', however, suggests a pair
of values from the set ['0'..'9'], which can represent at most 100
distinct values (usually the set [0..99]).  I.e., we have an ASCII
encoding of an integral value in [0..99].  There is no way this can be
represented with a single hexadecimal digit.  On the other hand, the
phrase `two-digit ASCII code' in close proximity to `hex code' suggests
that perhaps what is meant is a pair of values from the set
{['0'..'9'],['A'..'F']} (or the same but with lowercase).  Such a pair
can represent at most 256 distinct values, usually the set [0..255].
Coincidentally this corresponds exactly to the set of values that can
be represented by an (unsigned) eight bit byte.  But Mr. Pledger goes
on to say:

>I currently use atoi() and just write out the int to disk.

This suggests that the guess that `two ASCII digits' means `two ASCII
hexadecimal digits' (i.e., the set including the letters A through F,
rather than just the decimal digits zero through nine) is incorrect.
Now we are back to the concept of `two ASCII decimal digits', or a
printable ASCII representation of a value in the range [0..99].
We can now be fairly confident that what is desired is to take two
characters, each of which is guaranteed to be in the range ['0'..'9'],
and come up with an integer value in the range [0..99] according to
the obvious mapping ... until we reach his final sentence:

>I am interested in ... a routine that will take a character string as the 
>argument, and returning the hex result in the same character string.

But `hex', as noted above, is a particular ENCODING of a value.  It
is not a RESULT---the result of transforming two decimal ASCII digits
to a value is an integer in the range [0..99], and has nothing at all
to do with hexadecimal encodings.

Is it any wonder that such a question generates a multitude of answers?
There is no correct answer; the question is rather like asking how to
construct a circle with four sides.

Now, if all you want to do is convert an ASCII decimal representation of
a value to a binary value stored in a single `char', with the `char' being
the first of the set of `char's that held the original ASCII decimal
representation---and assuming that the original ASCII sequence is ended
with a NUL (character 0) or a non-digit, you can simply write:

	void foo(char *s) { *s = atoi(s); }

If you want the original ASCII sequence modified so that the result is
still a NUL-terminated string, you can add one more statement:

	void foo(char *s) { *s = atoi(s); s[1] = 0; }

If you want to take pairs of ASCII decimal digits and convert each pair
(with exactly two, no more and no fewer, required) into binary values and
store those in the original string, you can change this to:

	void foo(char *s) {
		char *out = s;

		while (isdigit(*s) && isdigit(s[1])) {
			*out++ = (*s - '0') * 10 + s[1] - '0';
			s += 2;
		}
		*out = 0;
	}

If you want something else, you will have to figure out what it is first.
There is no such thing as a `pure hexadecimal value', because hexadecimal
is a representation, not a value.  (Ask yourself just what an `impure
hexadecimal value' might be.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (10/29/90)

In article <302@cti1.UUCP>, mpledger@cti1.UUCP (Mark Pledger) writes:
> I guess I did'nt make myself clear enough on my question though. ...
> If you have a character string with the ascii 
> representation of hex values (e.g. s[] = "63",
> which will be stored as TWO hex byte values \x9 & \x9.
> I only want to store them as ONE hex byte of \x63.
> Without using scanf() (or even sprintf())  what is the best (fasted for me)
> way to convert a two-digit ascii code to a one digit hex code,
> so I can put it back into the charater string s[], append a null,
> and write it back out to disk.

I still don't see what's wrong with using sscanf().  Suppose
s currently points to "63\0" and you want it to point to "\x63\0".
	{ unsigned i; sscanf(s, "%2x", &i); s[0] = i, s[1] = '\0'; }
does the job just fine.  If you want speed, use a table:

	char hexval[256];	/* for EBCDIC, ISO 646, or ISO 8859 */

	void init_hexval()
	    {
		unsigned char *p;
		for (p = (unsigned char *)"0123456789"; *p; p++)
		    hexval[*p] = *p-'0';
		for (p = (unsigned char *)"ABCDEF"; *p; p++)
		    hexval[*p] = (*p-'A')+10;
		for (p = (unsigned char *)"abcdef"; *p; p++)
		    hexval[*p] = (*p-'a')+10;
		/* this leaves hexval[c] UNDEFINED for characters */
		/* c which are not in [0-9A-Fa-f] */
	    }

Now do
	s[0] = (hexval[s[0]] << 16) + hexval[s[1]], s[1] = '\0';
and presto chango, the job is done.

That is, of course, for the ultimate in speed with absolutely no
error checking at all.  If your objection to using C library functions
applies only to sscanf(), what's wrong with strtol()?

What I'm rather worried by is the "append a NUl and write it back out
to disc" bit.  Why append a NUL?  Why not just do
	{ int byte_value = (hexval[s[0]] << 16) + hexval[s[1]];
	  putchar(byte_value);
	}


-- 
Fear most of all to be in error.	-- Kierkegaard, quoting Socrates.

eager@ringworld.Eng.Sun.COM (Michael J. Eager) (10/31/90)

In article <302@cti1.UUCP> mpledger@cti1.UUCP (Mark Pledger) writes:
>I guess I did'nt make myself clear enough on my question though.  I know I
>can use scanf() or cast a char to int (for a single char).  I DID'NT want to
>use scanf() and casting does not work for my question.  The original question
>went something like this:  If you have a character string with the ascii 
>representation of hex values (e.g. s[] = "63", which will be stored as TWO
>hex byte values \x9 & \x9.  I only want to store them as ONE hex byte of \x63.
>Without using scanf() (or even sprintf())  what is the best (fasted for me) way
>to convert a two-digit ascii code to a one digit hex code, so I can put it back
>into the charater string s[], append a null, and write it back out to disk.  I
>currently use atoi() and just write out the int to disk.  I am interested in
>creating (or finding) a routine that will take a character string as the 
>argument, and returning the hex result in the same character string.


Well, I have to agree that the previous posting wasn't clear; but
I'm not sure this one makes anything more clear.

If you want to convert the ascii representation of a 8 bit hex value
into that value, here is a simple way: (Although returning the value
in the same string seems abhorrent.

void atox (char *s)
{
  char * p = s;
  int left, right;

  while (*s) {
    left = *s - '0';
    right = *(s+1) - '0';	/* Better hope they come in pairs */
    if (left > 9) left = toupper(left + '0') - 'A' + 10;
    if (right > 9) right = toupper(right + '0') - 'A' + 10;
    *p = (left << 4) + right;
    p++;
    s += 2;
    }
}


Note that you need to know how many characters you are passing to
this function so you will know how many bytes are returned.

-- Mike Eager

unhd (Jonathan W Miner) (11/05/90)

In article <302@cti1.UUCP> mpledger@cti1.UUCP (Mark Pledger) writes:
> [... DELETED STUFF ...]
>currently use atoi() and just write out the int to disk.  I am interested in
>creating (or finding) a routine that will take a character string as the 
>argument, and returning the hex result in the same character string.

Try this:

    void convert(s)
         char s[3];
    {
         int value;

         value = ((s[0] - '0') * 100) + ((s[1] - '0') * 10) + (s[2] - '0');
         s[0] = value / 256;
         s[1] = value % 256; 
         s[2] = 0;
    }

This assumes that the input will always be in the range '000' to '999'.

-- 
-----------------------------------------------------------------
Jonathan Miner        | I don't speak for UNH, and UNH does not 
jwm775@unhd.unh.edu   | speak for me! 
(603)868-3416         | Rather be downhill skiing... or hacking... 

gwyn@smoke.brl.mil (Doug Gwyn) (11/07/90)

In article <302@cti1.UUCP> mpledger@cti1.UUCP (Mark Pledger) writes:
>If you have a character string with the ascii representation of hex values
>(e.g. s[] = "63", which will be stored as TWO hex byte values \x9 & \x9.

No, the two successive values will be 0x36 and 0x33.

>I only want to store them as ONE hex byte of \x63.  ...  I am interested in
>creating (or finding) a routine that will take a character string as the
>argument, and returning the hex result in the same character string.

This is a trivial exercise.  WARNING: String literals (as opposed to
arrays of char) are not necessarily modifiable.

	void func( char *s ) {
		register int i = s[0] - '0' << 4 | s[1] - '0';
		s[0] = i;
		s[1] = '\0';
	}

This could be easily be made into a macro.

bhoughto@cmdnfs.intel.com (Blair P. Houghton) (11/08/90)

In article <14367@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>In article <302@cti1.UUCP> mpledger@cti1.UUCP (Mark Pledger) writes:
>>If you have a character string with the ascii representation of hex values
>>(e.g. s[] = "63", which will be stored as TWO hex byte values \x9 & \x9.
>
>No, the two successive values will be 0x36 and 0x33.

No, the three successive values will be 0x36, 0x33, and 0x00.

				--Blair
				  "Dead horse.  Whip.  Whap!"

sasrer@unx.sas.com (Rodney Radford) (11/10/90)

In article <302@cti1.UUCP> mpledger@cti1.UUCP (Mark Pledger) writes:
>I sure caused alot of traffic for this little question.
>
>Thanks to everyone who responded.
>
>I guess I did'nt make myself clear enough on my question though.  I know I
>can use scanf() or cast a char to int (for a single char).  I DID'NT want to
>use scanf() and casting does not work for my question.  The original question
>went something like this:  If you have a character string with the ascii 
>representation of hex values (e.g. s[] = "63", which will be stored as TWO
>hex byte values \x9 & \x9.  I only want to store them as ONE hex byte of \x63.
>Without using scanf() (or even sprintf())  what is the best (fasted for me) way
>to convert a two-digit ascii code to a one digit hex code, so I can put it back
>into the charater string s[], append a null, and write it back out to disk.  I
>currently use atoi() and just write out the int to disk.  I am interested in
>creating (or finding) a routine that will take a character string as the 
>argument, and returning the hex result in the same character string.
>
>Any suggestions?

I think the following routine (and it's test program) will do what you want...

/**********************************************************************/
/* Simple test program for the function atohex() below.               */
/**********************************************************************/

main()
{
    char test_string[] = "3132332e2E2E20697420776F726B7321";

    printf("input ascii string is................ '%s'\n", test_string);
    atohex(test_string);
    printf("output converted string is........... '%s'\n", test_string);
}

/**********************************************************************/
/* Author: Rodney Radford       SAS Institute, Cary, NC 27513         */
/**********************************************************************/
/* The following function atohex() converts the printable hex string, */
/* that is it's first argument, into a new string,  whose length is   */
/* 1/2 the original string's length, of the decoded hex values (???). */
/**********************************************************************/
/* ASSUMPTION: in it's current form, this routine assumes that the    */
/*     input strings length is even. If this not ALWAYS true then you */
/*     should #define NOT_SURE_ABOUT_EVEN_LENGTH.                     */
/* This also assumes the ASCII character set, but can be in either    */
/* upper or lower case.                                               */
/**********************************************************************/

#define ASCTOHEX(c) (((c) > '9') ? (((c)&0x1F)+9) : ((c)-'0'))

atohex(s)
   char *s;
{
   char *source,*dest;

   for (source=dest=s; *source; source+=2,dest++) {
#ifdef NOT_SURE_ABOUT_EVEN_LENGTH
      if (*(source+1) == '\0') break;
#endif
      *dest = (ASCTOHEX(*source)<<4) + ASCTOHEX(*(source+1));
      }
   *dest = '\0';
}

-- 
Rodney E. Radford        SAS Institute, Inc.        sasrer@unx.sas.com
DG/UX AViiON developer   Box 8000, Cary, NC 27512   (919) 677-8000 x7703