[comp.lang.ada] C Strings in Ada?

defaria@hpclapd.HP.COM (Andy DeFaria) (06/05/90)

I would like to create a small package that interfaces to some C code via a
pragma INTERFACE and I have a question to all you Ada pros out there.  What
is the best way to represent C_Strings in Ada?  C's notion of a string is a
pointer to a  char of undetermined length, terminated  by a null character.
Ada has a  totally  different notion of  what  a string is.   I  can  see 3
instances where I need to use a "C_String":

	1) As parameters to a subprogram call.
	2) As a value returned from a function call.
	3) As a part of a record passed to (or returned by) a function.

I am concerned  with usability  and  efficiency as I  intend to reuse  this
package.  I  could  declare an  Ada  string  and pass the 'ADDRESS   to the
interfaced C  code  but then  I  need to append the   null character before
calling  the routine in  the package.  I could have  the package append the
null character but this would require one or two heap allocations  on every
call  and  return which I  consider a big drawback  from  the  viewpoint of
efficiency.  I have tried defining this "C_String" as it is described to C,
as an access to CHARACTER but  with this representation  I can  not look at
the  value of the Ada string  in the debugger  because the debugger faithly
shows me only the first character.  I consider this a big drawback from the
viewpoint of usability.  And how should  strings be  described as part of a
record structure?

I can see three possiblities, none of which are ideal:

(1) Represent C strings as access types to STANDARD.CHARACTER

The pointer points to a  null terminated string.  Since this  uses the same
representation as C no conversions  are  necessary.   I would  then have to
provide subprograms to convert regular Ada strings into this data type.

Advantages: 	Speed, Efficiency
Disadvantages:	Only the first character is visible in the debugger,  
		Need to call the conversion routines before making the call
		the  package.   No conversion  is  needed if  the parameter
		itself was obtained   from a  previous call  to  one  of my
		routines in the package or if the  same value was used as a
		paramater from a previous call.

(2) Represent the strings as Ada strings

In this scheme, the caller would  use only Ada  strings and my package will
always perform the conversion of the string into  a null terminated string.
  

Advantages:	Ada-like, only one interpretation of a "string" needs to be
		considered     by    the    users      of     the  package.
Disadvantanges:	The  heap manager has   to be called  not  only to allocate
		strings, but  also  records that  contain  strings.   Since
		there  are now C  versions of records  and Ada versions  of
		records, a subprogram should  be provided  to convert the C
		version of  the record to  the Ada version  of  the record.
		Also messier to implement and  users of my package may come
		upon the C version of the record by chasing pointers within
		a record.

(3) Represent the C strings as Ada strings, but ...

In this scheme,  the caller of my package  should use Ada strings that  are
terminated with the null character thereby placing  some  responisbility on
them.  My package can  then pass these  strings directly to  the C routines
I'm interfacing to.  If I discover that the caller was lazy and shucked his
responsibility by not terminating the string  with a null character, then I
can do  it for them  and thereby incur the   overhead of calling  the  heap
manager only  if  the caller  forgot to terminate  the  string with a null.
When my package returns character  strings, then  I  would have to allocate
space for these strings by calling the heap  manager but I  would be giving
the caller a string that they can in turn pass back  to  my package without
additional heap manager overhead.

Advantages:	Compromise  between the  two other method  presented.  Heap
		manager overhead  is  incurred only   when  necessary.   If
		strings that are contained in record structures are defined
		as access to Ada strings then the records  are of identical
		size.

Disadvantages:	Does   not solve  the problem of  strings  contained within
		records.  Since there is a C  version  of the record and an
		Ada version,   I  still have   the   problems of  providing
		conversion routines.

4) Combine methods 1 & 2.

This  scheme  would   basically combine    methods 1 &  2 by    using Ada's
overloaded function  call  capability.   Thus the  caller  could use either
method. 

Advantages:	Caller could  use the  Ada-like calls while  debugging  and
		migrate to the faster C-like calls.

Disadvantages:	Much more   work involved for   me in writing  the package.
		Still have a problem with strings inside of  records.  Once
		the user migrates to the faster  C-like calls it is hard to
		debug because they can't look at any "C_Strings".  Requires
		that  they change  their   code to  use  "Ada Strings"  and
		recompile.

What do you think?  How would you implement "C_Strings" in Ada?

stt@inmet.inmet.com (06/08/90)

Re: C_Strings in Ada

The simplest way to turn an Ada string into a string acceptable to C is:
  STR & ASCII.NUL
This works quite well in several compilers, and only
requires that the called C code ignore the additional
array descriptor often passed along with an unconstrained array.

On the other hand, if you want a type which you can manipulate
in Ada which represents a null-terminated string of unknown length,
it is best to define an access type to a *constrained* string subtype.
You can then pretty safely use unchecked conversion
to convert the address of the C string into a value of this access type.
You may then refer to the characters of the C string so long
as the C string does not exceed the length of the constrained
string subtype.  For instance:

    type C_String_Ptr is access String(1..Integer'LAST);
    function To_C_String_Ptr is new Unchecked_Conversion(
      System.Address, C_String_Ptr
    );
    Addr : System.Address;
    Ptr : C_String_Ptr;
begin
    Addr := <address of C-string>
    Ptr := To_C_String_Ptr(Addr);
    for I in Ptr'RANGE loop
        exit when Ptr(I) = ASCII.NUL;
        <do something with Ptr(I)>
    end loop;
 
If you use unchecked conversion between an address and
an access to an unconstrained array type, on many
compilers all hell breaks loose, because the access value
is assumed to be pointing at a descriptor, but you
have set it to point directly at data.

BTW, None of this works on a RATIONAL, at least not until
they get their C++ compiler working :-*.

S. Tucker Taft
Intermetrics, Inc.
Cambridge, MA  02138

defaria@hpclapd.HP.COM (Andy DeFaria) (06/08/90)

>/ hpclapd:comp.lang.ada / emery@D74SUN.MITRE.ORG (David Emery) / 10:21 am  Jun  6, 1990 /
>>What is the best way to represent C_Strings in Ada?  
>
>The right solution is to keep all strings in the Ada program as Ada
>strings, and change them to/from C strings at the point of interface
>to C.  (i.e.  within the body of Ada_routine_calling_c).  This
>preserves the Ada abstractions, and makes all the Ada things work just
>fine (e.g. 'IMAGE, which returns an Ada string).  The only other
>alternative (that makes sense to me) is to make C_STRING a private
>type derived from System.address, but you won't be able to do anything
>useful in the debugger with this.
>
>How you do this conversion is very machine dependent, but here's my
>tricks that work on Verdix (Sun 3) and Meridian (IBM PC):
>
>	procedure to_c (str : string) 
>	is
>	  procedure c_routine (addr : system.address);
>          pragma interface (C, c_routine);	-- (void) c_routine(s)
>						-- char * s;
>	  c_string : constant string := str & Ascii.NUL;
>	begin
>	  c_routine(c_string(c_string'first)'address);
>	end to_c;
>
>	function from_c  
>	    return string
>	is
>	    function c_function			
>		return System.address;		
>	    pragma interface (C, c_function);	-- char * c_function
>	    function strlen (addr : System.address)
>		return integer;
>	    pragma interface (C, strlen);	-- the C function of
>						-- the same name
>	    addr_from_c : System.address;
>	 begin
>	    addr_from_c := c_function;
>	    declare
>	      answer : string(1..strlen(addr_from_c);
>	      for answer'address use at addr_from_c;
>	    begin
>	      return answer;
>	    end;
>  	 end from_c;
>
>Hopefully what's going here is obvious, if not let me know and I'll
>send more information.
>
>				dave emery
>				emery@aries.mitre.org
>----------

I kinda agree with you that the "right" way to do this would be to keep Ada
strings as  Ada strings and do the  conversion  in package but you have not
addressed one of my critical concerns: What about  the overhead incurred by
having  to call the  heap manager to allocate a  new string just  to append
ASCII.NUL so  that the C routine I'm   calling  will be  able to handle  it
properly.   I anticipate  that this overhead will have  an severe impact on
performance and Ada already has a bad rap for being a dog.  I don't want to
add to that reputation.

Another, problem that you haven't addressed  is that  of the representation
of a string in a record.  Should it be  and  Ada access to string or should
it be the string itself.  Or should it be a C string (char *)?

blakemor@software.org (Alex Blakemore) (06/11/90)

In article <920025@hpclapd.HP.COM> defaria@hpclapd.HP.COM (Andy DeFaria) writes:
>What is the best way to represent C_Strings in Ada?  

Dave Emery presented the following solution for some compilers.

$$	procedure to_c (str : string) 
$$	is
$$	  procedure c_routine (addr : system.address);
$$          pragma interface (C, c_routine);	-- (void) c_routine(s)
$$						-- char * s;
$$	  c_string : constant string := str & Ascii.NUL;
$$	begin
$$	  c_routine(c_string(c_string'first)'address);
$$	end to_c;
$$
$$	function from_c  
$$	    return string
$$	is
$$	    function c_function			
$$		return System.address;		
$$	    pragma interface (C, c_function);	-- char * c_function
$$	    function strlen (addr : System.address)
$$		return integer;
$$	    pragma interface (C, strlen);	-- the C function of
$$						-- the same name
$$	    addr_from_c : System.address;
$$	 begin
$$	    addr_from_c := c_function;
$$	    declare
$$	      answer : string(1..strlen(addr_from_c);
$$	      for answer'address use at addr_from_c;
$$	    begin
$$	      return answer;
$$	    end;
$$  	 end from_c;

Mr. DeFaria replied
> I kinda agree with you that the "right" way to do this would be to keep Ada
> strings as  Ada strings and do the  conversion  in package but you have not
> addressed one of my critical concerns: What about  the overhead incurred by
> having  to call the  heap manager to allocate a  new string just  to append
> ASCII.NUL so  that the C routine I'm   calling  will be  able to handle  it

I dont think this solution *requires* heap allocation.  A good compiler would 
be free to use stack space for the C strings in these cases, both for
the local constant c_string and for the return value.  Of course, there is 
the overhead of copying the strings but the allocation can be relatively
cheap.

-----------------------------------------------------------------------
Alex Blakemore                       Internet: blakemore@software.org
Software Productivity Consortium     UUNET:    ...!uunet!software!blakemore
2214 Rock Hill Rd, Herndon VA 22070  Bellnet:  (703) 742-7125

adoyle@bbn.com (Allan Doyle) (06/11/90)

Whew, the nesting is getting too deep. I sliced out the parts I want to
comment on...

>>What is the best way to represent C_Strings in Ada?  

>$$	procedure to_c (str : string) 
>$$	is
>$$	  procedure c_routine (addr : system.address);
>$$          pragma interface (C, c_routine);	-- (void) c_routine(s)
>$$						-- char * s;
>$$	  c_string : constant string := str & Ascii.NUL;
>$$	begin
>$$	  c_routine(c_string(c_string'first)'address);
>$$	end to_c;

>I dont think this solution *requires* heap allocation.  A good compiler would 
>be free to use stack space for the C strings in these cases, both for
>the local constant c_string and for the return value.  Of course, there is 
>the overhead of copying the strings but the allocation can be relatively
>cheap.
>
>-----------------------------------------------------------------------
>Alex Blakemore                       Internet: blakemore@software.org
>Software Productivity Consortium     UUNET:    ...!uunet!software!blakemore
>2214 Rock Hill Rd, Herndon VA 22070  Bellnet:  (703) 742-7125

Judging from the replies about heap space, I'm getting a little worried.
Do you guys mean to tell me that it's compiler dependent how to allocate
the 'c_string' constant? This is just the sort of thing that makes me
mistrust Ada for a realtime application that has to run for a long time
without any memory leaks. If I take the worst case assumption that c_string
is being allocated from the heap and add it to the worst case assumption
that my Ada run-time will not be doing garbage collection (perfectly legal,
I understand), then how many of these little beasties can I convert from
Ada to C before I run out of memory?



Allan Doyle                                        adoyle@bbn.com
BBN Systems and Technologies Corporation           (617) 873-3398
70 Fawcett Street,   Cambridge, MA 02138

kassover@minerva.crd.ge.com (David Kassover) (06/11/90)

In article <57245@bbn.BBN.COM> adoyle@vax.bbn.com (Allan Doyle) writes:
...
>
>Judging from the replies about heap space, I'm getting a little worried.
>Do you guys mean to tell me that it's compiler dependent how to allocate
>the 'c_string' constant? This is just the sort of thing that makes me
>mistrust Ada for a realtime application that has to run for a long time
>without any memory leaks. If I take the worst case assumption that c_string
>is being allocated from the heap and add it to the worst case assumption
>that my Ada run-time will not be doing garbage collection (perfectly legal,
>I understand), then how many of these little beasties can I convert from
>Ada to C before I run out of memory?

Perhaps I'm naive.  But isn't management of things like heaps and
stacks and stuff the purview of the underlying hardware and
operating system (if any?)

Under VMS, regardless of language used (well, take my absolutes
as approximately equal to .9944  8-), heap, stack, vm, etc
management can be affected by several parameters pertaining to
the operating system, the user account structure, and the current
hardware configuration.

On the other hand, if you *have* no operating system (your
program lives in an automated teller machine, e.g.) then when you
run out of memory depends on how much you have and how busy your
program was.  If this is a problem, then you must write a garbage
collector yourself (and find room for it in memory  8-), or get
the powers that be to plug in a bigger memory board, or poll the
ATM more frequently.

And furthermore, if what you are writing *is* an operating system
(in the sense of, say, OS 360, MS-DOS, VAX-VMS, VM/CMS, Primos,
or bhog-knows-what), then you *also* might consider writing a
garbage collector, and publishing instructions to writers of
compilers on how to take advantage of it.

But that has nothing, IMHO, to do with Ada.

Or have I missed a point here?
--
David Kassover             "Proper technique helps protect you against
kassover@ra.crd.ge.com	    sharp weapons and dull judges."
kassover@crd.ge.com			F. Collins

emery@linus.mitre.org (David Emery) (06/12/90)

Allan Doyle writes
>If I take the worst case assumption that c_string
>is being allocated from the heap and add it to the worst case assumption
>that my Ada run-time will not be doing garbage collection (perfectly legal,
>I understand), then how many of these little beasties can I convert from
>Ada to C before I run out of memory?

It's not clear to me that you can guarantee with a C compiler that
there is not some compiler-generated heap, either.  Does the C
LANGUAGE guarantee non-heap allocation for a local variable???    Or
does "everyone knows that's how it's done?"

 				dave emery
				emery@aries.mitre.org

adoyle@bbn.com (Allan Doyle) (06/12/90)

In article <8446@crdgw1.crd.ge.com>
 kassover@minerva.crd.ge.com (David Kassover) writes:
>In article <57245@bbn.BBN.COM> adoyle@vax.bbn.com (Allan Doyle) writes:
>...
>>
>>Judging from the replies about heap space, I'm getting a little worried.
>>Do you guys mean to tell me that it's compiler dependent how to allocate
>>the 'c_string' constant? This is just the sort of thing that makes me
>>mistrust Ada for a realtime application that has to run for a long time
>>without any memory leaks. If I take the worst case assumption that c_string
>>is being allocated from the heap and add it to the worst case assumption
>>that my Ada run-time will not be doing garbage collection (perfectly legal,
>>I understand), then how many of these little beasties can I convert from
>>Ada to C before I run out of memory?
>
>Perhaps I'm naive.  But isn't management of things like heaps and
>stacks and stuff the purview of the underlying hardware and
>operating system (if any?)
>
>Under VMS, regardless of language used (well, take my absolutes
>as approximately equal to .9944  8-), heap, stack, vm, etc
>management can be affected by several parameters pertaining to
>the operating system, the user account structure, and the current
>hardware configuration.
>
>--
>David Kassover             "Proper technique helps protect you against
>kassover@ra.crd.ge.com	    sharp weapons and dull judges."
>kassover@crd.ge.com			F. Collins

Well, maybe the OS is responsible for managing memory at a macro level, but
certainly not at the level of a few bytes here and there for the kind
of strings we're talking about. The Ada runtime will probably do malloc()
for some hunk of memory when it needs more heap space, and will probably
do a free() call of some sort when it thinks it's done with a large hunk
of memory (but the LRM says it does not have to do so, or at least does
not mention it at all). But if the runtime does not do garbage collection
when my program is done with a few bytes here and there, how will the memory
ever get marked for a free call? 

The OS does not know when the runtime is done with a piece of memory, no
matter how large or small. My point is that the runtime does not neccessarily
even bother to do the space reclamation. So if I write a program that
runs under UNIX in Ada that requires heap space sometimes, I want to be able
to reclaim that heap space. Either explicitly or by knowing that the Ada
runtime will do it for me.

When I am using the Ada "new" keyword, I know full well that I'm creating
potential for garbage. I can control when I call it. I can write my own
memory manager around it. I don't get that control when I need a construct
like

	declare
		c_string : constant string := "foo" & ascii.nul;
	begin
		<stuff>
	end;

If I can't even trust Ada to reclaim the 4 or so bytes used by foo, what
can I trust Ada to do????

Allan Doyle                                        adoyle@bbn.com
BBN Systems and Technologies Corporation           (617) 873-3398
70 Fawcett Street,   Cambridge, MA 02138

adoyle@bbn.com (Allan Doyle) (06/12/90)

In article <EMERY.90Jun11135320@scorpio.linus.mitre.org>
 emery@linus.mitre.org (David Emery) writes:
>Allan Doyle writes
>>If I take the worst case assumption that c_string
>>is being allocated from the heap and add it to the worst case assumption
>>that my Ada run-time will not be doing garbage collection (perfectly legal,
>>I understand), then how many of these little beasties can I convert from
>>Ada to C before I run out of memory?
>
>It's not clear to me that you can guarantee with a C compiler that
>there is not some compiler-generated heap, either.  Does the C
>LANGUAGE guarantee non-heap allocation for a local variable???    Or
>does "everyone knows that's how it's done?"
>
> 				dave emery
>				emery@aries.mitre.org


Touche' - I just looked in K&R and found no statement concerning
automatic variables and the memory they occupy. It only states that
the variables "disappear" from scope.

By the way, the example I gave in the previous post:

	declare
		c_string : constant string := "foo" & ascii.nul;

would take up heap in C as well. I really meant 

	proc(foo: string) is
		c_string : constant string := foo & ascii.nul;
	...

which everybody "knows" will go on the stack in C...(sigh)





Allan Doyle                                        adoyle@bbn.com
BBN Systems and Technologies Corporation           (617) 873-3398
70 Fawcett Street,   Cambridge, MA 02138

kassover@minerva.crd.ge.com (David Kassover) (06/12/90)

In article <57288@bbn.BBN.COM> adoyle@vax.bbn.com (Allan Doyle) writes:
...
>When I am using the Ada "new" keyword, I know full well that I'm creating
>potential for garbage. I can control when I call it. I can write my own
>memory manager around it. I don't get that control when I need a construct
>like
>
>	declare
>		c_string : constant string := "foo" & ascii.nul;
>	begin
>		<stuff>
>	end;

I'll ask for some clarification from the language lawyers (and
others who have experience with specific Ada environments other
than VAX Ada)

but I don't think garbage collection in this instance was ever an
issue.

That is, I have used similar constructs (and rather large ones,
too: potential to store 90000 g_floats) in VAX Ada.  It, so far,
has appeared that when the declared variables go out of scope,
the memory is returned to "the system".

On the other hand, stuff that is created with NEW is often
intended *not* to go out of scope, so unchecked_deallocation is
needed to get rid of it (presumably, *after* the program has
verified that "no one else" still points to it)

And furthermore, I remember having a discussion about this about
a year ago, before I implemented the monster data structure (we
use it to pass potentially large arrays to and from a FORTRAN
program, especially in instances where it is hard for the Ada
code to determine the dimensions of the array.  I was assured
that exhaustion of "heap space" would not be a problem, given
that there was enough heap available for any one scope.  But
again, that may be a "feature" of VAX Ada's way of dealing with it.

--
David Kassover             "Proper technique helps protect you against
kassover@ra.crd.ge.com	    sharp weapons and dull judges."
kassover@crd.ge.com			F. Collins

murphy@mips.COM (Mike Murphy) (06/12/90)

In article <57288@bbn.BBN.COM> adoyle@vax.bbn.com (Allan Doyle) writes:
>When I am using the Ada "new" keyword, I know full well that I'm creating
>potential for garbage. I can control when I call it. I can write my own
>memory manager around it. I don't get that control when I need a construct
>like
>
>	declare
>		c_string : constant string := "foo" & ascii.nul;
>	begin
>		<stuff>
>	end;
>
>If I can't even trust Ada to reclaim the 4 or so bytes used by foo, what
>can I trust Ada to do????

Most Ada compilers will reclaim the space used by an object when it 
exits its scope.  I say "most" because this is not required by the 
semantics of the language, but any good compiler will have this feature, 
as it is needed for large software systems, and it is not too difficult
to implement.  Basically, you emit some code at the end of any blocks 
or statements that use temporary space, to reclaim that space.
This might mean a "free", or it could be something even cheaper 
(e.g. some compilers will put these kind of block-structured temporaries 
on a mark-release stack).  You can code up a little example like the above
to test what your compiler does.  This is not considered to be "real"
garbage-collection, which no Ada compiler I know of does, which requires
searching through memory for dangling pointers.  Instead this is just
an important optimization for a common situation in Ada.  The situation
is more common in Ada than in languages like C because we can have
function calls that return dynamic-sized objects, such as strings,
and these need temporary space.  The reclaiming of the space is helped
by the block-structured nature of its scope (as opposed to the
full-blown garbage collection problem).

-- Mike Murphy
-- UUCP: sun!decwrl!mips!murphy or murphy@mips.com

defaria@hpclapd.HP.COM (Andy DeFaria) (06/12/90)

One of my concerns about coverting Ada Strings to C Strings  was the amount
of overhead involved in allocating heap space.  Dave  Emery stated that the
compiler should  not allocated any  heap space  for a  local variable -  it
should be placed in the stack:

declare
   C_STRING : constant STRING := STRING_PARAM & ASCII.NUL;
begin
   -- call C routine
end;

My understanding is that the compiler would allocate space on the stack for
this variable if the size of STRING_PARAM can be determined at compile time
not at run time.  Since STRING_PARAM is being passed into my routines it is
not  known, at compile time,  how big   the  STRING_PARAM will be (it could
theoretically (sp?) be INTEGER'LAST bytes long!!  Assuming a 32 bit integer
size  this could be gigantic  and with  some  architectures  stack space is
limited and heap space is more plentifull) so  the compiler might determine
that it is better to put this variable on the heap  instead.  This does not
mean  that the compiler   is  faulty.   This also  might   help  facilitate
procedure cleanup.

Also, no one has  address  my other  concern about  how to represent  these
strings in structures such as:

struct data_record {
   char *path;
   int  path_length;
   char *name;
   int  name_length;
}

emery@linus.mitre.org (David Emery) (06/12/90)

defaria@hpclapd.hp.com writes
>My understanding is that the compiler would allocate space on the stack for
>this variable if the size of STRING_PARAM can be determined at compile time
>not at run time.  Since STRING_PARAM is being passed into my routines it is
>not  known, at compile time,  how big   the  STRING_PARAM will be (it could
>theoretically (sp?) be INTEGER'LAST bytes long!!  

There is no requirement (in Ada or any other language) that only
objects who size is static at compile-time be allocated off the stack.
The requirement, instead, is that the size of the object be STATIC at
runtime.  The size of STRING_PARAM is known when the stack frame for
the block is built.  For instance, if STRING_PARAM is 7 characters
long, then a string of length 7 is allocated off the stack.  Almost
any block-structured language can do this (including C and Pascal).
To insure that the object being declared (C_STRING) remains static, it
is declared as a constant.  

Ada provides substantial support for types and subtypes whose size is
not known until runtime.  There are some things you can't do with
them, but in general storage allocation is not a problem.  That is
because all objects in Ada are required to be constrained, either with
a fixed size, or with a maximum size.  In the case of variant records
with defaults, the maximum size is determinable based on the range of
values of the discriminant.  

There is always the problem with overflowing either stack or heap.  If
you have a string where 'LENGTH = INTEGER'LAST, I doubt that it will fit
anywhere in main memory, and you'll get STORAGE_ERROR long before you
try to append a ASCII.NUL to it and pass it to C.

				dave emery
				emery@aries.mitre.org

murphy@mips.COM (Mike Murphy) (06/13/90)

In article <920026@hpclapd.HP.COM> defaria@hpclapd.HP.COM (Andy DeFaria) writes:
>Also, no one has  address  my other  concern about  how to represent  these
>strings in structures such as:
>
>struct data_record {
>   char *path;
>   int  path_length;
>   char *name;
>   int  name_length;
>}

It seems to me that the simple solution is to just define a c_string
type which is an access to a constrained string, and then some routines
to convert strings to c_strings and vice-versa.  Then you define your
records to match the C structures:
	type data_record is record
		path 		: c_string;
		path_length 	: integer;
		name 		: c_string;
		name_length 	: integer;
	end record;
The cost of converting strings to c_strings is probably not as bad as you 
imagine, depending on your application (e.g. chances are you will only
convert string to c_string once, then do many calls to C routines with
the c_string structures).

BTW, Verdix uses such an approach in their UNIX interfaces (there is a
c_strings package in the standard library).

-- Mike Murphy
-- UUCP: sun!decwrl!mips!murphy or murphy@mips.com

stt@inmet.inmet.com (06/13/90)

Re: C Strings in Ada
Please listen to Mike Murphy.  He knows of what he speaks!

In particular, many compilers do *not* use the heap for
objects of dynamic size.  Instead, they use a mark/release discipline
on a stack, either the primary stack or a secondary stack.

Secondly, despite David Emory's appropriate warning, *most* Ada
compilers will treat access-to-*constrained*-string as a simple
address pointing directly at the characters.  You will still
have to worry about the null terminator, since Ada is presuming
the pointed-to string is quite long.

S. Tucker Taft
Intermetrics, Inc.
Cambridge, MA  02138

falis@ajpo.sei.cmu.edu (Edward Falis) (06/14/90)

you're right: it depends on the compiler and rte implementation. In fairness,
though, several of the more prominent Ada vendors do perform automatic
reclamation (though not garbage collection).  The typical scheme is
to deallocate all objects of the collection type when the access type
goes out of scope.  For compiler-generated temporaries, reclamation 
will occur when the scope creating the tempo is exited, or when the
tempo is no longer required, depending on the thoroughness of the
implementation.  - Ed Falis, Alsys

jb@rti.rti.org (Jeff Bartlett) (06/15/90)

In article <920025@hpclapd.HP.COM>, defaria@hpclapd.HP.COM (Andy DeFaria) writes:
> >/ hpclapd:comp.lang.ada / emery@D74SUN.MITRE.ORG (David Emery) / 10:21 am  Jun  6, 1990 /
> >>What is the best way to represent C_Strings in Ada?  
> >
> >	procedure to_c (str : string) 
> >	is
> >	  procedure c_routine (addr : system.address);
> >          pragma interface (C, c_routine);	-- (void) c_routine(s)
> >						-- char * s;
> >	  c_string : constant string := str & Ascii.NUL;
> >	begin
> >	  c_routine(c_string(c_string'first)'address);
> >	end to_c;

> ... my critical concerns: What about  the overhead incurred by
> having  to call the  heap manager to allocate a  new string just  to append
> ASCII.NUL so  that the C routine I'm   calling  will be  able to handle  it
> properly. ...

A good compiler should grow the stack by (str'length + 1 + string_descriptor)
then copy 'str' and the NUL into the area.  It would also fill in the
new string_descriptor for 'c_string' from the information in 'str's descriptor
and the additional byte.  (A smart compiler may not even allocate space for
the new descriptor, keeping the information in registers).

Constant strings are a good 'trick'. Such as:

    ...
	str : string(1..10);
	N   : natural;
    begin
	...

	-- a subroutineless call like above
	declare
           s : constant string := str & ascii.nul;
        begin
           c_routine( s(s'first)'address );
        end;

	-- remove the leading space supplied by 'image.
	declare
	   s : constant string := integer'image(N);
        begin
           text_io.put_line("Activity on PE" & s(s'first+1..s'last) );
        end;

	-- figuring out how much of a buffer to overwrite
        declare
           s : constant string := function_that_returns_a_string(j);
        begin
           line(line'first..line'first+s'length-1) := s;  -- result used twice
        end;

> Another, problem that you haven't addressed  is that  of the representation
> of a string in a record.  Should it be  and  Ada access to string or should
> it be the string itself.  Or should it be a C string (char *)?

It depends on what operations need to be performed on it.  I'll leave this for
someone else.  Don't forget that you can:

	type s_ptr is access string;
        p : s_ptr;
    begin
        ...
	-- this will call the function, allocate space for the result, and
        -- initialize the area with the result.
        p := new string'( ada_function_that_returns_a_string );
        ...
	text_io.put_line("answer was '" & p.all & "'");

If you are paranoid about the Ada run-time heap manager forgetting to free
the space pointed to by 'p', you can instantate a new UNCHECKED_DEALLOCATION.

Hope this helps,

Jeff Bartlett
Research Engineer, Center for Digital Systems Research
Research Triangle Institute, RTP NC.
jb@rti.rti.org     mcnc!rti!jb     rti!jb@mcnc.org     (919)-541-6945

PS: The code above was only compiled with a chemical computer. ;-)