[comp.society.futures] Reply to: Musings on the future of computing ...

djy@INEL.GOV (Daniel J Yurman) (02/04/91)

Mail to - info-futures@world.std.com
 
This long posting is a response to a request for thoughts and
expansion on the them of "two competing models of computing ...
current polarizing development directions."  This response
addresses workstation and server environments in the context of
applications addressing environmental problems, .e.g, hazardous
waste.
 
I think the original post by Barry Shien - bzs@world.std.com - of
2/1/91 was very thoughtful and to the point.  This response is
not meant be be comprehensive, but hopefully will add to the
discussions and perhaps stimulate other to contrubute their
views.
 
*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%*%%*%*
 
Proposed Workstation Configuration for a Decision Support System
for Cleanup of Hazardous Waste Sites Using Geographic Information
Systems and Expert Systems.
 
Dan Yurman
Idaho National Engineering Laboratory
PO Box 1625, Idaho Falls, ID 83415    43N, 112W -7GMT
 
(208) 526-8591
 
* Internet       djy@inel.gov
* Bitnet         djy%inel.gov@cunyvm.cuny.edu
 
ABSTRACT
 
The complexity of identifying the extent of contamination at
hazardous waste sites requires the use of computer tools
distributed across multiple workstations.  Geographic information
systems (GIS) coupled with graphics and expert systems can help
shape solutions for analysts.  This is an account of the design
for workstations to meet the requirement for decision support
systems (DSS) in this area.
 
Scope
 
       I will address two topics.  These are (1) systems
architecture, and (2) resolving ill-defined problems.
 
1. System Architecture
 
       These comments discuss the requirements of a "site
characterization" workstation to support DSS.  By "site" I am
referring to a discrete location which is known to contain
uncontrolled hazardous waste either at the surface, below the
surface, or both.  By "characterization" I am referring to a
process by which the volume, concentration, and spatial
distribution of the uncontrolled hazardous waste are quantified,
and which can serve as the basis for specifying a cleanup action.
 
       The heart of the design is a concept labeled as the "logical
workstation"  This concept is based on the idea that not all
functions for an DSS need reside on a single host, but can be
spread among processors and platforms across a network.  This is
almost a necessity because of the distribution of human and
technical resources necessary needed to clean up large areas of
hazardous waste contamination.  Examples include the four Super-
fund sites contaminating 150 square miles of the Clark Fork
watershed area in Montana.
 
       People and computers needed to do the work are not
necessarily co-located in the same office nor even in the same
city.  In the case of the Clark Fork, computer resources to
support the GIS are located in Helena, MT, Denver, CO, and Las
Vegas, NV.  The main host is in Montana as is the plotter,
workstations for some of the input and output of tabular data are
in Denver, and workstations for development of applications,
including Arc/Info AMLs, are at EPA's GIS Laboratory in Las
Vegas.
 
       Workstation Processes
 
       The "logical workstation" concept states that there are
functions that are inter-related and that form logically
integrated work processes across a network.  Each "logical
workstation" provides a standard user interface for packaged
application functions, and, it allows the user to descend to the
level of the operating system for advanced programming functions.
Thus the Logical Workstation is both a physical reality and a
system concept.  Each of the work processes (Input, Output,
Analysis) supporting user requirements may be thought of as being
serviced by a workstation.  The logical workstations are related
through the common data shared by each of the applications.  The
applications are those supporting site characterization and
environmental restoration work.
 
       The workstation is composed of three logical workstations
corresponding to Input, Analysis, and Output functions.  Each
logical workstation is integrated with a Support module that
provides data management and system guidance functions.  The
integration through Support functions provides the "data draw-
bridge" that links and integrates the applications residing on
logical workstations with each other.  The Support workstation
also takes advantage of surplus computing cycles available on
workstations, and allocates their use for numerically intensive
operations on the Analysis workstation.  Finally, the Support
workstation can function as a database server either directly on
the main host, or as a gateway to multiple hosts.
 
       The logical workstation concept improves the ability of a
user community to match hardware and software resources in direct
proportion to processing and utilization requirements.  For
example, the Analysis Workstation may require a high speed
processor, but the other workstations may not share this need. On
the other hand, the heavy utilization of the input functions may
dictate implementing multiple Input Workstations.  The resulting
configuration would use multiple, low cost, lower power CPUs for
processing input functions and higher power, more expensive, CPUs
for analysis functions.  The point is that organizations
designing a DSS need not purchase a single target architecture
nor even a single application software package in an effort to be
all things to all users.  This point might seem obvious, but
please bear with me in the exposition of the idea of the logical
workstation.
 
       Input Workstation
 
       The Input Workstation would capture tabular data, chemical
sampling and analyses' results, attribute information, digitized
map information, etc., for a database.  The data in the DBMS would
support a geographic information system (GIS); or graphics to
support scientific visualization and engineering perspectives.
Even sampling and analysis results from a laboratory or test data
from an instrument have to go into a database before being
accessed for visualization routines.  Data can also be accepted
from digitizer, keyboard input, and from other sources already in
electronic form, e.g. Tiger files from the Census Bureau. The
Input Workstation must provide interactive editing of spatial and
attribute data.
 
       Typically, the Input Workstation does not have significant
local disk storage, but ships its data to a larger host across a
network.  It is assumed that the graphics and GIS functions are
interfaces to the database or distributed databases located on
one or several hosts across a network, but they do not operate on
the Input Workstation.
 
       There may be more than one kind of Input Workstation
depending on the kind of data being entered in the system. For
instance, an ASCII or X-Windowing terminal may be sufficient for
entry of data into forms, but a digitizer may be required for map
information.  A standard '286 or '386-based PC may be used to
write code to manipulate data already in electronic form on a
larger host.
 
       Analysis Workstation
 
       The Analysis Workstation would provide tools to support
hydrogeological, spatial, and visual analyses of site data.  It
would support validation of chemical sampling and analyses'
results.  Additional tools would include geophysical tools, such
as cross section and fence diagrams, as well as management of
lithologic and well construction data.  The Analysis Workstation
performs statistical analyses of chemical concentrations, and
supports the use of analytical models. In terms of a geographic
information system, the Analysis Workstation is host to the GIS
software using, in turn, a larger host for storage of attribute
data.  All work processes to develop and visually display GIS
coverages on the screen are carried out on the Analysis Work-
station. Memory and local disk storage are sufficient to hold all
programs and data needed for the current session.  These data are
acquired using SQL procedures from the networked host supported
by a client /server architecture.
 
       There may be more than one kind of Analysis Workstation
depending on the analytic procedures required to visualize
information, and some configurations may have more than one
function.  For instance, a large screen, color monitor, e.g.,
19-inches, hooked to a processing unit with significant memory,
e.g. 16 Mb, 700 Mb of storage, may be needed for GIS work, but a
'386- based PC with 4 Mb of RAM, 100 Mb of storage, may be
sufficient for statistical analyses of tabular information.
Also, the PC may be used as an input platform to write code in a
high level language to run a job on the larger workstation using
a network host as a server between the two.
 
       Groundwater Applications:
 
       The Analysis workstation must have the capability to support
specific types of applications.  The following examples in the
area of groundwater analyses, which are fundamental to cleaning
up hazardous waste, explain some of the functions.  The
workstation does not place sampling wells nor does it do other
kinds of field work.  It is a platform upon which numeric
analyses and modeling exercises are conducted which aid in the
implementation of the field activities.
 
       Types of Applications to be Supported
 
       (a) Monitoring, (b) Source Control, (c) Fate and Transport,
(d) Technology Transfer, (e) Treatment Technology, and (f)
integration of surface and groundwater models.
 
       Explanation of Application Types
 
       (a) Monitoring
 
       ... involves the placement and spacing of wells together
       with acceptable procedures for sample collection, quality
       assurance, and quality control.  This activity is a
       fundamental requirement for credible decisions in
       groundwater protection.
 
       (b) Source Control
 
       ... assesses technological and operational strategies to
       reduce the risk of contamination posed by improper disposal
       of hazardous wastes.
 
       (c) Transport & Fate
 
       ... involves predicting the behavior of contaminants below
       ground.  It is one of the most difficult and most important
       tasks for groundwater protection.  Analyses of fate and
       transport problems involve predicting the physical movement
       of groundwater and contaminants in saturated and unsaturated
       zones.  Also, it involves changes in the quality of ground-
       water through natural or manmade degradation or differential
       separation of constituents.
 
       (d) Technology Transfer
 
       ... imports information on comparative methods for attaining
       groundwater protection and cost data on available
       technologies.  Field personnel deal with an extremely broad
       scope of data and information, and often completion of
       groundwater analyses is aided by consultations among field
       offices.
 
       (e) Treatment Technology
 
       ... focuses on the removal of volatile and non-volatile
       organics, inorganics, metals, and microbes.
 
       (f) Integration of Surface & Groundwater Models
 
       ... support the practical assessment and prediction of
       occurrences in the following; - aquifer flow simulation
       under unconfined flow - aquifer flow simulation under
       pumping conditions - time/series predictions under pumping
       conditions - solute transport delineations and predictions -
       storage of multiple contours,
 
       Output Workstation
 
       The Output Workstation would support a full range of
presentation media for representing site data and the results of
analytic operations.  It would support desktop publishing
applications to integrate text and graphics to meet formal
presentation requirements for management.  It controls
development of plotter jobs, visually displays them prior to
execution, and translates them to graphic files, e.g. CGM, for
use in other applications.  This is an example of "task
pipelining", a concept which will be discussed in detail later in
these comments.
 
       There may be more than one kind of Output Workstation
depending on presentation media.  Examples of output media
include standard 'D or 'E size plots, tabular printout, post-
script laser, or television production quality video tape. The
preparation of plots of graphs and GIS coverages is done here,
and can utilize the services of a network host, or server, to
spool the job to an output device thus freeing up this machine
for new work.
 
       Memory and disk storage on this workstation are sufficient
to hold all programs and  data needed for the current session,
but completed files are shipped back to a larger host across a
network for storage between jobs.  This workstation may include
specialized hardware to achieve almost real-time decompression of
graphics files.  It may contain hardware to display graphics on a
television monitor or feed directly into transmission of TV
images over microwave to a remote site.
 
       System Support
 
       The Support workstation provides data management and user
guidance functions integrated with the other Logical Work-
stations.  Each of the Logical Workstations require interaction
with the Support workstation to execute their functions.  System
configuration management for network services is included here.
Data administration functions include support for data base
management, query functions, data extract, and reformatting
services.  The guidance functions include on-line help for
packaged applications, system and application tutorials, and
expert system guidance to shape solutions regarding what data is
to be acquired for each session and what tools are appropriate to
use.  The expert system would be capable of examining its own
reasoning and explaining its operation.
 
       A data "drawbridge" functions on the Support workstation
linking the GIS and DBMS functions with models, statistics,
contouring, and other specialized functions. The drawbridge is
linked to a problem solving function which aids the user in
several ways.  First, it aids in specifying the nature of the
problem.  Second, it assists in structuring a solution not only
in terms of conceptual approaches, but also in identifying
relevant data types and data sets which can be used to test
alternative solutions, e.g., to use a metaphor, in spreadsheet
form  Third, it aids the user in selection of appropriate tools
by evaluating the types of data and the means of computer
assisted analysis which can most effectively represent the data
in a solution.
 
       The system support workstation is the host to the DBMS as
well as databases of spatial information.  This workstation has
the greatest amount of mass storage.  It is also the home for
data retrieval and output tools for reports, maps, queries, and
final output for maps, plots, slides,a nd text.
 
       Additional ideas on the Support workstation are included
later in these comments under the heading of "Resolving Ill-
Defined Problems."  In the remainder of this section I would like
to make a few comments on how to migrate from present platforms
to the design suggestions for the Logical Workstation.
 
       Workstation Configurations
 
       There are many ways to configure workstations to support
site characterization analyses and integration with GIS systems.
Each configuration has its strengths and limitations, and should
be considered in light of performance, cost, future requirements,
and migration path. The new generation of workstations can give a
user community new capabilities at lower cost: faster processing,
high resolution graphics, and multitasking for a single user.  A
suggested migration path would allows a user organization to
start with single workstation, and expand to a network of
stations running a number of applications.  The present standards
within the user community for hardware, software, data resources,
and networks will influence workstation configurations and the
migration path.
 
       Four hardware configurations that are feasible for different
stages of migration path are discussed below. These are not
intended to be exhaustive descriptions of all combinations.  They
provide a basis for alternative combinations.
 
       Low cost single user workstation: This configuration is
recommended for work groups with only a few users, no local area
networks, and relatively low usage of current workstations to
support data entry functions.  These workstations may also be
used for other purposes including networked office automation
functions.  These low-cost, single user workstations would in the
present era have 32-bit memory, processors, and offer
considerably enhanced speed and capacity over current '286-based
processors or dedicated terminals.  These workstations can
process data faster than terminals, and are a good choice for the
database interface to a distributed system.
 
       High performance single user workstation: This machine is
recommended for work groups requiring more processing power,
increased local storage, and a multiuser, multitasking
environment.  This machine can be used for software development
and to provide technical support to end users.  It allows users
to interface with data through packaged software from third party
vendors, custom applications, or through high level programming
languages which interact with third-party software or custom
applications.  These functions can be migrated up the power curve
of workstation configurations.  These machines can be linked with
low cost, single user workstations and with larger hosts across a
network.
 
       Multiuser local area network configuration: The next step in
the migration path is a configuration with workstations on a
local area network.  This configuration is recommended for
organizations with existing networking capabilities or those
planning to have networking capabilities in the near future.  The
single-user, high-performance workstation can be networked with
other workstations with varying degrees of capability running a
number of applications as the demand for more computing power and
use of expensive peripherals in their computing facilities grows.
At this stage in the migration path, distributed processing can
take place capitalizing on surplus cycles of individual machines
to support numerically intensive operations.
 
       Client / Server Architecture: This type of configuration is
recommended for those sites with an increased demand for data
sharing, computing, and network resources.  The file server is
connected to a large disk that holds the network operating
system, application software, and data files.  This configuration
could include a capability to be upgraded to support multiple
processors, at least 64 Mb of RAM, and storage in excess of 1 Gb
connected via ethernet to workstations for the GIS and graphics,
and other workstations for data entry, retrieval, and integration
into desktop presentation or publishing tools.
 
       Summary
 
       In the first set of comments I have presented a proposed
system architecture which could potentially support the kind of
applications described in the second set of comments.  In the
second set I will provide some comments for ways in which these
design ideas could address the process of resolving ill-defined
problems.
 
 
2. Resolving Ill-Defined Problems
 
       The Support Workstation described above is host to expert
systems, a data drawbridge between applications, and a "solution
shaper" function.  I would like to discuss the spreadsheet
metaphor as applied to groundwater analyses and geographic
information systems.  I reason that there was a comparison which
could be made, by analogy, between proposals for DSS and the
quantum leap offered to financial analysts by the advent of the
electronic spreadsheet.
 
       The first part of the analogy is drawn from the idea of the
data drawbridge.  It's role is the dynamic manipulation and
selective mapping of output streams from a tool to the formalized
data input structures or another tool or model.  These data
linkages would allow task pipelining. The first sentence clearly
relates to the idea of a spreadsheet with its named ranges
(selective mapping of output streams), and the second relates to
the reverse which is the use of spreadsheets to manipulate data
downloaded from larger hosts.  With regard to "data linkages
allowing task pipelining" this has been part of the vision of DSS
for some time.  These types of products have achieved the objec-
tives of this vision.  The system architecture noted earlier is
moving in the direction of developing an integrated toolbox for
groundwater analyses and spatial analyses which transcend the
ordinary limitations of stand-alone models.
 
       Solution Structures
 
       Solution structures incorporate the methodological knowledge
that an experienced problem solver uses in the site
characterization process to analyze a problem and structure the
solution.  The problem is - where is the contamination and how
much is there?  The solution is - what is the most
technologically and cost effective method to remove threats to
human health and the environment?  The analyst needs attribute
information about the extent of contamination and also technology
information about cleanup methods.  These are two distinct
databases, and under the Logical Workstation concept, need not be
on the same host.
 
       In the logical workstation application this concept would
the data integration capability provided by a global data element
dictionary used by all applications. Specifically, the solution
structures would assist the user in identifying and locating
within the workstation databases or across a network the types of
data required to solve a particular type of problem.  Finally,
once the solution process has been structured and data needs and
availability identified, assistance in selecting the most
appropriate tools for analysis and display of the results would
be provided in the order in which they were needed by the user.
 
       Guidance Tool
 
       The user must have the option to access a guidance tool for
assistance in choosing the right application and analytical
method for site characterization analysis.  This option would be
served by an expert system, and use of this capability would
result in more effective use of site characterization analyses
and their integration into GIS. The expert system must be able to
extract information from the database for knowledge
representation.
 
       The expert system of choice should integrate with the
existing databases, user interface, and tools on the workstation.
The expert system functions could include:
 
       * Help:
 
       The expert system shell can be used as the help facility in
       the workstation.  It could work as an integral part of the
       object-oriented user interface in a multitasking, windowing
       environment.
 
       * Tutorial:
 
       The guidance tool can also be used as a tutorial for new
       users.  The expert system would be integrated with work-
       station shell and site data to get practical knowledge on
       using the workstation.
 
       * Problem Solver:
 
       The expert systems can be used as a check list for steps to
       be employed and selection of appropriate model for site
       analysis.  These steps include:
 
       - Problem Description: assistance in identifying and
       ordering tasks that must be completed to solve a particular
       problem.  An example might include setting appropriate data
       quality objectives.
 
       - Solution Structuring: the methodological knowledge can
       used to analyze a problem and structure the solution
       process.
 
       - Tool Identification Assistance: a knowledge-based system
       that would select the best algorithm for spatial data
       interpolation based upon the statistical characteristics and
       completeness of the data.
 
       * Self Knowledge:
 
       The expert system must have a explanation facility. This
explains how the system arrived at its answers.  The explanations
must display the inference chains and explain the rationale
behind each rule used in the chain.  The users will have more
faith in the results and more confidence in the system.
 
       General Considerations
 
       Expert systems offer a means to overcome barriers to more
effective use of groundwater analyses and their integration into
geographic information systems.  From an engineering perspective,
expert systems can be used to support computational techniques to
find acceptable answers to problems which are normally solvable
by conventional means, but which require iterative, complex
procedures to resolve.
 
       Expert systems can be used to address computational problems
involving special data structures together with domain specific
knowledge engineering techniques.  As a body of techniques and
methods, expert systems cut across other components of the
requirements analysis and have multiple potential applications.
 
       Use of Expert Systems
 
       There are many barriers involving both computational and
conceptual issues which related to problem solving for
groundwater analyses and development of GIS.  Expert systems have
the potential for intensifying the speed and degree of analysis
as well as addressing the more complex problems of spatial data
structures, resulting spatial data relationships, and
visualization of mapped data.  In general, expert systems may
hold a key to more effectively managing the complexity of
groundwater data.
 
       Applications which might be built include, but are not
limited to the following examples.  These examples are drawn from
the work of the national Center for Geographic Information
Analysis, Santa Barbara, CA;
 
       ... the statistical analysis of spatial data, including the
       selection of statistical tests to be employed, and warnings
       in case of inappropriate data management or use;
 
       ... modeling of spatial phenomena or a large area with many
       interrelated attributes;
 
       ... image interpretation such as detecting changes over
       time.
 
       ... cartographic design including label placement and color
       selection; enforcement of map design standards for a large
       user community
 
        Summary
 
       Ill-defined problems require a "intelligent assistant" to
scope an environmental problem against such variables as sources
and volume of pollutants, toxicity, transport & fate, etc., and
to allow the application of appropriate tools supported by task
pipelining.
 
 
Conclusions
 
       These workstation configurations could be assembled today
from commercial sources.  The greatest area of custom programming
would be in the development of "intelligent assistants" and a
common user interface for all workstations to shape solutions for
cleanup of hazardous waste.
 
 
REFERENCES
 
[1] "Workstations for Groundwater Analyses at US EPA," Yurman,
D., in Proceedings of the Second Joint Conference on Integrating
Data, Session of Microcomputers and Public Policy Analysis,
National Governers' Association, Washington, DC, December 14,
1988.
 
[2] "A Conceptual Structure for the Groundwater Workstation,"
Horsey, H., Center for Advanced Decision Support Water
Environmental Systems (CADSWES), Boulder, Colorado, in
Proceedings of the Groundwater Workstation Advisory Committee,
Office of Solid Waste & Emergency Response, U.S. Environmental
Protection Agency, Washington, DC, May 1988.
 
[3] Proceedings of the Groundwater Workstation Advisory
Committee, Confernce Report, Horsey, H., CADSWES, and Yurman, D.,
Office of Solid Waste & Emergency Response, U.S. Environmental
Protection Agency, Washington, DC, May 1988.
 
[4] Management Plan for Geographic Information Systems,
Information Management Staff, Office of Solid Waste & Emergency
Response, U.S. Environmental Protection Agency, Washington, DC,
November 1988.  Contractor assistance provided by Selden, D.,
American Management Systems, Rosslyn, Virginia.
 
[5] "Old Southington Landfill - Use of Historical Aerial
Photography and GIS," Proceedings of the Remote Sensing
Coordinators Conference, Environmental Photo Interpretation
Center (EPCI), U.S. Environmental Protection Agency, Vint Hill
Farms, Virginia, April 1988.
 
[6] "Management Review of the Clark Fork, Montana, Geographic
Information System," Hazardous Waste Division, U. S.
Environmental Protection Agency, Region VIII, Denver, Colorado,
July 1988.
 
[7] "Success Factors for Expert Systems," Yurman, D., In
Proceedings of the American Chemical Society, Symposium Series
No.431, Washington, DC, September 1989

*                 Standard disclaimer included by reference
* ------------------------------------------------------------
*  Dan Yurman     Idaho National Engineering Laboratory
*  djy@inel.gov   PO Box 1625, Idaho Falls, ID 83415-3900
*                 phone: (208) 526-8591  fax: (208)-526-6852
* ------------------------------------------------------------
* 43N 112W        Red Shift is not a brand of chewing tobacco.