[bionet.molbio.genome-program] Mouse genome report

JP2@CU.NIH.GOV (02/22/90)
In response to the query as to the status of sequencing the mouse genome,
a meeting was held of representatives of the US mouse community
last Nov. at Princeton.  The Human Genome Center was about to
post the report for that meeting, as well as reports for several other meetings
the NCHGR has sponsored over the past few months.           .
  I will post the report for the mouse meeting now with the others
to follow.   Participants in the meeting were:


Dr. Neal Copeland--NIH, Frederick MD
Dr. Verne Chapman-- Roswell Park Memorial Inst. Buffalo, NY
Dr. Peter D'Eustachio, NYU Medical Ctr, New York, NY
Dr. William Dove-- Univ of Wisconsin, Madison, WI
Dr. Marshall Edgell, U. of N. Carolina, Chapel Hill, NC
Dr. Eva Eicher, Jackson Laboratories, Bar Harbor, ME
Dr. Rosemary Elliott, Roswell Park
Dr. Jeff Friedman, Rockefeller Univ.  New York, NY
Dr. Lee Hood, Cal Tech, Pasadena, CA
Dr. Nancy Jenkins, NIH, Frederick, MD
Dr. Jan Klein, Max Planck Institute Biology, Tubigen, FRG
Dr. Eric Lander, MIT, Cambridge, MA
Dr. Terry Magnuson, Case Western Reserve Univ., Cleveland
Dr. Joe Nadeau, Jackson Labs, Bar Harbot, ME
Dr. Ken Paigen, Jackson Labs,
Dr. Eugene Rinchik, Oak Ridge National Laboratory, Oak Ridge, TN
Dr. Leane Russell, Oak Ridge
Dr. Shirley Tilghman, Princeton Univ, Princeton, NJ
Dr. Lee Silver, Princeton Univ

Any questions about the report can be directed to Shirley
Tilghman who organized the meeting.  Questions about funding
for mouse genome work through the Human Genome Center can be
directed to the NCHGR staff-- Jane Peterson--jp2@nihcu.bitnet,
Mark Guyer--gy4@nihcu.bitnet or Bettie Graham--b2g@nihcu.bitnet







------------------------------------------------------------------
                     INTRODUCTION

     The commitment of the National Institutes of
Health to map and sequence the human genome includes
an equal commitment toward understanding the
information which will accumulate.  This will require
the acquisition of equivalent amounts of information
for a variety of model organisms.  Far from being
redundant data gathering, the comparative approach is
essential to the success of the Genome Project.  Many
studies have shown that important genes are conserved
among organisms, providing a crucial criterion for
discriminating among genes.  However sequence alone
provides minimal information about the specific role
of a gene in biology.  That requires experimental
manipulation, which is impossible in humans.

     The mouse became the preeminent mammalian model
organism over the last fifty years because of its
convenient size, fertility, short gestation period,
ease of maintenance and well defined genetics.  The
recent advances in transgenic DNA technology and the
ability to mutate genes in the mouse germline via
homologous recombination in embryonic stem (ES) cells
have only served to strengthen this important role for
the mouse.  In fact, the mouse is the only mammal in
which one can experimentally test the biological role
of sequences of unknown function conserved in mammals.
The power of the interplay between scientific
discovery in the human and the mouse has been amply
demonstrated in the last several years.  For example
the function of cloned human genes, for e.g. the human
Zfy gene, whose role in male sex determination is
currently being questioned, can be specifically tested
only in mice.  Mouse models for autoimmunity, such as
the MRL/lpr and moth-eaten (me) mice, mimic accurately
the symptoms of arthritis, vasculitis and nephritis
associated with rheumatoid arthritis and systemic
lupus erthymatosus in humans.  Third, the mouse
mutants offer unique access to genes that are
complementary to those identified in human populations
as "disease genes".  Human recessive lethals, if there
is no visible heterozygous phenotype, will never be
observed.  Yet these represent a group of important
essential genes in mammalian biology.  Finally the
discovery of significant synteny between the mouse and
human genomes has led to the exchange of DNA markers
between mouse and human geneticists that has
contributed to the physical mapping of the Duchenne's
muscular dystrophy locus, neurofibromatosis and Wilm's
tumour.

     Within the last few years new advances in mapping
of the mouse genome, which take advantage of
interspecies crosses,  have made it possible to
develop high resolution multilocus genetic and
physical linkage maps of the entire genome.  The
development of these maps will virtually revolutionize
the study of the mammalian genome.

     This report grew out of a meeting held at
Princeton University on November 17, 1989.  In
attendance were 19 scientists (see Appendix) who have
extensive expertise in mouse genetics and molecular
biology.  The goal of the meeting was to draw up a
reasonable blueprint for generating a genetic and
physical map, and ultimately the DNA sequence, of the
mouse genome.  Goals were set for these endeavours
that we hope will be useful to the National Institutes
of Health as it sets its priorities for the Genome
Project.  The report is divided into five sections
which cover genetic mapping, physical mapping, DNA
sequencing, informatics and animal resources.
SECTION I.  GENETIC MAPPING

     In the past, chromosomal gene assignments in the
mouse relied on meiotic mapping involving the use of
classical two- and three-point crosses between
genetically nonidentical inbred strains or recombinant
inbred (RI) strains established from crosses of inbred
laboratory mouse progenitors.  Both of these
approaches are, however, limited by the difficulty in
identifying allelic differences among inbred strains.
RI strain analysis also suffers from the additional
disadvantage that only close linkages can be
identified.

      These two disadvantages have been overcome by
using interspecific crosses, which exploit the
differences between genetically diverse Mus species
and standard inbred strains.  The resulting
interspecific hybrids are multiply heterozygous across
every chromosome and, because the inbred strains have
been extensively characterized, the phase of linkage
is known for every combination of genes that is known.
Through the power of interspecific backcross mapping
it is now possible to develop high resolution
multilocus linkage maps for the mouse genome.  By
eliminating the time-consuming search for extensive
polymorphism, a chronic problem in human gene mapping,
it is likely that the mouse genetic map will proceed
far more rapidly than the human map.  Once generated,
the mouse genetic map will allow the complete
identification of regions of synteny and linkage
conservation between mouse and man, thereby
facilitating the human mapping efforts that will be
ongoing concurrently.  The other major advantage of
multi-locus mapping is that, for each cross the map is
cumulative; that is, each new gene is mapped relative
to every other gene, irrespective of the laboratory in
which it is performed.

     The long-term goal is to develop a genetic map of
the mouse with molecularly identified genes mapped to
a resolution of 0.5 to l.0 cM.  A map of this density
will serve as an important tool for the rapid and
efficient molecular analysis of mutant loci, and it
can serve as the base for the physical mapping of the
mouse genome and eventually the sequence analysis of
specific regions.  The workshop identified a two-step
process for building the mouse genetic map toward the
long-term objective.  First is the development of an
efficient method for localizing additional markers on
the linkage map with respect to a series of anchor
loci.  Second is the elaboration of mapping resources
that define the order of genes on chromosomes at a
resolution of 1.0 cM or less.  These steps are
discussed in more detail below.


1.  A set of approximately 320 anchor loci which will
serve to colinearize genetic maps generated at
different sites and using different methods.

     The immediate goal is to develop backcross
gene-mapping resources that locate a new locus on the
genetic map within 5 cM distance of an existing anchor
gene.  By choosing 5 cM as the optimal distance
between anchor loci,  approximately 320 anchor loci
(l,600 cM mouse genome/5 cM between anchors) will be
needed to cover the entire mouse genome.  This density
of anchor loci should provide sufficient genetic
resolution for future high resolution genetic mapping
studies.

      The 320 loci become important anchor references
for further orientation of the genetic map and they
will become sequence tagged sites (STSs) for the
elaboration of physical maps.  For a locus to be
chosen as an anchor it must  hybridize well to mouse
DNA and provide a STS that is polymorphic in
interspecific mouse backcrosses.  Ideally, the anchors
should also define the ends of conserved regions
between mouse and human genomes, facilitating cross
species comparisons, and be polymorphic among inbred
laboratory strains.

     The development of an STS based anchor map of the
mouse genome will enable any interested investigator
to participate in mouse genome mapping.  As long as
each investigator includes the appropriate anchor loci
in his or her map, all maps can readily be aligned in
the future.  It is anticipated that the majority of
laboratories will employ interspecific backcrosses for
mapping purposes.  However, many other types of
crosses (intraspecific, RI, etc.) need to be continued
as well.  These will be important in assessing the
generalities of the interspecies map, as at least one
chromosomal inversion between M. domestices and M.
spretus has been identified.  Such rearrangements
would have profound effects on recombinational
mapping.

2.  A genetic map of 1cM resolution.

     The optimal resolution of the mouse genetic
linkage map was set at l cM (where a l cM genetic map
is defined as a map where greater than 95% of the
recombination breakpoints lie within l cM of each
other).  The workshop members arrived at that level of
genetic resolution based on the conviction that this
is sufficient to guarantee a meaningful integration of
the genetic and physical maps.

     To achieve the 1cM goal, an additional
3,000-4,000 probes (200-400 markers per chromosome)
will need to be placed on the genetic map.  A
backcross panel of 300 progeny provides a 95%
probability of recombining genes that are 1.0 cM apart
and a 460 progeny panel provides a 99% probability of
recombination in that distance.  This is well within
the means of many laboratories to produce.  We
anticipate that a l cM mouse linkage map can be in
place within the next 5 years.

     Since cloned genes will primarily be mapped in
the future by interspecific backcross panels,  DNA
sequence information should be available to convert
these probes into STS's for the mouse.  The meeting
endorsed the use of STS's for mouse genome mapping.
To facilitate the overall conversion to an STS-based
map in the future, we will strongly encourage the
prior identification and description of the STS as a
prerequisite for inclusion of any new marker into the
composite map database.

SECTION II.  THE PHYSICAL MAP

     As an ultimate goal, the physical map of the
mouse should consist of an ordered set of STS's spaced
at intervals of l00 kilobase pairs of DNA (kb)
throughout the mouse genome.  Once the ordered STS's
are in hand, it is hoped that it will not be necessary
to maintain any particular reference library of
clones.  Rather the information in STS's will provide
the probes for isolating clones from any region of the
mouse genome.  Without endorsing a particular
strategy, which would be premature at this time, there
was unanimous sentiment that a physical map of the
mouse genome is a very high priority which could be
achieved within the next five to eight years.

     The workshop strongly endorsed physical mapping
at two levels of  complexity: regional and genome
wide.  These will be considered separately, although
it is clear that coordination, possibly through the
use of a common yeast artificial chromosome (YAC)
library, will enhance both kinds of efforts.  It was
the strong conviction of the attendees that both were
essential and should begin immediately.  Unless
genome-wide mapping is tackled, closure of the
physical map will be many years away.

1.  Regional mapping

          The workshop recommended that current
efforts at regional physical mapping be intensified
immediately.  Techniques for making chromosome
specific or chromosome enriched libraries, such as
chromosome microdissection, somatic cell hybrids
segregating mouse chromosomes, and flow sorted
(Robertsonian) chromosomes are currently available.
Others are likely to be developed in the next few
years.

     This endeavour will almost certainly take place
in regions that are well characterized with respect to
genetics and biological function.  Although
interpretation of this criterion will certainly vary
from investigator to investigator, evaluation of what
is currently available in the mouse-mutation resource
does suggest several regions on which to concentrate.
Clearly, the proximal region of chromosome l7 is
destined for such model status; the years of
investigation into the nature and function of both the
t complex and the major histocompatibility complex,
the vast storehouse of mouse "reagents" such as MHC
congenic strains, recombinant t haplotypes,
ethylnitrosourea-induced lethal mutations, spontaneous
chromosomal rearrangements involving t chromosomes,
and region-specific chromosome-dissection microclones,
which provide many nucleation points for the
development of long-range physical maps of the region,
all combine to suggest that intensive study of this
region should continue.

     Other regions to be considered are those covered
by existing complexes of overlapping radiation- and
chlorambucil-induced germline deletions, or those
regions such as the heavy and light chain
immunoglobulin loci and T cell receptor loci which are
dense with genetic information.  The length of genome
associated just with the 6 to ll cM deletion complex
surrounding the albino (c) locus on chromosome 7, for
example, could approach 0.3% to 0.7% of the total
genome.  Similar arguments can be made for the
dilute-short ears complex on chromosome 9, the Agouti
complex on chromosome 2, and the brown (b) complex on
chromosome 4.  These animal resources are unique in
mammalian biology, and provide powerful tools for
physical mapping.  Molecular cloning of
breakpoint-junction fragments from independent members
of a deletion complex affords an efficacious way of
chromosome "jumping" back and forth within a defined
length of genome.

     The existing mutation resource, as well as that
which can be newly developed, thus provides the
genetic tools for intensive physical and correlated
functional mapping in selected regions of the mouse
genome.  Because of extensive mouse-human
genome-segment homologies, exploitation of mouse
mutations, and the maps to which they contribute, may
make it possible to assign functions to human DNA
sequences that might otherwise be characterized only
at the DNA sequence level.
2.  Genome-Wide Physical Mapping

     Although there is strong support for regional
mapping efforts, it is recognized that regional
mapping will not lead to a complete physical map.  For
this, it will be necessary to generate an overlapping
contig map of the entire mouse genome.  The workshop
endorsed immediate initiation of projects aimed at
genome-wide physical mapping.  The paradigms for such
an endeavour  are the successful projects undertaken
for E. coli, yeast, and C. elegans.  Although there
are a variety of approaches which can be used to
assemble an overlapping contig map, certain basic
considerations apply across the board.  As shown by
theory and practice, it is impractical to achieve
closure of overlapping contig maps, at least by random
addition of clones.  For the mouse genome, it seems
practical to aim at assembling clones into perhaps
2000-4000 islands of physically overlapping clones
which must then be ordered relative to each other in
the genome.  Once the islands are ordered, the gaps
can be systematically closed.

     Projects to generate contigs in a genome of 3 X
109 bp should be strongly encouraged.  Clone
fingerprinting in a mammalian genome may require new
approaches, for example, hybridization of restriction
enzyme digested DNA from YAC clones with probes
detecting frequent repeats in the mouse genome to
produce distinctive patterns of fragment sizes. Robust
methods for band detection will be essential to a
project of this size.  The development of additional
moderately repetitive probes is also highly
encouraged.  Moderately repetitive probes have many
important uses aside from fingerprinting,  including
the rapid mapping of uncloned mouse mutations, the
establishment of skeleton linkage maps for the mouse
genome, and the mapping of polygenic traits.

     The blueprint set for the genetic map in Section
I. clearly is important in considering strategies for
genome-wide physical mapping.  The assignment of
contigs to the genetic map will be greatly facilitated
by interspecific cross mapping; e.g., taking probes
containing genetically-mapped polymorphisms and
identifying the contigs that contain them.  By
contrast, it is worth noting that the greater
difficulty of genetic mapping in the human might tend
to favor high resolution in-situ hybridization as a
mapping strategy.

      Ultimately the physical contig map  will be
converted into STS's, in a variety of ways.  For
example, one might (i) make STSs that detect genetic
polymorphisms, placing them on the genetic map and a
clone collection simultaneously; (ii) make STSs from
the ends of clones, using the STS to detect small
overlaps between clones; (iii) make STSs from genes of
interest; or (iv) use a combination of these
approaches.

SECTION III.   DNA SEQUENCING

          In the end the physical and genetic maps are
preludes to complete DNA sequence analysis.  It is
important that the mouse community take immediate (and
selective) advantage of the maps to generate regional
sequence.
The workshop supported the idea of several regional
DNA sequencing projects where a 0.5-2 mb focus is
chosen based on interesting genetics and biology.
This is the size of a large gene or many multigene
loci (e.g., cystic fibrosis, MHC, T-cell receptor and
antibody loci).  Ideally, each sequencing project
should be done in parallel with the same region in the
human genome,  and in intimate coordination with
investigations on the biology and genetics of the
locus.  At this time a group of l2-l5 people with
appropriate instrumentation and computational support
could sequence perhaps one mb per year.  The number
and scope of regional sequencing projects could
increase as the sequencing technologies become more
mature.

     It is important to stress again why sequence
information for the mouse is not redundant with that
for the human.  The evolutionary distance between
mouse and human is sufficient to allow the
identification of important features of mammalian
genomes, which would include both coding, regulatory
and structural elements.  For example the first
enhancer was identified by virtue of its homology
between these two species. Second, given that the
mouse will be the primary experimental organism in
which to explore the function of human genes, then it
will be necessary to have intimate knowledge of the
homologous mouse genes.  The complete DNA sequence
will provide powerful new tools to explore the
developmental and molecular biology of these genes
(e.g., knowledge of all restriction sites, PCR probes,
antibodies to coding regions, etc.).  These tools will
revolutionize the manner in which the corresponding
loci can be analyzed.

SECTION IV.  INFORMATICS

     Informatics is a key aspect of the mouse portion
of the Genome Project.  Not only is it essential for
managing and exploiting the substantial and complex
data that will be generated, but it will also
represent a key connection to the other species in the
Project and to all subsequent studies of organismal
biology.  We strongly endorse the notion that this
effort be international in scope, especially as groups
in Europe and Japan are already (Green) contributing
significantly to the Mouse Genome Effort.  The
International Mouse Genetics Community has
well-established organs of communication that serve as
important background resources.  The Mouse Newsletter,
and Green's Genetic Variants and Strains of the
Laboratory Mouse are two such examples.  It is in the
best interests of the entire field that databases be
assembled with the international community in mind.

     There are two primary uses of genome-related
databases: as repositories of data from which
particular bits of information can be extracted and as
a source of information from which data can be
retrieved for analyses of genome organization and
biological function.  These uses dictate the manner in
which these data should be stored, displayed, analyzed
and distributed.  The following is based on the
premise that informatics needs should be addressed
with small, focused, sophisticated databases that any
user can access in a simple, uniform manner,
preferably one based on a graphical, point-and-click
environment.  The important issues which need to be
addressed are outlined below.

1.  Databases structure

     There are two alternative philosophies of
database composition.  One is to create a monolithic
database of all possible mouse genome-related
information.  The advantages of this approach are
several.  Information is concentrated at a particular
site and users immediately know where and how
information can be obtained.  Moreover, users need to
learn only one user environment and they can become
intimately familiar with the subtleties of the
system.

     The alternative philosophy involves marginal
databases consisting of a series of small specialized
databases each containing specific types of genome
information.  Again there are several advantages to
this approach.  Each database is maintained by a
specialist in that particular area.  Each specialist
can be relied upon to keep the information in a better
manner than the most well-intentioned person who is
less intimately related with the data.  Databases can
be built and maintained in ways that address the
specific needs of the specialty area.  Changing
interests and needs can be tracked efficiently.  The
principal disadvantage of this approach is that users
must learn the idiosyncrasies of each system.
Projects are underway, however, to address this
problem by providing a single user environment
("front-end") in which access to diverse databases can
be obtained transparently.

2.  Quality of the Databases

     It goes without saying that the data must be of
high quality.  Every effort should be made to enter
data in its most robust form, thereby enabling the
widest variety of uses.  An example is the observed
numbers of progeny of each haplotype class in
multi-locus crosses, rather than summarized measures
of recombination frequencies.  Summarized data by
their nature limit the range of analyses that are
possible.

     An important feature that database keepers should
incorporate is a coding of the data according to their
status, e.g. published versus personal communication,
confirmed versus provisional.  This feature would
enable users to identify strong and weak data and to
make their own decisions about which data should be
included in their analysis.  Another important feature
should be comments on special or idiosyncratic aspects
of design, methods, results, or any other relevant
part of each study.

3.  Connections between databases

     It is essential that means be developed for
accessing and analyzing diverse databases
simultaneously.  These connections represent one of
the most important consequences of the Genome Project
- the utilization of genetic and physical mapping data
to understand organismal biology.  A simple example is
comparison of the physical map with the genetic map of
phenotypic deviants with simultaneous access to a
description of the phenotype, the genetic and physical
maps of the homologous portions of the human genome,
and a description of genetic diseases that map to that
region.

     Excellent communication exists not only between
the various groups working on Mouse Informatics, but
also with the Human Informatics community.  (Lest this
comment appear gratuitous - several meetings between
the two Informatics groups at The Jackson Laboratory
has led to a firm commitment of both groups to work
together to build comprehensive databases for the
mouse.)  Efforts are being made to coordinate database
development in order to minimize unintentional
redundancy.  Plans are being developed for a workshop
to be held as soon as possible to discuss Mouse
Informatics.  This workshop will provide a valuable
forum for discussing informatic's needs, alternative
solutions, and means of implementation.

4.  Database needs and existing databases

     A wide variety of reasonably sophisticated
databases of mouse genetic information already exist.
Each of these databases is being enhanced to improve
performance, user compatibility, and flexibility.  It
is probable that other kinds of databases will be
developed.  For example, important new databases
dealing with screening physical mapping libraries with
defined clones and probes and the assembly of the
physical map from overlapping segments of the genome
are needed.  Databases that deal with important areas
of biological information about the mouse and other
species of mammals also exist.  A simple example of
the utility of these databases involves MUTCAT, M.C.
Green's description of heritable mutations.  Many of
the databases, analytical methods, and multi-level
connections between databases can only be described
very generally; specific descriptions of future
database needs is premature.

     Listed below are examples of the databases that
already exist or that need to be developed.

      Database needs                         ExistsTo
be developed

Directly genome related

Genetic maps
     multilocus                              In part
     mutation (incl. the historical map)     Yes
     chromosome maps                         Yes
Physical maps
     contig - ordered clones                      Yes
     hit-map (hyb. of clones to library)
     Yes
Cytogenetic map
     banding patterns                        Yes
     chr rearrangements and fragile sites    Yes
     in situ hybridization                   Yes
     deletions                               In part
RFLP map                                     Yes
Comparative genetic maps                     Yes
Comparative physical maps                         Yes

Relevant databases of biological information

Protein sequence and structure               Yes
Protein function                             ??
Mutant phenotype                             Yes
Disease loci                                 Yes
MIM                                          Yes
MUTCAT                                       Yes
LONDON DYSMORPHOLOGY                         Yes

Resources

Clones and probes                            Yes
Lane list of strains and mutations           Yes
JAX DNA resource                             Yes
     It is premature to define hardware and software
standards.  Software needs are not sufficiently
well-defined that specific programs can be developed
and generally supported.  Instead, effort should focus
in the short term on research on the construction and
use of database structures and mechanisms to integrate
the various efforts to develop useful databases.  The
hardware available for these problems is rapidly
advancing.  The software and databases should be as
independent of particular hardware architectures as
possible.  The goal should be portability,
flexibility, and independence. Dialogue between
research groups involved in studying these problems is
essential.

SECTION V.  MOUSE STRAIN AND MUTATION RESOURCES

     One of the major reasons that the mouse has
become the preeminent mammalian model organism is the
rich genetic resources that exist world wide.  The
roles that the various Mus species collected
worldwide, the standard inbred strains, with their
genetic homogeneity, and the large number of  mutant
strains will play in genetic and physical mapping has
already been described.  However it is equally
important to consider the important role these
mutations, and the ones that will be generated in the
future, will play in correlating DNA sequence with
biological function.  The generation and study of
mutations in the mouse will contribute in a
significant way to understanding the functional makeup
of the mammalian genome by providing a direct,
experimentally malleable system for associating
biologically significant phenotypes first with regions
of the genome, and then with specific DNA sequences
themselves.

     That the availability of heritable genetic
alterations, and their subsequent use as biological
"reagents", are essential components in the
molecular-genetic analysis of any organism is an
undisputed fact.  Recent developments in both
classical- and molecular-mutagenesis techniques have
provided means to induce particular types of mouse
mutations, each with its own applications, advantages,
and disadvantages.  Indeed, the exciting new technique
of homologous recombination in embryonic stem-cell
(ES) lines is a powerful method for introducing
specific, defined mutations within a gene of choice.
Assuming that current technical difficulties with this
procedure will be ironed out in the near future, its
only limitations are that one must first start with a
cloned sequence (no problem at the end of the genome
initiative, but difficult initially), and that genes
must be mutated one at a time.  It was generally
agreed at the Workshop that the technique of
homologous recombination takes its place as an
extremely important and useful tool in the future of
biology, but will not (and should not) be the only
source of new mutations to complement the genome
initiative.

     Insertional mutagenesis, either by naturally
occurring endogenous elements, or by exogenously added
DNA (as in the formation of transgenic mice), likewise
provides a means of inducing mutations that will also
be immediately accessible at the molecular level.  In
contrast to homologous-recombination-induced
mutations, insertional mutations may prove to be quite
random, occurring at just about any locus throughout
the genome.

     Spontaneous and, importantly, agent-induced
chromosomal rearrangements have proved, in many
systems, to be valuable genetic tools for gaining
initial molecular access either to a specific
(uncloned) gene or to an entire  chromosomal region.
These rearrangements --- specifically, deletions,
translocations, and inversions --- have also proved
essential in providing both cytogenetic and molecular
landmarks or reference points in the development of
detailed physical maps of entire chromosomal regions;
in some cases, they provide the actual definition of
when one has indeed arrived at a specific locus at the
molecular level.  Because of these attributes,
chromosomal rearrangements are, at this point in time,
highly desirable research tools for beginning the
molecular analysis of a genomic region or locus.

     Several regions of the mouse genome (perhaps two
to three percent, in aggregate) are currently covered
by extensive "complexes" of overlapping radiation- and
chlorambucil-induced deletions.  Unfortunately, these
types of rearrangements are currently not available
for the major portion of the mouse genome.  Several
laboratories are now developing the use of
chlorambucil mutagenesis of male germ cells as a
high-efficiency method for creating chromosomal
rearrangements (and particularly deletions) at
numerous additional regions throughout the genome, and
this should be encouraged.

     Intragenic mutations (existing or newly induced),
while often providing suitable mammalian models for
the study of developmental and clinical genetics,
identify loci that are difficult to localize precisely
on genetic and physical maps with current
technologies.  The localization of such loci into
intervals of the l-cM cloned-marker
interspecific-backcross map will present a major
challenge, especially for those mutations associated
with phenotypes such as lethality or reproductive
failure.  On the other hand, such intragenic mutations
have considerable utility for refining maps of entire,
specific, genomic regions.  Recently developed
procedures for high-efficiency germline mutagenesis
with agents, such as N-ethyl-N-nitrosourea, that
induce primarily intragenic mutations, along with
appropriate breeding protocols, now provide the best
means for determining the functional-locus makeup of
specific stretches of chromosome, for estimating the
numbers and kinds of genes that one might expect to
find within the complete DNA sequence, and for
determining the gene densities that might be
encountered in different "structural" regions of
chromosomes (e.g., G-light versus G-dark bands).

     Particularly cogent in this context would be the
exploitation of high-efficiency mutagenesis protocols
in mouse regions that share significant homologies
with certain human regions.  In this way, aspects of
the functional composition of certain regions of the
human genome might be ascertained early on in the
genome initiative.

     In summary, the role of the mouse-mutation
resource in the genome initiative is currently
substantial, and it should become even more important
and relevant as the initiative evolves.  Indeed, the
major strength of coordinating the molecular aspects
of the human and mouse genome initiatives with the
mouse-mutation resource is that the outcomes need not
stand alone as a sterile catalog; they can readily be
associated with functional information derived from
genetic investigations, and with experimental studies
that can delve into the earliest development of
abnormal phenotypes.