THE BLUEPRINT OF MAN THE HUMAN GENOME
COMPILED AND DESIGNED BY JOSE VILLICANA
THE BLUEPRINT OF MAN THE HUMAN GENOME
COMPILED AND DESIGNED BY JOSE VILLICANA
THE BLUEPRINT OF MAN THE HUMAN GENOME
BOOK DESIGN © 2011 BY JOSE VILLICANA ACADEMY OF ART UNIVERSITY SPRING 2011 GR 330 OL1: TYPOGRAPHY 3: COMPLEX HIERARCHY TAUGHT BY: CAROLINA DE BARTOLO PRINTED AT CHUM’S DESIGN & PRINT BOUND AT THE KEY PRINTING AND BINDING ALL RIGHTS RESERVED
TABLE OF CONTENTS
INTRODUCTION
01
OUR GENES
21
DNA
35
RNA
47
CHROMOSOMES
57
MUTATIONS
67
THE FUTURE OF GENETICS
79
INDEX
89
INTRODUCTION
01 Life is specified by genomes. Every organism, including humans, has a genome that contains all of the biological information needed to build and maintain a living example of that organism. The biological information contained in a genome is encoded in its deoxyribonucleic acid (DNA) and is divided into discrete units called genes. Genes code for proteins that attach to the genome at the appropriate positions and switch on a series of reactions called gene expression. Inside each of our cells lies a nucleus, a membranebounded region that provides a sanctuary for genetic information. The nucleus contains long strands of DNA that encode this genetic information. A DNA chain is made up of four chemical bases: adenine (A) and guanine (G), which are called purines, and cytosine (C) and thymine (T), referred to as pyrimidines. Each base has a slightly different composition, or combination of oxygen, carbon, nitrogen, and hydrogen. In a DNA chain, every base is attached to a sugar molecule (deoxyribose) and a phosphate molecule, resulting in a nucleic acid or nucleotide. Individual nucleotides are linked through the phosphate group, and it is the precise order, or sequence, of nucleotides that determines the product made from that gene. The DNA that constitutes a gene is a double-stranded molecule consisting of two chains running in opposite directions.
01.
NH2
N
A
Adenine
C
Cytosine
G
Guanine
C C
HC
N
C
HC
CH N
N
NH2
C CH
N
C
CH
O
N
O
N
C
C
C
NH
C
HC N
C N
NH2
T
O
CH
C HN
C
C O
CH N
02. THE BLUEPRINT OF MAN: THE HUMAN GENOME
Thymine
The chemical nature of the bases in double-stranded DNA creates a slight twisting force that gives DNA its characteristic gently coiled structure, known as the double helix. The two strands are connected to each other by chemical pairing of each base on one strand to a specific partner on the other strand. Adenine (A) pairs with thymine (T), and guanine (G) pairs with cytosine (C). Thus, A-T and G-C base pairs are said to be complementary. This complementary base pairing is what makes DNA a suitable molecule for carrying our genetic information—one strand of DNA can act as a template to direct the synthesis of a complementary strand. In this way, the information in a DNA sequence is readily copied and passed on to the next generation of cells though not all genetic information is found in nuclear DNA. Both plants and animals have an organelle—a “little organ” within the cell— called the mitochondrion. Each mitochondrion has its own set of genes. Plants also have a second organelle, the chloroplast, which also has its own DNA. Cells often have multiple mitochondria, particularly cells requiring lots of energy, such as active muscle cells. This is because mitochondria are responsible for converting the energy stored in macromolecules into a form usable by the cell, namely, the adenosine triphosphate (ATP) molecule. Thus, they are often referred to as the power generators of the cell.Unlike nuclear DNA. The DNA (the DNA found within the nucleus of a cell), half of which comes from our mother and half from our father, mitochondrial DNA is only inherited from our mother. This is because mitochondria are only found in the female gametes or “eggs” of sexually reproducing animals, not in the male gamete, or sperm.
INTRODUCTION 03.
Mitochondrial DNA also does not recombine; there is no shuffling of genes from one generation to the other, as there is with nuclear genes. Large numbers of mitochondria are found in the tail of sperm, providing them with an engine that generates the energy needed for swim ming toward the egg. However, when the sperm enters the egg during fertilization, the tail falls off, taking away the father’s mitochondria. The energy-conversion process that takes place in the mitochondria takes place aerobically, in the presence of oxygen. Other energy conversion processes in the cell take place anaerobically, or without oxygen. The independent aerobic function of these organelles is thought to have evolved from bacteria that lived inside of other simple organisms in a mutually beneficial, or symbiotic relationship, providing them with a type of aerobic capacity. Through the process of evolution, these tiny organisms became incorporated into the cell, and their genetic systems and cellular functions became integrated to form a single functioning cellular unit. Because mitochondria have their own DNA, RNA, and ribosomes, this scenario is quite possible. This theory is also supported
by
the
existence
of
a
eukaryotic
organism,
called the amoeba, which lacks mitochondria. Therefore, amoeba must always have a symbiotic relationship with an aerobic bacterium. There are many diseases caused by mutations in mitochondrial DNA (mtDNA). Because the mitochondria produce energy in cells, symptoms of mitochondrial diseases often involve degeneration or functional failure of tissue. For example, mtDNA mutations have been identified in some forms of diabetes, deafness, and certain inherited heart diseases.
04. THE BLUEPRINT OF MAN: THE HUMAN GENOME
In addition, mutations in mtDNA are able to accumulate throughout an individual’s lifetime. This is different from mutations in nuclear DNA, which has sophisticated repair mechanisms to limit the accumulation of mutations. Mitochondrial DNA mutations can also concentrate in the mitochondria of specific tissues. A variety of deadly diseases are attributable to a large number of accumulated mutations in mitochondria. There is even a theory, the Mitochondrial Theory of Aging, that suggests that accumulation of mutations in mitochondria contributes to, or drives, the aging process. These defects are associated with Parkinson’s and Alzheimer’s disease, although it is not known whether the defects actually cause or are a direct result of the diseases. However, evidence suggests that the mutations contribute to the progression of both diseases. In
addition
to
the
critical
cellular
energy-related
functions, mitochondrial genes are useful to evolutionary biologists because of their maternal inheritance and high rate of mutation. By studying patterns of mutations, scientists are able to reconstruct patterns of migration and evolution within and between species. For example, mtDNA analysis has been used to trace the migration of people from Asia across the Bering Strait to North and South America. It has also been used to identify an ancient maternal lineage from which modern man evolved. Just like DNA, ribonucleic acid (RNA) is a chain, or polymer, of nucleotides with the same 5’ to 3’ direction of its strands. However, the ribose sugar component of RNA is slightly different chemically than that of DNA. RNA has a 2’ oxygen atom that is not present in DNA. Other fundamental structural differences exist.
INTRODUCTION 05.
For example, uracil takes the place of the thymine nucleotide found in DNA, and RNA is, for the most part, a single-stranded molecule. DNA directs the synthesis of a variety of RNA molecules, each with a unique role in cellular function. For example, all genes that code for proteins are first made into an RNA strand in the nucleus called a messenger RNA (mRNA). The mRNA carries the information encoded in DNA out of the nucleus to the protein assembly machinery, called the ribosome, in the cytoplasm. The ribosome complex uses mRNA as a template to synthesize the exact protein that is
coded
for by the human gene. Although DNA is the carrier of genetic information in a cell, proteins do the bulk of the work. Proteins are long chains containing as many as 20 different kinds of amino acids. Each cell contains thousands of different proteins: enzymes that make new molecules and catalyze nearly all chemical processes in cells; structural components that give cells their shape and help them move; hormones that transmit signals throughout the body; antibodies that recognize foreign molecules; and transport molecules that carry oxygen. The genetic code carried by DNA is what specifies the order and number of amino acids and, therefore, the shape and function of the protein. The “Central Dogma”—a fundamental principle of molecular biology—states that genetic information flows from DNA to RNA to protein. Ultimately, however, the genetic code resides in DNA because only DNA is passed from generation to generation. Yet, in the process of making a protein, the encoded information must be faithfully transmitted first to RNA then to protein. Transferring the code from DNA to RNA is a fairly straightforward process called transcription.
06. THE BLUEPRINT OF MAN: THE HUMAN GENOME
Deciphering the code in the resulting mRNA is a little more complex. It first requires that the mRNA leave the nucleus and associate with a large complex of specialized RNAs and proteins that, collectively, are called the ribosome. Here the mRNA is translated into protein by decoding the mRNA sequence in blocks of three RNA bases, called codons, where each codon specifies a particular amino acid. In this way, the ribosomal complex builds a protein one amino acid at a time, with the order of amino acids determined precisely by the order of the codons in the mRNA. Genes make up about 1 percent of the total DNA in our genome. In the human genome, the coding portions of a gene, called exons, are interrupted by intervening sequences, called introns. In addition, a eukaryotic gene does not code for a protein in one continuous stretch
of
DNA.
Both
exons
and
introns
are
“tran-
scribed” into mRNA, but before it is transported to the ribosome, the primary mRNA transcript is edited. This editing process removes the introns, joins the exons together, and adds unique features to each end of the transcript to make a “mature” mRNA. One might then ask what the purpose of an intron is if it is spliced out after it is transcribed? It is still unclear what all the functions of introns are, but scientists believe that some serve as the site for recombination,
INTRODUCTION 07.
U
C
A
G
UUU Phenylalanine
UCU Serine
UGU Cysteine
UUC Phenylalanine
UCC Serine
UGC Cysteine
UUA Leucine
UCA Serine
UGA Stop
UUG Leucine
UCG Serine
UGG Tryptophan
CUU Leucine
CCU Proline
CGU Arginine
CUC Leucine
CCC Proline
CGC Arginine
CUA Leucine
CCA Proline
CGA Arginine
CUG Leucine
CCG Proline
CGG Arginine
AUU Isoleucine
ACU Threonine
AGU Serine
AUC Isoleucine
ACC Threonine
AGC Serine
AUA Isoleucine
ACA Threonine
AGA Arginine
AUG Methionine
ACG Threonine
AGG Arginine
GUU Valine
GCU Alanine
GGU Glycine
GUC Valine
GCC Alanine
GGC Glycine
GUA Valine
GCA Alanine
GGA Glycine
GUG Valine
GCG Alanine
GGG Glycine
Table 1. RNA triplet codons and their corresponding amino acids.
08. THE BLUEPRINT OF MAN: THE HUMAN GENOME
When the complete mRNA sequence for a gene is known, computer programs are used to align the mRNA sequence with the appropriate region of the genomic DNA sequence. This provides a reliable indication of the beginning and end of the coding region for that gene. In the absence of a complete mRNA sequence, the boundaries can be estimated by ever-improving, but still inexact, gene prediction software. The problem is the lack of a single sequence pattern that indicates the beginning or end of a eukaryotic gene. Fortunately, the middle of a gene, referred to as the core gene sequence, this would have enough consistent features that would
allow
for more reliable predictions.
Close up of the DNA strand DNA bases pair up with each other, A with T and C with G, to form units called base pairs. Each base is also attached to a sugar molecule and a phosphate molecule. Together, a base, sugar, and phosphate are called a nucleotide. Nucleotides are arranged in two long strands that form a spiral called a double helix. The structure of the double helix is somewhat like a ladder, with the base pairs forming the ladder’s rungs and the sugar and phosphate molecules forming the vertical sidepieces of the ladder.
INTRODUCTION 09.
Transcription, the synthesis of an RNA copy from a sequence of DNA, is carried out by an enzyme called RNA polymerase. This molecule has the job of recognizing the DNA sequence where transcription is initiated, called the promoter site. In general, there are two “promoter” sequences upstream from the beginning of every gene. The location and base sequence of each promoter site vary for prokaryotes (bacteria) and eukaryotes (higher organisms), but they are both recognized by RNA polymerase, which can then grab hold of the sequence and drive the production of an mRNA. Eukaryotic
cells
have
three
different
RNA
polymer-
ases, each recognizing three classes of genes. RNA polymerase II is responsible for synthesis of mRNAs from protein-coding genes. This polymerase requires a sequence resembling TATAA, commonly referred to as the TATA box, which is found 25-30 nucleotides upstream of the beginning of the gene, referred to as the initiator sequence. Transcription terminates when the polymerase stumbles upon a termination, or stop signal. In eukaryotes, this process is not fully understood. Prokaryotes, however, tend to have a short region composed of G’s and C’s that is able to fold in on itself and form complementary base pairs, creating a stem in the new mRNA. This stem then causes the polymerase to trip and release the nascent, or newly formed, mRNA.
10. THE BLUEPRINT OF MAN: THE HUMAN GENOME
The
beginning
the
genetic
of
code
translation, carried
by
the
process
in
mRNA
directs
the
which syn-
thesis of proteins from amino acids, differs slightly for
prokaryotes
and
eukaryotes,
although
both
pro-
cesses always initiate at a codon for methionine. For prokaryotes, the ribosome recognizes and attaches at the sequence AGGAGGU on the mRNA, called the ShineDelgarno sequence, that appears just upstream from the methionine (AUG) codon. Curiously, eukaryotes lack this recognition sequence and simply initiate translation at the amino acid methionine, usually coded for by the bases AUG, but sometimes GUG. Translation is terminated for both prokaryotes and eukaryotes when the ribosome reaches one of the three stop codons. Sequences that code for proteins are called structural genes. Although it is true that proteins are the major components of structural elements in a cell, proteins are also the real workhorses of the cell. They perform such functions as transporting nutrients into the cell; synthesizing new DNA, RNA, and protein molecules; and transmitting chemical signals from outside to inside the cell, as well as throughout the cell—both critical to the process of making proteins.
INTRODUCTION 11.
A class of sequences called regulatory sequences makes up a numerically insignificant fraction of the genome but provides critical functions. For example, certain sequences indicate the beginning and end of genes, sites for initiating replication and recombination, or provide landing sites for proteins that turn genes on and off. Like structural genes, regulatory sequences are inherited; however, they are not com monly referred to as genes. Forty to forty-five percent of our genome is made up of short sequences that are repeated, sometimes hundreds of times. There are numerous forms of this “repetitive DNA”, and a few have known functions, such as stabilizing the chromosome structure or inactivating one of the two X chromosomes in developing females, a process called X-inactivation. The most highly repeated sequences found so far in mammals are called “satellite DNA” because their unusual composition allows them to be easily separated from other DNA. These sequences are associated with chromosome structure and are found at the centromeres (or centers) and telomeres (ends) of chromosomes.
12. THE BLUEPRINT OF MAN: THE HUMAN GENOME
Although they do not play a role in the coding of proteins, they do play a significant role in chromosome structure, duplication, and cell division. In February 2001, two largely independent draft versions of the human genome were published. Both studies estimated that there are 30,000 to 40,000 genes in the human genome, roughly one-third the number of previous estimates. More recently scientists estimated that there are less than 30,000 human genes. However, we still have to make guesses at the actual number of genes, because not all of the human genome sequence is annotated
and
not
all
of
the
known
sequence
has
been
assigned a particular position in the genome. Only the
a
small
human
percentage
genome
of
becomes
the an
3
billion
expressed
bases
gene
in
prod-
uct. However, of the approximately 1 percent of our genome that is expressed, 40 percent is alternatively spliced
to
produce
multiple
proteins
from
a
single
gene. Alternative splicing refers to the cutting and pasting of the primary mRNA transcript into various combinations of mature mRNA. Therefore the one gene– one protein theory, originally framed as “one gene–one enzyme”, does not precisely hold.
INTRODUCTION 13.
With so much DNA in the genome, why restrict transcription to a tiny portion, and why make that tiny portion work overtime to produce many alternate transcripts? This process may have evolved as a way to limit the deleterious effects of mutations. Genetic mutations occur randomly, and the effect of a small number of mutations on a single gene may be minimal. However,
an
individual
having
many
genes
each
with
small changes could weaken the individual, and thus the species. On the other hand, if a single mutation affects several alternate transcripts at once, it is more likely that the effect will be devastating the individual may not survive to contribute to the next generation. Thus, alternate transcripts from a single gene could reduce the of
a mutated gene
TELOMERES
CENTROMERES
TELOMERES
14. THE BLUEPRINT OF MAN: THE HUMAN GENOME
PARENTAL DNA MOLECULES
A T
C G
G C
T A
A T
C G
G C
T A
A T
C G
G C
T A
A
C
G
T
A
C
G
T
A
C
G
T
T
G
C
A
T
G
C
A
T
G
C
A
RECOMBINATION INTERMEDIATE
A T
C G
G C
T A
A T
C G
G C
T A
A T
C G
G C
T A
A
C
G
T
A
C
G
T
A
C
G
T
T
G
C
A
T
G
C
A
T
G
C
A
RECOMBINATS
A T
C G
G C
T A
A T
C G
G C
T A
A T
C G
G C
T A
A
C
G
T
A
C
G
T
A
C
G
T
T
G
C
A
T
G
C
A
T
G
C
A
Recombination involves pairing between complementary strands of two parental duplex DNAs (top and middle panel). This process creates a stretch of hybrid DNA (bottom panel) in which the single strand of one duplex is paired with its complement from the other duplex.
INTRODUCTION 15.
The estimated number of genes for humans, less than 30,000, is not so different from the 25,300 known genes of Arabidopsis thaliana, commonly called mustard grass. Yet, we appear, at least at first glance, to be a far more complex organism. A person may wonder how this increased complexity is achieved. One answer lies in the regulatory system that turns genes on and off. This system also precisely controls the amount of a gene product that is produced and can further modify the product after it is made. This exquisite control requires multiple regulatory input points. One very efficient point occurs at transcription, such that an mRNA is produced only when a gene product is needed. Cells also regulate gene expression by posttranscriptional modification; by allowing only a subset of the mRNAs to go on to translation; or by restricting translation of specific mRNAs to only when the product is needed. At other levels, cells regulate gene expression through DNA folding, chemical modification of the nucleotide bases, and intricate “feedback mechanisms” in which some of the gene’s own protein product directs the cell to cease further protein production. Transcription is the process whereby RNA is made from DNA. It is initiated when an enzyme, RNA polymerase, binds to a site on the DNA called a promoter sequence. In most cases, the polymerase is aided by a group of proteins called “transcription factors” that perform specialized functions, such as DNA sequence recognition and regulation of the polymerase’s enzyme activity. Other regulatory sequences include activators, repressors, and enhancers. These sequences can be cis-acting (affecting genes that are adjacent to the sequence) or trans-acting (affecting expression of the gene from a distant site), even on another chromosome.
16. THE BLUEPRINT OF MAN: THE HUMAN GENOME
An example of transcriptional control occurs in the family of genes responsible for the production of globin. Globin is the protein that complexes with the ironcontaining heme molecule to make hemoglobin. Hemoglobin transports oxygen to our tissues via red blood cells. In the adult, red blood cells do not contain DNA for making new globin; they are ready-made with all of the hemoglobin they will need. During the first few weeks of life, embryonic globin is expressed in the yolk sac of the egg. By week five of gestation, globin is expressed in early liver cells. By birth, red blood cells are being produced, and globin is expressed in the bone marrow. Yet, the globin found in the yolk is not produced from the same gene as is the globin found in the liver or bone marrow stem cells. In fact, at each stage of development, different globin genes are turned on and off through a process of transcriptional regulation called “switching�. To further complicate matters, globin is made from two different
protein
chains:
an
alpha-like
chain
coded
for on chromosome 16; and a beta-like chain coded for on chromosome 11. Each chromosome has the embryonic, fetal, and adult form lined up on the chromosome in a sequential order for developmental expression. The developmentally regulated transcription of globin is controlled by a number of cis-acting DNA sequences, and although there remains a lot to be learned about the interaction of these sequences, one known control sequence is an enhancer called the Locus Control Region (LCR). The LCR sits far upstream on the sequence and controls the alpha genes on chromosome 16.
INTRODUCTION 17.
It may also interact with other factors to determine which alpha gene is turned on. Thalassemias are a group of diseases characterized by the absence or decreased production of normal globin, and thus hemoglobin, leading to decreased oxygen in the system. There are alpha and beta thalassemias, defined by the defective gene, and there are variations of each of these, depending on whether the embryonic, fetal, or adult forms are affected and/or expressed. Although there is no known cure for the thalassemias, there are medical treatments that have been developed based on our current understanding of both gene regulation and cell differentiation. Treatments include blood transfusions, iron chelators, and bone marrow transplants. With continuing research in the areas of gene regulation and cell differentiation, new and more effective treatments may soon be on the horizon, such as the advent of gene transfer therapies. Sequences that are important in regulating transcription do not necessarily code for transcription factors or other proteins. Transcription can also be regulated by subtle variations in DNA structure and by chemical changes in the bases to which transcription factors bind. As stated previously, the chemical properties of the four DNA bases differ slightly, providing each base
with
unique
opportunities
to
chemically
react
with other molecules. One chemical modification of DNA, called methylation, involves the addition of a methyl group or (-CH3) for short.
18. THE BLUEPRINT OF MAN: THE HUMAN GENOME
Methylation
frequently
occurs
at
cytosine
residues
that are preceded by guanine bases, oftentimes in the vicinity of promoter sequences. The methylation status of
DNA
ity,
often
where
correlates
inactive
with
genes
its
tend
functional
to
be
more
activheavily
methylated. This is because the methyl group serves to inhibit transcription by attracting a protein that binds specifically to methylated DNA, thereby interfering with polymerase binding. Methylation also plays an important role in genomic imprinting, which occurs when both maternal and paternal alleles are present but only one allele is expressed while the other remains inactive. Another way to think of genomic imprinting is as “parent of origin differences�
in
Considerable
the
expression
intrigue
of
surrounds
inherited
the
effects
traits. of
DNA
methylation, and many researchers are still working to this day to to unlock the mystery behind this concept.
INTRODUCTION 19.
OUR GENES
02 We’ve all heard the axiom “birds of a feather flock together,” and the saying seems to be true for humans: we tend to associate with people that we resemble. Obviously,
this
kind
of
similarity
can
result
from
social influences, but can it also extend to our genes? A group of scientists attempts to answer this question in a new study in PNAS, but their findings have been met with a healthy dose of criticism. Two long-term datasets were used as sources of data for this study, the National Longitudinal Study of Adolescent Health (known as Add Health), and the Framingham Heart Study Social Network (abbreviated as FHS-Net). These studies investigate some of the causes and risk factors associated with health, including social influences. Therefore, each of these studies asks respondents for information about some of their friends. During their time in the study, Add Health participants and their friends donated DNA samples in the form of saliva. The researchers had information about six genes that are known to influence social behavior in humans: DRD2, DRD4, CYP2A6, MAOA, SLC6A3, and SLC6A4.
These
genes are related to traits such as risk-taking, openness, and aggressiveness.
For the purpose of the PNAS
study, only unrelated friends were included.
21.
Analysis
showed
a
significant
positive
correlation
between friends’ genotypes for DRD2, and a significant negative
correlation
for
CYP2A6.
In
layman’s
terms,
this means that friends are likely to associate with people who are similar to them in terms of their DRD2 genotypes, and different when it comes to their CYP2A6 genotypes. The researchers then checked these findings with data from FHS-Net, and found similar results. In the discussion, the authors proposed an interesting possible mechanism for DRD2 clusters in social networks. Previous studies have linked DRD2 to alcoholism, and it is possible that drinkers gravitate toward social scenes where booze is present, whereas nondrinkers are more likely to congregate in places where alcohol isn’t abundant. CYP2A6 has been associated with openness, and a mechanism driving the negative correlation between CYP2A6 genotypes is less obvious. The
authors
suggest
“metagenomic,”
since
that our
humans
may
friends’
be
considered
genotypes
likely
affect many aspects of our own lives, from what we consume
to
how
we
communicate.
Additionally,
these
genetic correlations have interesting implications for the
recent
field
of
“indirect
genetic
effects,”
in
which one individual’s phenotype is influenced by the genes of another individual. If this study’s findings are accurate, the social environment could be just as important a source of selection as the physical environment.
22. THE BLUEPRINT OF MAN: THE HUMAN GENOME
However, this publication has been met with some significant criticisms. First, the authors picked just six genes for which to study variation, instead of doing an unbiased genome wide association study. Additionally, not as much genetic data was available in FHS-Net compared to Add Health, so their verification of the Add Health correlations wasn’t truly a “replication,” as the authors claim. Instead, the researchers “imputed” the genotypes, or inferred them from the available data. It’s not that critics don’t believe that these correlations exist; they’re just not sure that this study found enough solid evidence for the broad claims the authors make.
Some of the discussion sections reads
like pure speculation, which is compounded by the lessthan-convincing methods. To make matters worse, the media has blown this finding way out of proportion, making all sorts of exaggerated claims about the power of genes to make or break friendships.
Expression
of
μOR
in
vitro
and
in
vivo.
His-peptide
(top)
and
μOR
(Center) labels in cultured DRG neurons. Cells were examined under a confocal microscope at great magnification
OUR GENES 23.
The D4 dopamine receptor (DRD4) locus may be a model system
for
understanding
the
relationship
between
genetic variation and human cultural diversity. It has been the subject of intense interest in psychiatry, because bearers of one variant are at increased risk for attention deficit hyperactivity disorder (ADHD). A survey of world frequencies of DRD4 alleles has shown striking differences among populations, with population differences greater than those of most neutral markers. In this issue of PNAS Ding et al
provide a
detailed molecular portrait of world diversity at the DRD4 locus. They show that the allele associated with ADHD has increased a lot in frequency within the last few thousands to tens of thousands of years, although it has probably been present in our ancestors for hundreds of thousands or even millions of years. There is a repeat polymorphism in exon III of the gene at which a variable number of 48-bp motifs can occur. The common and probable ancestral allele has four repeats (4R), whereas the allele associated with novelty seeking and ADHD has seven (7R). Ding et al. (3) show that there are essentially two allele classes at the locus: most variants are easily derived from 4R by simple recombinations or mutations, whereas 7R differs from 4R by a minimum of seven events. This means either that the coalescence of the 4R and 7R allele classes is ancient or that 7R was generated by an extremely improbable mutation and recombination event more recently. At any rate the allelic diversity within the 4R class and the lack of linkage disequilibrium with markers around it show that it has been the common variant in our species for a long time. 7R alleles, with little diversity within the class and high linkage disequilibrium with neighboring markers,
24. THE BLUEPRINT OF MAN: THE HUMAN GENOME
must have a recent common ancestor. If the 7R class has been rare for a long time, coalescence within the class could be recent, whereas coalescence between the 7R and 4R classes could be ancient. This curious pattern, an allele that has been in the population for a very long time at a very low frequency, suggests that some kind of balancing selection has been maintaining 7R, but preventing it from becoming com mon until recently. An alternative is that 7R was incorporated from another hominid species during the expansion of modern humans. It is clear from association studies with childhood ADHD that the allele affects behavior, but it is not so clear what it does. The recent positive selection for 7R denies that it causes pathology in any evolutionary sense. Adaptive genetic variants cannot be the cause of fitness-reducing mental illness unless that illness in some way is a product of recent environmental change or is compensated by other increases in fitness as in classical balanced polymorphisms. Natural selection would have purged com mon alleles that induce crippling disorders such as schizophrenia in premodern conditions. On the other hand, it is entirely possible that some psychological traits are adaptive yet, because they are irritating or undesirable, are called mental illness. There is an important distinction to be made between fitness-reducing mental illnesses, behaviors that can never have been adaptive, and psychological syndromes that we happen not to like. Another hint that 7R does not cause pathology in an evolutionary sense is the finding that children diagnosed with ADHD often show specific neurological deficits, whereas children bearing 7R and diagnosed with ADHD do not show such deficits.
OUR GENES 25.
Neurotransmitter receptors are especially good candidates for genetic variation that modulates behavior. The polymorphisms we see in dopamine metabolism all seem to be in receptor and transporter molecules rather than in the dopamine molecule itself. Dopamine is involved in a wide spectrum of functions, acting through five different receptors. Any change that affected all of those systems at once would almost certainly be maladaptive. A mutation that affects a smaller range of systems, by changing a single dopamine receptor or transporter, has a better chance. The same principle may apply to other neurotransmitters, which typically have a number of different receptors. We know that serotonin, for example, has 14. Evidence of adaptive genetic variation affecting human psychology should be of interest to evolutionary psychologists, particularly because they have argued that it cannot exist. For example Tooby and Cosmides claim that there are only two kinds of human nature, male and female, and that apparent variation in personality is either facultative response to environmental cues or nonadaptive. They argue that complex adaptations require the coordinated action of many genes, and that if individuals of a sexually reproducing species differed in the genes required to build these adaptations, sexual reproduction would disrupt the necessary gene complexes. They also argue that there has been insufficient time since the advent of modern humanity for the development of significant novel mental adaptations.
26. THE BLUEPRINT OF MAN: THE HUMAN GENOME
As has been pointed out by D. Wilson, their arguments are unconvincing. They imply that no sexual species should have heritable adaptive morphs, whereas in fact many examples are known, such as the male morphs of the side-blotched lizard Uta stansburiana that act out a scissors–paper–rock game, in which morph A beats morph B, morph B beats morph C, and morph C beats morph A. The “insufficient time” argument is probably just incorrect;
considering
the
measured
heritability
of
psychological traits and the expected response to mild selection over hundreds of generations, it is instead surprising that we seem to have changed little over historical time. Even if 40 or 50 thousand years were too short a time for the evolutionary development of a truly new and highly complex mental adaptation, which is by no means certain, it is certainly long enough for some groups to lose such an adaptation, for some groups to develop a highly exaggerated version of an adaptation, or for changes in the triggers or timing of that adaptation to evolve. That is what we see in domesticated dogs, for example, who have entirely lost certain key behavioral adaptations of wolves such as paternal investment. Other wolf behaviors have been exaggerated or distorted. A border collie’s herding is recognizably derived from wolf behaviors, as is a terrier’s aggressiveness, but this hardly means that collies, wolves, and terriers are all the same. Paternal investment may be particularly fragile and easily lost in mam mals, because parental investment via internal gestation and lactation is engineered into females but not males.
OUR GENES 27.
If, as seems likely, the basis for selection at DRD4 has been the effect of the 7R allele on behavior, the selective advantage of 7R is probably frequency-dependent rather than absolute. Even a small absolute advantage would eventually make this allele go to fixation or nearly so. In the case of simple directional selection the new advantageous allele takes a long time to reach an appreciable frequency, then approaches fixation in a much smaller number of generations. There would be only a relatively small time interval in which one would see both new and old alleles at intermediate frequency, and the new allele would be most common near its place of origin. What we see today looks much more likely to be the consequence of some kind of frequency-dependent selection, in which the frequency of the new allele increases up to a certain point and then stabilizes. These selective forces must not be the same in all populations,
because
the
7R
allele
is
quite
com mon
in some populations (South American Indians), exists at intermediate frequencies in others (Europeans and Africans), and is rare to nonexistent in yet others (East Asia, !Kung Bushmen). When the advantage of an allele is frequency-dependent, two or more different alleles can persist indefinitely at polymorphic frequencies. In evolutionary game theory, such a stable mix is known as an evolutionary stable strategy or ESS. The most famous example is the hawk窶電ove game, in which doves always yield to hawks, whereas aggressive hawks incur fitness costs from fighting other hawks, costs that increase as hawks become more com mon. If the cost of fighting is greater than the gains from intimidation, hawks and doves coexist. Consider, for example, the game of chicken.
28. THE BLUEPRINT OF MAN: THE HUMAN GENOME
Because the prominent phenotypic effects of 7R are in males, we need to ask what is the niche in human societies for males who are energetic, impulsive (i.e., unpredictable),
and
noncompliant?
Whereas
tests
of
hypotheses ought to be careful and conservative, generation of hypotheses ought to be speculative and freeranging. There is a tradition of caution approaching self-censorship
in
discussions
of
human
biological
diversity, but we will break that tradition in what follows. There are at least two hypotheses to explain the world distribution of 7R. The first, due to Chen et al. , is that it is a dispersal morph. They argue that the allele increases the likelihood that its bearers migrate. As modern humans colonized the earth, bearers of 7R were more likely to be movers so that populations far away from their ancient places of origin have, in effect, concentrated 7R. The world high frequencies in South America reflect the great distance of South America from
the
original
human
homeland:
similarly
migra-
tions from China led to the presence of the allele in southeast Asian and Pacific populations, whereas none remained in China. This hypothesis does not account for the apparent long persistence at low frequency of 7R in human ancestors before the population movements occurred that were responsible for population frequency differences. The second hypothesis is that 7R bearers enjoy a reproductive advantage in male-competitive societies, either in competition for food as children or in face-toface and local group male competition. Societies in which this advantage would be present were rare before the spread of agriculture, but com mon after it. This hypothesis requires a brief review of human ecological history. We acknowledge our abuse of detail and of ethnographic diversity in the sum mary that follows.
OUR GENES 29.
Among most hunting and gathering people both sexes work to provision offspring; in particular males allocate much of their reproductive effort to parental effort. These “dad” societies contrast with “cad” societies in which males allocate reproductive effort to mating effort, that is to competition with other males for access to females. The general theory of the “war between the sexes” is described by Dawkins, whereas the human version of it is described in a landmark paper by Whiting and Whiting and
elaborated
by
others.
In
general,
in
societies
where males are dads, men and women live together with their offspring; they eat and sleep together; the males are not particularly gaudy; and they do not make fancy weapons and art. Pair bonds are durable, divorce rates are low; and nuclear families are the primary context for care of children.
Cells examined under a confocal microscope 2 weeks after they were infected with crude rAAV-μOR. The soma and processes of the neuron in the center were intensely labeled with antibodies.
30. THE BLUEPRINT OF MAN: THE HUMAN GENOME
Most foraging people are dad societies, the exceptions being cases where there are periodic rich resource streams like salmon runs on the North American northwest coast. There is some controversy in the literature about whether apparently parental males in dad societies are really parental or whether they are instead engaged in many subtle forms of male competition and mate guarding. At any rate the end result is that men work and provide food to children. Among low density gardeners, on the other hand, the typical pattern is that most of the gardening work is done by women, freeing men from subsistence responsibilities. Boserup
calls these “female farming sys-
tems,� a euphemism for societies where men live off women.
Freed
from
domestic
responsibility,
men
can
occupy their time decorating themselves and planning the next raid. Widespread systems of such societies, as in highland New Guinea or lowland South America, seem to be stable as the chronic raiding and warfare suppress population growth. But powerful polities can break the stability, suppressing local male coalitional violence, leading to population growth, agricultural intensification, and ultimately to males again working at farming to provision their families. Thus there are drab males working at subsistence among foragers and, at the other end of the density continuum, among intensive agriculturalists and peasants. In between we see decorative competitive males engaged in local male coalitional violence. There is an unsettling parallel with the dad males of the working class in contemporary industrial societies and the cad males of the underclass.
OUR GENES 31.
There are several ways in which female farming systems might have opened up new behavioral niches or greatly expanded
existing
ones.
Females
in
these
societies
often have children by more than one male, because pair bonds are fragile. Hence siblings are more likely to be half rather than full siblings. This increases the extent to which individuals in the same family have genetic conflicts of interest; it exacerbates such conflicts spring,
between
and
parents,
between
between
siblings.
parent
Children
and
that
off-
are
in
some sense “difficult” may have a survival advantage in these conditions. One study found that a sample of difficult infants survived a drought at a higher rate than easy infants. Further,
whenever
paternal
investment
decreases,
paternal genetic quality becomes relatively more important. The 7R allele may facilitate risky “show-off” behaviors that exhibit good genes: cad societies are characterized by male posturing and machismo, “protest masculinity” in an older terminology. Most obviously, energetic unpredictable adult males may enjoy a competitive advantage in the face-to-face male competition and violence that characterize these groups. Chagnon has shown that there is a fitness payoff to warriors and those with such a reputation among the Yanomamo of Amazonia.
32. THE BLUEPRINT OF MAN: THE HUMAN GENOME
In a preagricultural world of hunters and gatherers, males with an advantage in face-to-face competition would have enjoyed a limited advantage, perhaps being favored only in those niches where periodic resource abundance allowed male competition to flourish elsewhere new ecological circumstances led to transient plenty as on the Great Plains after the introduction of the horse. Otherwise they would have been disadvantaged, especially because there were more of them. But the advent of gardening technology freed males from subsistence work and, in the new social environment of display, aggression, and male coalitional violence bearers of 7R may have flourished, accounting for the evidence given by Ding et al.of the recent increase in frequency of the allele. Finally, as polities emerged that suppressed the system of local anarchy, population growth occurred, land became scarce; agriculture became more labor intensive; and men were again forced into work to provision their families. The archetypes in the literature of anthropology of dad hunter-gatherers are !Kung Bushmen of southern Africa; the archetypes of labor intensive farmers are east Asians; and the archetypes of local anarchy are Indians of lowland South America like the Yanomamo. It is probably no accident that two of the best known ethnographies of the twentieth century are titled “The Harmless People,” about the !Kung who have few or no 7R alleles, and “The Fierce People,” about the Yanomamo with a high frequency of 7R.
OUR GENES 33.
DNA
03 Life is specified by genomes. Every organism, including humans, has a genome that contains all of the biological information needed to build and maintain a living example of that organism. The biological information contained in a genome is encoded in its deoxyribonucleic acid (DNA) and is divided into discrete units called genes. Genes code for proteins that attach to the genome at the appropriate positions and switch on a series of reactions called gene expression. Inside each of our cells lies a nucleus, a membranebounded region that provides a sanctuary for genetic information. The nucleus contains long strands of DNA that encode this genetic information. A DNA chain is made up of four chemical bases: adenine (A) and guanine (G), which are called purines, and cytosine (C) and thymine (T), referred to as pyrimidines. Each base has a slightly different composition, or combination of oxygen, carbon, nitrogen, and hydrogen. In a DNA chain, every base is attached to a sugar molecule (deoxyribose) and a phosphate molecule, resulting in a nucleic acid or nucleotide. Individual nucleotides are linked through the phosphate group, and it is the precise order, or sequence, of nucleotides that determines the product made from that gene. The DNA that constitutes a gene is a double-stranded molecule consisting of two chains running in opposite directions.
35.
DNA
consists
of
two
long
polymers
of
simple
units
called nucleotides, with backbones made of sugars and phosphate
groups
joined
by
ester
bonds.
These
two
strands run in opposite directions to each other and are therefore anti-parallel. Attached to each sugar is one of four types of molecules called nucleobases (informally, bases). It is the sequence of these four nucleobases along the backbone that encodes information. This information is read using the genetic code, which specifies the sequence of the amino acids within proteins. The code is read by copying stretches of DNA into the related nucleic acid RNA in a process called transcription. Within cells, DNA is organized into long structures called chromosomes. During cell division these chromosomes are duplicated in the process of DNA replication, providing each cell its own complete set of chromosomes. Eukaryotic organisms (animals, plants, fungi, and protists) store most of their DNA inside the cell nucleus and some of their DNA in organelles, such as mitochondria or chloroplasts. In contrast, prokaryotes (bacteria and archaea) store their DNA only in the cytoplasm. Within the chromosomes, chromatin proteins such as histones compact and organize DNA. These compact structures guide the interactions between DNA and other proteins, helping control which parts of the DNA are transcribed
36. THE BLUEPRINT OF MAN: THE HUMAN GENOME
DNA is a long polymer made from repeating units called nucleotides. As first discovered by James D. Watson and Francis Crick, the structure of DNA of all species comprises two helical chains each coiled round the same axis, and each with a pitch of 34 Ångströms (3.4 nanometres) and a radius of 10 Ångströms (1.0 nanometres). According to another study, when measured in a particular solution, the DNA chain measured 22 to 26 Ångströms wide (2.2 to 2.6 nanometres), and one nucleotide unit measured 3.3 Å (0.33 nm) long. Although each individual repeating unit is very small, DNA polymers can be very large molecules containing millions of nucleotides. For instance,
the
largest
human
chromosome,
chromosome
number 1, is approximately 220 million base pairs long. In living organisms DNA does not usually exist as a single molecule, but instead as a pair of molecules that are held tightly together. These two long strands entwine like vines, in the shape of a double helix. The nucleotide repeats contain both the segment of the backbone of the molecule, which holds the chain together, and a nucleobase, which interacts with the other DNA strand in the helix. A nucleobase linked to a sugar is called a nucleoside and a base linked to a sugar and one or more phosphate groups is called a nucleotide. Polymers comprising multiple linked nucleotides (as in DNA) is called a polynucleotide.
DNA 37.
The DNA double helix is stabilized primarily by two forces: hydrogen bonds between nucleotides and basestacking interactions among the aromatic nucleobases. In the aqueous environment of the cell, the conjugated π bonds of nucleotide bases align perpendicular to the axis of the DNA molecule, minimizing their interaction with the solvation shell and therefore, the Gibbs free energy. The four bases found in DNA are adenine (abbreviated A), cytosine (C), guanine (G) and thymine (T). These four bases are attached to the sugar/phosphate to form the complete nucleotide. Twin helical strands form the DNA backbone. Another double helix may be found by tracing the spaces, or grooves, between the strands. These voids are adjacent to the base pairs and may provide a binding site. As the strands are not directly opposite each other, the grooves are unequally sized. One groove, the major groove, is 22 Å wide and the other, the minor groove, is 12 Å wide. The narrowness of the minor groove means that the edges of the bases are more accessible in the major grooves.
The pairing of the DNA and RNA. A strand of RNA is produced from a strand of DNA much the same as during DNA replication but in this case it is catalysed by RNA polymerase using rNTPs (ribonucleotide triphosphates). No primer is required. The synthesis occurs in the same direction as for DNA replication
and pyrophosphates are still released when the ribo-
nucleotide triphosphates bind to the backbone.
38. THE BLUEPRINT OF MAN: THE HUMAN GENOME
As a result, proteins like transcription factors that can bind to specific sequences in double-stranded DNA usually make contacts to the sides of the bases exposed in the major groove. This situation varies in unusual conformations of DNA within the cell, but the major and minor grooves are always named to reflect the differences in size that would be seen if the DNA is twisted back into the ordinary B form. In a DNA double helix, each type of nucleobase on one strand normally interacts with just one type of nucleobase on the other strand. This is called complementary base pairing. Here, purines form hydrogen bonds to pyrimidines, with A bonding only to T, and C bonding only to G. This arrangement of two nucleotides binding together across the double helix is called a base pair. As hydrogen bonds are not covalent, they can be broken and rejoined relatively easily. The two strands of DNA in a double helix can therefore be pulled apart like a zipper, either by a mechanical force or high temperature.As a result of this complementarity, all the information in the double-stranded sequence of a DNA helix is duplicated on each strand, which is vital in DNA replication. Indeed, this reversible and specific interaction between complementary base pairs is critical for all the functions of DNA in living organisms.
DNA 39.
The two types of base pairs form different numbers of hydrogen bonds, AT forming two hydrogen bonds, and GC forming three hydrogen bonds (see figures, left). DNA with high GC-content is more stable than DNA with low GC-content. Although it is often stated that this is due to the added stability of an additional hydrogen bond, this is incorrect. DNA with high GC-content is more stable due to intra-strand base stacking interactions. As noted above, most DNA molecules are actually two polymer
strands,
bound
together
in
a
helical
fash-
ion by noncovalent bonds; this double stranded structure (dsDNA) is maintained largely by the intrastrand base
stacking
interactions,
which
are
stongest
for
G,C stacks. The two strands can come apart - a process known as melting - to form two ss DNA molecules. Melting occurs when conditions favor ssDNA; such conditions are high temperature, low salt and high pH (low pH also melts DNA, but since DNA is unstable due to acid depurination, low pH is rarely used). The stability of the dsDNA form depends not only on the GC-content ( G,C basepairs) but also on sequence (since stacking is sequence specfic) and also length (longer molecules are more stable). The stability can be measured in various ways; a common way is the “melting temperature�, which is the temperature at which 50% of the ds molecules are converted to ss molecules; melting temperature is dependent on ionic strength and the concentration of DNA.
40. THE BLUEPRINT OF MAN: THE HUMAN GENOME
Long DNA helices with a high GC-content have stronger-interacting strands, while short helices with high AT content have weaker-interacting strands.In biology, parts of the DNA double helix that need to separate easily, such as the TATAAT Pribnow box in some promoters, tend to have a high AT content, making the strands easier to pull apart. In the laboratory, the strength of this interaction can be measured by finding the temperature required to break the hydrogen bonds, their melting temperature (also called Tm value). When all the base pairs in a DNA double helix melt, the strands separate and exist in solution as two entirely independent molecules. These single-stranded DNA molecules (ssDNA) have no single common shape, but some conformations are more stable.
The close up of genes being multiplied and divided.
DNA 41.
A DNA sequence is called “sense” if its sequence is the same as that of a messenger RNA copy that is translated into protein.The sequence on the opposite strand is called the “antisense” sequence. Both sense and antisense sequences can exist on different parts of the same strand of DNA (i.e. both strands contain both sense and antisense sequences). In both prokaryotes and eukaryotes, antisense RNA sequences are produced, but the functions of these RNAs are not entirely clear.One proposal is that antisense RNAs are involved in regulating gene expression through RNA-RNA base pairing. A few DNA sequences in prokaryotes and eukaryotes, and more in plasmids and viruses, blur the distinction between sense and antisense strands by having overlapping genes.In these cases, some DNA sequences do double duty, encoding one protein when read along one strand, and a second protein when read in the opposite direction along the other strand. In bacteria, this overlap may be involved in the regulation of gene transcription,while in viruses, overlapping genes increase the amount of information that can be encoded within the small viral genome.
42. THE BLUEPRINT OF MAN: THE HUMAN GENOME
DNA exists in many possible conformations that include A-DNA, B-DNA, and Z-DNA forms, although, only B-DNA and Z-DNA have been directly observed in functional organisms. The conformation that DNA adopts depends on the hydration level, DNA sequence, the amount and direction of supercoiling, chemical modifications of the bases, the type and concentration of metal ions, as well as the presence of polyamines in solution. The first published reports of A-DNA X-ray diffraction patterns— and also B-DNA used analyses based on Patterson
transforms
that
provided
only
a
limited
amount of structural information for oriented fibers of DNA. An alternate analysis was then proposed by Wilkins et al in 1953, for the in vivo B-DNA X-ray diffraction/scattering
patterns
of
highly
hydrated
DNA
fibers in terms of squares of Bessel functions. In the same journal, James D. Watson and Francis Crick presented their molecular modeling analysis of the DNA X-ray diffraction patterns to suggest that the structure was a double-helix. Although the `B-DNA form’ is most com mon under the conditions found in cells, it is not a well-defined conformation but a family of related DNA conformations that occur at the high hydration levels present in living cells. Their corresponding X-ray diffraction and scattering patterns are characteristic of molecular paracrystals with a significant degree of disorder.
DNA 43.
A DNA helix usually does not interact with other segments of DNA, and in human cells the different chromosomes even occupy separate areas in the nucleus called “chromosome territories”. This physical separation of different chromosomes is important for the ability of DNA to function as a stable repository for information, as one of the few times chromosomes interact is during chromosomal crossover when they recombine. Chromosomal crossover is when two DNA helices break, swap a section and then rejoin. Recombination allows chromosomes to exchange genetic information and produces new combinations of genes, which increases the efficiency of natural selection and can be important in the rapid evolution of new proteins. Genetic recombination can also be involved in DNA repair, particularly in the cell’s response to double-strand breaks. The
most
common
form
of
chromosomal
crossover
is
homologous recombination, where the two chromosomes involved share very similar sequences. Non-homologous recombination can be damaging to cells, as it can produce
chromosomal
translocations
and
genetic
abnor-
malities. The recombination reaction is catalyzed by enzymes known as recombinases, such as RAD51.
44. THE BLUEPRINT OF MAN: THE HUMAN GENOME
DNA contains the genetic information that allows all modern living things to function, grow and reproduce. However, it is unclear how long in the 4-billion-year history of life DNA has performed this function, as it has been proposed that the earliest forms of life may have used RNA as their genetic material.[99][111] RNA may have acted as the central part of early cell metabolism as it can both transmit genetic information and carry out catalysis as part of ribozymes. This ancient RNA world where nucleic acid would have been used for both catalysis and genetics may have influenced the evolution of the current genetic code based on four nucleotide bases. This would occur, since the number of different bases in such an organism is a trade-off between a small number of bases increasing replication accuracy and a large number of bases increasing the catalytic efficiency of ribozymes. However, there is no direct evidence of ancient genetic systems, as recovery of DNA from most fossils is impossible. This is because DNA will survive in the environment for less than one million years and slowly degrades into short fragments in solution. Claims for older DNA have been made, most notably a report of the isolation of a viable bacterium from a salt crystal 250 million years old,[ but these claims are controversial.
DNA 45.
RNA
04 Ribonucleic acid or RNA, is one of the three major macromolecules that are essential for all known forms of life. Like DNA, RNA is made up of a long chain of components called nucleotides. Each nucleotide consists of a nucleobase (sometimes called a nitrogenous base), a ribose sugar, and a phosphate group. The sequence of nucleotides allows RNA to encode genetic information. All cellular organisms use messenger RNA to carry the genetic information that directs the synthesis of proteins. In addition, some viruses use RNA instead of DNA as their genetic material; perhaps a reflection of the suggested key role of RNA in the evolutionary history of life on Earth. Like proteins, some RNA molecules play an active role in cells by catalyzing biological reactions, controlling gene expression, or sensing and com municating responses to cellular signals. One of these active processes is protein synthesis, a universal function whereby mRNA molecules direct the assembly of proteins on ribosomes. This process uses transfer RNA (tRNA) molecules to deliver amino acids to the ribosome, where ribosomal RNA links amino acids together to form proteins.
47.
The chemical structure of RNA is very similar to that of DNA, with two differences: (a) RNA contains the sugar ribose, while DNA contains the slightly different sugar deoxyribose (a type of ribose that lacks one oxygen atom), and (b) RNA has the nucleobase uracil while DNA contains thymine. Uracil and thymine have similar base-pairing properties. Unlike DNA, most RNA molecules are single-stranded. Single-stranded RNA molecules adopt very complex threedimensional structures, since they are not restricted to the repetitive double-helical form of double-stranded DNA. RNA is made within living cells by RNA polymerases, enzymes that act to copy a DNA or RNA template into a new RNA strand through processes known as transcription or RNA replication, respectively. RNA and DNA are both nucleic acids, but differ in three main ways. First, unlike DNA, which is, in general, double-stranded, RNA is a single-stranded molecule in many of its biological roles and has a much shorter chain of nucleotides. Second, while DNA contains deoxyribose, RNA contains ribose (in deoxyribose there is no hydroxyl group attached to the pentose ring in the 2’ position). These hydroxyl groups make RNA less stable than DNA because it is more prone to hydrolysis.
This is a Structure of a ham merhead ribozyme, a ribozyme that cuts strants of RNA, damaging it.
48. THE BLUEPRINT OF MAN: THE HUMAN GENOME
the
Like DNA, most biologically active RNAs, including mRNA, tRNA, rRNA, snRNAs, and other non-coding RNAs, contain self-complementary sequences that allow parts of the RNA to fold and pair with itself to form double helices. Analysis of these RNAs has revealed that they are highly structured. Unlike DNA, their structures do not consist of long double helices but rather collections of short helices packed together into structures akin to proteins. In this fashion, RNAs can achieve chemical catalysis, like enzymes.[4] For instance, determination of the structure of the ribosome—an enzyme that catalyzes peptide bond formation—revealed that its active site is composed entirely of RNA. Each nucleotide in RNA contains a ribose sugar, with carbons numbered 1’ through 5’. A base is attached to the 1’ position, in general, adenine (A), cytosine (C), guanine (G), or uracil (U). Adenine and guanine are purines, cytosine, and uracil are pyrimidines. A phosphate group is attached to the 3’ position of one ribose and the 5’ position of the next. The phosphate groups have a negative charge each at physiological pH, making RNA a charged molecule. The bases may form hydrogen bonds between cytosine and guanine, between adenine and uracil and between guanine and uracil. However, other interactions are possible, such as a group of adenine bases binding to each other in a bulge, or the GNRA tetraloop that has a guanine–adenine base-pair. An important structural feature of RNA that distinguishes it from DNA is the presence of a hydroxyl group at the 2’ position of the ribose sugar. The presence of this functional group causes the helix to adopt the A-form geometry rather than the B-form most commonly observed in DNA.
RNA 49.
RNA is transcribed with only four bases (adenine, cytosine, guanine and uracil), but these bases and attached sugars can be modified in numerous ways as the RNAs mature. Pseudouridine, in which the linkage between uracil and ribose is changed from a C–N bond to a C–C bond, and ribothymidine (T) are found in various places (the most notable ones being in the TC loop of tRNA). Another notable modified base is hypoxanthine, a deaminated adenine base whose nucleoside is called inosine (I). Inosine plays a key role in the wobble hypothesis of the genetic code. There are nearly 100 other naturally occurring modified nucleosides,
of
which
pseudouridine
and
nucleosides
with 2’-O-methylribose are the most com mon. The specific roles of many of these modifications in RNA are not fully understood. However, it is notable that, in ribosomal RNA, many of the post-transcriptional modifications occur in highly functional regions, such as the peptidyl transferase center and the subunit interface, implying that they are important for normal function. The functional form of single stranded RNA molecules, just like proteins, frequently requires a specific tertiary structure. The scaffold for this structure is provided
by
secondary
structural
elements
that
are
hydrogen bonds within the molecule. This leads to several recognizable “domains” of secondary structure like hairpin loops, bulges, and internal loops. Since RNA is charged, metal ions such as Mg2+ are needed to stabilise many secondary and tertiary structures.
50. THE BLUEPRINT OF MAN: THE HUMAN GENOME
2
3
C5
C5
B
B
3
C5
2
2
C5
3
B
B
2
3
C5
C5
2
3
B
B
3
2
C5
C5
B
B
S CONFORMATIONS
N CONFORMATIONS
The five-membered ring of the ribose moiety of ribonucleotides can adopt various puckering forms, several of which are shown here in New man projections. The conformations can be classified into envelope forms, in which one atom lies out of the plane of the other four, and twist forms, in which two consecutive atoms lie on opposite faces of plane of the other three.
RNA 51.
Synthesis of RNA is usually catalyzed by an enzyme—RNA polymerase—using DNA as a template, a process known as transcription. Initiation of transcription begins with the binding of the enzyme to a promoter sequence in the DNA (usually found “upstream” of a gene). The DNA double helix is unwound by the helicase activity of the enzyme. The enzyme then progresses along the template strand in the 3’ to 5’ direction, synthesizing a complementary RNA molecule with elongation occurring in the 5’ to 3’ direction. The DNA sequence also dictates where termination of RNA synthesis will occur. RNAs are often modified by enzymes after transcription. For example, a poly(A) tail and a 5’ cap are added to eukaryotic pre-mRNA and introns are removed by the spliceosome.There are also a number of RNA-dependent RNA polymerases that use RNA as their template for synthesis of a new strand of RNA. For instance, a number of RNA viruses (such as poliovirus) use this type of enzyme to replicate their genetic material.
Close up of a single stand of Rna. Double stranded RNA molecules containing a single stranded, capped 5’ end and a 3’ poly tail were circularized by the poly(A) tail binding protein, the cap binding protein, and a fragment of the adapter protein translation initiation factor eIF4G. The large white particles connecting the two ends of the R N A strand contains these three proteins.
52. THE BLUEPRINT OF MAN: THE HUMAN GENOME
Messenger RNA (mRNA) is the RNA that carries information from DNA to the ribosome, the sites of protein synthesis (translation) in the cell. The coding sequence of the mRNA determines the amino acid sequence in the protein that is produced.[22] Many RNAs do not code for protein however (about 97% of the transcriptional output is non-protein-coding in eukaryotes. These
so-called
non-coding
RNAs
(“ncRNA”)
can
be
encoded by their own genes (RNA genes), but can also derive from mRNA introns. The most prominent examples of non-coding RNAs are transfer RNA (tRNA) and ribosomal RNA (rRNA), both of which are involved in the process of translation. There are also non-coding RNAs involved in gene regulation, RNA processing and other roles. Certain RNAs are able to catalyse chemical reactions such as cutting and ligating other RNA molecules, and the catalysis of peptide bond formation in the ribosome; these are known as ribozymes. Messenger RNA (mRNA) carries information about a protein sequence to the ribosomes, the protein synthesis factories in the cell. It is coded so that every three nucleotides (a codon) correspond to one amino acid. In eukaryotic cells, once precursor mRNA (pre-mRNA) has been transcribed from DNA, it is processed to mature mRNA.
This
removes
its
introns—non-coding
sections
of the pre-mRNA. The mRNA is then exported from the nucleus to the cytoplasm, where it is bound to ribosomes and translated into its corresponding protein form with the help of tRNA. In prokaryotic cells, which do not have nucleus and cytoplasm compartments, mRNA can bind to ribosomes while it is being transcribed from DNA. After a certain amount of time the message degrades into its component nucleotides with the assistance of ribonucleases.
RNA 53.
Many RNAs are involved in modifying other RNAs. Introns are spliced out of pre-mRNA by spliceosomes, which contain several small nuclear RNAs (snRNA), or the introns can be ribozymes that are spliced by themselves. RNA can also be altered by having its nucleotides modified to other nucleotides than A, C, G and U. In eukaryotes, modifications of RNA nucleotides are generally directed by small nucleolar RNAs (snoRNA; 60-300 nt), found in the nucleolus and cajal bodies. snoRNAs associate with enzymes and guide them to a spot on an RNA by basepairing to that RNA. These enzymes then perform the nucleotide modification. rRNAs and tRNAs are extensively modified, but snRNAs and mRNAs can also be the target of base modification. Like DNA, RNA can carry genetic information. RNA viruses have genomes composed of RNA, and a variety of proteins encoded by that genome. The viral genome is replicated by some of those proteins, while other proteins protect the genome as the virus particle moves to a new host cell. Viroids are another group of pathogens, but they consist only of RNA, do not encode any protein and are replicated by a host plant cell’s polymerase. Reverse transcribing viruses replicate their genomes by reverse transcribing DNA copies from their RNA; these DNA copies are then transcribed to new RNA. Retrotransposons also spread by copying DNA and RNA from one another, and telomerase contains an RNA that is used as a template for building the ends of eukaryotic chromosomes.
54. THE BLUEPRINT OF MAN: THE HUMAN GENOME
Research on RNA has led to many important biological discoveries and numerous Nobel Prizes. Nucleic acids were
discovered
in
1868
by
Friedrich
Miescher,
who
called the material ‘nuclein’ since it was found in the nucleus. It was later discovered that prokaryotic cells, which do not have a nucleus, also contain nucleic acids. The role of RNA in protein synthesis was suspected already in 1939. Severo Ochoa won the 1959 Nobel Prize in Medicine (shared with Arthur Kornberg) after he discovered an enzyme that can synthesize RNA in the laboratory. Ironically, the enzyme discovered by Ochoa (polynucleotide phosphorylase) was later shown to be responsible for RNA degradation, not RNA synthesis. In 1977, introns and RNA splicing were discovered in both mammalian viruses and in cellular genes, resulting in a 1993 Nobel to Philip Sharp and Richard Roberts. Catalytic RNA molecules (ribozymes) were discovered in the early 1980s, leading to a 1989 Nobel award to Thomas Cech and Sidney Altman. In 1990 it was found in petunia that introduced genes can silence similar genes of the plant’s own, now known to be a result of RNA interference. At about the same time, 22 nt long RNAs, now called microRNAs, were found to have a role in the development of C. elegans. Studies on RNA interference gleaned a Nobel Prize for Andrew Fire and Craig Mello in 2006, and another Nobel was awarded for studies on transcription of RNA to Roger Kornberg in the same year. The discovery of gene regulatory RNAs has led to attempts to develop drugs made of RNA to silence genes.
RNA 55.
CHROMOSOMES
05 A chromosome is an organized structure of DNA and protein found in cells. It is a single piece of coiled DNA
containing
many
genes,
regulatory
elements
and
other nucleotide sequences. Chromosomes also contain DNA-bound proteins, which serve to package the DNA and
control
its
functions.
Chromosomes
vary
widely
between different organisms. The DNA molecule may be circular or linear, and can be composed of 100,000 to
10,000,000,000[1]
Typically,
eukaryotic
nucleotides cells
in
(cells
a
with
long
chain.
nuclei)
have
large linear chromosomes and prokaryotic cells (cells without defined nuclei) have smaller circular chromosomes, although there are many exceptions to this rule. In
eukaryotes,
nuclear
chromosomes
are
packaged
by
proteins into a condensed structure called chromatin. This allows the very long DNA molecules to fit into the cell nucleus. The structure of chromosomes and chromatin varies through the cell cycle. Chromosomes are the essential unit for cellular division and must be replicated, divided, and passed successfully to their daughter cells so as to ensure the genetic diversity and survival of their progeny. Chromosomes may exist as either duplicated or unduplicated. Unduplicated chromosomes are single linear strands, whereas duplicated chromosomes (copied during synthesis phase) contain two copies joined by a centromere.
CHROMOSOMES 57.
Compaction of the duplicated chromosomes during mitosis and meiosis results in the classic four-arm structure (pictured to the right). Chromosomal recombination plays a vital role in genetic diversity. If these structures are manipulated incorrectly, through processes known
as
chromosomal
instability
and
translocation,
the cell may undergo mitotic catastrophe and die, or it may unexpectedly evade apoptosis leading to the progression of cancer. In practice “chromosome� is a rather loosely defined term. In prokaryotes and viruses, the term genophore is
more
appropriate
when
no
chromatin
is
present.
However, a large body of work uses the term chromosome regardless of chromatin content. In prokaryotes, DNA is usually arranged as a circle, which is tightly coiled in on itself, sometimes accompanied by one or more smaller, circular DNA molecules called plasmids.
02 - 20m
SHORT ARM CENTROMERE
LONG FOOT
CHROMATID
This is a Diagram of a replicated and condensed metaphase eukaryotic chromosome. The Chromatid is one of the two identical parts of the chromosome after S phase. The Centromere is the point where the two chromatids touch, and where the microtubules attach. The Short Arm is on the top and the Long foot is on the bottom.
68. THE BLUEPRINT OF MAN: THE HUMAN GENOME
In a series of experiments beginning in the mid-1880s, Theodor Boveri gave the definitive demonstration that chromosomes are the vectors of heredity. His two principles were the continuity of chromosomes and the individuality of chromosomes. It is the second of these principles that was so original. Wilhelm Roux suggested that each chromosome carries a different genetic load. Boveri was able to test and confirm this hypothesis. Aided by the rediscovery at the start of the 1900s of Gregor Mendel’s earlier work, Boveri was able to point out the connection between the rules of inheritance and the behaviour of the chromosomes. Boveri influenced two generations of American cytologists: Edmund Beecher Wilson, Walter Sutton and Theophilus Painter were all influenced by Boveri (Wilson and Painter actually worked with him). In his famous textbook The Cell in Development and Heredity, Wilson linked together the independent work of Boveri and Sutton (both around 1902) by naming the chromosome theory of inheritance the “Sutton-Boveri Theory” (the names are sometimes reversed).Ernst Mayr remarks that the theory was hotly contested by some famous geneticists: William Bateson, Wilhelm Johannsen, Richard Goldschmidt and T.H. Morgan, all of a rather dogmatic turn-of-mind. Eventually, complete proof came from chromosome maps in Morgan’s own lab.
CHROMOSOMES 59.
Eukaryotes (cells with nuclei such as those found in plants, yeast, and animals) possess multiple large linear chromosomes contained in the cell’s nucleus. Each chromosome has one centromere, with one or two arms projecting from the centromere, although, under most circumstances, these arms are not visible as such. In addition, most eukaryotes have a small circular mitochondrial genome, and some eukaryotes may have additional small circular or linear cytoplasmic chromosomes. In the nuclear chromosomes of eukaryotes, the uncondensed DNA exists in a semi-ordered structure, where it is wrapped around histones (structural proteins), forming a composite material called chromatin. Chromatin is the complex of DNA and protein found in the eukaryotic nucleus, of
which
chromatin
packages
varies
chromosomes.
significantly
The
structure
between
different
stages of the cell cycle, according to the requirements of the DNA. During interphase (the period of the cell cycle where the
cell
is
not
dividing),
two
types
of
chromatin
can be distinguished: Euchromatin, which consists of DNA that is active, e.g., being expressed as protein. Heterochromatin,
which
consists
of
mostly
inactive
DNA. It seems to serve structural purposes during the chromosomal
stages.
Heterochromatin
can
be
further
distinguished into two types: o Constitutive heterochromatin, around
the
which is never expressed. centromere
and
usually
is
located
contains
It
repeti-
tive sequences. o Facultative heterochromatin, which is sometimes expressed.
60. THE BLUEPRINT OF MAN: THE HUMAN GENOME
In the early stages of mitosis or meiosis (cell division), the chromatin strands become more and more condensed. They cease to function as accessible genetic material (transcription stops) and become a compact transportable form. This compact form makes the individual chromosomes visible, and they form the classic four arm structure, a pair of sister chromatids attached to each other at the centromere. The shorter arms are called p arms (from the French petit, small) and the longer arms are called q arms (q follows p in the Latin alphabet; q-g “grande�). This is the only natural context in which individual chromosomes are visible with an optical microscope. During divisions, long microtubules attach to the centromere and the two opposite ends of the cell. The microtubules then pull the chromatids apart, so that each daughter cell inherits one set of chromatids. Once the cells have divided, the chromatids are uncoiled and can function again as chromatin. In spite of their appearance, chromosomes are structurally highly condensed, which enables these giant DNA structures to be contained within a cell nucleus. The self-assembled microtubules form the spindle, which attaches
to
chromosomes
at
specialized
structures
called kinetochores, one of which is present on each sister chromatid. A special DNA base sequence in the region of the kinetochores provides, along with special proteins, longer-lasting attachment in this region.
CHROMOSOMES 61.
1
2
3
6
7
8
13
14
15
19
20
4
9
10
16
21
22
5
11
12
17
18
X
Y
The abnormality seen by Nowell and Hungerford on chromosome 22, now known as the Philadelphia Chromosome
62. THE BLUEPRINT OF MAN: THE HUMAN GENOME
Chromosome 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X-chromosome Y-chromosome
Total
Genes
Total bases
Sequenced bases
4,220
247,199,719
224,999,719
1,491
242,751,149
237,712,649
1,550
199,446,827
194,704,827
446
191,263,063
187,297,063
609
180,837,866
177,702,766
2,281
170,896,993
167,273,993
2,135
158,821,424
154,952,424
1,106
146,274,826
142,612,826
1,920
140,442,298
120,312,298
1,793
135,374,737
131,624,737
379
134,452,384
131,130,853
1,430
132,289,534
130,303,534
924
114,127,980
95,559,980
1,347
106,360,585
88,290,585
921
100,338,915
81,341,915
909
88,822,254
78,884,754
1,672
78,654,742
77,800,220
519
76,117,153
74,656,155
1,555
63,806,651
55,785,651
1,008
62,435,965
59,505,254
578
46,944,323
34,171,998
1,092
49,528,953
34,893,953
1,846
154,913,754
151,058,754
454
57,741,652
25,121,652
32,185
3,079,843,747
2,857,698,560
A chart of the chromosomes and how many genes each chromosome has, as well as the number of bases per chromosome and the sequenced bases.
CHROMOSOMES 63.
Chromosomes can be divided into two types—autosomes, and sex chromosomes. Certain genetic traits are linked to a person’s sex, and are passed on through the sex chromosomes. The autosomes contain the rest of the genetic hereditary information. All act in the same way during cell division. Human cells have 23 pairs of large linear nuclear chromosomes (22 pairs of autosomes and one pair of sex chromosomes), giving a total of 46 per cell. In addition to these, Human cells have many hundreds of copies of the mitochondrial genome. Sequencing of the human genome has provided a great deal of information about each of the chromosomes. Below is a table compiling statistics for the chromosomes, based on the Sanger Institute’s human genome information in the Vertebrate Genome Annotation (VEGA) database.
Number of genes is an estimate as it
is in part based on gene predictions. Total chromosome length is an estimate as well, based on the estimated size of unsequenced heterochromatin regions. The prokaryotes – bacteria and archaea – typically have a single circular chromosome, but many variations do exist. Most bacteria have a single circular chromosome that can range in size from only 160,000 base pairs in the endosymbiotic bacterium Candidatus Carsonella ruddii, to 12,200,000 base pairs in the soil-dwelling bacterium Sorangium cellulosum.
Spirochaetes of the genus
Borrelia are a notable exception to this arrangement, with bacteria such as Borrelia burgdorferi, the cause of Lyme disease, containing a single linear chromosome.
64. THE BLUEPRINT OF MAN: THE HUMAN GENOME
Prokaryotes do not possess nuclei. Instead, their DNA is organized into a structure called the nucleoid.The nucleoid is a distinct structure and occupies a defined region of the bacterial cell. This structure is, however, dynamic and is maintained and remodeled by the actions
of
a
range
of
histone-like
proteins,
which
associate with the bacterial chromosome. Bacterial chromosomes tend to be tethered to the plasma membrane of the bacteria. In molecular biology application, this allows for its isolation from plasmid DNA by centrifugation of lysed bacteria and pelleting of the membranes (and the attached DNA). Prokaryotic chromosomes and plasmids are, like eukaryotic DNA, generally supercoiled. The DNA must first be released into its relaxed state for access for transcription, regulation, and replication.
1
2
3
A diagram of The three major single chromosome mutations; Deletion (1), Duplication (2) and Inversion (3).
CHROMOSOMES 65.
MUTATIONS
06 In molecular biology and genetics, mutations are changes in a genomic sequence: the DNA sequence of a cell’s genome or the DNA or RNA sequence of a virus. They can be defined as sudden and spontaneous changes in the cell. Mutations are caused by radiation, viruses, transposons and mutagenic chemicals, as well as errors that occur during meiosis or DNA replication. They can also be induced by the organism itself, by cellular processes such as hypermutation. Mutation can result in several different types of change in sequences; (DNA) these can either have no effect, alter
the
product
of
a
gene,
or
prevent
the
gene
from functioning properly or completely. Studies in the fly Drosophila melanogaster suggest that if a mutation changes a protein produced by a gene, this will probably be harmful, with about 70 percent of these mutations having damaging effects, and the remainder being either neutral or weakly beneficial. Due to the damaging effects that mutations can have on genes, organisms have mechanisms such as DNA repair to remove mutations. Viruses that use RNA as their genetic material have rapid mutation rates,which can be an advantage since these viruses will evolve constantly and rapidly, and thus evade the defensive responses of e.g. the human immune system.
67.
Mutations can involve large sections of DNA becoming duplicated, usually through genetic recombination. These duplications are a major source of raw material for evolving new genes, with tens to hundreds of genes duplicated in animal genomes every million years. Most genes belong to larger families of genes of shared ancestry. Novel genes are produced by several methods, com monly through the duplication and mutation of an ancestral gene, or by recombining parts of different genes to form new combinations with new functions. Domains act as modules, each with a particular and independent function, that can be mixed together to produce genes encoding new proteins with novel properties. For example, the human eye uses four genes to make structures that sense light: three for color vision and one for night vision; all four arose from a single ancestral gene. Another advantage of duplicating a gene (or even an entire genome) is that this increases redundancy; this allows one gene in the pair to acquire a new function while the other copy performs the original function. Other types of mutation occasionally create new genes from previously noncoding DNA. Changes in chromosome number may involve even larger mutations, where segments of the DNA within chromosomes break and then rearrange. For example, two chromosomes in the Homo genus fused to produce human chromosome 2; this fusion did not occur in the lineage of the other apes, and they retain these separate chromosomes. In evolution, the most important role of such chromosomal rearrangements may be to accelerate the divergence of a population into new species by making populations less likely to interbreed, and thereby preserving genetic differences between these populations.
68. THE BLUEPRINT OF MAN: THE HUMAN GENOME
Sequences
of
DNA
that
can
move
about
the
genome,
such as transposons, make up a major fraction of the genetic material of plants and animals, and may have been important in the evolution of genomes. For example, more than a million copies of the Alu sequence are present in the human genome, and these sequences have now been recruited to perform functions such as regulating gene expression. Another effect of these mobile DNA sequences is that when they move within a genome, they can mutate or delete existing genes and thereby produce genetic diversity. Nonlethal mutations accumulate within the gene pool and increase the amount of genetic variation. The abundance of some genetic changes within the gene pool can be reduced by natural selection, while other “more favorable” mutations may accumulate and result in adaptive changes. For example, a butterfly may produce offspring with new mutations. The majority of these mutations will have no effect; but one might change the color of one of the butterfly’s offspring, making it harder (or easier) for predators to see.
MUTATIONS 69.
If this color change is advantageous, the chance of this butterfly surviving and producing its own offspring are a little better, and over time the number of butterflies with this mutation may form a larger percentage of the population. Neutral mutations are defined as mutations whose effects do not influence the fitness of an individual. These can accumulate over time due to genetic drift. It is believed that the overwhelming majority of mutations have no significant effect on an organism’s fitness. Also, DNA repair mechanisms are able to mend most changes before they become permanent mutations, and many organisms have mechanisms for eliminating otherwise permanently mutated somatic cells. Mutation is generally accepted by biologists as the mechanism by which natural selection acts, generating advantageous new traits that survive and multiply in offspring as well as disadvantageous traits, in less fit offspring, that tend to die out.
70. THE BLUEPRINT OF MAN: THE HUMAN GENOME
Two
classes
of
mutations
are
spontaneous
mutations
(molecular decay) and induced mutations caused by mutagens. Spontaneous mutations on the molecular level can be caused by: Tautomerism – A base is changed by the repositioning of a hydrogen atom, altering the hydrogen bonding pattern of that base resulting in incorrect base pairing during replication. Depurination – Loss of a purine base (A or G) to form an apurinic site (AP site). Deamination – Hydrolysis changes a normal base to an atypical base containing a keto group in place of the original amine group. Examples include C → U and A → HX (hypoxanthine), which can be corrected by DNA repair mechanisms; and 5MeC (5-methylcytosine) → T, which is less likely to be detected as a mutation because thymine is a normal DNA base. Induced mutations on the molecular level can be caused by:
Hydroxylamine
NH2OH,
Base
analogs,
Alkylating
agents, These agents can mutate both replicating and non-replicating DNA. In contrast, a base analog can only mutate the DNA when the analog is incorporated in replicating the DNA. Each of these classes of chemical mutagens has certain effects that then lead to transitions, transversions, or deletions.Agents that form DNA adducts. DNA intercalating agents, DNA crosslinkers, Oxidative damage, Nitrous acid converts amine groups on A and C to diazo groups, altering their hydrogen bonding patterns which leads to incorrect base pairing during replication, and Radiation.
MUTATIONS 71.
Mutation rates also vary across species. Evolutionary biologists have theorized that higher mutation rates are beneficial in some situations, because they allow organisms to evolve and therefore adapt more quickly to their environments. For example, repeated exposure of bacteria to antibiotics, and selection of resistant mutants, can result in the selection of bacteria that have a much higher mutation rate than the original population (mutator strains). The sequence of a gene can be altered in a number of ways. Gene mutations have varying effects on health depending on where they occur and whether they alter the function of essential proteins. Mutations in the structure of genes can be classified as: Small-scale mutations, such as those affecting a small gene in one or a few nucleotides, including:
Point mutations,
often caused by chemicals or malfunction of DNA replication, exchange a single nucleotide for another. These changes are classified as transitions or transversions. [30] Most common is the transition that exchanges a purine for a purine (A ↔ G) or a pyrimidine for a pyrimidine, (C ↔ T). A transition can be caused by nitrous acid, base mis-pairing, or mutagenic base analogs such as 5-bromo-2-deoxyuridine (BrdU).
72. THE BLUEPRINT OF MAN: THE HUMAN GENOME
Mutations whose effect is to juxtapose previously separate pieces of DNA, potentially bringing together separate genes to form functionally distinct fusion genes (e.g. bcr-abl). These include: Chromosomal translocations: interchange of genetic parts from nonhomologous chromosomes. Interstitial deletions: an intra-chromosomal deletion that removes a segment of DNA from a single chromosome, thereby apposing previously distant genes. For example, cells isolated from a human astrocytoma, a type of brain tumor, were found to have a chromosomal deletion removing sequences between the “fused in glioblastoma”
gene and the receptor tyrosine
kinase “ros”, producing a fusion protein. The abnormal FIG-ROS fusion protein has constitutively active kinase activity that causes oncogenic transformation (a transformation from normal cells to cancer cells).Chromosomal inversions: reversing the orientation of a chromosomal segment. Loss of heterozygosity: loss of one allele, either by a deletion or recombination event, in an organism that previously had two different alleles.
MUTATIONS 73.
CELL DIVISION
HEALTHY CELL
SUICIDE CELL
MUTATIONS
TUMOUR
Cancer usually results from defects or damage in one or more of the genes involved in cell division. If these genes become damaged, the defective cells can multiply to form abnormal tissue (tumor).
There are
four main types of gene directly linked to cell division, most tumors, have damaged copies of more than one gene.
74. THE BLUEPRINT OF MAN: THE HUMAN GENOME
Loss-of-function mutations are the result of gene product having less or no function. When the allele has a complete loss of function (null allele) it is often called an amorphic mutation. Phenotypes associated with such mutations are most often recessive. Exceptions are when the organism is haploid, or when the reduced dosage of a normal gene product is not enough for a normal phenotype (this is called haploinsufficiency). Gain-offunction mutations change the gene product such that it gains a new and abnormal function. These mutations usually have dominant phenotypes. Often called a neomorphic mutation. Dominant negative mutations (also called antimorphic mutations)
have
an
altered
gene
product
that
acts
antagonistically to the wild-type allele. These mutations usually result in an altered molecular function (often inactive) and are characterised by a dominant or semi-dominant phenotype. In humans, Marfan syndrome is an example of a dominant negative mutation occurring in an autosomal dominant disease. In this condition, the defective glycoprotein product of the fibrillin gene (FBN1) antagonizes the product of the normal allele. Lethal mutations are mutations that lead to the death of the organisms which carry the mutations. A back mutation or reversion is a point mutation that restores the original sequence and hence the original phenotype.
MUTATIONS 75.
A frameshift mutation is a mutation caused by insertion or deletion of a number of nucleotides that is not evenly divisible by three from a DNA sequence. Due to the triplet nature of gene expression by codons, the insertion or deletion can disrupt the reading frame, or the grouping of the codons, resulting in a completely different translation from the original. The earlier in the sequence the deletion or insertion occurs, the more altered the protein produced is. A nonsense mutation is a point mutation in a sequence of DNA that results in a premature stop codon, or a nonsense codon in the transcribed mRNA, and possibly a
truncated,
and
often
nonfunctional
protein
prod-
uct. Missense mutations or nonsynonymous mutations are types of point mutations where a single nucleotide is changed to cause substitution of a different amino acid. This in turn can render the resulting protein nonfunctional. Such mutations are responsible for diseases such as Epidermolysis bullosa, sickle-cell disease, and SOD1 mediated ALS.
76. THE BLUEPRINT OF MAN: THE HUMAN GENOME
Although most mutations that change protein sequences are neutral or harmful, some mutations have a positive effect on an organism. In this case, the mutation may enable the mutant organism to withstand particular environmental stresses better than wild-type organisms, or reproduce more quickly. In these cases a mutation will tend to become more com mon in a population through natural selection. For example, a specific 32 base pair deletion in human CCR5 (CCR5-Δ32) confers HIV resistance to homozygotes and delays AIDS onset in heterozygotes. The CCR5 mutation is more common in those of European descent. One possible explanation of the etiology of the relatively high frequency of CCR5-Δ32 in the European population is that it conferred resistance to the bubonic plague in mid-14th century Europe. People with this mutation were more likely to survive infection; thus its frequency in the population increased. This theory could explain why this mutation is not found in southern Africa, where the bubonic plague never reached. A newer theory suggests that the selective pressure on the CCR5 Delta 32 mutation was caused by smallpox instead of the bubonic plague. Another example is Sickle cell disease, a blood disorder in which the body produces an abnormal type of the oxygen-carrying substance hemoglobin in the red blood cells. One-third of all indigenous inhabitants of Sub-Saharan Africa carry the gene,because in areas where malaria is common, there is a survival value in carrying only a single sickle-cell gene (sickle cell trait). Those with only one of the two alleles of the sickle-cell disease are more resistant to malaria, since the infestation of the malaria plasmodium is halted by the sickling of the cells which it infests
MUTATIONS 77.
THE FUTURE OF GENETICS
07
With the human genome decoded, researchers have the daunting
task
of
sifting
through
the
newly-discov-
ered genes in search of those that lead to disease. These efforts will change the way we view diseases and receive
medical
care.
What
Are
Genetic
Researchers
Really Trying to Figure out? The Human Genome Project recently announced that they had sequenced all of the human chromosomes — about three billion bases long. In itself, this is a huge achievement, but knowing the sequence doesn’t tell researchers what the 30,000 to 50,000 genes actually do in the body. Their next step is to figure out what the genes do, and what role these genes may play in disease. Once researchers know which genes are involved in a disease, they can develop a test to screen people who are at risk and also start looking for a cure. A diagnostic test by itself will not cure the disease, but it can help identify high-risk people who may require more intensive screening or preventative action. Knowing what genes cause a given disease can also help researchers understand what goes wrong in that disease, which can help drive the search for drugs that counteract the problem.
79.
When newspapers announce that a gene has been discovered
for
a
certain
disease,
often
times
it
can
mean a number of things. Finding the gene: Sometimes, researchers
have
identified
a
gene
that
definitely
causes a disease, such as the discovery of the gene for hemophilia or cystic fibrosis. Such a finding does not necessarily mean that researchers can cure the disease, or that a genetic test is im mediately available, but it does mean that the medical com munity may be closer to a possible cure. Finding one of many genes: Other times, researchers have discovered a gene that plays a role in a small subset of people who get a common disease, such as the genes BRCA1 and BRCA2, which are the cause for some people’s breast cancer. Again, finding these genes puts researchers one step closer to a cure or genetic test that can help some people with the disease. Finding a gene in animals: One way to understand gene function in in humans is to find and manipulate a gene that causes an animal — such as the mouse or fruit fly — to show symptoms similar to a human disease. Animals have genes that are very similar to our own, so these discoveries help point researchers to the biological function of a human disease gene. However, it is a long path from finding a gene in flies or mice to finding a genetic treatment for the human disease.
80. THE BLUEPRINT OF MAN: THE HUMAN GENOME
As researchers discover the role genes play in disease, there will be more genetic tests available to help doctors make diagnoses and pinpoint the cause of the disease. For example, heart disease can be caused either by a mutation in certain genes, or by environmental factors such as diet or exercise. Doctors can easily diagnose a person with heart disease once they have symptoms. However, doctors can’t easily tell what the cause for the heart disease is in each person. Thus, all people receive the same treatment regardless of underlying cause of the disease. In the future, a panel of genetic tests for heart disease might reveal the specific genetic factors that are involved in a given person. People with a specific mutation may be able to receive treatment that is targeted to that mutation, thereby treating the cause of the disease rather than just the symptoms. Also, people with genetic risk may be able to receive preventitive treatment before symptoms become apparent.
THE FUTURE OF GENETICS 81.
Currently, doctors take a brief family medical history for their patients. This helps primary care physicians tell if there are genetic factors that may influence a patient’s health. But a family history does not provide a doctor with the complete picture of a person’s risk. Some people may be at risk for an early-onset disease, even if they do not have a family history of that disease. Similarly, not every person with a strong family history of a disease will inherit the risk factors. Doctors are just beginning to be able to screen people for genetic risk factors present in their family. Some of these predictive tests include testing for predisposition to certain types of hereditary colon and breast cancer. In the future, it may be possible for doctors to screen their patients for many genetic-based risks — those that run in their family and those that don’t — all at the same time. Individual risk assessments will be created for each patient based on that patient’s set of genes. By
knowing
which
genes
are
involved
in
disease,
researchers can develop better medical treatments and prevention strategies that are specific for those gene defects. Knowing their personal risk makes it possible for individuals to decrease their chance developing the disease through lifestyle changes, more aggressive disease surveillance, and preventative medical treatments. In certain instances, we already have more personalized medicine because of advances in genetics; in the future these medical benefits will only expand.
82. THE BLUEPRINT OF MAN: THE HUMAN GENOME
In human inheritance, errors may occur, and in these situations genetic diseases appear. They are caused by recessive genes, and here are some examples: Turner Syndrome, cancer, autism, Cri-du-Chat syndrome and Klinefelter syndrome. Cri-du-chat is caused by the shorter arm of the 5th chromosome, and this results in mental retardation, problems in behavior and a cat like sound. This syndrome occurs at one case in 50,000 births. Klinefelter affects only men and it is caused by the presence of an extra X chromosome. The genotype of these individuals has on the 47th position a XXY type and not the normal XY type. The symptoms are small testes, sterility and enlarged breasts.
THE FUTURE OF GENETICS 83.
Gene therapy was actually tried in 1970 and again in 1980 without success, but the knowledge and techniques were primitive by today’s standards. The first attempt in what might be called the modern era of gene therapy began in September 1990 at the National Institutes of Health (NIH), when doctors treated a 4-year-old girl. The
child
suffered
from
a
grave
im mune
deficiency
because she lacked the enzyme adenosine deaminase. The doctors took her own white blood cells, altered them by adding the gene for the missing enzyme and transplanted the altered cells back into her. Next on the NIH agenda was a substantially different strategy introducing a cancer-fighting substance, tumor
necrosis
factor,
into
the
genetic
repertoire
of melanoma patients’ own cancer-fighting white blood cells. Ultimately the same approach may be applied to other types of cancer.
84. THE BLUEPRINT OF MAN: THE HUMAN GENOME
Philip Sharp suggests one possibility that might be tried as soon as techniques are sufficiently refined. Instead of treating an AIDS patient for the rest of his or her life with a drug to protect the im mune system against the HIV virus, doctors might use gene transplants to render the patient’s im mune system permanently resistant to the virus. W. French Anderson of the NIH, one of the architects of the new attempts at gene therapy, sees a bright future. By the early years of the next century, he predicts, gene therapy will have become a highly sophisticated drug delivery system. Doctors will give the patient one, or perhaps several, transplants of his or her own cells that have been genetically engineered to manufacture a drug. In many cases this might replace the conventional practice of injecting drugs at regular intervals. How far in the future is this new application of genetic medicine? Five to ten years for the essential techniques, he estimates, somewhat longer to achieve a high degree of sophistication. The first gene therapy attempts at NIH used the patient’s white blood cells as the target for gene insertion. In the future, scientists hope to perfect techniques for using bone marrow cells. Several research centers are making progress in animal experiments using liver cells and endothelial cells, such as those that line blood vessels, to deliver valuable genes to the tissues where they would be useful. Another strategy that would have seemed sheer fantasy a few years ago is being discussed by serious scientists today. That is the idea of using inhalant spray to deliver copies of a good gene to airway tissues of cystic fibrosis patients.
THE FUTURE OF GENETICS 85.
The transplantation and manipulation of genes in other species has already proved valuable in genetics research and
will
probably
play
an
even
larger
role
in
the
future. Mario Capecchi and his team at the University of Utah have recently used the method of gene manipulation known as homologous recombination to discover the function of a mouse gene. The gene first attracted notice because it produced breast cancer in the animals when it became activated abnormally. By developing mice in which that gene, and only that gene, had been knocked out, the scientists showed that the gene’s normal function is crucial to the development of two regions of the animals’ brain: the midbrain and cerebellum. The discovery opens an important door to studies of brain development and brain function.
86. THE BLUEPRINT OF MAN: THE HUMAN GENOME
To use homologous recombination, scientists must be able to identify and grow embryonic stem (ES) cells, the unspecialized precursors of all other cells in an organism. In Capecchi’s mouse experiments, ES cells are modified to alter the specific gene under study and then implanted in a very early mouse embryo and used to breed animals that have the desired trait or flaw. Some experts consider this technique among the most exciting recent advances in genetics research. But the excitement in genetics is general and pervasive. “Having been part of genetics research for 30 years, I find it almost stupefying that it is every bit as exciting and maybe even more so than it has seemed in the past,” says Leon Rosenberg, dean of the Yale University School of Medicine. “I continue to be dazzled by the pace and surprise of new information in the field.” Studies of microbes, plants, animals, and many normal human beings are all contributing to the explosion of new knowledge. In recent years, molecular genetics has given important insights into the origin of life and its evolution, the emergence of humans, and our intimate relatedness to every other species on Earth. We can expect many more advances as geneticists continue to explore the wonder of life.
THE FUTURE OF GENETICS 87.
INDEX
A
B
ADAPTATION 12, 18
C
CELL 03, 05, 10
ALLELE 15, 25
CELLULAR 36
AMINO ACID 30, 35
CELLULAR FUSION 57,
APERIODIC 05. 10, 36
CHROMATIC 15, 21
ATOM 45
CHROMOSOMES 03, 15
AXON 48, 55
CODE 35, 41, 45
BACTERIA 15, 18, 22
COLOR 25, 33
BIODEGRADABLE 36
CREDIT 13, 25, 40
BIOELECTRICITY 45, 57
CRYPTIC 36, 35, 66
BIONIC 66
D
DEAD CELL 57, 59,
BIOPHARMING 13, 28
DEPOLARIZATION 15
BIOTECHNOLOGY 36, 40
DIE 53, 55
BLOOD 42, 53
DIFFUSION 36, 37, 40
BODY 66, 67
DNA 05, 11, 55
BODY FUNCTION 13, 15
DONOR 25, 27, 36
BULGE 74, 78
DRIVING FORCE 35
THE FUTURE OF GENETICS 89.
E
F
ELECTRON 36, 38, 40
J
JIG 14
EPIGENETICS 10, 44
JOULE 63, 64
EXON 36. 38
JUNCTION 12, 13, 15
FATTY ACIDS 55, 60
JUMP 80
FLUID MOSAIC 25 FRACTAL 47, 48
K
KINETICS 76, 77
GENES 15, 16, 22, 35 GENETIC INFO 33 GENETIC CODE 05, 08
kRYPT 18, 19 L
HOMEOSTASIS 77, 80 I
LIPID 03, 04
HEMI-CELLULOSE 44 HISTONE 08, 14, 16
LECITHIN 05. 07, 14 LIFE 15
GENOME 44, 46, 55, 60 H
KARYOTYPE 05, 15, 16 KILL 08, 12
FUNCTION 55, 56 G
JAGGED 52, 55, 60
ENZYME 05, 07, 08
LIPOPROTEIN 19 M
MARCOMOLECULES 48
HUMAN GENOME 75, 77,
MEMBRANE 14, 38, 44
IMMUNOLOGY 56, 57
MOLECULE 36
IONS 22, 25, 30
MUTATIONS, 53, 54,
INORGANIC 50 INTRON 36, 39
90. THE BLUEPRINT OF MAN: THE HUMAN GENOME
N
NANOTECHNOLOGY 68
SEQUENCE 05, 25, 38
NUCLEUS 11, 18, 33
SUBUNIT 47 T
TRANSCRIPT 23, 68
ORGANIC 65, 66
TRANSPORTER 69, 78 U
ORTHOLOG 36 PARALOG 38
V
PROTEIN 11, 45 QUALIA 80 QUANTUM 79
VISUAL CORTEX 09 W
WHOLE 77, 78 WILD TYPE (WY) 36
X
X-CHROMOSOME 38, 47 XENOBIOTICS 25, 26
QUATERNARY 74
R
VIRAL 08, 12, 17 VIRUS 54, 58, 74
PHARMACOGEN 03, 07 PHENOTYPE 48
URACIL 09, 15 UV RADIATION 47
PARTITION K 11
Q
THEORY 38, 41
ORGANELLE 28
ORGANISM 05, 18, 37
P
SECONDARY 66
NUCLEIC ACID 23, 36
NUTRACEUTICALS 59 O
S
QUINONE 18, 20
Y
Y-CHROMOSOME
RECESSIVE TRAIT 58
Z
ZYGOTE 05
47
REPLICATION 47, 48 RIBOSOME 58, 63 RNA 47, 50
THE FUTURE OF GENETICS 91.
THE BLUEPRINT OF MAN THE HUMAN GENOME There are estimated to be between 20,000 and 25,000 human protein-coding genes. The estimate of the number of human genes has been repeatedly revised down as genome sequence quality and gene finding methods have improved. Earlier predictions estimated that human cells had as many as 2,000,000 genes. The human genome is the genome of Homo sapiens or humans, which is stored on 23 chromosome pairs plus the small mitochondrial DNA. 22 of the 23 chromosomes are autosomal chromosome pairs, while the remaining pair is sex-determining. The haploid human genome occupies a total of just over 3 billion DNA base pairs. The Human Genome Project (HGP) produced a reference sequence of the euchromatic human genome, which is used worldwide in biomedical sciences.