Amino acids by Zain Koofi

Amino Acids: An Introduction to Their Structure, Functions and Biochemical Properties Introduction Any time one deals with anything in Biology, one must also contend with amino acids and proteins: the products of gene activation. To understand proteins, it is necessary to understand amino acids, to learn their structures and to learn a few of the functions and essentiality of the amino acids. There are 20 amino acids and 1 imino acid we will study: glycine (gly) alanine (ala) valine (val) leucine (leu) isoleucine (ile or ileu) proline (pro) ď&#x192;&#x; imino acid phenylalanine (phe) tyrosine (tyr) tryptophan (trp) serine (ser)

cysteine (cys) cystine (cys-cys) threonine (thr) methionine (met) aspartic acid (aspartate; asp) asparagine (asn) glutamic acid (glutamate; glu) glutamine (gln) histidine (his) lysine (lys) arginine (arg)

The simplest amino acid is glycine (gly). It consists of two carbon atoms covalently bonded to each other (Image at right). To one carbon atom, two oxygen atoms are bonded; to the other carbon atom, an amino group (NH3+) and 2 hydrogen atoms are bonded. The carbon that is directly attached to the CO2- is called the ď Ą-carbon. It is this carbon that makes all amino acids used by man the ď Ą -amino acids. When the carboxyl group is deprotonated to form a carboxylate group and when the amine is protonated to form an ammonium group, this form is called a zwitterion: a double ion. This ionization is due to water and/or buffer solvation/ionization. Both positive (+) and negative (-) charges exist per molecule which is normal under physiological conditions. Gly (stick-n-ball model at right: remember that black = C; red = O; white = H; blue = N) is typically found in proteins where there are turns in the amino acid sequence, as it is very small and has a small "R" group (a hydrogen). R groups are radical groups, representative groups or reactive groups. In this case, and for

the case of all the amino acids, we will use the second definition of R group to mean the rest of the amino acid molecule beyond the 2d carbon in the back bone of the amino acid. Gly is an amino acid with an uncharged polar R group. Another way to look at Glycine is illustrated in the graphic at top right. This method uses lines to represent bonds and either illustrates each atom or utilizes a condensed structure form. The same manner may be used with the other 19 amino acids. Amino Acids with Hydrophobic R Groups The next simplest amino acid is alanine. The difference between ala and gly is that the H in gly has been replaced by a CH3 in ala (Images at left and right). Ala is a small amino acid, especially suited for diffusing from muscle cells into the blood to be transported by the blood to the liver for utilization in gluconeogenesis.

Valine (val), leucine (leu) and isoleucine (ile or ileu) are the next three simplest amino acids (line structures and stick-n-ball models, respectively, bottom of above page). These three amino acids are called branched chain amino acids (BCAA's) and are utilized for the synthesis of substrates for gluconeogenesis and for ketogenesis. Leu is the only purely ketogenic amino acid. Ketones are usually associated with someone who has diabetes mellitus and who is in diabetic coma. It is the ketones, or ketone bodies, that give the patient the sweet, fruity smelling breath of diabetic co ma. Proline is actually an imino acid (Images left and right). Note that it is a closed ring amino acid. Pro, like gly, is usually found in proteins where turns are required. A derivative of proline, hydroxyproline is found in connective tissue and helps make the tissue stronger. Phenylalanine is alanine with a benzene ring attached to it (C6H6) (Images left and right). Phenylalanine is necessary for the synthesis of the catecholamines dopa, dopamine, norepinephrine and epinephrine. Some people are born lacking an enzyme that regulates the catabolism of phe. When this happens, a metabolite of phe, phenylpyruvic acid, builds up in nervous tissue and causes severe mental retardation. This condition is known as phenylketonuria, or PKU. People who have PKU are generally blonde, blue-eyed and fair complected. The reason for this is that phe is also necessary for the synthesis of a pigment called melanin that contributes to eye, hair and skin color. People who have PKU must eat a diet low in phe the rest of their life. Since phe

is required by the body to initiate the synthesis of the catecholamines for neurotransmitter and hormonal functions, people who have PKU must add tyrosine (left and right images at bottom of previous page) to their diet -- the product of the hydroxylation of phe that doesn't occur in PKU. Methionine (Images left and right top this page) is a sulfur containing amino acid. Its necessity is to provide the methyl group (CH3) to acceptor molecules in one-carbon metabolism. Onecarbon metabolism is important in the production of red blood cells, white blood cells and platelets. Tryptophan (trp; images immediately left and right) is the last of the amino acids with hydrophobic R groups. Trp is the precursor for the synthesis of serotonin (aka “nature's downer”). Serotonin from the health food store will NOT cross the blood brain barrier; trp is required for this to occur. Turkey and milk have high levels of trp. There seems to be some controversy as to whether or not there is enough trp in milk (especially warm milk) to render a person drowsy so that they will fall asleep when it is difficult for them to do so without assistance. In recent times, selective serotonin reuptake inhibitors have seen use in depression, eating disorders, obsessive compulsive disorder, to name a few, e.g. prozac, celexa. Amino Acids with Uncharged Polar R Groups At physiologic pH, the R groups are not ionized as are the amino and carboxyl groups of the amino acids. Glycine has already been mentioned. It was the first amino acid we examined. Since it’s the simplest, that’s where we started.

Serine (ser) is alanine with an -OH group replacing a â&#x20AC;&#x201C;H (Images at bottom left and right of previous page). As a rule, ser has a function similar to that of threonine (thr; images left and right), another hydroxylated amino acid: it serves as an activation site in enzymes, i.e., when it is phosphorylated or dephosphorylated, the enzyme is turned on or off. The last hydroxylated amino acid is tyrosine (tyr â&#x20AC;&#x201C; with phe discussion). It is, simply, p-hydroxy phenylalanine, with the -OH group straight across the benzene ring from the alanine moiety. Tyr has been discussed, previously, as well, in the phenylalanine discussion. Cysteine (cys; images left and right) is a sulfur containing amino acid. It is found in most connective tissues. The most often thought about site of cys, though, is the hair. Hair maintains its shape by the presence of disulfide bonds (-S-S-). The disulfide bonds come from the loss of -H from the -SH group of two cys molecules in the hair which then bond to hold the hair in its appropriate shape to form cystIne (cys-cys; Images bottom left and right; remember yellow = S ). Cosmetologists, beauticians utilize this property every day when they give perms. They first reduce the natural disulfide bonds in hair, then place the hair in the shape the customer asks, then finish the job with an oxidizing agent that forces the formation of the disulfide bonds and, voila!, a new style comes out from under the curlers, drier, etc.

Asparagine (asn; top images, left and right) and glutamine (gln; images beneath the asn images, this page) are 4 and 5 carbons in length, respectively. They are derivatives of the dicarboxylic amino acids aspartate and glutamate (coming up below). Note that each has an extra NH2 group on the carbon double bonded to an oxygen farthest from the ď Ą-carbon. These two molecules serve as ammonia transporters to the liver and kidney for urea synthesis. Urea (Image centered immediately below) is a small, nontoxic compound (compared to ammonia's effects on the cell) that is excreted via the urine.

Amino Acids with Negatively Charged R Groups at Physiological pH

The next two amino acids under study are the acids aspartate (asp; above left and center top, above) and glutamate (glu; above right and center bottom, above) -- the precursor amino acids of asn and gln, respectively. Both are dicarboxylic amino acids, i.e., there is a COOH group on each end of the molecules. Amino Acids with Positively Charged R Groups at Physiological pH Arginine (arg; below left and top center) and lysine (lys; below right and bottom center) have positively charged R groups at physiological pH.

Lysine is heavily involved in connective tissue biosynthesis. Children with low levels of arginine tend to be mentally retarded (hypoargininemia). Arginine is the last product of the urea cycle from which urea is clipped for excretion. Histidine (his; left and right images) is positively charged at a pH of approximately 6 or below. His is the precursor molecule to histamine, the compound that causes many allergic reactions and which may be blocked by the use of anti-histamines. Histamine synthesis may be stimulated by the

influence of norepinephrine or psychological stress. Because of this, many people who have itching-related health problems may be prescribed a drug like doxepin which has both histamine antagonistic properties and anxiolytic properties: both of which combat the health problem by reducing the anxiety felt by the patient which reduces the itching, which reduces the anxiety which reduces the itching, ad nauseum. Of these 20 amino acids, 8 are essential (humans require them in their diets as humans lack the enzymes to synthesize them from scratch) and 2 are semi-essential (required for growth by the young human). The essential amino acids are phe, val, trp, thr, ile, met, lys, leu. The semi-essential amino acids are his and arg. A helpful mnemonic to remember these is: PVT TIM HALL, where the first letter of each amino acid makes up this mnemonic. Acid-Base Titrations: Amino Acid Applications Mono-protic Acid Dissociation -- The dissociation of a monoprotic acid follows the general reactions below: HA  H+ + A- Or HA + MOH  HOH + MA As you learned in your pre-requisite courses, the acid does not dissociate all at once, rather the deprotonation occurs slowly and sequentially … even with a mono-protic acid (See titration/first derivative curve in image at upper right to refresh your memory). The image below is actual data for the titration of HOAc performed by one of your predecessors at WNC in Spring 2008.

dpH/dV

Potentiometric Titration of HOAc -- Trial 1 10

0 0

mL 0.1 N NaOH Added

Di-protic Acid Dissociation The dissociation of a di-protic acid follows the general reactions below: H2A  2H+ + A-2 Or H2A + 2MOH  2HOH + M2A As you learned in your pre-requisite courses, the acid does not dissociate all at once, rather the deprotonation occurs slowly and sequentially (See titration/first derivative curve in image at upper left to refresh your memory). Tri-protic Acid Dissociation The dissociation of a di-protic acid follows the general reactions below: H3A  3H+ + A-3 Or H3A + 3MOH  3HOH + M3A As you learned in your pre-requisite courses, the acid does not dissociate all at once, rather the deprotonation occurs slowly and sequentially (See titration/first derivative curve in image at upper right to refresh your memory). Amino acids are also titratable. For example, the titration of Alanine follows diprotic acid titration patterns. The carboxyl group is deprotonated first; eventually, the amine is protonated. This forms the zwitterionic form with the carboxylate group and the ammonium group. The titration of Glutamate follows tri-protic acid titration. The carboxyl group bound to the  carbon is deprotonated first. The R group (another carboxyl group as you’ll recall) is deprotonated, next. Lastly, the amine is protonated.

Histidine Titration 3.5 3 2.5 2 1.5 1 0.5 0 -0.5

dpH/dV

14 12 10 8 6 4 2 0 0

0.1 N NaOH Added (mL) Another of your predecessors at WNC titrated Histidine. Histidineâ&#x20AC;&#x2122;s carboxyl group is deprotonated first. The imidazole ring is deprotonated secondly. Lastly, the amine is protonated. Each endpoint is clearly identifiable not only on the first derivative curve in the graphic above, but on the pH curve, as well. So, what is the value of being able to titrate amino acids? We all know that structure gives function. At specific pHâ&#x20AC;&#x2122;s, each R group in a protein takes on specific charges. Not only do the charges give structure to the protein, we can also take advantage of that property of amino acids and proteins and separate them for identification by a process known as electrophoresis. Electrophoresis is defined as the separation of (in this case) amino acids or proteins in a gel by an electrical charge. The graphic at lower right illustrates Electrophoresis. The top portion of the graphic is a top view of a gel (usually polyacrylamide) that has three wells in it â&#x20AC;&#x201C; each well has a specific sample in it (an amino acid OR a protein). The gel is placed in a tank on a support and the samples are loaded into the wells. A buffer (determined by earlier research) is poured all around the gel and the electrophoresis tank is covered.

Don’t let “tank” fool you – these are usually small. Some tanks can be as small as 4” by 4”. A power supply is attached to the electrophoresis tank and turned on. The current runs for a pre-determined amount of time, then it is turned off. Once the gel is removed from the tank it is stained and the bands, as they are called, are visualized. The bottom part of the electrophoresis graphic shows that two of the samples migrated towards opposite ends of the gel. One remained at the origin. The top sample migrated to the positively charged electrode – this means that the amino acid or protein had an overall charge that was negative (which amino acids would give a negative charge?). The “blue sample” migrated to the negative electrode. This means that it had an overall charge that was positive (what amino acids will give overall positive charges?). So, how are amino acids “put together” to make proteins? Peptides and Peptide Bond Amino acids are the building blocks of proteins. In order for the amino acids to link together to form the numerous proteins necessary to keep a human functioning, they form a special bond between each other: the peptide bond (highlighted in the image at right). The peptide bond is formed between the carboxyl group of the first amino acid and the amino group of the second amino acid to form a dipeptide. The peptide bond is unique in that it appears to be a single bond, but has the characteristic of a double bond, i.e., it is a rigid bond. This kind of bond only occurs between amino acids. As the amino acid chain increases, the next amino acid adds onto the previous carboxyl group by its amino group. Peptide Bond and Peptides By convention, the left amino acid is always the #1 amino acid; is the free amino end or the Nterminus (see image top of following page). The farthest amino acid residue to the right is the amino acid in the protein that has the highest number and, as a general rule, is the free carboxyl end or the C-terminus. In some cases, the -OH may be replaced with an NH2, making it an amide.

When dealing with peptides, there is always one LESS peptide bond than there are amino acid residues in the protein, i.e., a tripeptide has 2 peptide bonds and three amino acids; a hexapeptide has five peptide bonds and six amino acids, ad nauseum. At right is an image of a stick-n-ball model of a dipeptide – the peptide bond is marked by the red box, as it is in the image above. Primary Structure of Proteins The sequence[s] of the amino acids held together by peptide bonds ONLY is called the primary structure of a protein. Secondary Structure of Proteins The secondary structure of proteins is determined by how the amino acid sequence (primary structure) folds upon itself and bonds with hydrogen bonds, i.e., non-covalent attractive forces. There are, for this course, 3 secondary structures: -helix; -pleated sheet; Thermodynamic random coil. -helix A protein that coils on itself in a right handed turn is called an -helix. The -helix permits tissues to stretch a bit, like hair (or a coiled spring). Note that the H bonds are between the carbonyl oxygen and the amino hydrogen. Only a PORTION of a protein is in alpha-helix, NOT the whole protein. This is illustrated at right. Green = N; black = C; red = O, blue = H bonds ca 3.6 amino acid residues apart. The H bonds are what give this structure its shape and stability. Helices are capable of stretching … much like coiled springs.

When one looks down an -helix, it’s much like looking down the bore of a piece of hose, image at right. Black = C; red = O; green = N; orange’ish = R groups; no H bonds are shown as it would be hideous to look at. -pleated sheet The second of the secondary structures of proteins is called the pleated sheet or, some times, the -pleated sheet (Image, center right of this page). The pleated sheet is in the antiparallel organization, i.e., the peptide chains making up the sheet are running in opposite directions to each other. Pleated sheets tend to make proteins that do not "give", e.g., silk, it doesn't seem to be of great importance to other proteins. Colors are as before. When thinking of a pleated sheet do as Linus Pauling did: he figured it out after surgery in the hospital by folding the bed sheet into a “fan shape” … like we all did with construction paper in kindergarten. That was Nobel Prize #1 for Dr. Pauling. Thermodynamic Random Coil The last secondary structure about which we have interest is the thermodynamic random coil (Image at right). Although we call this a random coil, nature tells us that there is a reason for every structure. We call it random as we have not worked out the "code" of this structure. In addition, if we denature this structure, the protein loses its function. Tertiary Structure The tertiary structure of a protein is, for all intents and purposes, the three dimensional shape of the protein brought about by interaction forces of ionic, hydrophobic and covalent disulfide links of the one protein chain. Tertiary structure, put another way, is the manner in which the R groups assist the protein in secondary structure formation to fold, twist, bend, kink, AGAIN, upon itself. Water soluble proteins fold so that hydrophobic R groups are tucked inside the protein and hydrophilic R groups are on the outside of the protein. WHY? This way, the protein may interact with the solvent (water) and not precipitate or otherwise be inactivated.

Water insoluble proteins fold so that hydrophilic R groups are tucked inside the protein and hydrophobic R groups are on the outside of the protein. WHY? This is so that a protein, e.g., an ion channel in a cell membrane, may insert itself in a non-polar environment so that polar particles may be transported into or out of regions compartmentalized from each other. Ionic interactions also stabilize tertiary structures. Where, though, are the ionic groups? They are the R groups (Image at right)! The carboxyl groups on asp and glu; the -amino group on lys; the guanidino group on arg; the imidazole ring on his. R groups cross-link to form “salt links” (image at immediate left; black and purple lines represent remainder of protein chains – 2 chains linked by R groups; lower middle left image is an intrachain linkage). Disulfide bonds assist in tertiary structure by allowing the protein chain to interconnect itself and introduce a hair-pin into its structure (image at right) -- just like how straight hair is curled and curly hair is straightened out. Two tertiary structure examples include the -chain of hemoglobin and the myoglobin molecule, Image below of hemoglobin). Quaternary Structure The last structure of proteins in which we have interest is called the quaternary structure: the organization of two or more protein chains to bind together in such a manner as to give the group of proteins a single function, e.g., the tetramer of hemoglobin. The 4 proteins are held together by salt links, hydrophobic and hydrophilic interactions. In Hemoglobin (also in image at lower left), disruption of

these forces (to form deoxy hemoglobin) cause the hemoglobin molecule to become smaller than oxy-hemoglobin. Protein Denaturation The denaturation of proteins includes anything that disrupts secondary, tertiary and/or quaternary structure of proteins: heat, alcohol, salts, heavy metals, freeze/thaw, acids/bases. All cause inactivation of proteins. Groups of Proteins Fibrous proteins include: •

Collagens: connective tissue; after it's boiled, the soluble part is called gelatin (Bill Cosby sells this as JELLOtm)

•

Elastins: in stretching tissues

•

Keratins: water-proofing proteins

•

Myosins: in muscle

•

Fibrin: blood clotting protein

Globular proteins include: •

Albumins: water soluble; transporters and increase blood osmotic pressure

•

Globulins: saline soluble; transporters and antibodies

•

Enzymes: biological reaction catalysts

Enzymes Of significance, of course, is the fact that the shape of the enzyme gives it its function (the shape of a protein gives it its function). Enzymes speed up the reaction rate in biological systems 100,000 - 1,000,000 fold! Some are known to increase the reaction rate > 10 20-fold! Enzymes have specific substrates (chemical group upon which the enzyme works), but can work on limited kinds of substrates. Enzymes Have Specific Functions Enzymes are categorized into one of 6 biological activities according to the Enzyme Commission (E.C.):

–

Oxidoreductases: catalyze redox reactions -- involve NAD and FAD (E.C. 1.X.X.X)

–

Transferases: catalyze group transfers (E.C. 2.X.X.X)

–

Hydrolases: use water to lyse bonds (E.C. 3.X.X.X)

–

Lyase: nonhydrolytic and non-oxidative group removal (E.C. 4.X.X.X)

–

Isomerases: catalyse isomerization reactions (E.C. 5.X.X.X)

–

Ligase: catalyzes reactions requiring ATP hydrolysis (E.C. 6.X.X.X)

Enzyme “Add-On’s” – Terminology 

Active site  3-dimensional cleft in the enzyme caused by/coded by the primary structure of the protein; complimentary to the shape (geometry of the substrate)

•

Apoenzyme  active enzyme minus the cofactor; catalytically inactive

•

Coenzyme  a carbon-based molecule required by an enzyme for complete catalytic capacity, e.g., NAD+, FAD, vitamins – bound loosely to the apoenzyme

•

Cofactors  a molecule or ion of a non-protein nature that is required by an enzyme for complete catalytic capacity, e.g., Mn2+, Zn2+, Fe2+, Cu2+, Ca2+, Mg2+, Mo2+

•

Constitutive enzymes  always in the cell without regard to the availability of substrate

•

E.C. nomenclature  comes from Enzyme Commission. The Commission (a subcommittee of the International Union of Biochemists) mandated a standardized name and numbering system for enzymes to make it easier for everyone to know which of their favorite enzymes they discuss. The numerical system is a 4 number system – each of the 4 numbers is separated by a dot (“.”). Only the first number is important for this course. “E.C. 1” = oxidoreductases; “E.C. 2” = transferases; “E.C. 3” = hydrolases; “E.C. 4” = lyases; “E.C. 5” = isomerases; “E.C. 6” = ligases.

•

Holoenzyme  apoenzyme plus prosthetic group

•

Induced enzyme  present in the cell ONLY when substrate activates gene mechanisms causing intracellular release of active enzyme.

•

Prosthetic group non-protein moiety tightly bound to apoenzyme

•

Specificity characteristics  due to the active site; crevice allows binding of 1) only one substrate or 2) 1 kind of R group

•

Zymogens  immature enzymes that need “clipping” for activation – more later in course

Enzymes are globular proteins. The exception to this rule is a class of RNA molecules that possess enzyme activity: ribozymes. Without enzymes, cellular reactions go too slowly to be conducive to life. All enzyme names end in “ase”. Efficiency of Enzymes Enzymes Increase the rate of reaction without being consumed themselves. Enzymes lower the Ea; have no effect on Keq; Enzymes permit reactions to reach equilibrium quicker; Enzymes have pH and temperature requirements; Enzymes cause reactions to go within seconds as opposed to lab reactions that may take years; Enzymes are an absolute necessity to/for life: E.g., CO2 + H2O  H2CO3 Catalyzed by carbonic anhydrase at a rate of 6*105 molecules of CO2 condensed per second! Specificity of Enzymes Enzyme specificity occurs in either the reaction types catalyzed or in the substance involved in the reaction (substrate; S): 

Absolute specificity  catalyzes reaction with only one S



Relative specificity  catalyzes reaction of substrates with similar structures, e.g., functional groups



Stereochemical specificity  D vs L – this has to do with “handedness: right (D) or left (L) “handed”

Enzyme Regulation The cell regulates which enzymes function and when, i.e., not ALL enzymes are working at the same time. Some enzymes catalyze uni-directionally; some catalyze bi-directionally. Enzyme Activity is defined as the catalytic capacity of enzyme to increase reaction rate; the Turnover number is defined as the number of S molecules acted upon by ONE enzyme molecule per minute; Enzyme assays measure enzyme activity. Enzyme activity is measured in International Units  IU. The amount of enzyme that catalyzes 1 mol of substrate to be altered to product per minute at a given pH, T and [S]. It measures the amount of enzyme present, therefore, an enzyme level of 150 IU = an enzyme concentration 150 times greater than the standard – useful in diagnosing diseases.

Enzyme Models There are two generally accepted models for the functioning of enzymes: the lock and key model and the induced fit model. We will address the lock and key model first. Model #1: Lock-n-Key In this model, see graphic, above, top of page, the substrate (S) is complimentary to the binding/active site in the enzyme (E). This is likened to the lock and key, where the lock is complimentary to the key. As the E and S bind, they form the Enzyme-Substrate complex (ES). The enzyme acts as a sort of scaffold, holding the substrate so that one specific reaction may occur. ES is an intermediate in the reaction that will cause S to be changed into a product (P) to form the enzyme-product complex (EP). In this case, a bond (or bonds) is (are) broken as the enzyme changes its shape ever so slightly, causing the substrate to break exactly where it's supposed to, releasing the new products and the enzyme for use, again. Once EP is formed, itâ&#x20AC;&#x2122;s a matter of time for P to be released and E to start the process all over. Remember that the active sites of the enzyme are complimentary to the SHAPE of the substrate. Model #2: Induced Fit The second model is called the induced fit model. This means that as the S gets closer to the E, the E actually undergoes a conformational change (shape change) to fit the S, i.e., its shape is INDUCED to change by the presence of the substrate (Image below). Note that as S gets closer to E, the active sites change shape to match the complimentary site on S. As S continues to get even closer, the next site shifts its shape, as does the last site when S is all but bound to the enzyme. Once ES is formed, this model conforms

to the remainder of the lock and key theory of enzyme-substrate binding, i.e., it goes through ES to get to EP to form E plus P. Enzyme Inhibition: Descriptive Introduction The upper left graphic represents the normal ES complex for comparison with the three types of inhibition patterns. Note the green “pellet-shaped” region in the top center of the graphic – this is S for all future comparisons. The remainder of the graphic represents the enzyme (E). The upper right graphic represents competitive inhibition of an enzyme, i.e., an inhibitor specific to this enzyme COMPETES with the substrate for the active site of this enzyme. It is reversible; will block S from binding. One example of this sort of inhibition is carbamoyl choline that competitively inhibits acetylcholinesterase. Note the green-outlined red region in the center with the white region above it. This is the inhibitor – it “looks” in part like the S – it’s just different enough, however, to plug the crevice and inhibit the enzyme. The bottom left graphic represents uncompetitive inhibition. This sort of inhibition involves covalently bound inhibitor and inactivates the enzyme irreversibly. The “pitchform” represents the inhibitor – it looks nothing like the S. Two examples of this sort of inhibitor are nerve gas and organophosphates that inhibit acetylcholinesterase. Organophosphate poisoning may be reversed by injecting a drug called 2-PAM. Valium and atropine are useful to treat muscle spasms and breathing difficulties, as well. The bottom right graphic represents noncompetitive inhibition. Note that the inhibitor does NOT bind to the active site of the enzyme, rather it has its own unique binding site (the red bar). When a noncompetitive inhibitor binds to an enzyme, it causes the enzyme to change shape (see the white gap in the active site on the right side of the “pellet-shaped” S?) and shuts off its activity

reversibly by not allowing S to bind completely. This sort of inhibition is also referred to as allosteric inhibition and plays major roles in metabolic regulation. An example of mixed inhibitor types is aspirin (ASA) and Ibuprofen (IBU). ASA is an UNcompetitive inhibitor of COX-1 (CycloOXygenase type 1). ASA and IBU inhibit cyclooxygenase variants which is the main enzyme in prostaglandin biosynthesis. Prostaglandins mediate pain, Inflammation, blood pressure, gastric mucous secretion, blood clotting, labor and delivery, dysmenorrhea, to name a few. This inhibition is IR-reversible – unlike other NSAID’s (Non-Steroidal Anti-Inflammatory Drug’s). ASA acetylates COX-1 to inhibit it. The half life (t½) of ASA varies by dosage: 250 mg dose t½ = 2-4.5 hrs; 1 g dose t½ = 5 hrs; 2 g dose t½ = 9 hours; > 4 g t½ = 15-30 hrs. ASA CHANGES COX-2 activity to produce anti-inflammatory lipoxins (“LX’s”; derived from 3 fatty acids (EPA) as well as 6 fatty acids such as 20:45,8,11,14), see image at right. IBU is a NONcompetitive inhibitor of COX-2. It is reversible in its inhibition. IBU works primarily through COX-2 (Like Vioxx and Celebrex by reducing PGI2. This permits “normal” TX(A2 or B2) production which increases the incidence of blood clots [IBU has lowest incidence of GI/Hematological Sx of the NSAID’s, by the way]). The half life for IBU is unique in that it’s about 1.8-2 hrs while its duration of action is about 24X the t½. The problem with these two medications is that IBU binds to COX-2 inhibiting the production of PGI2 –

the natural titrant of TX’s. What does this mean? Blood clots, potentially. If one is a cardiac patient and is taking low dose po ASA to prevent blood clot formation, yet needs IBU for pain control, what is one to do? Per http://www.fda.gov/Drugs/DrugSafety/PostmarketDrugSafetyInformationforPatientsandProvid ers/ucm125222.htm, the patient needs to take their ASA first and wait 30 minutes to take their IBU (top of graphic at bottom of previous page) or take their IBU 8 hours before their ASA dose. If IBU is taken first or is not taken long enough before the ASA dose, IBU not only binds, it also blocks the binding of ASA, to COX-1 and COX-2 (bottom graphic at bottom of previous page). Should this occur, the potential for a fatal MI due to thrombosis of [a] coronary arter[y]ies is elevated. Isoenzymes/Isozymes Multiple forms of enzymes in different tissues with the same activity are called isoenzymes or isozymes. They possess identical cofactors but slightly different apoenzymes. Two best examples: LDH (or LD) and CPK (for those of us old enough to remember it this way; nowadays, it’s CK). LDH – Lactate Dehydrogenase LDH is a cellular enzyme that generally is activated during anaerobic glycolysis. There are at least five (5) variants (isozymes). These variants are summarized in the table below. LDH LD1

LD2

LD3

LD4

LD5

H3M

H2M2

HM3

Heart

Heart (); Brain = Kidney

Brain = Lung

Lung (); Skeletal muscle

Liver = Skeletal Muscle

H = heart sub-units; M = muscle sub-units

LD1 was first discovered in the heart and has four (4) identical protein sub-units. Since they are identical and from the heart, each sub-unit is called an “H” sub-unit. There are four (4) H subunits in LD1. LD5 was identified in muscle and it, too, had four (4) identical sub-units, called “M” for muscle. LD2-4 were identified afterwards and their sub-units were actually combinations of the H and M sub-units as indicated in the table. LD2 is highest in the heart; brain and kidney

activities are about the same. LD4 has the greatest activity in the lungs, although there is activity in skeletal muscle. CPK or CK– Creatinephosphokinase or creatine kinase CK was first isolated in skeletal muscle, followed by brain tissue. In the late 1970’s, we were assaying total CPK levels and obtaining astronomical values … that didn’t fit the MI damage. Turned out there were 3 variants (at least – some sources indicate there may be as many as 3 sub-variants of CK-MB, for example). CK-MB was specific for the heart. In the early 1980’s, this assay was not STAT – it was “when you can get it”. The assay, itself, took almost 3 hours to run. Nowadays, these are run at the bedside. C[P]K BB

1° Brain

1° Skeletal muscle

1° Heart

Medical Uses of Enzymes and Enzyme Assays When cells die or are injured, they dump some or all of their E’s into the blood. Assays are used to make diagnoses, e.g., –

C[P]K, LDH 2° MI (myoglobins and troponins are being used, as well)

–

GPT (ALT) 2° liver problems

–

GOT (AST) 2° MI or liver problems

–

Ratios •

GPT:GOT – normal = 0.75; viral hepatitis = 1.6

•

LD1:LD2 – normally < 1; 48° after MI, > 1 and is called the LD1-LD2 “flip”

Calcium ion channel blockers – block calcium ion influx via integral protein membrane channel which leads to reduced calcium ion being taken up inside the cell which leads to reduced muscle contraction. This reduction in contraction of the heart muscle makes it easier for the heart to beat to reduce the risk of MI or death after MI. More on calcium ion channel blockers in a later monograph (The Cell).

Physiological Enzymology Pepsin Pepsin hydrolyzes proteins at the C-terminus of Trp Phe

The aromatic amino acids

Met

Sulfur-containing amino acid

Leu

BCAA

Tyr

Pepsin Activity – Image at middle right illustrates some of pepsin’s proteolytic activity. Pepsin is an enzyme from the stomach that initiates protein digestion in humans. Note the disulfide bond; Note the pepsin cleavage sites; Note the products. Pepsin Activity – Image at lower right is an additional example of pepsin’s proteolytic action. Again, note the disulfide bond (pepsin won’t cleave this bond; it requires a desulfhydratase); Note the cleavage sites; Note the products. Pepsin can only do so much in the stomach. More enzymes are needed throughout the GI system to assist in protein digestion. Proteases from Small Bowel Aminopeptidase – removes N-terminal amino acid from peptide: Asp-Gly-Pro-Lys-Arg-Cys-Phe + aminopeptidase Yields Asp + Gly-Pro-Lys-Arg-Cys-Phe If repeated one AA at a time, this would disassemble peptide interminably slowly.

Dipeptidase: is the final protease that hydrolyzes dipeptides to free amino acids: Pro-Met + dipeptidase Yields Pro + Met Pancreatic Proteases Trypsin

Chymotrypsin

Carboxypeptidase

Elastase

Cleaves at the Cterminus of Arg and Lys

Cleaves at the Cterminus of Phe, Trp, Tyr

Removes the Cterminal amino acid -- one at a time

Cleaves at the Cterminus of Ser, Thr, Tyr, Asn, Gln and Cys

Note that chymotrypsin has some of the same activity of pepsin. Pepsin, though, functions BEST at a pH of around 2. It STILL has SOME function at higher pH's, although it "prefers" the acidic conditions of the stomach for its "pH optimum". The pH optimum is the pH at which optimal activity is attained. Chymotrypsin functions best at an alkaline pH, as is found in the small bowel. Trypsin Trypsin is secreted as trypsinogen and activated by enterokinase Trypsin cleaves peptides/proteins at the C-terminus of Arg and Lys â&#x20AC;&#x201C; the positively charged amino acids (image above right). Cystic Fibrosis A quick and dirty lab test to detect the potential for a newborn patient to develop cystic fibrosis (CF) tests for fecal trypsin activity. Although we usually think of CF as a pulmonary disease, it has multiple ramifications, including bowel disorders. This disorder comes about because the pancreas gets plugged by this disease in its process, rendering digestion difficult, to say the least. Since the pancreas gets plugged, it can not secrete digestive enzymes like trypsin.

To perform this easy screening procedure, you need do the following. Place the bottom of a Petri dish flat on a lab surface, break an applicator stick in two and place them in the dish on top of a moistened paper towel. With another applicator stick, smear a little baby poop on the piece of x-ray film and mix it with normal saline. Place the film on the applicator sticks and cover with the top of the Petri dish. Incubate at 37 C. After incubation, rinse off the film and examine it. If the surface of the film doesn’t look “chewed up” or “roughened”, this means that there is no trypsin in poop. The pancreas is plugged and this infant needs to be tested further for CF. If there’s a rough surface or a sort of “hole” in film, this means that there is trypsin in poop. The pancreas is ok and no further tests for CF are necessary. Chymotrypsin[ogen] Chymotrypsinogen is activated by Trypsin. Chymotrypsin cleaves at the C-terminus of Trp, Phe, Tyr – the aromatic amino acids (image top right, this page). Elastase Elastase cleaves at the C-terminus of the neutral amino acids: Ser, Thr, Tyr, Cys, Asn, Gln (image middle right, this page). Elastase Aside Elastase is present in high quantities in lung tissue. Elastase activity is inhibited under normal conditions by 1-PI – alpha one-protease inhibitor. This allows lungs to remain pliable and “stretchy-able”. Smoking inhibits 1-PI – alpha one-protease inhibitor. Elastase is activated and the lungs lose pliability and the patient is working on “getting” COPD.

Carboxypeptidase Carboxypeptidase removes the C-terminal amino acid from a peptide, e.g., Cys-Pro-Leu-Arg-Gly-Lys + Carboxypeptidase Yields Cys-Pro-Leu-Arg-Gly + Lys Proteases

If each enzyme were to work one at a time, this process would take forever. Therefore, all of these enzymes (small bowel and pancreas) work at the same time to disassemble proteins. Pepsin, of course, only works, optimally, in the stomach. Note in the graphic, above, the backup enzymes and specificity. This â&#x20AC;&#x153;assembly lineâ&#x20AC;? approach renders proteolysis incredibly effeicient and effective. Experimental: Qualitative Amino Acid and Protein Methods Char test: Place about a pea-sized bit of casein in an evaporating dish and ignite it with the Bunsen burner. What does the odor remind you of? What do you think it smells like?

Xanthoproteic reaction: Add 10 gtts concentrated nitric acid to a third of a pea-sized amount of egg albumin in 30 drops of water. Heat this tube to boiling in your hot water bath. Did it turn yellow?

Now add 12 gtts 6 M NaOH to your mixture and examine the surface of the mixture in the tube. Did it turn orange?

If it turned orange (as in the graphic at right), this is a positive test. This test tests for the presence of aromatic (contains a benzene ring for our purposes) amino acids. Draw the 3 aromatic amino acids in the space below:

Biuret test: Obtain 5 test tubes. Label them 1-5. Leave the #1 tube empty for now. Into each of the next 4 tubes add solid egg albumin in the following manner:

Tube 2 About a quarter the size of a small pea

Approximate Sample Size of Egg Albumin Tube 3 Tube 4 About a third the About a half the size of a small pea size of a small pea

Tube 5 About two-thirds the size of a small pea

Now, into each of the 5 tubes, add 20 gtts water and vortex to mix. Add 20 gtts 6 M NaOH to each of the five tubes and re-vortex. Add 3 gtts of the CuSO4 solution to each of the 5 tubes and re-vortex. Record your observations (look at the color[s] and intensities):

Tube 1

Biuret Test Observations Tube 2 Tube 3 Tube 4

Tube 5

Note (and remember) that your first tube (tube #1) has NO albumin in it. As a general rule, the more protein present, the more peptide bonds that are present. Remember that the oxygen in the peptide bond has 2 unbonded pairs of electrons. The nitrogen has one pair of unbounded electrons. It is through coordinate covalent bonding that the Copper(II) ion reacts with the protein to deepen the color. Are your results consistent with your observations and the graphic at right? Why or why not?

Urea hydrolysis: Place about a half cm of urea in the bottom of an ignition tube (hold the tube with a three fingered clamp to the ring stand – “aim” it away from people) and place a piece of moistened RED litmus paper folded in a “V” shape in the neck of the tube (as in the image at right). Heat it gently with your Bunsen burner (the urea will bubble). CAREFULLY waft the odor towards you. What is the gas that is emitted?

What color did the litmus paper turn?

Define “thermolysis” in the space below.

Experimental: Quantitative Amino Acid and Protein Methods Potentiometric Titration of an Amino Acid In previous course-work, you explored the potentiometric titration of a weak acid (HOAc). In this experiment, you will explore the titration of an amino acid. The information you will obtain from this experiment will demonstrate the acidity of the carboxyl group (COOH), the alkalinity of the amino group (-NH2) and the acid-base characteristics of the R-group -- if any on the amino acid, at all. Remember from lecture that amino acids under physiological conditions undergo double ionization and are called "zwitterions" -- twin ions. This term comes about from the deprotonation of the carboxyl group to the carboxylate ion (COO -) and the protonation of the amino group to the ammonium ion (-NH3+). R-groups, such as the imidazole ring, carboxyl group, amino group, guanidino group, will protonate or deprotonate depending upon their chemical characteristics and the pH of the solution in which they are solvated. The table, below, lists the pK values for some of the amino acids and the isoelectric point (pI) for these amino acids, as well:

Name of Amino Acid Alanine (ala) Arginine (arg) Aspartic acid (asp) Cysteine (cys) Glutamic acid (glu) Glycine (gly) Histidine (his) Isoleucine (ile) Leucine (leu) Lysine (lys) Methionine (met) Proline (pro) Serine (ser) Threonine (thr) Tryptophan (trp) Tyrosine (tyr) Valine (val)

pKCOOH 2.34 2.17 2.09 1.71 2.19 2.34 1.82 2.32 2.36 2.18 2.20 2.00 2.21 2.63 2.35 2.20 2.29

pKNH2 9.69 9.04 9.82 10.78 9.67 9.6 9.17 9.75 9.60 8.95 9.05 10.6 (Imine) 9.15 10.43 9.33 9.11 9.72

pKR 12.48 3.86 8.33 4.25 6.0

10.53

10.07

pI 6.02 10.76 2.98 9.56 3.22 5.97 7.59 6.04 5.98 9.74 5.63 6.3 5.68 6.53 5.84 9.59 6.01

Remember that the calculation of the pI is to take the sum of the 2 closest pK values and divide them by 2. This is the point at which the amino acid is electrically neutral. There are some caveats to remember when titrating an amino acid: while this is an accepted technique among some analytical chemists, among biochemists it's not believed to be as accurate as something like HPLC or 2-D paper chromatography for amino acid identification. It is also important to remember that pK's within about 2 pK units will probably NOT be detected by this method, e.g., the 2 carboxyl groups on asp or glu or the 2 amines on lys may not be picked up. The purpose of this portion of the experiment is for the student to observe the amphiprotic characteristics of amino acids and use a â&#x20AC;&#x153;pre-fabbâ&#x20AC;&#x2122;dâ&#x20AC;? Excel spreadsheet to demonstrate the zwitterionic characteristics of an amino acid. The techniques for determining the titration and first derivative curves are as per the potentiometric titration of a weak acid experiment that you performed in previous courses, i.e., carry that knowledge into this experiment. Materials and Methods Chemicals Equipment 1 M HCl Weighing boats 0.1 N NaOH Spatula Amino acid (assigned to each student) Electronic pan balance pH 4 and 7 buffers Buret and clamp 1-calibrated pH Checker Ring stand 2-125 mL Erlenmeyer flasks

Method Mass out 2 samples of your amino acid such that they are not more than 0.1000 g, apiece. Record the amino acid and the masses in the table, below:

Name of Amino Acid: Sample #1 mass (g)

Sample #2 mass (g)

Pour each sample into each of the pre-labeled Erlenmeyer flasks and dissolve the amino acid in about 25 mL of water.

Obtain your pH checker and turn it on. Calibrate it according to instructions. Once the pH checker is satisfactorily calibrated, insert it into the amino acid solution and allow it to stabilize. Once it has stabilized, add enough 1 M HCl to it to adjust the pH of the solution to a pH of about 1 to 1.5. Record that pH and the volume of HCl added to get to that pH in the box below:

Sample 1 pH after ______ mL HCl added pH =

Sample 2 pH after ______ mL HCl added pH =

Obtain your buret and titrating supplies. Clean and prepare the buret as in the past, remembering to put the 0.1 N NaOH in the buret. Remember, too, to read your buret from top down. Once you have this assembled, you are ready to begin titrating your sample. Read the following table carefully and record your pH readings at the volumes indicated. Note that this is very painstaking and that readings are characteristically taken after the addition of every 0.5 mL 0.1 N NaOH. This is necessary for good first (and sometimes, second) derivative curves. You will perform this titration in duplicate.

TRIAL 1

Vol 0.1 N NaOH (mL) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5

Vol 0.1 N NaOH (mL) 15.0 15.5 16.0 16.5 17.0 17.5 18.0 18.5 19.0 19.5 20.0 20.5 21.0 21.5 22.0 22.5 23.0 23.5 24.0 24.5 25.0 25.5 26.0 26.5 27.0 27.5 28.0 28.5 29.0 29.5

Vol 0.1 N NaOH (mL) 30.0 30.5 31.0 31.5 32.0 32.5 33.0 33.5 34.0 34.5 35.0 35.5 36.0 36.5 37.0 37.5 38.0 38.5 39.0 39.5 40.0 40.5 41.0 41.5 42.0 42.5 43.0 43.5 44.0 44.5

TRIAL 2

Vol 0.1 N NaOH (mL) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 12.5 13.0 13.5 14.0 14.5

Vol 0.1 N NaOH (mL) 15.0 15.5 16.0 16.5 17.0 17.5 18.0 18.5 19.0 19.5 20.0 20.5 21.0 21.5 22.0 22.5 23.0 23.5 24.0 24.5 25.0 25.5 26.0 26.5 27.0 27.5 28.0 28.5 29.0 29.5

Vol 0.1 N NaOH (mL) 30.0 30.5 31.0 31.5 32.0 32.5 33.0 33.5 34.0 34.5 35.0 35.5 36.0 36.5 37.0 37.5 38.0 38.5 39.0 39.5 40.0 40.5 41.0 41.5 42.0 42.5 43.0 43.5 44.0 44.5

Calculations Once you have completed your work and disposed of your waste as instructed, get your computer, download the Excel file (http://www.drcarman.info/kem121lb/hoac.xls) onto your desktop and set up the titration curves and first-derivative curves as in the previous lab. For pKCOOH: find the first endpoint volume (from your first derivative curve) and divide that in half. Back determine the pH at that new (half) volume from your titration curve. Record both experimental pKCOOH's here: Sample 1 pKCOOH

Sample 2 pKCOOH

For pKNH2: find the first and second endpoint volumes from your first derivative curve. Find the halfway volume between the two. Back determine the pKNH2 at that volume from your titration curve. Record both experimental pKNH2's here: Sample 1 pKNH2

Sample 2 pKNH2

For those of you with three titratable groups -- pKR: find the 2d and 3d endpoint volumes from your first derivative curve. Find the halfway volume between the two. Back determine the pKR at that volume from your titration curve. Record both experimental pK R's here: Sample 1 pKR

Sample 2 pKR

Summarize the average pK values in the table below:

pKCOOH

pKNH2

pKR

Text pK Values Experimental pK Values How do the values compare?

Questions You have plenty to do without the added aggravation of questions. Attach your graphs to the lab for turn-in. Sources Harris: Exploring Chemical Analysis, Second Edition. (W.H. Freeman and Co.: NY)ÂŠ 2001. Lehninger: Principles of Biochemistry. (Worth Publishers: NY, NY)ÂŠ 1982.

Amino Acids 101 What is an amino acid? • •

•

Amino acids, or alpha- amino acids, are the “building blocks of peptides and proteins” They are composed of amine and carboxylic acid groups, separated by the alpha-carbon but the side chains on the alpha carbon vary with the acid Amino acids are the subunits of proteins: amino acids make peptide chains, peptide chains make polypeptides, polypeptides make proteins!

How can we tell them apart? The amino acids differ in the properties of their side chains   Hydrophobic, non acidic (the H+ ion won’t associate with water)

Leucine (Leu)

Methionine (Met)

Alanine (Ala)

Phenylalanine (Phe)

Valine (Val)

Tryptophan (Trp)

Proline (Pro) ** Secondary amine (HNR2)

Isoleucine (Ile)

Glycine (Gly)

Hydrophobic acidic (side chain is more acidic than water) The pKa of water is 15.7

Tyrosine (Tyr)

Cysteine (Cys)

(HO is acidic)

(HS is acidic)

Hydrophilic nonacidic side chains

Serine (Ser)

Asparagine (Asp)

Glutamine (Gln)

Hydrophilic acidic side chains

Glutamic acid (Glu)

Aspartic acid (Asp)

Threonine (Thr)

Hydrophilic basic side chains (lone pairs on Nitrogen accept a proton)

Histidine (His)

Arginine (Arg)

Lysine (Lys)

What do these all have in common?

Side chain

Amine

Carboxylic acid

So how do they make peptides? By peptide bonding â&#x20AC;˘ Covalent bond between amino acids â&#x20AC;˘ Carboxyl group reacts with amino group, releases H 2 O

What is the difference between a standard and nonstandard amino acid? • DNA codes for 20 different amino acids in humans. A standard amino acid is one of these • A nonstandard amino acid isn’t coded by DNA- they are chemically modified from other standard amino acids How do I put amino acids together? When making a peptide chain, think like this: 1. Start with the amine (H2N) on the left a. (this is assuming you are drawing the peptide from N-terminus to C- terminus) 2. Then say, “alpha carbon, carbonyl… Nitrogen, alpha carbon, carbonyl… nitrogen, alpha carbon, carbonyl” a. You’ll notice that the carbonyls alternate going “up” and “down” 3. Do this until you have drawn enough generic amino acids for your chain 4. Then put your OH at the end for the rest of the carboxylic acid group 5. Draw in wedges and dashes on the alpha carbons a. Start with wedge, next will be a dash 6. Draw in hydrogens on the Nitrogens 7. Draw in side chains on the alpha carbons depending on the name of the amino acid Check out this example. “Draw Ser-Leu-Ala-Thr-Asp” • Amine on the left, then alpha carbon, carbonyl… nitrogen, alpha carbon, carbonyl… keep repeating pattern • Count number of alpha carbons, should be equal to the number of amino acids in your peptide chain • Put OH on the end (part of carboxylic acid group)

•

draw in dashes and wedges on alpha carbons, starting with a wedge

•

draw in hydrogens on the nitrogens- they should also alternate up and down

•

draw in side chains according to the amino acids present in the peptide chain

There you have it! Ser-Leu-Ala-Thr-Asp! Also, consider the electrostatic interactions. The Oxygens and Hydrogens could interact with another peptide chain and have hydrogen bonding... Other important things to know about amino acids: • Cysteine is an important amino acid because it can form disulfide bridges. It is not hydrophilic. • Disulfide bridges link two cysteine residues in a peptide Why is this even important?!? Amino acids make up 75% of your body! They make bodily functions happen, allow chemical reactions to happen, and keep you healthy. Ten of the twenty amino acids in DNA are already present in the body, and ten “essential” amino acids must be ingested regularly through food.

Works Cited: Hardinger, Steven. Chemistry 14C: Lecture Supplement. 5th ed. Plymouth, MI: Hayden-McNeil Pub., 2012. Print. Hardinger, Steven. Chemistry 14C: Thinkbook. 9th ed. Plymouth, MI: Hayden-McNeil Pub., 2012. Print. Peptide Bond Image from http://www.phschool.com/science/biology_place/biocoach/images/translation/peptbo nd.gif "Amino acid - Wikipedia, the free encyclopedia." Wikipedia, the free encyclopedia. N.p., n.d. Web. 10 June 2012. <http://en.wikipedia.org/wiki/Amino_acid>. "Dr. Hardinger's Organic Chemistry Page - UCLA." UCLA Chemistry and Biochemistry. N.p., n.d. Web. 10 June 2012. <http://www.chem.ucla.edu/harding/index.html>. All images of amino acids from Dr. Hardingerâ&#x20AC;&#x2122;s Chem 14C website "What are Amino Acids?." wiseGEEK: clear answers for common questions. N.p., n.d. Web. 10 June 2012. <http://www.wisegeek.com/what-are-amino-acids.htm>.

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1153

COO⫺

C H A P T E R

x eli a-h

AMINO ACIDS, PEPTIDES, AND PROTEINS

⫹NH 3

24-1

Proteins are the most abundant organic molecules in animals, playing important roles in all aspects of cell structure and function. Proteins are biopolymers of A-amino acids, so named because the amino group is bonded to the a carbon atom, next to the carbonyl group. The physical and chemical properties of a protein are determined by its constituent amino acids. The individual amino acid subunits are joined by amide linkages called peptide bonds. Figure 24-1 shows the general structure of an a-amino acid and a protein.

α carbon atom

H2N

O CH

α-amino group

Introduction

side chain

an α-amino acid

O H2N

O OH H2N

CH3

O OH H2N

CH2OH

alanine

serine

OH H2N

CH2SH

glycine

cysteine

O OH H2N

CH(CH3)2 valine

several individual amino acids

peptide bonds

O NH

CH CH3

O NH

CH2OH

O NH

CH2SH

O NH

CH(CH3)2

a short section of a protein 쎱

FIGURE 24-1

Structure of a general protein and its constituent amino acids. The amino acids are joined by amide linkages called peptide bonds.

1153

WADEMC24_1153-1199hr.qxp

1154

16-12-2008

CHAPTER 24

14:15

Page 1154

Amino Acids, Peptides, and Proteins

TABLE 24-1 Examples of Protein Functions Class of Protein

Example

structural proteins enzymes transport proteins contractile proteins protective proteins hormones toxins

Function of Example

collagen, keratin DNA polymerase hemoglobin actin, myosin antibodies insulin snake venoms

strengthen tendons, skin, hair, nails replicates and repairs DNA transports O2 to the cells cause contraction of muscles complex with foreign proteins regulates glucose metabolism incapacitate prey

Proteins have an amazing range of structural and catalytic properties as a result of their varying amino acid composition. Because of this versatility, proteins serve an astonishing variety of functions in living organisms. Some of the functions of the major classes of proteins are outlined in Table 24-1. The study of proteins is one of the major branches of biochemistry, and there is no clear division between the organic chemistry of proteins and their biochemistry. In this chapter, we begin the study of proteins by learning about their constituents, the amino acids. We also discuss how amino acid monomers are linked into the protein polymer, and how the properties of a protein depend on those of its constituent amino acids. These concepts are needed for the further study of protein structure and function in a biochemistry course.

24-2 Structure and Stereochemistry of the a-Amino Acids

The term amino acid might mean any molecule containing both an amino group and any type of acid group; however, the term is almost always used to refer to an a-amino carboxylic acid. The simplest a-amino acid is aminoacetic acid, called glycine. Other common amino acids have side chains (symbolized by R) substituted on the a carbon atom. For example, alanine is the amino acid with a methyl side chain. O

H2N 9 CH2 9 C 9 OH

H2N 9 CH 9 C 9 OH

glycine

a substituted amino acid

O H2N 9 CH 9 C 9 OH

CH3

alanine (R ⫽ CH3)

Except for glycine, the a-amino acids are all chiral. In all of the chiral amino acids, the chirality center is the asymmetric a carbon atom. Nearly all the naturally occurring amino acids are found to have the (S) configuration at the a carbon atom. Figure 24-2 shows a Fischer projection of the (S) enantiomer of alanine, with the carbon chain along the vertical and the carbonyl carbon at the top. Notice that the configuration of (S)-alanine is similar to that of L-1-2-glyceraldehyde, with the amino group on the left in the Fischer COOH C H2N

CH3

CHO C HO

COOH 쎱

FIGURE 24-2

Almost all the naturally occurring amino acids have the (S) configuration. They are called L-amino acids because their stereochemistry resembles that of L-1 -2-glyceraldehyde.

H2N

H CH3

L-alanine (S)-alanine

CH2OH

COOH C H2N

CHO HO

H CH2OH

L-(–)-glyceraldehyde

(S)-glyceraldehyde

R H

COOH H2N

H R

an L-amino acid (S) configuration

WADEMC24_1153-1199hr.qxp

16-12-2008

19:28

Page 1155

Structure and Stereochemistry of the a-Amino Acids

24-2

projection. Because their stereochemistry is similar to that of L-1 - 2 -glyceraldehyde, the naturally occurring (S)-amino acids are classified as L-amino acids. Although D-amino acids are occasionally found in nature, we usually assume the amino acids under discussion are the common L-amino acids. Remember once again that the D and L nomenclature, like the R and S designation, gives the configuration of the asymmetric carbon atom. It does not imply the sign of the optical rotation, 1 +2 or 1 - 2, which must be determined experimentally. Amino acids combine many of the properties and reactions of both amines and carboxylic acids. The combination of a basic amino group and an acidic carboxyl group in the same molecule also results in some unique properties and reactions. The side chains of some amino acids have additional functional groups that lend interesting properties and undergo reactions of their own.

Bacteria require specific enzymes, called racemases, to interconvert D and L amino acids. Mammals do not use D amino acids, so compounds that block racemases do not affect mammals and show promise as antibiotics.

24-2A The Standard Amino Acids of Proteins The standard amino acids are 20 common a -amino acids that are found in nearly all proteins. The standard amino acids differ from each other in the structure of the side chains bonded to their a carbon atoms. All the standard amino acids are L-amino acids. Table 24-2 shows the 20 standard amino acids, grouped according to the TABLE 24-2 The Standard Amino Acids Name

Symbol

Abbreviation

Functional Group in Side Chain

Structure

Isoelectric Point

side chain is nonpolar, H or alkyl glycine

Gly

H2N

COOH

none

6.0

COOH

alkyl group

6.0

COOH

alkyl group

6.0

alkyl group

6.0

alkyl group

6.0

aromatic group

5.5

rigid cyclic structure

6.3

hydroxyl group

5.7

hydroxyl group

5.6

H alanine

Ala

H2N

CH CH3

*valine

Val

H2N

CH CH

CH3 *leucine

Leu

H2N

CH3 CH

COOH CH

CH2

CH3

CH3 *isoleucine

*phenylalanine

Ile

Phe

H2N

COOH

CH3

CH2CH3

H2N

COOH

CH2 proline

Pro

H2C

COOH CH2

CH2

side chain contains an 9 OH serine

Ser

H2N

CH CH2

*threonine

Thr

COOH OH

H2N

COOH

CH3

1155

WADEMC24_1153-1199hr.qxp

1156

16-12-2008

CHAPTER 24

19:47

Page 1156

Amino Acids, Peptides, and Proteins

TABLE 24-2 The Standard Amino Acids (continued ) Name

Symbol

tyrosine

Abbreviation

Functional Group in Side Chain

Structure

Tyr

H2N

COOH

CH2

Isoelectric Point

phenolic â&#x20AC;&#x201D; OH group

5.7

thiol

5.0

sulfide

5.7

amide

5.4

amide

5.7

indole

5.9

carboxylic acid

2.8

carboxylic acid

3.2

amino group

9.7

side chain contains sulfur cysteine

H2N

Cys

COOH SH

CH2 *methionine

H2N

Met

COOH

CH2

CH3

side chain contains nonbasic nitrogen asparagine

H2N

Asn

COOH

CH2

NH2

O glutamine

Gln

H2N

COOH

CH2

NH2

O *tryptophan

H2N

Trp

COOH

CH2

N H

side chain is acidic aspartic acid

H2N

Asp

COOH COOH

CH2 glutamic acid

H2N

Glu

COOH

CH2

COOH

side chain is basic *lysine

*arginine

Lys

Arg

H2N

COOH

CH2

COOH

CH2

NH2 guanidino group

CH2

10.8

NH2

NH *histidine

His

H2N

COOH

CH2 NH N

*essential amino acid

imidazole ring

7.6

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1157

24-2

Structure and Stereochemistry of the a-Amino Acids

1157

chemical properties of their side chains. Each amino acid is given a three-letter abbreviation and a one-letter symbol (green) for use in writing protein structures. Notice in Table 24-2 how proline is different from the other standard amino acids. Its amino group is fixed in a ring with its a carbon atom. This cyclic structure lends additional strength and rigidity to proline-containing peptides. COOH

proline

a carbon a-amino group

PROBLEM 24-1 Draw three-dimensional representations of the following amino acids. (a) L-phenylalanine (b) L-histidine (c) D-serine (d) L-tryptophan

PROBLEM 24-2 Most naturally occurring amino acids have chirality centers (the asymmetric a carbon atoms) that are named (S) by the Cahn–Ingold–Prelog convention (Section 5-3). The common naturally occurring form of cysteine has a chirality center that is named (R), however. (a) What is the relationship between (R)-cysteine and (S)-alanine? Do they have the opposite three-dimensional configuration (as the names might suggest) or the same configuration? (b) (S)-alanine is an L-amino acid (Figure 24-2). Is (R)-cysteine a D-amino acid or an L-amino acid?

24-2B Essential Amino Acids Humans can synthesize about half of the amino acids needed to make proteins. Other amino acids, called the essential amino acids, must be provided in the diet. The ten essential amino acids, starred 1*2 in Table 24-2, are the following: arginine (Arg) valine (Val) threonine (Thr) phenylalanine (Phe) lysine (Lys) tryptophan (Trp)

methionine (Met) histidine (His)

leucine (Leu) isoleucine (Ile)

Proteins that provide all the essential amino acids in about the right proportions for human nutrition are called complete proteins. Examples of complete proteins are those in meat, fish, milk, and eggs. About 50 g of complete protein per day is adequate for adult humans. Proteins that are severely deficient in one or more of the essential amino acids are called incomplete proteins. If the protein in a person’s diet comes mostly from one incomplete source, the amount of human protein that can be synthesized is limited by the amounts of the deficient amino acids. Plant proteins are generally incomplete. Rice, corn, and wheat are all deficient in lysine. Rice also lacks threonine, and corn also lacks tryptophan. Beans, peas, and other legumes have the most complete proteins among the common plants, but they are deficient in methionine. Vegetarians can achieve an adequate intake of the essential amino acids if they eat many different plant foods. Plant proteins can be chosen to be complementary, with some foods supplying amino acids that others lack. An alternative is to supplement the vegetarian diet with a rich source of complete protein such as milk or eggs. PROBLEM 24-3 The herbicide glyphosate (Roundup®) kills plants by inhibiting an enzyme needed for synthesis of phenylalanine. Deprived of phenylalanine, the plant cannot make the proteins it needs, and it gradually weakens and dies. Although a small amount of glyphosate is deadly to a plant, its human toxicity is quite low. Suggest why this powerful herbicide has little effect on humans.

Gelatin is made from collagen, which is a structural protein composed primarily of glycine, proline, and hydroxyproline. As a result, gelatin has low nutritional value because it lacks many of the essential amino acids.

WADEMC24_1153-1199hr.qxp

1158

16-12-2008

CHAPTER 24

19:31

Page 1158

Amino Acids, Peptides, and Proteins

24-2C Rare and Unusual Amino Acids In addition to the standard amino acids, other amino acids are found in protein in smaller quantities. For example, 4-hydroxyproline and 5-hydroxylysine are hydroxylated versions of standard amino acids. These are called rare amino acids, even though they are commonly found in collagen. OH H

4 5

COOH

H2N

CH2

COOH

CH NH2

H 4-hydroxyproline

CH2

5-hydroxylysine

Some of the less common D enantiomers of amino acids are also found in nature. For example, D-glutamic acid is found in the cell walls of many bacteria, and D-serine is found in earthworms. Some naturally occurring amino acids are not a-amino acids: g-Aminobutyric acid (GABA) is one of the neurotransmitters in the brain, and b-alanine is a constituent of the vitamin pantothenic acid. COOH H

COOH H

NH2 CH CH COOH

2 2 D-glutamic

24-3 Acid–Base Properties of Amino Acids

acid

NH2

CH2

CH OH

NH2

2 D-serine

COOH

CH2

NH2

g-aminobutyric acid

COOH

b-alanine

Although we commonly write amino acids with an intact carboxyl 1 ¬ COOH2 group and amino 1 ¬ NH 22 group, their actual structure is ionic and depends on the pH. The carboxyl group loses a proton, giving a carboxylate ion, and the amino group is protonated to an ammonium ion. This structure is called a dipolar ion or a zwitterion (German for “dipolar ion”). O

O ⫹

H2N 9 CH 9 C 9 OH

H3N 9 CH 9 C 9 O⫺

uncharged structure (minor component)

dipolar ion, or zwitterion (major component)

The dipolar nature of amino acids gives them some unusual properties: 1. Amino acids have high melting points, generally over 200 °C. +

H3N ¬ CH2 ¬ COOglycine, mp 262 °C

2. Amino acids are more soluble in water than they are in ether, dichloromethane, and other common organic solvents.

3. Amino acids have much larger dipole moments 1m2 than simple amines or simple acids. +

H3N ¬ CH2 ¬ COO-

CH3 ¬ CH2 ¬ CH2 ¬ NH2

CH3 ¬ CH2 ¬ COOH

glycine, m = 14 D

propylamine, m = 1.4 D

propionic acid, m = 1.7 D

4. Amino acids are less acidic than most carboxylic acids and less basic than most amines. In fact, the acidic part of the amino acid molecule is the ¬ NH3+

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1159

Acid–Base Properties of Amino Acids

24-3

1159

group, not a ¬ COOH group. The basic part is the ¬ COO- group, and not a free ¬ NH2 group. R R9 COOH

R9NH2

pKa ⫽ 5

pKb ⫽ 4

⫹

H3N 9 CH 9 COO⫺ pKa ⫽ 10 pKb ⫽ 12

Because amino acids contain both acidic 1 ¬ NH3+2 and basic 1 ¬ COO-2 groups, they are amphoteric (having both acidic and basic properties). The predominant form of the amino acid depends on the pH of the solution. In an acidic solution, the ¬ COO- group is protonated to a free ¬ COOH group, and the molecule has an overall positive charge. As the pH is raised, the ¬ COOH loses its proton at about pH 2. This point is called pKa1, the first acid-dissociation constant. As the pH is raised further, the ¬ NH3+ group loses its proton at about pH 9 or 10. This point is called pKa2, the second acid-dissociation constant. Above this pH, the molecule has an overall negative charge. −OH

⫹

H3N 9 CH 9 COOH

⫹

H3N 9 CH 9 COO⫺

H+

−OH

H2N 9 CH 9 COO⫺

H+

cationic in acid

pKa1 ⬵ 2

R pKa2 ⬵ 9–10

neutral

anionic in base

Figure 24-3 shows a titration curve for glycine. The curve starts at the bottom left, where glycine is entirely in its cationic form. Base is slowly added, and the pH is recorded. At pH 2.3, half of the cationic form has been converted to the zwitterionic form. At pH 6.0, essentially all the glycine is in the zwitterionic form. At pH 9.6, half of the zwitterionic form has been converted to the basic form. From this graph, we can see that glycine is mostly in the cationic form at pH values below 2.3, mostly in the zwitterionic form at pH values between 2.3 and 9.6, and mostly in the anionic form at pH values above 9.6. By varying the pH of the solution, we can control the charge on the molecule. This ability to control the charge of an amino acid is useful for separating and identifying amino acids by electrophoresis, as described in Section 24-4.

0.5

1.5

.. H2N

CH2

O−

anionic above pH 9.6

10 pKa2 = 9.6

O pH

+ H3N

Isoelectric point = 6.0

CH2

O−

zwitterionic near the isoelectric point 4

+ H3N

pKa1 = 2.3

CH2

cationic below pH 2.3 0.5

1.5

Equivalents of −OH added

쎱

FIGURE 24-3

A titration curve for glycine. The pH controls the charge on glycine: cationic below pH 2.3; zwitterionic between pH 2.3 and 9.6; and anionic above pH 9.6. The isoelectric pH is 6.0.

WADEMC24_1153-1199hr.qxp

1160

16-12-2008

CHAPTER 24

14:15

Page 1160

Amino Acids, Peptides, and Proteins

24-4 Isoelectric Points and Electrophoresis

An amino acid bears a positive charge in acidic solution (low pH) and a negative charge in basic solution (high pH). There must be an intermediate pH where the amino acid is evenly balanced between the two forms, as the dipolar zwitterion with a net charge of zero. This pH is called the isoelectric pH or the isoelectric point.

⫹

H3N 9 CH 9 COOH

−OH

H+

⫹

H3N 9 CH 9 COO⫺ R

low pH (cationic in acid)

isoelectric pH (neutral)

−OH

H+

H2N 9 CH 9 COO⫺ R high pH (anionic in base)

The isoelectric points of the standard amino acids are given in Table 24-2. Notice that the isoelectric pH depends on the amino acid structure in a predictable way. acidic amino acids: neutral amino acids: basic amino acids:

aspartic acid (2.8), glutamic acid (3.2) (5.0 to 6.3) lysine (9.7), arginine (10.8), histidine (7.6)

The side chains of aspartic acid and glutamic acid contain acidic carboxyl groups. These amino acids have acidic isoelectric points around pH 3. An acidic solution is needed to prevent deprotonation of the second carboxylic acid group and to keep the amino acid in its neutral isoelectric state. Basic amino acids (histidine, lysine, and arginine) have isoelectric points at pH values of 7.6, 9.7, and 10.8, respectively. These values reflect the weak basicity of the imidazole ring, the intermediate basicity of an amino group, and the strong basicity of the guanidino group. A basic solution is needed in each case to prevent protonation of the basic side chain to keep the amino acid electrically neutral. The other amino acids are considered neutral, with no strongly acidic or basic side chains. Their isoelectric points are slightly acidic (from about 5 to 6) because the ¬ NH3+ group is slightly more acidic than the ¬ COO- group is basic. PROBLEM 24-4 Draw the structure of the predominant form of (a) isoleucine at pH 11 (b) proline at pH 2 (c) arginine at pH 7 (d) glutamic acid at pH 7 (e) a mixture of alanine, lysine, and aspartic acid at (i) pH 6; (ii) pH 11; (iii) pH 2 problem-solving

Hint

At its isoelectric point (IEP), an amino acid has a net charge of zero, with NH3+ and COObalancing each other. In more acidic solution (lower pH), the carboxyl group becomes protonated and the net charge is positive. In more basic solution (higher pH), the amino group loses its proton and the net charge is negative.

PROBLEM 24-5 Draw the resonance forms of a protonated guanidino group, and explain why arginine has such a strongly basic isoelectric point.

PROBLEM 24-6 Although tryptophan contains a heterocyclic amine, it is considered a neutral amino acid. Explain why the indole nitrogen of tryptophan is more weakly basic than one of the imidazole nitrogens of histidine.

Electrophoresis uses differences in isoelectric points to separate mixtures of amino acids (Figure 24-4). A streak of the amino acid mixture is placed in the center of a layer of acrylamide gel or a piece of filter paper wet with a buffer solution. Two electrodes are placed in contact with the edges of the gel or paper, and a potential of several thousand volts is applied across the electrodes. Positively charged (cationic) amino acids are attracted to the negative electrode (the cathode), and negatively charged (anionic) amino acids are attracted to the positive electrode (the anode). An amino acid at its isoelectric point has no net charge, so it does not move. As an example, consider a mixture of alanine, lysine, and aspartic acid in a buffer solution at pH 6. Alanine is at its isoelectric point, in its dipolar zwitterionic form with

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1161

24-5 − Beginning

power supply

Synthesis of Amino Acids

1161

cathode

anode

−

wet with pH 6 buffer solution streak containing Ala, Lys, Asp

− End

power supply

cathode

anode

−

쎱

Asp− moves toward the positive charge Ala does not move Lys+ moves toward the negative charge

FIGURE 24-4

A simplified picture of the electrophoretic separation of alanine, lysine, and aspartic acid at pH 6. Cationic lysine is attracted to the cathode; anionic aspartic acid is attracted to the anode. Alanine is at its isoelectric point, so it does not move.

a net charge of zero. A pH of 6 is more acidic than the isoelectric pH for lysine (9.7), so lysine is in the cationic form. Aspartic acid has an isoelectric pH of 2.8, so it is in the anionic form. Structure at pH 6 ⫹

H3N 9 CH 9 COO⫺ CH3 alanine (charge 0)

⫹

H3N 9 CH 9 COO⫺ (CH2)4 9 NH⫹3

lysine (charge ⫹1)

⫹

H3N 9 CH 9 COO⫺ CH2 9 COO⫺

aspartic acid (charge ⫺1)

When a voltage is applied to a mixture of alanine, lysine, and aspartic acid at pH 6, alanine does not move. Lysine moves toward the negatively charged cathode, and aspartic acid moves toward the positively charged anode (Figure 24-4). After a period of time, the separated amino acids are recovered by cutting the paper or scraping the bands out of the gel. If electrophoresis is being used as an analytical technique (to determine the amino acids present in the mixture), the paper or gel is treated with a reagent such as ninhydrin (Section 24-9) to make the bands visible. Then the amino acids are identified by comparing their positions with those of standards. PROBLEM 24-7 Draw the electrophoretic separation of Ala, Lys, and Asp at pH 9.7.

PROBLEM 24-8 Draw the electrophoretic separation of Trp, Cys, and His at pH 6.0.

Naturally occurring amino acids can be obtained by hydrolyzing proteins and separating the amino acid mixture. Even so, it is often less expensive to synthesize the pure amino acid. In some cases, an unusual amino acid or an unnatural enantiomer is needed, and it must be synthesized. In this chapter, we consider four methods for making amino acids. All these methods are extensions of reactions we have already studied.

24-5 Synthesis of Amino Acids

WADEMC24_1153-1199hr.qxp

1162

16-12-2008

CHAPTER 24

14:15

Page 1162

Amino Acids, Peptides, and Proteins

24-5A Reductive Amination Reductive amination of ketones and aldehydes is one of the best methods for synthesizing amines (Section 19-19). It also forms amino acids. When an a-ketoacid is treated with ammonia, the ketone reacts to form an imine. The imine is reduced to an amine by hydrogen and a palladium catalyst. Under these conditions, the carboxylic acid is not reduced. N9H

O excess NH3

R 9 C 9 COOH a-ketoacid

NH2 H2

R 9 C 9 COO⫺ ⫹NH4

R 9 CH 9 COO⫺

imine

a-amino acid

This entire synthesis is accomplished in one step by treating the a-ketoacid with ammonia and hydrogen in the presence of a palladium catalyst. The product is a racemic a-amino acid. The following reaction shows the synthesis of racemic phenylalanine from 3-phenyl-2-oxopropanoic acid. O

NH2

Ph 9 CH2 9 C 9 COOH 3-phenyl-2-oxopropanoic acid

NH3, H2 Pd

Ph 9 CH2 9 CH 9 COO⫺ ⫹NH4 (D,L)-phenylalanine (ammonium salt) (30%)

We call reductive amination a biomimetic (“mimicking the biological process”) synthesis because it resembles the biological synthesis of amino acids. The biosynthesis begins with reductive amination of a-ketoglutaric acid (an intermediate in the metabolism of carbohydrates), using ammonium ion as the aminating agent and NADH as the reducing agent. The product of this enzyme-catalyzed reaction is the pure L enantiomer of glutamic acid. O

H H

HOOC CH2CH2 C COO⫺ ⫹ a-ketoglutaric acid

NH2

⫹

NH4 ⫹

NH3

enzyme

⫹ H⫹

⫹

HOOC

CH2CH2 CH L-glutamic acid

C NH2

COO⫺ ⫹

⫹

⫹ H2O

sugar

NADH

NAD⫹

Biosynthesis of other amino acids uses L-glutamic acid as the source of the amino group. Such a reaction, moving an amino group from one molecule to another, is called a transamination, and the enzymes that catalyze these reactions are called transaminases. For example, the following reaction shows the biosynthesis of aspartic acid using glutamic acid as the nitrogen source. Once again, the enzyme-catalyzed biosynthesis gives the pure L enantiomer of the product. ⫹

NH3

HOOC 9 CH2CH2 9 CH 9 COO⫺ L-glutamic

HOOC 9 CH2CH2 9 C 9 COO⫺ a-ketoglutaric acid

acid

⫹

transaminase

O HOOC 9 CH2 9 C 9 COO⫺ oxaloacetic acid

⫹ ⫹

NH3

HOOC 9 CH2 9 CH 9 COO⫺ L-aspartic

acid

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1163

24-5

Synthesis of Amino Acids

1163

PROBLEM 24-9 Show how the following amino acids might be formed in the laboratory by reductive amination of the appropriate a-ketoacid. (a) alanine (b) leucine (c) serine (d) glutamine

24-5B Amination of an a- Halo Acid The Hell–Volhard–Zelinsky reaction (Section 22-4) is an effective method for introducing bromine at the a position of a carboxylic acid. The racemic a-bromo acid is converted to a racemic a-amino acid by direct amination, using a large excess of ammonia. O

R 9 CH2 9 C 9 OH carboxylic acid

(1) Br2/PBr3 (2) H2O

R 9 CH 9 C 9 OH a-bromo acid

NH2 O NH3

R 9 CH 9 C 9 O⫺ ⫹NH4

(large excess)

(D,L)-a-amino acid (ammonium salt)

In Section 19-19, we saw that direct alkylation is often a poor synthesis of amines, giving large amounts of overalkylated products. In this case, however, the reaction gives acceptable yields because a large excess of ammonia is used, making ammonia the nucleophile that is most likely to displace bromine. Also, the adjacent carboxylate ion in the product reduces the nucleophilicity of the amino group. The following sequence shows bromination of 3-phenylpropanoic acid, followed by displacement of bromide ion, to form the ammonium salt of racemic phenylalanine. NH2

Br Ph 9 CH2 9 CH2 9 COOH 3-phenylpropanoic acid

(1) Br2/PBr3

Ph 9 CH2 9 CH 9 COOH

(2) H2O

excess NH3

Ph 9 CH2 9 CH 9 COO⫺ ⫹NH4 (D,L)-phenylalanine (salt) (30–50%)

PROBLEM 24-10 Show how you would use bromination followed by amination to synthesize the following amino acids. (a) glycine (b) leucine (c) glutamic acid

24-5C The Gabriel–Malonic Ester Synthesis One of the best methods of amino acid synthesis is a combination of the Gabriel synthesis of amines (Section 19-21) with the malonic ester synthesis of carboxylic acids (Section 22-16). The conventional malonic ester synthesis involves alkylation of diethyl malonate, followed by hydrolysis and decarboxylation to give an alkylated acetic acid. temporary ester group

O H

CO2 c

COOEt

(1)⫺OEt

OEt

H malonic ester

(2) RX

C R

H3O⫹, heat

OEt

R alkylated acetic acid

To adapt this synthesis to making amino acids, we begin with a malonic ester that contains an a-amino group. The amino group is protected as a non-nucleophilic amide to prevent it from attacking the alkylating agent (RX).

WADEMC24_1153-1199hr.qxp

1164

16-12-2008

CHAPTER 24

14:15

Page 1164

Amino Acids, Peptides, and Proteins

The Gabriel–malonic ester synthesis begins with N-phthalimidomalonic ester. Think of N-phthalimidomalonic ester as a molecule of glycine (aminoacetic acid) with the amino group protected as an amide (a phthalimide in this case) to keep it from acting as a nucleophile. The acid is protected as an ethyl ester, and the a position is further activated by the additional (temporary) ester group of diethyl malonate. temporary ester group

O N

⫽

COOEt O

COOEt

H O

protected acid

glycine

N-phthalimidomalonic ester

protected amine

Just as the malonic ester synthesis gives substituted acetic acids, the N-phthalimidomalonic ester synthesis gives substituted aminoacetic acids: a-amino acids. N-Phthalimidomalonic ester is alkylated in the same way as malonic ester. When the alkylated N-phthalimidomalonic ester is hydrolyzed, the phthalimido group is hydrolyzed along with the ester groups. The product is an alkylated aminomalonic acid. Decarboxylation gives a racemic a-amino acid. The Gabriel – malonic ester synthesis temporary ester group

O N O

CO2

COOEt (1) base (2) R 9 X

COOEt

COOH ⫹

H3N

COOEt

N-phthalimidomalonic ester

H3O+

H heat

⫹

H3N

COOH

alkylated

COOH a-amino acid

hydrolyzed

The Gabriel–malonic ester synthesis is used to make many amino acids that cannot be formed by direct amination of haloacids. The following example shows the synthesis of methionine, which is formed in very poor yield by direct amination. O N O

COOEt CH

(1) NaOEt (2) Cl 9 CH2CH2SCH3

COOEt

N O

COOEt C

CH2CH2SCH3

COOEt

H3O+ heat

H ⫹

H3N

CH2CH2SCH3

COOH (D, L)-methionine (50%)

PROBLEM 24-11 Show how the Gabriel–malonic ester synthesis could be used to make (a) valine (b) phenylalanine (c) glutamic acid (d) leucine

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1165

1165

Synthesis of Amino Acids

24-5 * PROBLEM 24-12

COOEt O

The Gabriel–malonic ester synthesis uses an aminomalonic ester with the amino group protected as a phthalimide. A variation has the amino group protected as an acetamido group. Propose how you might use an acetamidomalonic ester synthesis to make phenylalanine.

CH3

acetamidomalonic ester

24-5D The Strecker Synthesis The first known synthesis of an amino acid occurred in 1850 in the laboratory of Adolph Strecker in Tübingen, Germany. Strecker added acetaldehyde to an aqueous solution of ammonia and HCN. The product was a-amino propionitrile, which Strecker hydrolyzed to racemic alanine. The Strecker synthesis of alanine

⫹

NH2

CH3 9 C 9 H

⫹ NH3 ⫹

H2O

HCN

H3O+

CH3 9 C 9 H

C#N

COOH

a-amino propionitrile

acetaldehyde

(D,L)-alanine (60%)

The Strecker synthesis can form a large number of amino acids from appropriate aldehydes. The mechanism is shown next. First, the aldehyde reacts with ammonia to give an imine. The imine is a nitrogen analogue of a carbonyl group, and it is electrophilic when protonated. Attack of cyanide ion on the protonated imine gives the a-amino nitrile. This mechanism is similar to that for formation of a cyanohydrin (Section 18-15), except that in the Strecker synthesis cyanide ion attacks an imine rather than the aldehyde itself. Step 1: The aldehyde reacts with ammonia to form the imine (mechanism in Section 18-16)

O C

H ⫹

H+

NH3

aldehyde

H ⫹ H2O

imine

Step 2: Cyanide ion attacks the imine.

N R

⫹

N H 9 CN

imine

NH2 R

⫺CN

a-amino nitrile

In a separate step, hydrolysis of the a-amino nitrile (Section 21-7D) gives an a-amino acid. R H2N 9 CH 9 C # N a-amino nitrile

H3O+

R ⫹

H3N 9 CH 9 COOH

a-amino acid (acidic form)

NH3

WADEMC24_1153-1199hr.qxp

1166

16-12-2008

CHAPTER 24

14:15

Page 1166

Amino Acids, Peptides, and Proteins

SOLVED PROBLEM 24-1 Show how you would use a Strecker synthesis to make isoleucine.

SOLUTION Isoleucine has a sec-butyl group for its side chain. Remember that CH3 ¬ CHO undergoes Strecker synthesis to give alanine, with CH3 as the side chain. Therefore, sec-butyl ¬ CHO should give isoleucine. CH3 O

CH3 NH2 NH3, HCN

CH3CH2CH 9 C 9 H

H2O

sec-butyl 9 CHO (2-methylbutanal)

CH3 ⫹NH3

H3O+

CH3CH2CH 9 C 9 H

C#N

COOH (D,L)-isoleucine

Hint

problem-solving

PROBLEM 24-13

In the malonic ester synthesis, use the side chain of the desired amino acid (must be a good SN2

(a) Show how you would use a Strecker synthesis to make phenylalanine. (b) Propose a mechanism for each step in the synthesis in part (a).

substrate) to alkylate the ester. In the Strecker synthesis, the aldehyde carbon becomes the a carbon of the amino acid: Begin with [side chain] ¬ CHO.

PROBLEM 24-14 Show how you would use a Strecker synthesis to make (a) leucine (b) valine (c) aspartic acid

SUMMARY

Syntheses of Amino Acids

1. Reductive amination (Section 24-5A)

N9H

O R 9 C 9 COOH

excess NH3

NH2 H2

R 9 C 9 COO⫺ ⫹NH4

a-ketoacid

R 9 CH 9 COO⫺

imine

a-amino acid

2. Amination of an a-haloacid (Section 24-5B)

O R 9 CH29 C 9 OH

(2) H2O

NH2 O

Br (1) Br2/PBr3

R 9 CH 9 C 9 OH

NH3

R 9 CH 9 C 9 O⫺ ⫹NH4

(large excess)

a-bromo acid

carboxylic acid

(D,L)-a-amino salt (ammonium salt)

3. The Gabriel – malonic ester synthesis (Section 24-5C) temporary ester group

CO2 O

COOEt (1) base (2) R 9 X

COOEt

H3O+ heat

⫹

H3N

H heat

COOH

COOEt

N-phthalimidomalonic ester

COOH

COOEt

NH2

R9C9H

⫹

NH3

⫹

HCN

H2O

R9C9H C#N

aldehyde

a-amino nitrile

⫹

H3O+

a-amino acid

4. The Strecker synthesis (Section 24-5D)

H3N

COOH

hydrolyzed

alkylated

⫹

NH3

R9C9H COOH a-amino acid

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1167

24-7

COOH H2N

CH3

acid

NH2

H2N

R L

acylase

COOH

CH3C 2 O

CH3

is deacylated

COOH

CH3

acid

racemic amino acid ě&#x17D;ą

COOH

R D-amino

) )

COOH H

COOH

R L-amino

Reactions of Amino Acids

acylated

is unaffected

(easily separated mixture)

FIGURE 24-5

Selective enzymatic deacylation. An acylase enzyme (such as hog kidney acylase or carboxypeptidase) deacylates only the natural L-amino acid.

All the laboratory syntheses of amino acids described in Section 24-5 produce racemic products. In most cases, only the L enantiomers are biologically active. The D enantiomers may even be toxic. Pure L enantiomers are needed for peptide synthesis if the product is to have the activity of the natural material. Therefore, we must be able to resolve a racemic amino acid into its enantiomers. In many cases, amino acids can be resolved by the methods we have already discussed (Section 5-16). If a racemic amino acid is converted to a salt with an optically pure chiral acid or base, two diastereomeric salts are formed. These salts can be separated by physical means such as selective crystallization or chromatography. Pure enantiomers are then regenerated from the separated diastereomeric salts. Strychnine and brucine are naturally occurring optically active bases, and tartaric acid is used as an optically active acid for resolving racemic mixtures. Enzymatic resolution is also used to separate the enantiomers of amino acids. Enzymes are chiral molecules with specific catalytic activities. For example, when an acylated amino acid is treated with an enzyme like hog kidney acylase or carboxypeptidase, the enzyme cleaves the acyl group from just the molecules having the natural (L) configuration. The enzyme does not recognize D-amino acids, so they are unaffected. The resulting mixture of acylated D-amino acid and deacylated L-amino acid is easily separated. Figure 24-5 shows how this selective enzymatic deacylation is accomplished.

24-6 Resolution of Amino Acids

PROBLEM 24-15 Suggest how you would separate the free L-amino acid from its acylated Figure 24-5.

enantiomer in

Amino acids undergo many of the standard reactions of both amines and carboxylic acids. Conditions for some of these reactions must be carefully selected, however, so that the amino group does not interfere with a carboxyl group reaction, and vice versa. We will consider two of the most useful reactions, esterification of the carboxyl group and acylation of the amino group. These reactions are often used to protect either the carboxyl group or the amino group while the other group is being modified or coupled to another amino acid. Amino acids also undergo reactions that are specific to the a-amino acid structure. One of these unique amino acid reactions is the formation of a colored product on treatment with ninhydrin, discussed in Section 24-7C.

24-7 Reactions of Amino Acids

1167

WADEMC24_1153-1199hr.qxp

1168

16-12-2008

CHAPTER 24

14:15

Page 1168

Amino Acids, Peptides, and Proteins

24-7A Esterification of the Carboxyl Group Like monofunctional carboxylic acids, amino acids are esterified by treatment with a large excess of an alcohol and an acidic catalyst (often gaseous HCl). Under these acidic conditions, the amino group is present in its protonated 1 ¬ NH3+2 form, so it does not interfere with esterification. The following example illustrates esterification of an amino acid. Cl⫺

O ⫹

H2N

H2C

O⫺

CH2

⫹

Ph 9 CH2 9 OH HCl

CH2

H2N

H2C

CH2Ph

CH2 CH2

proline

proline benzyl ester (90%)

Esters of amino acids are often used as protected derivatives to prevent the carboxyl group from reacting in some undesired manner. Methyl, ethyl, and benzyl esters are the most common protecting groups. Aqueous acid hydrolyzes the ester and regenerates the free amino acid. O ⫹

H3N 9 CH 9 C 9 OCH2CH3

O H3O+

⫹

H3N 9 CH 9 C 9 OH ⫹ CH3CH2 9 OH

CH2 9 Ph

CH2 9 Ph phenylalanine

phenylalanine ethyl ester

Benzyl esters are particularly useful as protecting groups because they can be removed either by acidic hydrolysis or by neutral hydrogenolysis (“breaking apart by addition of hydrogen”). Catalytic hydrogenation cleaves the benzyl ester, converting the benzyl group to toluene and leaving the deprotected amino acid. Although the mechanism of this hydrogenolysis is not well known, it apparently hinges on the ease of formation of benzylic intermediates. O ⫹

H3N

CH CH2

O OCH2

phenylalanine benzyl ester

H2, Pd

⫹

H3N

CH CH2

O⫺ ⫹ CH3

phenylalanine

toluene

PROBLEM 24-16 Decarboxylation is an important reaction of amino acids in many biological processes. Histamine, which causes runny noses and itchy eyes, is synthesized in the body by decarboxylation of histidine. The enzyme that catalyzes this reaction is called histidine decarboxylase.

CH2CH2NH2 NH N histamine

Propose a mechanism for the acid-catalyzed hydrolysis of phenylalanine ethyl ester.

PROBLEM 24-17 Give equations for the formation and hydrogenolysis of glutamine benzyl ester.

24-7B Acylation of the Amino Group: Formation of Amides Just as an alcohol esterifies the carboxyl group of an amino acid, an acylating agent converts the amino group to an amide. Acylation of the amino group is often done to protect it from unwanted nucleophilic reactions. A wide variety of acid chlorides and anhydrides are used for acylation. Benzyl chloroformate acylates the amino group to give a benzyloxycarbonyl derivative, often used as a protecting group in peptide synthesis (Section 24-10).

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1169

24-7

Reactions of Amino Acids

O H2N

CH3

COOH

CH2

(

COOH

CH2

)

CH3 9C 9O 2

NH N

histidine

N-acetylhistidine O

(acetic anhydride)

H2N

PhCH2OC 9 Cl

COOH

(benzyl chloroformate)

PhCH2O

CH2CH(CH3)2

COOH

CH2CH(CH3)2

leucine

N-benzyloxycarbonyl leucine (90%)

The amino group of the N-benzyloxycarbonyl derivative is protected as the amide half of a carbamate ester (a urethane, Section 21-16), which is more easily hydrolyzed than most other amides. In addition, the ester half of this urethane is a benzyl ester that undergoes hydrogenolysis. Catalytic hydrogenolysis of the N-benzyloxycarbonyl amino acid gives an unstable carbamic acid that quickly decarboxylates to give the deprotected amino acid. O CH2

H N

COOH

H2, Pd

CH3

COOH

CH2

CH(CH3)2

CH(CH3)2 toluene

N-benzyloxycarbonyl leucine

CO2 H2N

COOH

CH2 CH(CH3)2

a carbamic acid

leucine

PROBLEM 24-18 Give equations for the formation and hydrogenolysis of N-benzyloxycarbonyl methionine.

24-7C Reaction with Ninhydrin Ninhydrin is a common reagent for visualizing spots or bands of amino acids that have been separated by chromatography or electrophoresis. When ninhydrin reacts with an amino acid, one of the products is a deep violet, resonance-stabilized anion called Ruhemann’s purple. Ninhydrin produces this same purple dye regardless of the structure of the original amino acid. The side chain of the amino acid is lost as an aldehyde. Reaction of an amino acid with ninhydrin O H2N

COOH ⫹ 2

R amino acid

O⫺

O pyridine

⫹ CO2

OH O ninhydrin

⫹ R O

Ruhemann’s purple

The reaction of amino acids with ninhydrin can detect amino acids on a wide variety of substrates. For example, if a kidnapper touches a ransom note with his fingers, the dermal ridges on his fingers leave traces of amino acids from skin secretions.

CHO

1169

WADEMC24_1153-1199hr.qxp

1170

16-12-2008

14:15

Page 1170

Amino Acids, Peptides, and Proteins

CHAPTER 24

Treatment of the paper with ninhydrin and pyridine causes these secretions to turn purple, forming a visible fingerprint. PROBLEM 24-19 Use resonance forms to show delocalization of the negative charge in the Ruhemann’s purple anion.

SUMMARY

Reactions of Amino Acids

1. Esterification of the carboxyl group (Section 24-7A)

⫹

H3N 9 CH 9 C 9 O⫺

R⬘ 9 OH

⫹

amino acid

⫹

H+

H3N 9 CH 9 C 9 O 9 R⬘

alcohol

⫹

H2O

amino ester

2. Acylation of the amino group: formation of amides (Section 24-7B) R

O ⫹

H2N 9 CH 9 C 9 OH amino acid

R⬘ 9 C 9 X acylating agent

R⬘ 9 C 9 NH 9 CH 9 C 9 OH acylated amino acid

⫹

H9X

3. Reaction with ninhydrin (Section 24-7C)

H2N

COOH

⫹

O⫺

O 2

pyridine

O ninhydrin

amino acid

⫹

CO2

CHO

Ruhemann’s purple

4. Formation of peptide bonds (Sections 24-10 and 24-11)

peptide bond O ⫹

H3N 9 CH 9 C 9 O⫺

O ⫹

⫹

H3N 9 CH 9 C 9 O⫺

loss of H2O

⫹

H3N 9 CH 9 C 9 NH 9 CH 9 C 9 O⫺

Amino acids also undergo many other common reactions of amines and acids.

24-8 24-8A Structure and Nomenclature of Peptides and Proteins

The most important reaction of amino acids is the formation of peptide bonds. Amines and acids can condense, with the loss of water, to form amides. Industrial processes often make amides simply by mixing the acid and the amine, then heating the mixture to drive off water.

O R 9 C 9 OH acid

Peptide Structure

O ⫹ H2N 9 R⬘ amine

O ⫹

R 9 C 9 O⫺ H3N 9 R⬘ salt

heat

R 9 C 9 NH 9 R⬘ ⫹

H2O

amide

Recall from Section 21-13 that amides are the most stable acid derivatives. This stability is partly due to the strong resonance interaction between the nonbonding electrons on nitrogen and the carbonyl group. The amide nitrogen is no longer a strong base, and the C ¬ N bond has restricted rotation because of its partial double-bond character. Figure 24-6 shows the resonance forms we use to explain the partial double-bond

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1171

1171

Structure and Nomenclature of Peptides and Proteins

24-8

peptide bond

⫺

쎱

⫹

FIGURE 24-6

Resonance stabilization of an amide accounts for its enhanced stability, the weak basicity of the nitrogen atom, and the restricted rotation of the C ¬ N bond. In a peptide, the amide bond is called a peptide bond. It holds six atoms in a plane: the C and O of the carbonyl, the N and its H, and the two associated a carbon atoms.

amide plane

character and restricted rotation of an amide bond. In a peptide, this partial doublebond character results in six atoms being held rather rigidly in a plane. Having both an amino group and a carboxyl group, an amino acid is ideally suited to form an amide linkage. Under the proper conditions, the amino group of one molecule condenses with the carboxyl group of another. The product is an amide called a dipeptide because it consists of two amino acids. The amide linkage between the amino acids is called a peptide bond. Although it has a special name, a peptide bond is just like other amide bonds we have studied. peptide bond

O +

H3N C

C 9 O−

H3N

⫹

H loss of H2O

C 9 O−

H3N

R1 H

In this manner, any number of amino acids can be bonded in a continuous chain. A peptide is a compound containing two or more amino acids linked by amide bonds between the amino group of each amino acid and the carboxyl group of the neighboring amino acid. Each amino acid unit in the peptide is called a residue. A polypeptide is a peptide containing many amino acid residues but usually having a molecular weight of less than about 5000. Proteins contain more amino acid units, with molecular weights ranging from about 6000 to about 40,000,000. The term oligopeptide is occasionally used for peptides containing about four to ten amino acid residues. Figure 24-7 shows the structure of the nonapeptide bradykinin, a human hormone that helps to control blood pressure.

C terminus

N terminus O + H3N

O N

O NH

CH H

O NH

CH CH2

NH + H2N

O N

CH2

O NH

+ H2N

NH2

Pro

Gly

Phe

Ser

O−

Pro

CH2

Arg 쎱

Pro

Phe

C NH2

Arg

FIGURE 24-7

The human hormone bradykinin is a nonapeptide with a free ¬ NH3+ at its N terminus and a free ¬ COO- at its C terminus.

WADEMC24_1153-1199hr.qxp

1172

16-12-2008

CHAPTER 24

14:15

Page 1172

Amino Acids, Peptides, and Proteins

The end of the peptide with the free amino group 1 ¬ NH3+2 is called the N-terminal end or the N terminus, and the end with the free carboxyl group 1 ¬ COO-2 is called the C-terminal end or the C terminus. Peptide structures are generally drawn with the N terminus at the left and the C terminus at the right, as bradykinin is drawn in Figure 24-7.

24-8B Peptide Nomenclature The names of peptides reflect the names of the amino acid residues involved in the amide linkages, beginning at the N terminus. All except the last are given the -yl suffix of acyl groups. For example, the following dipeptide is named alanylserine. The alanine residue has the -yl suffix because it has acylated the nitrogen of serine.

O ⫹

H3N

O NH

O⫺

CH2OH

CH3 alanyl

serine Ala-Ser

Bradykinin (Figure 24-7) is named as follows (without any spaces): arginyl prolyl prolyl glycyl phenylalanyl seryl prolyl phenylalanyl arginine This is a cumbersome and awkward name. A shorthand system is more convenient, representing each amino acid by its three-letter abbreviation. These abbreviations, given in Table 24-2, are generally the first three letters of the name. Once again, the amino acids are arranged from the N terminus at the left to the C terminus at the right. Bradykinin has the following abbreviated name: Arg-Pro-Pro-Gly-Phe-Ser-Pro-Phe-Arg Single-letter symbols (also given in Table 24-2) are becoming widely used as well. Using single letters, we symbolize bradykinin by RPPGFSPFR PROBLEM 24-20 Draw the complete structures of the following peptides: (a) Thr-Phe-Met (b) serylarginylglycylphenylalanine

(d) ELVIS

24-8C Disulfide Linkages Amide linkages (peptide bonds) form the backbone of the amino acid chains we call peptides and proteins. A second kind of covalent bond is possible between any cysteine residues present. Cysteine residues can form disulfide bridges (also called disulfide linkages) which can join two chains or link a single chain into a ring. Mild oxidation joins two molecules of a thiol into a disulfide, forming a disulfide linkage between the two thiol molecules. This reaction is reversible, and a mild reduction cleaves the disulfide. R ¬ SH + HS ¬ R two molecules of thiol

[oxidation]

IRRRJ [reduction]

R ¬ S ¬ S ¬ R + H2O disulfide

Similarly, two cysteine sulfhydryl 1 ¬ SH2 groups are oxidized to give a disulfidelinked pair of amino acids. This disulfide-linked dimer of cysteine is called cystine. Figure 24-8 shows formation of a cystine disulfide bridge linking two peptide chains.

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1173

Structure and Nomenclature of Peptides and Proteins

24-8

peptide chain

O NH

CH2

CH2 S

+ H2O

[H] (reduce)

CH2 NH

[O] (oxidize)

peptide chain

CH2 NH

C ě&#x17D;ą

O two cysteine residues

FIGURE 24-8

Cystine, a dimer of cysteine, results when two cysteine residues are oxidized to form a disulfide bridge.

cystine disulfide bridge

Two cysteine residues may form a disulfide bridge within a single peptide chain, making a ring. Figure 24-9 shows the structure of human oxytocin, a peptide hormone that causes contraction of uterine smooth muscle and induces labor. Oxytocin is a nonapeptide with two cysteine residues (at positions 1 and 6) linking part of the molecule in a large ring. In drawing the structure of a complicated peptide, arrows are often used to connect the amino acids, showing the direction from N terminus to C terminus. Notice that the C terminus of oxytocin is a primary amide 1Gly # NH22 rather than a free carboxyl group. O CH3

CH3CH2

CH2CH2C NH

H2N

CH2 CH

NH2

CH NH

CH2

NH2

O O C

N terminus

CH2

cystine disulfide bridge

H3C Ile

Asn S S

N terminus

Cys

Pro

Leu

Gly NH2

Cys

CH3

Gln

Tyr

ě&#x17D;ą

1173

C terminus (amide form)

FIGURE 24-9

Structure of human oxytocin. A disulfide linkage holds part of the molecule in a large ring.

CH H

NH2

C terminus (amide form)

WADEMC24_1153-1199hr.qxp

1174

16-12-2008

CHAPTER 24

19:34

Page 1174

Amino Acids, Peptides, and Proteins

A chain

N terminus Ile

Val

Glu

Gln

Cys

Leu

Gln

His

Leu

Val

Leu

Glu

Asn

Tyr

Gly

Asn NH2

His

Glu

Ala

Leu

Tyr

Leu

Val

Cys

Gly

Leu

Glu Arg

NH2 Ala

B chain

Ser

Cys S

Val

Cys

Gln

Ser

S Asn

Tyr

Val Ala

Ser

Gly

C terminus

Lys

Pro

Thr

Tyr

Phe

Gly

Phe C terminus N terminus 쎱

FIGURE 24-10

Structure of insulin. Two chains are joined at two positions by disulfide bridges, and a third disulfide bond holds the A chain in a ring.

Orexin A (from the Greek orexis, “appetite”) is a 33 amino acid neuropeptide connected by two disulfide bridges. Orexin A is a powerful stimulant for food intake and gastric juice secretion. Scientists are studying orexin A to learn more about the regulation of appetite and eating, hoping to learn more about causes and potential treatments for anorexia nervosa.

24-9 Peptide Structure Determination

Figure 24-10 shows the structure of insulin, a more complex peptide hormone that regulates glucose metabolism. Insulin is composed of two separate peptide chains, the A chain, containing 21 amino acid residues, and the B chain, containing 30. The A and B chains are joined at two positions by disulfide bridges, and the A chain has an additional disulfide bond that holds six amino acid residues in a ring. The C-terminal amino acids of both chains occur as primary amides. Disulfide bridges are commonly manipulated in the process of giving hair a permanent wave. Hair is composed of protein, which is made rigid and tough partly by disulfide bonds. When hair is treated with a solution of a thiol such as 2-mercaptoethanol 1HS ¬ CH2 ¬ CH2 ¬ OH2, the disulfide bridges are reduced and cleaved. The hair is wrapped around curlers, and the disulfide bonds are allowed to re-form, either by air oxidation or by application of a neutralizer. The disulfide bonds re-form in new positions, holding the hair in the bent conformation enforced by the curlers.

Insulin is a relatively simple protein, yet it is a complicated organic structure. How is it possible to determine the complete structure of a protein with hundreds of amino acid residues and a molecular weight of many thousands? Chemists have developed clever ways to determine the exact sequence of amino acids in a protein. We will consider some of the most common methods.

24-9A Cleavage of Disulfide Linkages The first step in structure determination is to break all the disulfide bonds, opening any disulfide-linked rings and separating the individual peptide chains. The individual peptide chains are then purified and analyzed separately. Cystine bridges are easily cleaved by reducing them to the thiol (cysteine) form. These reduced cysteine residues have a tendency to reoxidize and re-form disulfide bridges, however. A more permanent cleavage involves oxidizing the disulfide linkages with peroxyformic acid (Figure 24-11). This oxidation converts the disulfide bridges to sulfonic acid 1 ¬ SO3H2 groups. The oxidized cysteine units are called cysteic acid residues.

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1175

24-9 O NH

CH2

CH2 cysteic acid

O C

SO3H

OOH

SO3H

CH2

cysteic acid

1175

Peptide Structure Determination

SO3H

S S

SO3H O H

S S

OOH

SO3H HO3S

SO3H SO3H

ě&#x17D;ą

FIGURE 24-11

Oxidation of a protein by peroxyformic acid cleaves all the disulfide linkages by oxidizing cystine to cysteic acid.

24-9B Determination of the Amino Acid Composition Once the disulfide bridges have been broken and the individual peptide chains have been separated and purified, the structure of each chain must be determined. The first step is to determine which amino acids are present and in what proportions. To analyze the amino acid composition, the peptide chain is completely hydrolyzed by boiling it for 24 hours in 6 M HCl. The resulting mixture of amino acids (the hydrolysate) is placed on the column of an amino acid analyzer, diagrammed in Figure 24-12.

buffer solution

hydrolysate ion-exchange resin

ninhydrin solution

light photocell waste

intensity of absorption

different amino acids move at different speeds ě&#x17D;ą

time recorder

FIGURE 24-12

In an amino acid analyzer, the hydrolysate passes through an ionexchange column. The solution emerging from the column is treated with ninhydrin, and its absorbance is recorded as a function of time. Each amino acid is identified by the retention time required to pass through the column.

WADEMC24_1153-1199hr.qxp

1176

16-12-2008

CHAPTER 24

14:15

Page 1176

Amino Acids, Peptides, and Proteins

rg A

is H

e Ly s

rg A

time

ly G

o Pr

r Se

bradykinin

FIGURE 24-13

Use of an amino acid analyzer to determine the composition of human bradykinin. The bradykinin peaks for Pro, Arg, and Phe are larger than those in the standard equimolar mixture because bradykinin has three Pro residues, two Arg residues, and two Phe residues.

absorption

ě&#x17D;ą

Pr o G ly A la Cy s V al M e Ilet Le u

sp Th Se r r G lu

standard

In the amino acid analyzer, the components of the hydrolysate are dissolved in an aqueous buffer solution and separated by passing them down an ion-exchange column. The solution emerging from the column is mixed with ninhydrin, which reacts with amino acids to give the purple ninhydrin color. The absorption of light is recorded and printed out as a function of time. The time required for each amino acid to pass through the column (its retention time) depends on how strongly that amino acid interacts with the ion-exchange resin. The retention time of each amino acid is known from standardization with pure amino acids. The amino acids present in the sample are identified by comparing their retention times with the known values. The area under each peak is nearly proportional to the amount of the amino acid producing that peak, so we can determine the relative amounts of amino acids present. Figure 24-13 shows a standard trace of an equimolar mixture of amino acids, followed by the trace produced by the hydrolysate from human bradykinin (Arg-Pro-ProGly-Phe-Ser-Pro-Phe-Arg). Sequencing the Peptide: Terminal Residue Analysis The amino acid analyzer determines the amino acids present in a peptide, but it does not reveal their sequence: the order in which they are linked together. The peptide sequence is destroyed in the hydrolysis step. To determine the amino acid sequence, we must cleave just one amino acid from the chain and leave the rest of the chain intact. The cleaved amino acid can be separated and identified, and the process can be repeated on the rest of the chain. The amino acid may be cleaved from either end of the peptide (either the N terminus or the C terminus), and we will consider one method used for each end. This general method for peptide sequencing is called terminal residue analysis.

24-9C Sequencing from the N Terminus: The Edman Degradation The most efficient method for sequencing peptides is the Edman degradation. A peptide is treated with phenyl isothiocyanate, followed by acid hydrolysis. The products are the shortened peptide chain and a heterocyclic derivative of the N-terminal amino acid called a phenylthiohydantoin. This reaction takes place in three stages. First, the free amino group of the Nterminal amino acid reacts with phenylisothiocyanate to form a phenylthiourea.

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1177

Peptide Structure Determination

24-9

1177

Second, the phenylthiourea cyclizes to a thiazolinone and expels the shortened peptide chain. Third, the thiazolinone isomerizes to the more stable phenylthiohydantoin. Step 1: Nucleophilic attack by the free amino group on phenyl isothiocyanate, followed by a proton transfer, gives a phenylthiourea. Ph

H2N

CH R

⫺

H2N⫹ CH

peptide

C HN

S CH

peptide

a phenylthiourea

Step 2: Treatment with HCl induces cyclization to a thiazolinone and expulsion of the shortened peptide chain.

peptide

⫹

C R

C 1

NHPh

C S

HN CH

⫹

NHPh

peptide

C R

C S

C 1

⫹

NH2

peptide

N H

C R

C 1

H H2O

protonated phenylthiourea

Step 3: In acid, the thiazolinone isomerizes to the more stable phenylthiohydantoin.

NHPh

C N

HCl

HN H

thiazolinone

C R1

C O

a phenylthiohydantoin

The phenylthiohydantoin derivative is identified by chromatography, by comparing it with phenylthiohydantoin derivatives of the standard amino acids. This gives the identity of the original N-terminal amino acid. The rest of the peptide is cleaved intact, and further Edman degradations are used to identify additional amino acids in the chain. This process is well suited to automation, and several types of automatic sequencers have been developed. Figure 24-14 shows the first two steps in the sequencing of oxytocin. Before sequencing, the oxytocin sample is treated with peroxyformic acid to convert the disulfide bridge to cysteic acid residues. In theory, Edman degradations could sequence a peptide of any length. In practice, however, the repeated cycles of degradation cause some internal hydrolysis of the peptide, with loss of sample and accumulation of by-products. After about 30 cycles of degradation, further accurate analysis becomes impossible. A small peptide such as bradykinin can be completely determined by Edman degradation, but larger proteins must be broken into smaller fragments (Section 24-9E) before they can be completely sequenced. PROBLEM 24-21 Draw the structure of the phenylthiohydantoin derivatives of (a) alanine (b) tryptophan (c) lysine (d) proline

a thiazolinone

⫹ H2N

peptide ⫹ H3O⫹

WADEMC24_1153-1199hr.qxp

1178

16-12-2008

14:15

Page 1178

Amino Acids, Peptides, and Proteins

CHAPTER 24

Step 1: Cleavage and determination of the N-terminal amino acid S O CH

Tyr

Ile

Gln

peptide

(1) Ph

(2) H3O+

.. H2N

HN CH

CH2

Ph + H2N

Tyr

Ile

Gln

peptide

Ile

Gln

peptide

O CH2SO3H cysteic acid phenylthiohydantoin

SO3H

cysteic acid

Step 2: Cleavage and determination of the second amino acid (the new N-terminal amino acid) S O CH

Ile

Gln

peptide

(1) Ph

(2) H3O+

.. H2N

HN CH

CH2

Ph + H2N

N C O

tyrosine phenylthiohydantoin OH 쎱

FIGURE 24-14

The first two steps in sequencing oxytocin. Each Edman degradation cleaves the N-terminal amino acid and forms its phenylthiohydantoin derivative. The shortened peptide is available for the next step.

PROBLEM 24-22 Show the third and fourth steps in the sequencing of oxytocin. Use Figure 24-14 as a guide.

PROBLEM 24-23 The Sanger method for N-terminus determination is a less common alternative to the Edman degradation. In the Sanger method, the peptide is treated with the Sanger reagent, 2,4-dinitrofluorobenzene, and then hydrolyzed by reaction with 6 M aqueous HCl. The N-terminal amino acid is recovered as its 2,4-dinitrophenyl derivative and identified. The Sanger method O O2N

⫹ H2N

NO2

peptide

R1 peptide

2,4-dinitrofluorobenzene (Sanger reagent)

O O2N

NH NO2

R1 derivative

peptide

6 M HCl, heat

O2N

NH NO2

CH R1

2,4-dinitrophenyl derivative

(a) Propose a mechanism for the reaction of the N terminus of the peptide with 2,4-dinitrofluorobenzene. (b) Explain why the Edman degradation is usually preferred over the Sanger method.

COOH ⫹ amino acids

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1179

Peptide Structure Determination

24-9

24-9D C-Terminal Residue Analysis There is no efficient method for sequencing several amino acids of a peptide starting from the C terminus. In many cases, however, the C-terminal amino acid can be identified using the enzyme carboxypeptidase, which cleaves the C-terminal peptide bond. The products are the free C-terminal amino acid and a shortened peptide. Further reaction cleaves the second amino acid that has now become the new C terminus of the shortened peptide. Eventually, the entire peptide is hydrolyzed to its individual amino acids. O peptide

Rn⫺1

O NH

The selective enzymatic cleavage of proteins is critical to many biological processes. For example, the clotting of blood depends on the enzyme thrombin cleaving fibrinogen at specific points to produce fibrin, the protein that forms a clot.

O OH

carboxypeptidase H2O

peptide

1179

O OH ⫹ H2N

Rn⫺1

Rn free amino acid

(further cleavage)

A peptide is incubated with the carboxypeptidase enzyme, and the appearance of free amino acids is monitored. In theory, the amino acid whose concentration increases first should be the C terminus, and the next amino acid to appear should be the second residue from the end. In practice, different amino acids are cleaved at different rates, making it difficult to determine amino acids past the C terminus and occasionally the second residue in the chain.

24-9E Breaking the Peptide into Shorter Chains: Partial Hydrolysis Before a large protein can be sequenced, it must be broken into smaller chains, not longer than about 30 amino acids. Each of these shortened chains is sequenced, and then the entire structure of the protein is deduced by fitting the short chains together like pieces of a jigsaw puzzle. Partial cleavage can be accomplished either by using dilute acid with a shortened reaction time or by using enzymes, such as trypsin and chymotrypsin, that break bonds between specific amino acids. The acid-catalyzed cleavage is not very selective, leading to a mixture of short fragments resulting from cleavage at various positions. Enzymes are more selective, giving cleavage at predictable points in the chain. TRYPSIN: Cleaves the chain at the carboxyl groups of the basic amino acids lysine and arginine. CHYMOTRYPSIN: Cleaves the chain at the carboxyl groups of the aromatic amino acids phenylalanine, tyrosine, and tryptophan. Let’s use oxytocin (Figure 24-9) as an example to illustrate the use of partial hydrolysis. Oxytocin could be sequenced directly by C-terminal analysis and a series of Edman degradations, but it provides a simple example of how a structure can be pieced together from fragments. Acid-catalyzed partial hydrolysis of oxytocin (after cleavage of the disulfide bridge) gives a mixture that includes the following peptides: Ile-Gln-Asn-Cys

Gln-Asn-Cys-Pro

Pro-Leu-Gly # NH2

Cys-Tyr-Ile-Gln-Asn

When we match the overlapping regions of these fragments, the complete sequence of oxytocin appears: Cys-Tyr-Ile-Gln-Asn Ile-Gln-Asn-Cys Gln-Asn-Cys-Pro Cys-Pro-Leu-Gly Pro-Leu-Gly # NH2 Complete structure Cys-Tyr-Ile-Gln-Asn-Cys-Pro-Leu-Gly # NH2

Cys-Pro-Leu-Gly

Proteolytic (protein-cleaving) enzymes also have applications in consumer products. For example, papain (from papaya extract) serves as a meat tenderizer. It cleaves the fibrous proteins, making the meat less tough.

WADEMC24_1153-1199hr.qxp

1180

16-12-2008

CHAPTER 24

14:15

Page 1180

Amino Acids, Peptides, and Proteins

The two Cys residues in oxytocin may be involved in disulfide bridges, either linking two of these peptide units or forming a ring. By measuring the molecular weight of oxytocin, we can show that it contains just one of these peptide units; therefore, the Cys residues must link the molecule in a ring. PROBLEM 24-24 Show where trypsin and chymotrypsin would cleave the following peptide. Tyr-Ile-Gln-Arg-Leu-Gly-Phe-Lys-Asn-Trp-Phe-Gly-Ala-Lys-Gly-Gln-Gln # NH2

PROBLEM 24-25 After treatment with peroxyformic acid, the peptide hormone vasopressin is partially hydrolyzed. The following fragments are recovered. Propose a structure for vasopressin. Phe-Gln-Asn Asn-Cys-Pro-Arg

24-10 24-10A Solution-Phase Peptide Synthesis

Pro-Arg-Gly # NH2 Tyr-Phe-Gln-Asn

Cys-Tyr-Phe

Introduction

Total synthesis of peptides is rarely an economical method for their commercial production. Important peptides are usually derived from biological sources. For example, insulin for diabetics was originally taken from pork pancreas. Now, recombinant DNA techniques have improved the quality and availability of peptide pharmaceuticals. It is possible to extract the piece of DNA that contains the code for a particular protein, insert it into a bacterium, and induce the bacterium to produce the protein. Strains of Escherichia coli have been developed to produce human insulin that avoids dangerous reactions in people who are allergic to pork products. Laboratory peptide synthesis is still an important area of chemistry, however, for two reasons: If the synthetic peptide is the same as the natural peptide, it proves the structure is correct; and the synthesis provides a larger amount of the material for further biological testing. Also, synthetic peptides can be made with altered amino acid sequences to compare their biological activity with the natural peptides. These comparisons can point out the critical areas of the peptide, which may suggest causes and treatments for genetic diseases involving similar abnormal peptides. Peptide synthesis requires the formation of amide bonds between the proper amino acids in the proper sequence. With simple acids and amines, we would form an amide bond simply by converting the acid to an activated derivative (such as an acyl halide or anhydride) and adding the amine. O

R9C9X ⫹

H2N 9 R⬘

R 9 C 9 NH 9 R⬘

⫹

H9X

(X is a good leaving group, preferably electron-withdrawing)

Amide formation is not so easy with amino acids, however. Each amino acid has both an amino group and a carboxyl group. If we activate the carboxyl group, it reacts with its own amino group. If we mix some amino acids and add a reagent to make them couple, they form every conceivable sequence. Also, some amino acids have side chains that might interfere with peptide formation. For example, glutamic acid has an extra carboxyl group, and lysine has an extra amino group. As a result, peptide synthesis always involves both activating reagents to form the correct peptide bonds and protecting groups to block formation of incorrect bonds. Chemists have developed many ways of synthesizing peptides, falling into two major groups. The solution-phase method involves adding reagents to solutions

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1181

Solution-Phase Peptide Synthesis

24-10

1181

of growing peptide chains and purifying the products as needed. The solid-phase method involves adding reagents to growing peptide chains bonded to solid polymer particles. Many different reagents are available for each of these methods, but we will consider only one set of reagents for the solution-phase method and one set for the solid-phase method. The general principles are the same regardless of the specific reagents.

24-10B Solution-Phase Method Consider the structure of alanylvalylphenylalanine, a simple tripeptide: O

H2N 9 CH 9 C 9 NH 9 CH 9 C 9 NH 9 CH 9 C 9 OH CH3

CH(CH3)2

alanyl

CH2Ph

valyl Ala-Val-Phe

phenylalanine

Solution-phase peptide synthesis begins at the N terminus and ends at the C terminus, or left to right as we draw the peptide. The first major step is to couple the carboxyl group of alanine to the amino group of valine. This cannot be done simply by activating the carboxyl group of alanine and adding valine. If we activated the carboxyl group of alanine, it would react with another molecule of alanine. To prevent side reactions, the amino group of alanine must be protected to make it nonnucleophilic. In Section 24-7B, we saw that an amino acid reacts with benzyl chloroformate (also called benzyloxycarbonyl chloride) to form a urethane, or carbamate ester, that is easily removed at the end of the synthesis. This protecting group has been used for many years, and it has acquired several names. It is called the benzyloxycarbonyl group, the carbobenzoxy group (Cbz), or simply the Z group (abbreviated Z). Preliminary step: Protect the amino group with Z. Z group

O CH2

O Cl ⫹ H2N

O OH

Et3N

CH2

O NH

CH3 benzyl chloroformate Z-Cl

OH ⫹ HCl

CH3

alanine Ala

benzyloxycarbonyl alanine Z-Ala

The amino group in Z-Ala is protected as the nonnucleophilic amide half of a carbamate ester. The carboxyl group can be activated without reacting with the protected amino group. Treatment with ethyl chloroformate converts the carboxyl group to a mixed anhydride of the amino acid and carbonic acid. It is strongly activated toward nucleophilic attack. Step 1: Activate the carboxyl group with ethyl chloroformate. anhydride of carbonic acid

O Z

NHCH

O OH

⫹

O OCH2CH3

CH3 protected alanine

NHCH

O O

CH3 ethyl chloroformate

mixed anhydride

OCH2CH3

⫹

HCl

WADEMC24_1153-1199hr.qxp

1182

16-12-2008

CHAPTER 24

14:15

Page 1182

Amino Acids, Peptides, and Proteins

When the second amino acid (valine) is added to the protected, activated alanine, the nucleophilic amino group of valine attacks the activated carbonyl of alanine, displacing the anhydride and forming a peptide bond. (Some procedures use an ester of the new amino acid to avoid competing reactions from its carboxylate group.) Step 2: Form an amide bond to couple the next amino acid. O Z

NHCH

O OCH2CH3 ⫹ H2N

NHCH

⫹ CO2 ⫹ CH3CH2OH

CH(CH3 )2 Z-Ala-Val

CH3

CH(CH3 )2 valine

CH3 protected, activated alanine

PROBLEM 24-26 Give complete mechanisms for the formation of Z-Ala, its activation by ethyl chloroformate, and the coupling with valine.

At this point, we have the N-protected dipeptide Z-Ala-Val. Phenylalanine must be added to the C terminus to complete the Ala-Val-Phe tripeptide. Activation of the valine carboxyl group, followed by addition of phenylalanine, gives the protected tripeptide. Step 1: Activate the carboxyl group with ethyl chloroformate.

NHCHCNHCH

CH3

O OH ⫹ Cl

OEt

NHCHCNHCH

CH(CH3)2

Ala

CH3

Val

O O

OEt ⫹ HCl

CH(CH3)2

Ala

Val

Step 2: Form an amide bond to couple the next amino acid. O Z

Ala

NHCH

O OEt ⫹ H2N

CH(CH3)2 Val

O OH

CH2 Ph phenylalanine

Ala

NHCH

H3C

O NH

CH CH3 CH2 Z-Ala-Val-Phe

OH ⫹ CO2 ⫹ EtOH

To make a larger peptide, repeat these two steps for the addition of each amino acid residue: 1. Activate the C terminus of the growing peptide by reaction with ethyl chloroformate. 2. Couple the next amino acid. The final step in the solution-phase synthesis is to deprotect the N terminus of the completed peptide. The N-terminal amide bond must be cleaved without breaking any of the peptide bonds in the product. Fortunately, the benzyloxycarbonyl group is partly an amide and partly a benzyl ester, and hydrogenolysis of the benzyl ester takes place under mild conditions that do not cleave the peptide bonds. This mild cleavage is the reason for using the benzyloxycarbonyl group (as opposed to some other acyl group) to protect the N terminus. Final step: Remove the protecting group.

O CH2

NHCHC CH3

Z-Ala-Val-Phe

O Val

Phe

H2, Pd

H2NCHC

Val

CH3 Ala-Val-Phe

Phe ⫹ CO2

⫹ Ph

CH3

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1183

24-11

Solid-Phase Peptide Synthesis

problem-solving

PROBLEM 24-27 Show how you would synthesize Ala-Val-Phe-Gly-Leu starting with Z-Ala-Val-Phe.

PROBLEM 24-28 Show how the solution-phase synthesis would be used to synthesize Ile-Gly-Asn.

The solution-phase method works well for small peptides, and many peptides have been synthesized by this process. A large number of chemical reactions and purifications are required even for a small peptide, however. Although the individual yields are excellent, with a large peptide, the overall yield becomes so small as to be unusable, and several months (or years) are required to complete so many steps. The large amounts of time required and the low overall yields are due largely to the purification steps. For larger peptides and proteins, solid-phase peptide synthesis is usually preferred.

In 1962, Robert Bruce Merrifield of Rockefeller University developed a method for synthesizing peptides without having to purify the intermediates. He did this by attaching the growing peptide chains to solid polystyrene beads. After each amino acid is added, the excess reagents are washed away by rinsing the beads with solvent. This ingenious method lends itself to automation, and Merrifield built a machine that can add several amino acid units while running unattended. Using this machine, Merrifield synthesized ribonuclease (124 amino acids) in just six weeks, obtaining an overall yield of 17%. Merrifieldâ&#x20AC;&#x2122;s work in solid-phase peptide synthesis won the Nobel Prize in Chemistry in 1984.

24-11 Solid-Phase Peptide Synthesis

Three reactions are crucial for solid-phase peptide synthesis. These reactions attach the first amino acid to the solid support, protect each amino group until its time to react, and form the peptide bonds between the amino acids. Attaching the Peptide to the Solid Support The greatest difference between solution-phase and solid-phase peptide synthesis is that solid-phase synthesis is done in the opposite direction: starting with the C terminus and going toward the N terminus, right to left as we write the peptide. The first step is to attach the last amino acid (the C terminus) to the solid support. The solid support is a special polystyrene bead in which some of the aromatic rings have chloromethyl groups. This polymer, often called the Merrifield resin, is made by copolymerizing styrene with a few percent of p-(chloromethyl)styrene. Formation of the Merrifield resin

CH2Cl

âŤš

âŤ˝

H C

C H

styrene

CH2

p-(chloromethyl)styrene

polymer

Hint

Remember that classical (solution-phase) peptide synthesis: 1. Goes N : C. Protect the N terminus (Z group) first, deprotect last. 2. Couple each amino acid by activating the C terminus (ethyl chloroformate), then adding the new amino acid.

24-11A The Individual Reactions

CH2Cl

1183

abbreviation

WADEMC24_1153-1199hr.qxp

1184

16-12-2008

14:15

Page 1184

Amino Acids, Peptides, and Proteins

CHAPTER 24

Like other benzyl halides, the chloromethyl groups on the polymer are reactive toward SN2 attack. The carboxyl group of an N-protected amino acid displaces chloride, giving an amino acid ester of the polymer. In effect, the polymer serves as the alcohol part of an ester protecting group for the carboxyl end of the C-terminal amino acid. The amino group must be protected, or it would attack the chloromethyl groups. Attachment of the C-terminal amino acid

O protecting group

H O

⫺

H C

protecting group

Cl⫺

CH2

Once the C-terminal amino acid is fixed to the polymer, the chain is built on the amino group of this amino acid. Using the tert-Butyloxycarbonyl (Boc) Protecting Group The benzyloxycarbonyl group (the Z group) cannot be used with the solid-phase process because the Z group is removed by hydrogenolysis in contact with a solid catalyst. A polymer-bound peptide cannot achieve the intimate contact with a solid catalyst required for hydrogenolysis. The N-protecting group used in the Merrifield procedure is the tert-butyloxycarbonyl group, abbreviated Boc or t-Boc. The Boc group is similar to the Z group, except that it has a tert-butyl group in place of the benzyl group. Like other tert-butyl esters, the Boc protecting group is easily removed under acidic conditions. The acid chloride of the Boc group is unstable, so we use the anhydride, di-tertbutyldicarbonate, to attach the group to the amino acid. Protection of the amino group as its Boc derivative

CH3 CH3

O O

CH3

C CH3

di-tert-butyldicarbonate

CH3

CH3 CH3 ⫹ H2N

COOH

R amino acid

CH3

COOH

R Boc-amino acid

CH3 ⫹ CO2 ⫹ CH3

CH3

The Boc group is easily cleaved by brief treatment with trifluoroacetic acid (TFA), CF3COOH. Loss of a relatively stable tert-butyl cation from the protonated ester gives an unstable carbamic acid. Decarboxylation of the carbamic acid gives the deprotected amino group of the amino acid. Loss of a proton from the tert-butyl cation gives isobutylene.

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1185

Solid-Phase Peptide Synthesis

24-11

CH3 CH3

O⫹ H

CH3

O NH

CH3

COOH

CF3COOH

CH3

⫹

C⫹

CH3

O O

CH3

COOH

Boc-amino acid

CH3

protonated

CH3

⫹

H3N

COOH

COOH ⫹ CH2

CH3

free amino acid

a carbamic acid

⫹ CO2

isobutylene

People who synthesize peptides generally do not make their own Boc-protected amino acids. Because they use all their amino acids in protected form, they buy and use commercially available Boc amino acids. Use of DCC as a Peptide Coupling Agent The final reaction needed for the Merrifield procedure is the peptide bond-forming condensation. When a mixture of an amine and an acid is treated with N,N¿-dicyclohexylcarbodiimide (abbreviated DCC), the amine and the acid couple to form an amide. The molecule of water lost in this condensation converts DCC to N,N¿-dicyclohexyl urea (DCU). O R

O O⫺

⫹

⫹ H 3N

R⬘ ⫹

amine

acid

N,N⬘-dicyclohexylcarbodiimide (DCC)

R⬘ ⫹

N,N⬘-dicyclohexyl urea (DCU)

amide

The mechanism for DCC coupling is not as complicated as it may seem. The carboxylate ion adds to the strongly electrophilic carbon of the diimide, giving an activated acyl derivative of the acid. This activated derivative reacts readily with the amine to give the amide. In the final step, DCU serves as an excellent leaving group. The cyclohexane rings are miniaturized for clarity. Formation of an activated acyl derivative N

N O

O R

⫺

N O

C N

H 9 NH29 R⬘

⫺

activated

H2N

R⬘

C NH

Coupling with the amine and loss of DCU

O R

N O

R NH

R⬘

NH2

R⬘

⫺

C ⫹

NH2

N O

R NH

O ⫹

C NH

N R⬘

⫺

C NHR⬘ amide ⫹ DCU

1185

WADEMC24_1153-1199hr.qxp

1186

16-12-2008

14:15

Page 1186

Amino Acids, Peptides, and Proteins

CHAPTER 24

At the completion of the synthesis, the ester bond to the polymer is cleaved by anhydrous HF. Because this is an ester bond, it is more easily cleaved than the amide bonds of the peptide. Cleavage of the finished peptide

CH2F

O peptide

CH2

peptide

OH ⫹ P

P PROBLEM 24-29 Propose a mechanism for the coupling of acetic acid and aniline using DCC as a coupling agent.

Hint

problem-solving

Now we consider an example to illustrate how these procedures are combined in the Merrifield solid-phase peptide synthesis.

Remember that solid-phase peptide synthesis: 1. Goes C : N. Attach the Boc-protected C terminus to the bead first. 2. Couple each amino acid by removing (TFA) the Boc group from the N terminus, then add the next Bocprotected amino acid with DCC. 3. Cleave (HF) the finished peptide from the bead.

24-11B An Example of Solid-Phase Peptide Synthesis For easy comparison of the solution-phase and solid-phase methods, we will consider the synthesis of the same tripeptide we made using the solution-phase method. Ala-Val-Phe The solid-phase synthesis is carried out in the direction opposite that of the solution-phase synthesis. The first step is attachment of the N-protected C-terminal amino acid (Boc-phenylalanine) to the polymer. O

Me3C

O NH

Boc

O⫺ ⫹ CH2

Me3C

Boc

CH2

Boc-Phe

O NH

CH2

Boc-Phe— P

CH2

Trifluoroacetic acid (TFA) cleaves the Boc protecting group of phenylalanine so that its amino group can be coupled with the next amino acid. O Me3C

Boc

O NH Ph

CH2

Boc-Phe— P

CF3COOH (TFA)

O ⫹

H3N Ph

CH3 O

CH2 ⫹ CH2

CH3

CH2

Phe— P

⫹ CO2

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1187

1187

Solid-Phase Peptide Synthesis

24-11

The second amino acid (valine) is added in its N-protected Boc form so that it cannot couple with itself. Addition of DCC couples the valine carboxyl group with the free ¬ NH2 group of phenylalanine. O

O Boc

⫹

O⫺ ⫹ H3N

(CH3)2CH

CH2

Boc

(CH3)2CH

CH2

Phe— P

Boc-Val

O DCC

CH2

CH2 ⫹ DCU

Boc-Val-Phe— P

To couple the final amino acid (alanine), the chain is first deprotected by treatment with trifluoroacetic acid. Then the N-protected Boc-alanine and DCC are added. Step 1: Deprotection

Boc

(CH3)2CH

CF3COOH (TFA)

O NH Ph

H3N

CH2

NH Ph

(CH3)2CH

Boc-Val-Phe— P

O ⫹

CH3 O

CH2 ⫹ CH3

CH2

⫹ CO2

CH2

Val-Phe— P

Step 2: Coupling O

⫹

H3N

(CH3)2CH

Boc 9 NH 9 CH 9 C 9 O−

O NH

CH2

CH3 DCC

Boc

CH3

Val-Phe— P

O CH

(CH3)2CH

O NH

CH2

CH2 ⫹ DCU

Boc-Ala-Val-Phe— P

If we were making a longer peptide, the addition of each subsequent amino acid would require the repetition of two steps: 1. Use trifluoroacetic acid to deprotect the amino group at the end of the growing chain. 2. Add the next Boc-amino acid, using DCC as a coupling agent. Once the peptide is completed, the final Boc protecting group must be removed, and the peptide must be cleaved from the polymer. Anhydrous HF cleaves the ester linkage that bonds the peptide to the polymer, and it also removes the Boc protecting group. In our example, the following reaction occurs: O Boc

CH CH3

O NH

(CH3)2CH

O NH

CH2

O O

CH2

⫹

H3N

CH3

CH2

(CH3)2CH

Ala-Val-Phe Boc-Ala-Val-Phe— P

CH3 ⫹ CO2 ⫹ CH3

CH2 ⫹

CH2F

WADEMC24_1153-1199hr.qxp

1188

16-12-2008

CHAPTER 24

14:15

Page 1188

Amino Acids, Peptides, and Proteins

PROBLEM 24-30 Show how you would synthesize Leu-Gly-Ala-Val-Phe starting with Boc-Ala-Val-Pheâ&#x20AC;&#x201D;ä&#x160;&#x160;. P

PROBLEM 24-31 Show how solid-phase peptide synthesis would be used to make Ile-Gly-Asn.

24-12 Classification of Proteins

Proteins may be classified according to their chemical composition, their shape, or their function. Protein composition and function are treated in detail in a biochemistry course. For now, we briefly survey the types of proteins and their general classifications. Proteins are grouped into simple and conjugated proteins according to their chemical composition. Simple proteins are those that hydrolyze to give only amino acids. All the protein structures we have considered so far are simple proteins. Examples are insulin, ribonuclease, oxytocin, and bradykinin. Conjugated proteins are bonded to a nonprotein prosthetic group such as a sugar, a nucleic acid, a lipid, or some other group. Table 24-3 lists some examples of conjugated proteins. TABLE 24-3 Classes of Conjugated Proteins Class

glycoproteins nucleoproteins lipoproteins metalloproteins

Prosthetic Group

carbohydrates nucleic acids fats, cholesterol a complexed metal

Examples

g-globulin, interferon ribosomes, viruses high-density lipoprotein hemoglobin, cytochromes

Proteins are classified as fibrous or globular depending on whether they form long filaments or coil up on themselves. Fibrous proteins are stringy, tough, and usually insoluble in water. They function primarily as structural parts of the organism. Examples of fibrous proteins are a-keratin in hooves and fingernails, and collagen in tendons. Globular proteins are folded into roughly spherical shapes. They usually function as enzymes, hormones, or transport proteins. Enzymes are protein-containing biological catalysts; an example is ribonuclease, which cleaves RNA. Hormones help to regulate processes in the body. An example is insulin, which regulates glucose levels in the blood and its uptake by cells. Transport proteins bind to specific molecules and transport them in the blood or through the cell membrane. An example is hemoglobin, which transports oxygen in the blood from the lungs to the tissues.

24-13 24-13A Levels of Protein Structure

Primary Structure

Up to now, we have discussed the primary structure of proteins. The primary structure is the covalently bonded structure of the molecule. This definition includes the sequence of amino acids, together with any disulfide bridges. All the properties of the protein are determined, directly or indirectly, by the primary structure. Any folding, hydrogen bonding, or catalytic activity depends on the proper primary structure.

24-13B Secondary Structure Although we often think of peptide chains as linear structures, they tend to form orderly hydrogen-bonded arrangements. In particular, the carbonyl oxygen atoms form hydrogen

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1189

24-13

Levels of Protein Structure

1189

R C C H

H H

N R

CH C

쎱

N C

N R O

N CH

O N

O C O H HC R

C H

H N N CH

C O R HC CH R

O N

C = gray N = blue O = red R = green

FIGURE 24-15

The a helical arrangement. The peptide chain curls into a helix so that each peptide carbonyl group is hydrogen-bonded to an N ¬ H hydrogen on the next turn of the helix. Side chains are symbolized by green atoms in the space-filling structure.

bonds with the amide 1N ¬ H2 hydrogens. This tendency leads to orderly patterns of hydrogen bonding: the A helix and the pleated sheet. These hydrogen-bonded arrangements, if present, are called the secondary structure of the protein. When a peptide chain winds into a helical coil, each carbonyl oxygen can hydrogen-bond with an N ¬ H hydrogen on the next turn of the coil. Many proteins wind into an a helix (a helix that looks like the thread on a right-handed screw) with the side chains positioned on the outside of the helix. For example, the fibrous protein a keratin is arranged in the a-helical structure, and most globular proteins contain segments of a helix. Figure 24-15 shows the a-helical arrangement. Segments of peptides can also form orderly arrangements of hydrogen bonds by lining up side-by-side. In this arrangement, each carbonyl group on one chain forms a hydrogen bond with an N ¬ H hydrogen on an adjacent chain. This arrangement may involve many peptide molecules lined up side-by-side, resulting in a two-dimensional sheet. The bond angles between amino acid units are such that the sheet is pleated (creased), with the amino acid side chains arranged on alternating sides of the sheet. Silk fibroin, the principal fibrous protein in the silks of insects and arachnids, has a pleated sheet secondary structure. Figure 24-16 shows the pleated sheet structure.

C CH

...

R R

H N CH R

R R

C CH R R

CH C

CH R R

C CH

CH N

R R

N CH

...

N CH

CH C

N CH

...

N C

...

N CH

...

N CH

...

Spider web is composed mostly of fibroin, a protein with pleated-sheet secondary structure. The pleatedsheet arrangement allows for multiple hydrogen bonds between molecules, conferring great strength.

O C

CH C

CH R

쎱

FIGURE 24-16

The pleated sheet arrangement. Each peptide carbonyl group is hydrogenbonded to an N ¬ H hydrogen on an adjacent peptide chain.

WADEMC24_1153-1199hr.qxp

1190

16-12-2008

CHAPTER 24

19:35

Page 1190

Amino Acids, Peptides, and Proteins

A protein may or may not have the same secondary structure throughout its length. Some parts may be curled into an a helix, while other parts are lined up in a pleated sheet. Parts of the chain may have no orderly secondary structure at all. Such a structureless region is called a random coil. Most globular proteins, for example, contain segments of a helix or pleated sheet separated by kinks of random coil, allowing the molecule to fold into its globular shape.

24-13C Tertiary Structure

Tertiary structures of proteins are determined by X-ray crystallography. A single crystal of the protein is bombarded with X rays, whose wavelengths are appropriate to be diffracted by the regular atomic spacings in the crystal. A computer then determines the locations of the atoms in the crystal.

The tertiary structure of a protein is its complete three-dimensional conformation. Think of the secondary structure as a spatial pattern in a local region of the molecule. Parts of the protein may have the a-helical structure, while other parts may have the pleated-sheet structure, and still other parts may be random coils. The tertiary structure includes all the secondary structure and all the kinks and folds in between. The tertiary structure of a typical globular protein is represented in Figure 24-17. Coiling of an enzyme can give three-dimensional shapes that produce important catalytic effects. Polar, hydrophilic (water-loving) side chains are oriented toward the outside of the globule. Nonpolar, hydrophobic (water-hating) groups are arranged toward the interior. Coiling in the proper conformation creates an enzyme’s active site, the region that binds the substrate and catalyzes the reaction. A reaction taking place at the active site in the interior of an enzyme may occur under essentially anhydrous, nonpolar conditions—while the whole system is dissolved in water!

24-13D Quaternary Structure Quaternary structure refers to the association of two or more peptide chains in the complete protein. Not all proteins have quaternary structure. The ones that do are those that associate together in their active form. For example, hemoglobin, the oxygen carrier in mammalian blood, consists of four peptide chains fitted together to form a globular protein. Figure 24-18 summarizes the four levels of protein structure.

random coil

C terminus

COO−

eli

α-h

쎱

FIGURE 24-17

The tertiary structure of a typical globular protein includes segments of a helix with segments of random coil at the points where the helix is folded.

+NH 3

N terminus

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1191

24-14

Ile

Gln

Tyr

Asn Cys

Pro

Leu

primary structure

Gly NH2

CH N

R H O

H O

C CH

secondary structure

ě&#x17D;ą

tertiary structure

1191

S S

CH R

Cys

Protein Denaturation

quaternary structure

For a protein to be biologically active, it must have the correct structure at all levels. The sequence of amino acids must be right, with the correct disulfide bridges linking the cysteines on the chains. The secondary and tertiary structures are important, as well. The protein must be folded into its natural conformation, with the appropriate areas of a helix and pleated sheet. For an enzyme, the active site must have the right conformation, with the necessary side-chain functional groups in the correct positions. Conjugated proteins must have the right prosthetic groups, and multichain proteins must have the right combination of individual peptides. With the exception of the covalent primary structure, all these levels of structure are maintained by weak solvation and hydrogen-bonding forces. Small changes in the environment can cause a chemical or conformational change resulting in denaturation: disruption of the normal structure and loss of biological activity. Many factors can cause denaturation, but the most common ones are heat and pH.

FIGURE 24-18

A schematic comparison of the levels of protein structure. Primary structure is the covalently bonded structure, including the amino acid sequence and any disulfide bridges. Secondary structure refers to the areas of a helix, pleated sheet, or random coil. Tertiary structure refers to the overall conformation of the molecule. Quaternary structure refers to the association of two or more peptide chains in the active protein.

24-14 Protein Denaturation

24-14A Reversible and Irreversible Denaturation The cooking of egg white is an example of protein denaturation by high temperature. Egg white contains soluble globular proteins called albumins. When egg white is heated, the albumins unfold and coagulate to produce a solid rubbery mass. Different proteins have different abilities to resist the denaturing effect of heat. Egg albumin is quite sensitive to heat, but bacteria that live in geothermal hot springs have developed proteins that retain their activity in boiling water. When a protein is subjected to an acidic pH, some of the side-chain carboxyl groups become protonated and lose their ionic charge. Conformational changes result, leading to denaturation. In a basic solution, amino groups become deprotonated, similarly losing their ionic charge, causing conformational changes and denaturation.

Irreversible denaturation of egg albumin. The egg white does not become clear and runny again when it cools.

WADEMC24_1153-1199hr.qxp

1192

16-12-2008

CHAPTER 24

14:15

Page 1192

Amino Acids, Peptides, and Proteins

Milk turns sour because of the bacterial conversion of carbohydrates to lactic acid. When the pH becomes strongly acidic, soluble proteins in milk are denatured and precipitate. This process is called curdling. Some proteins are more resistant to acidic and basic conditions than others. For example, most digestive enzymes such as amylase and trypsin remain active under acidic conditions in the stomach, even at a pH of about 1. In many cases, denaturation is irreversible. When cooked egg white is cooled, it does not become uncooked. Curdled milk does not uncurdle when it is neutralized. Denaturation may be reversible, however, if the protein has undergone only mild denaturing conditions. For example, a protein can be salted out of solution by a high salt concentration, which denatures and precipitates the protein. When the precipitated protein is redissolved in a solution with a lower salt concentration, it usually regains its activity together with its natural conformation.

24-14B Prion Diseases

Micrograph of normal human brain tissue. The nuclei of neurons appear as dark spots.

Brain tissue of a patient infected with vCJD. Note the formation of (white) vacuole spaces and (dark, irregular) plaques of prion protein. (Magnification 200X)

Up through 1980, people thought that all infectious diseases were caused by microbes of some sort. They knew about diseases caused by viruses, bacteria, protozoa, and fungi. There were some strange diseases, however, for which no one had isolated and cultured the pathogen. Creutzfeldt–Jakob Disease (CJD) in humans, scrapie in sheep, and transmissible encephalopathy in mink (TME) all involved a slow, gradual loss of mental function and eventual death. The brains of the victims all showed unusual plaques of amyloid protein surrounded by spongelike tissue. Workers studying these diseases thought there was an infectious agent involved (as opposed to genetic or environmental causes) because they knew that scrapie and TME could be spread by feeding healthy animals the ground-up remains of sick animals. They had also studied kuru, a disease much like CJD among tribes where family members showed their respect for the dead by eating their brains. These diseases were generally attributed to “slow viruses” that were yet to be isolated. In the 1980s, neurologist Stanley B. Prusiner (of the University of California at San Francisco) made a homogenate of scrapie-infected sheep brains and systematically separated out all the cell fragments, bacteria, and viruses, and found that the remaining material was still infectious. He separated out the proteins and found a protein fraction that was still infectious. He suggested that scrapie (and presumably similar diseases) is caused by a protein infectious agent that he called prion protein. This conclusion contradicted the established principle that contagious diseases require a living pathogen. Many skeptical workers repeated Prusiner’s work in hopes of finding viral contaminants in the infectious fractions, and most of them finally came to the same conclusion. Prusiner received the 1998 Nobel Prize in Medicine or Physiology for this work. Since Prusiner’s work, prion diseases have become more important because of their threat to humans. Beginning in 1996, some cows in the United Kingdom developed “mad cow disease” and would threaten other animals, wave their heads, fall down, and eventually die. The disease, called bovine spongiform encephalopathy, or BSE, was probably transmitted to cattle by feeding them the remains of scrapie-infected sheep. The most frightening aspect of the BSE outbreak was that people could contract a fatal disease, called new-variant Creutzfeldt–Jakob Disease (vCJD) from eating the infected meat. Since that time, a similar disease, called chronic wasting disease, or CWD, has been found in wild deer and elk in the Rocky Mountains. All of these (presumed) prion diseases are now classified as transmissible spongiform encephalopathies, or TSEs. The most widely accepted theory of prion diseases suggests that the infectious prion protein has the same primary structure as a normal protein found in nerve

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1193

Glossary

1193

cells, but it differs in its tertiary structure. In effect, it is a misfolded, denatured version of a normal protein that polymerizes to form the amyloid protein plaques seen in the brains of infected animals. When an animal ingests infected food, the polymerized protein resists digestion. Because it is simply a misfolded version of a normal protein, the infectious prion does not provoke the hostâ&#x20AC;&#x2122;s immune system to attack the pathogen. When the abnormal prion interacts with the normal version of the protein on the membranes of nerve cells, the abnormal protein somehow induces the normal molecules to change their shape. This is the part of the process we know the least about. (We might think of it like crystallization, in which a seed crystal induces other molecules to crystallize in the same conformation and crystal form.) These newly misfolded protein molecules then induce more molecules to change shape. The polymerized abnormal protein cannot be broken down by the usual protease enzymes, so it builds up in the brain and causes the plaques and spongy tissue associated with TSEs. We once thought that a protein with the correct primary structure, placed in the right physiological solution, would naturally fold into the correct tertiary structure and stay that way. We were wrong. We now know that protein folding is a carefully controlled process in which enzymes and chaperone proteins promote correct folding as the protein is synthesized. Prion diseases have shown that there are many factors that cause proteins to fold into natural or unnatural conformations, and that the folding of the protein can have major effects on its biological properties within an organism.

active site The region of an enzyme that binds the substrate and catalyzes the reaction. (p. 1190) amino acid Literally, any molecule containing both an amino group 1 ÂŹ NH22 and a carboxyl group 1 ÂŹ COOH2. The term usually means an A-amino acid, with the amino group on the carbon atom next to the carboxyl group. (p. 1154) biomimetic synthesis A laboratory synthesis that is patterned after a biological synthesis. For example, the synthesis of amino acids by reductive amination resembles the biosynthesis of glutamic acid. (p. 1162) complete proteins Proteins that provide all the essential amino acids in about the right proportions for human nutrition. Examples include those in meat, fish, milk, and eggs. Incomplete proteins are severely deficient in one or more of the essential amino acids. Most plant proteins are incomplete. (p. 1157) conjugated protein A protein that contains a nonprotein prosthetic group such as a sugar, nucleic acid, lipid, or metal ion. (p. 1188) C terminus (C-terminal end) The end of the peptide chain with a free or derivatized carboxyl group. As the peptide is written, the C terminus is usually on the right. The amino group of the C-terminal amino acid links it to the rest of the peptide. (p. 1172) denaturation An unnatural alteration of the conformation or the ionic state of a protein. Denaturation generally results in precipitation of the protein and loss of its biological activity. Denaturation may be reversible, as in salting out a protein, or irreversible, as in cooking an egg. (p. 1191) disulfide linkage (disulfide bridge) A bond between two cysteine residues formed by mild oxidation of their thiol groups to a disulfide. (p. 1172) Edman degradation A method for removing and identifying the N-terminal amino acid from a peptide without destroying the rest of the peptide chain. The peptide is treated with phenylisothiocyanate, followed by a mild acid hydrolysis to convert the N-terminal amino acid to its phenylthiohydantoin derivative. The Edman degradation can be used repeatedly to determine the sequence of many residues beginning at the N terminus. (p. 1176) electrophoresis A procedure for separating charged molecules by their migration in a strong electric field. The direction and rate of migration are governed largely by the average charge on the molecules. (p. 1160)

Glossary

WADEMC24_1153-1199hr.qxp

1194

16-12-2008

CHAPTER 24

14:15

Page 1194

Amino Acids, Peptides, and Proteins

enzymatic resolution The use of enzymes to separate enantiomers. For example, the enantiomers of an amino acid can be acylated and then treated with hog kidney acylase. The enzyme hydrolyzes the acyl group from the natural L-amino acid, but it does not react with the D-amino acid. The resulting mixture of the free L-amino acid and the acylated D-amino acid is easily separated. (p. 1167) enzyme A protein-containing biological catalyst. Many enzymes also include prosthetic groups, nonprotein constituents that are essential to the enzyme’s catalytic activity. (p. 1188) essential amino acids Ten standard amino acids that are not biosynthesized by humans and must be provided in the diet. (p. 1157) fibrous proteins A class of proteins that are stringy, tough, threadlike, and usually insoluble in water. (p. 1188) globular proteins A class of proteins that are relatively spherical in shape. Globular proteins generally have lower molecular weights and are more soluble in water than fibrous proteins. (p. 1188) A helix A helical peptide conformation in which the carbonyl groups on one turn of the helix are hydrogen-bonded to N ¬ H hydrogens on the next turn. Extensive hydrogen bonding stabilizes this helical arrangement. (p. 1189) hydrogenolysis Cleavage of a bond by the addition of hydrogen. For example, catalytic hydrogenolysis cleaves benzyl esters. (p. 1168) O R

O O

H2, Pd

CH2

benzyl ester

H ⫹ H

CH2

acid

toluene

isoelectric point (isoelectric pH) The pH at which an amino acid (or protein) does not move under electrophoresis. This is the pH where the average charge on its molecules is zero, with most of the molecules in their zwitterionic form. (p. 1160) L-amino acid An amino acid having a stereochemical configuration similar to that of L- 1-2-glyceraldehyde. Most naturally occurring amino acids have the L configuration. (p. 1155)

COOH H

H2N CH3

L-alanine (S)-alanine

CHO HO

H CH2OH

L-(–)-glyceraldehyde

(S)-glyceraldehyde

COOH H

H2N R

an L-amino acid (S) configuration

N terminus (N-terminal end) The end of the peptide chain with a free or derivatized amino group. As the peptide is written, the N terminus is usually on the left. The carboxyl group of the N-terminal amino acid links it to the rest of the peptide. (p. 1172) oligopeptide A small polypeptide, containing about four to ten amino acid residues. (p. 1171) peptide Any polymer of amino acids linked by amide bonds between the amino group of each amino acid and the carboxyl group of the neighboring amino acid. The terms dipeptide, tripeptide, etc. may specify the number of amino acids in the peptide. (p. 1171) peptide bonds Amide linkages between amino acids. (pp. 1153, 1171) pleated sheet A two-dimensional peptide conformation with the peptide chains lined up side by side. The carbonyl groups on each peptide chain are hydrogen-bonded to N ¬ H hydrogens on the adjacent chain, and the side chains are arranged on alternating sides of the sheet. (p. 1189) polypeptide A peptide containing many amino acid residues. Although proteins are polypeptides, the term polypeptide is commonly used for molecules with lower molecular weights than proteins. (p. 1171)

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1195

24 primary structure The covalently bonded structure of a protein; the sequence of amino acids, together with any disulfide bridges. (p. 1188) prion protein A protein infectious agent that is thought to promote misfolding and polymerization of normal protein molecules, leading to amyloid plaques and destruction of nerve tissue. (p. 1192) prosthetic group The nonprotein part of a conjugated protein. Examples of prosthetic groups are sugars, lipids, nucleic acids, and metal complexes. (p. 1188) protein A biopolymer of amino acids. Proteins are polypeptides with molecular weights higher than about 6000 amu. (p. 1171) quaternary structure The association of two or more peptide chains into a composite protein. (p. 1190) random coil A type of protein secondary structure where the chain is neither curled into an a-helix nor lined up in a pleated sheet. In a globular protein, the kinks that fold the molecule into its globular shape are usually segments of random coil. (p. 1190) residue An amino acid unit of a peptide. (p. 1171) Sanger method A method for determining the N-terminal amino acid of a peptide. The peptide is treated with 2,4-dinitrofluorobenzene (Sanger’s reagent), then completely hydrolyzed. The derivatized amino acid is easily identified, but the rest of the peptide is destroyed in the hydrolysis. (p. 1178) secondary structure The local hydrogen-bonded arrangement of a protein. The secondary structure is generally the a helix, pleated sheet, or random coil. (p. 1189) sequence As a noun, the order in which amino acids are linked together in a peptide. As a verb, to determine the sequence of a peptide. (p. 1176) simple proteins Proteins composed of only amino acids (having no prosthetic groups). (p. 1188) solid-phase peptide synthesis A method in which the C-terminal amino acid is attached to a solid support (polystyrene beads) and the peptide is synthesized in the C : N direction by successive coupling of protected amino acids. When the peptide is complete, it is cleaved from the solid support. (p. 1183) solution-phase peptide synthesis (classical peptide synthesis) Any of several methods in which protected amino acids are coupled in solution in the correct sequence to give a desired peptide. Most of these methods proceed in the N : C direction (p. 1181) standard amino acids The 20 a-amino acids found in nearly all naturally occurring proteins. (p. 1155) Strecker synthesis Synthesis of a-amino acids by reaction of an aldehyde with ammonia and cyanide ion, followed by hydrolysis of the intermediate a-amino nitrile. (p. 1165)

NH2

O R9C9H

⫹

NH3

HCN

H2O

R9C9H

⫹

H3O+

C#N

a-amino nitrile

aldehyde

NH3

R9C9H COOH a-amino acid

terminal residue analysis Sequencing a peptide by removing and identifying the residue at the N terminus or at the C terminus. (p. 1176) tertiary structure The complete three-dimensional conformation of a protein. (p. 1190) transamination Transfer of an amino group from one molecule to another. Transamination is a common method for the biosynthesis of amino acids, often involving glutamic acid as the source of the amino group. (p. 1162) zwitterion (dipolar ion) A structure with an overall charge of zero but having a positively charged substituent and a negatively charged substituent. Most amino acids exist in zwitterionic forms. (p. 1158)

O H2N 9 CH 9 C 9 OH R uncharged structure (minor component)

O ⫹

H3N 9 CH 9 C 9 O⫺ R dipolar ion, or zwitterion (major component)

Glossary

1195

WADEMC24_1153-1199hr.qxp

1196

16-12-2008

CHAPTER 24

14:15

Page 1196

Amino Acids, Peptides, and Proteins

Essential Problem-Solving Skills in Chapter 24 1. Correctly name amino acids and peptides, and draw the structures from their names. 2. Use perspective drawings and Fischer projections to show the stereochemistry of D- and L-amino acids. 3. Explain which amino acids are acidic, which are basic, and which are neutral. Use the isoelectric point to predict whether a given amino acid will be positively charged, negatively charged, or neutral at a given pH. 4. Show how one of the following syntheses might be used to make a given amino acid: reductive amination HVZ followed by ammonia Gabriel–malonic ester synthesis Strecker synthesis 5. Predict products of the following reactions of amino acids: esterification, acylation, reaction with ninhydrin. 6. Use information from terminal residue analysis and partial hydrolysis to determine the structure of an unknown peptide. 7. Show how solution-phase peptide synthesis or solid-phase peptide synthesis would be used to make a given peptide. Use appropriate protecting groups to prevent unwanted couplings. 8. Discuss and identify the four levels of protein structure (primary, secondary, tertiary, quaternary). Explain how the structure of a protein affects its properties and how denaturation changes the structure.

Study Problems 24-32 Define each term and give an example. (a) a-amino acid (b) L-amino acid (e) isoelectric point (f) Strecker synthesis (i) peptide bond (j) hydrogenolysis (m) peptide (n) protein (q) tertiary structure (r) quaternary structure (u) conjugated protein (v) protein denaturation (y) prosthetic group (z) solid-phase peptide synthesis 24-33

(c) essential amino acid (g) electrophoresis (k) enzymatic resolution (o) primary structure (s) pleated sheet (w) disulfide bridge (aa) oligopeptide

(d) dipolar ion (h) transamination (l) zwitterion (p) secondary structure (t) a helix (x) Edman degradation (bb) prion protein

Draw the complete structure of the following peptide. Ser-Gln-Met # NH2

24-34

Predict the products of the following reactions. O

(a) Ile

O OH OH

⫹

pyridine heat

CH3

(b) Ph 9 CH2 9 O 9 C 9 NH 9 CH 9 COOH

(d) (D,L)-proline

(1) excess Ac2 O

(2) hog kidney acylase, H2O

CHO (e) CH3CH2 9 CH 9 CH3

NH3, HCN H2O

(g) 4-methylpentanoic acid + Br2>PBr3 ¡

(f) product from part (e)

H3O

(h) product from part 1g2 + excess NH3 ¡

H2, Pd

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1197

24 24-35

Study Problems

1197

Show how you would synthesize any of the standard amino acids from each starting material. You may use any necessary reagents. O (a) (CH3)2CH 9 C 9 COOH

(b) CH3 9 CH 9 CH2 9 COOH CH2CH3

24-37

CH2Br

(d)

Show how you would convert alanine to the following derivatives. Show the structure of the product in each case. (a) alanine isopropyl ester (b) N-benzoylalanine (c) N-benzyloxycarbonyl alanine (d) tert-butyloxycarbonyl alanine Suggest a method for the synthesis of the unnatural D enantiomer of alanine from the readily available L enantiomer of lactic acid. CH 3 ¬ CHOH ¬ COOH lactic acid

24-38 24-39 24-40

24-41

Show how you would use the Gabriel–malonic ester synthesis to make histidine. What stereochemistry would you expect in your synthetic product? Show how you would use the Strecker synthesis to make tryptophan. What stereochemistry would you expect in your synthetic product? Write the complete structures for the following peptides. Tell whether each peptide is acidic, basic, or neutral. (a) methionylthreonine (b) threonylmethionine (c) arginylaspartyllysine (d) Glu-Cys-Gln The following structure is drawn in an unconventional manner. O

CH3

CH3CH2 9 CH 9 CH 9 NH 9 C 9 CH 9 CH2CH2 9 C 9 NH2 NH 9 CO 9 CH2NH2

CONH2

24-42

24-43

(a) Label the N terminus and the C terminus. (b) Label the peptide bonds. (c) Identify and label each amino acid present. (d) Give the full name and the abbreviated name. Aspartame (Nutrasweet®) is a remarkably sweet-tasting dipeptide ester. Complete hydrolysis of aspartame gives phenylalanine, aspartic acid, and methanol. Mild incubation with carboxypeptidase has no effect on aspartame. Treatment of aspartame with phenyl isothiocyanate, followed by mild hydrolysis, gives the phenylthiohydantoin of aspartic acid. Propose a structure for aspartame. A molecular weight determination has shown that an unknown peptide is a pentapeptide, and an amino acid analysis shows that it contains the following residues: one Gly, two Ala, one Met, one Phe. Treatment of the original pentapeptide with carboxypeptidase gives alanine as the first free amino acid released. Sequential treatment of the pentapeptide with phenyl isothiocyanate followed by mild hydrolysis gives the following derivatives: first time

second time

S Ph

N O

24-44 24-45

third time

S NH H

CH2Ph

N O

S NH H

CH3

Propose a structure for the unknown pentapeptide. Show the steps and intermediates in the synthesis of Leu-Ala-Phe (a) by the solution-phase process. (b) by the solid-phase process. Using classical solution-phase techniques, show how you would synthesize Ala-Val and then combine it with Ile-Leu-Phe to give Ile-Leu-Phe-Ala-Val.

WADEMC24_1153-1199hr.qxp

1198 24-46

16-12-2008

CHAPTER 24

14:15

Page 1198

Amino Acids, Peptides, and Proteins

Peptides often have functional groups other than free amino groups at the N terminus and other than carboxyl groups at the C terminus. (a) A tetrapeptide is hydrolyzed by heating with 6 M HCl, and the hydrolysate is found to contain Ala, Phe, Val, and Glu. When the hydrolysate is neutralized, the odor of ammonia is detected. Explain where this ammonia might have been incorporated in the original peptide. (b) The tripeptide thyrotropic hormone releasing factor (TRF) has the full name pyroglutamylhistidylprolinamide. The structure appears here. Explain the functional groups at the N terminus and at the C terminus. H2 C

H2C

CH C O

H C

N H

H2N N

CH2 CH2

CH2

24-47

H2 C

C O

(c) On acidic hydrolysis, an unknown pentapeptide gives glycine, alanine, valine, leucine, and isoleucine. No odor of ammonia is detected when the hydrolysate is neutralized. Reaction with phenyl isothiocyanate followed by mild hydrolysis gives no phenylthiohydantoin derivative. Incubation with carboxypeptidase has no effect. Explain these findings. Lipoic acid is often found near the active sites of enzymes, usually bound to the peptide by a long, flexible amide linkage with a lysine residue.

O COOH

C N

S lipoic acid

H bound to lysine residue

(a) Is lipoic acid a mild oxidizing agent or a mild reducing agent? Draw it in both its oxidized and reduced forms. (b) Show how lipoic acid might react with two Cys residues to form a disulfide bridge. (c) Give a balanced equation for the hypothetical oxidation or reduction, as you predicted in part (a), of an aldehyde by lipoic acid. O R

COOH H âŤš S

24-48

24-49 *24-50

H2O

Histidine is an important catalytic residue found at the active sites of many enzymes. In many cases, histidine appears to remove protons or to transfer protons from one location to another. (a) Show which nitrogen atom of the histidine heterocycle is basic and which is not. (b) Use resonance forms to show why the protonated form of histidine is a particularly stable cation. (c) Show the structure that results when histidine accepts a proton on the basic nitrogen of the heterocycle and then is deprotonated on the other heterocyclic nitrogen. Explain how histidine might function as a pipeline to transfer protons between sites within an enzyme and its substrate. Metabolism of arginine produces urea and the rare amino acid ornithine. Ornithine has an isoelectric point close to 10. Propose a structure for ornithine. Glutathione (GSH) is a tripeptide that serves as a mild reducing agent to detoxify peroxides and maintain the cysteine residues of hemoglobin and other red blood cell proteins in the reduced state. Complete hydrolysis of glutathione gives Gly, Glu, and Cys. Treatment of glutathione with carboxypeptidase gives glycine as the first free amino acid released. Treatment of glutathione with 2,4-dinitrofluorobenzene (Sanger reagent, page 1178), followed by complete hydrolysis, gives the 2,4-dinitrophenyl derivative of glutamic acid. Treatment of glutathione with phenyl isothiocyanate does not give a recognizable phenylthiohydantoin, however.

WADEMC24_1153-1199hr.qxp

16-12-2008

14:15

Page 1199

24-51

N terminus

C terminus

Gln Ala Arg

Tyr Phe Ile

Incubation of the decapeptide with trypsin gives a dipeptide D, a pentapeptide E, and a tripeptide F. Terminal residue analysis of F shows that the N terminus is Ser, and the C terminus is Ile. Propose a structure for the decapeptide and for fragments A through F. There are many methods for activating a carboxylic acid in preparation for coupling with an amine. The following method converts the acid to an N-hydroxysuccinimide (NHS) ester. O

O ⫹ F3C

R OH

24-53

1199

(a) Propose a structure for glutathione consistent with this information. Why would glutathione fail to give a normal product from Edman degradation, even though it gives a normal product from the Sanger reagent followed by hydrolysis? (b) Oxidation of glutathione forms glutathione disulfide (GSSG). Propose a structure for glutathione disulfide, and write a balanced equation for the reaction of glutathione with hydrogen peroxide. Complete hydrolysis of an unknown basic decapeptide gives Gly, Ala, Leu, Ile, Phe, Tyr, Glu, Arg, Lys, and Ser. Terminal residue analysis shows that the N terminus is Ala and the C terminus is Ile. Incubation of the decapeptide with chymotrypsin gives two tripeptides, A and B, and a tetrapeptide, C. Amino acid analysis shows that peptide A contains Gly, Glu, Tyr, and NH3; peptide B contains Ala, Phe, and Lys; and peptide C contains Leu, Ile, Ser, and Arg. Terminal residue analysis gives the following results.

A B C

24-52

Study Problems

Et3N

⫹ F3C

O N

(a) Explain why an NHS ester is much more reactive than a simple alkyl ester. (b) Propose a mechanism for the reaction shown. (c) Propose a mechanism for the reaction of the NHS ester with an amine, R ¬ NH2. Sometimes chemists need the unnatural D enantiomer of an amino acid, often as part of a drug or an insecticide. Most L-amino acids are isolated from proteins, but the D amino acids are rarely found in natural proteins. D-amino acids can be synthesized from the corresponding L amino acids. The following synthetic scheme is one of the possible methods.

COOH

H2N

R H

COOH NaNO3 HCl

intermediate 1

NaN3

intermediate 2

L configuration

(a) Draw the structures of intermediates 1 and 2 in this scheme. (b) How do we know that the product is entirely the unnatural D configuration?

H2 Pd

R H

NH2

D configuration

Chapter 27: Amino Acids, Peptides, and Proteins. monomer unit: α-amino acids H NH2 R

R = sidechain

CO2H

!- Amino Acid

Biopolymer: the monomeric amino acids are linked through an amide bond (the carboxylic acids of one AA with the α-amino group of a second) R1 H3N

+ CO2

R2 H3N

- H2O CO2

N-terminus

H N

R2 N H

H N O

H N

H3N

R4 N H

H N O

R6 N H

CO2

H N O

C-terminus

N H

Peptide or protein (polypeptide)

peptide (< 50 amino acids) protein (> 50 amino acids)

307

27.1: Classification of Amino Acids. AA’s are classified according to the location of the amino group. H H H2N C C CO2H H H

H H2N C CO2H H !-amino acid (2-amino carboxylic acid)

H H H H2N C C C CO2H H H H

"-amino acid (3-amino carboxylic acid)

#-amino acid (4-amino carboxylic acid)

There are 20 genetically encoded α-amino acids found in peptides and proteins 19 are primary amines, 1 (proline) is a secondary amine 19 are “chiral”, 1 (glycine) is achiral; the natural configuration of the α-carbon is L. H

CHO OH CH2OH

CHO H CH2OH

D-glyceraldehyde

L-glyceraldehyde

CHO H HO H OH CH2OH

CHO OH H HO H CH2OH

D-erythrose

L-erythrose

H2N

CO2H H CH3

H2N

CO2H H R

L-alanine

H2N H

CO2H H OH CH3

L-theronine (2S,3R)

H2N H3C

CO2H H H CH2CH3

L-isoleucine (2S,3S)

308

157

α-Amino acids are classified by the properties of their sidechains. Nonpolar: COO COO COO –

–

NH3

Glycine (Gly, G)

(S)-(+)-Valine (Val, V)

(S)-(+)-Alanine (Ala, A)

COO–

NH3

(S)-(–)-Leucine (Leu, L)

(S)-(–)-Methionine (Met, M)

(2S,3S)-(+)-Isoleucine (Ile, I)

COO– COO–

N H

(S)-(–)-Proline (Pro, P)

(S)-(–)-Tryptophan (Trp, W)

COO–

COO– COO–

NH3

(2S,3R)-(–)-Threonine (Thr, T)

(S)-(–)-Tyrosine (Tyr, Y)

pKa ~ 13

pKa ~ 10.1

COO–

H2N

NH3

(R)-(–)-Cysteine (Cys, C)

-O

NH3

(S)-(–)-Asparagine (Asn, N)

(S)-(+)-Glutamine (Gln, Q)

NH3

(S)-(+)-Aspartic Acid (Asp, D)

(S)-(+)-Glutamic Acid (Glu, E)

pKa ~ 3.6

COO– NH3

COO–

-O

NH3

Basic:

309

COO– O

COO–

H2N

NH3

pKa ~ 8.2

H3N

NH3

(S)-(–)-Serine (Ser, S)

Acidic:

NH3

N H

(S)-(–)-Phenylalanine (Phe, F)

Polar but non-ionizable: HO

COO–

NH3

pKa ~ 4.2

N H N H

COO– NH3

(S)-(+)-Lysine (Lys, K)

(S)-(–)-Histidine (His, H)

pKa ~ 10.5

pKa ~ 6.0

H H2N

H N H

COO– NH3

(S)-(+)-Arginine (Arg, R) pKa ~ 12.5

27.2: Stereochemistry of Amino Acids: The natural configuration of the α-carbon is L. D-Amino acids are found in the cell walls of bacteria. The D-amino acids are not genetically encoded, but derived from the epimerization of L-isomers

310

158

27.3: Acid-Base Behavior of Amino Acids. Amino acids exist as a zwitterion: a dipolar ion having both a formal positive and formal negative charge (overall charge neutral). + R _ H3N CO2 H

R H2N

CO2H H

pKa ~ 5

pKa ~ 9

Amino acids are amphoteric: they can react as either an acid or a base. Ammonium ion acts as an acid, the carboxylate as a base. Isoelectric point (pI): The pH at which the amino acid exists largely in a neutral, zwitterionic form (influenced by the nature of the sidechain) + R H3N CO2H H low pH

H3O+ pKa1

+ R _ H3N CO2 H

R H2N

CO2 H high pH

pKa2

311

Table 27.2 (p. 1115) & 27.2 (p. 1116)

pKax + pKay 2

pI =

+ CH3 H3N CO2H H low pH

pKa1 (2.3)

+ CH3 H3N CO2 H

H2N pKa2 (9.7)

CO2H

CO2

CH2

CO2H H

pKa1 (1.9)

H3N

CO2 H

pKa3 (3.6)

CH3 CO2 H

high pH

CO2H H3N

H3N

CO2 CH2

CO2 H

pKa2 (9.6)

low pH

NH3

NH3 (CH2)4

CO2H H

low pH

H2N

CO2 H high pH

(CH2)4 H3N

pKa1 (2.2)

H3N

CO2 H

pKa2 (9.0)

NH3

NH2

(CH2)4

H2N

CO2 H

pKa3 (10.5)

H2N

CO2 H

high pH

312

159

Electrophoresis: separation of polar compounds based on their mobility through a solid support. The separation is based on charge (pI) or molecular mass. _

_ _

+ +

313

27.5: Synthesis of Amino Acids: R-CH2-CO2H

Br2, PBr3

NH2

NH3

R C CO2H

R C CO2H H

Ch. 19.16

Strecker Synthesis: recall reductive amination NH3

NH2

R C CO2H

NH2

NaB(CN)H3

R C CO2H

H O

NH3

NH2

NaC!N

NH2

R C H

R C C!N

R C H

NH2

H3O+ -orNaOH, H2O

R C CO2H H

N!C:

Amidomalonate Synthesis: recall the malonic acid synthesis O

O HN CO2Et C H CO2Et

EtO

RCH2X

HN CO2Et C RCH2 CO2Et

H3O - CO2

H2N H C CO2H

RCH2

314

160

27.5: Reactions of Amino Acids. Amino acids will undergo reactions characteristic of the amino (amide formation) and carboxylic acid (ester formation) groups. H3C O

H2N R

HOCH2CH3

H3N

H+

CO2CH2CH3

H3C

O O

CH3

base

CO2

CO2H

27.6: Some Biochemical Reactions of Amino Acids. Many enzymes involved in amino acid biosynthesis, metabolism and catabolism are pyridoxal phosphate (vitamin B6) dependent (please read) R

H CO2NH3

D-amino acid

N H

pyridoxal phosphate (PLP)

decarboxylase H H

transaminase

H3N

CO2-

CO2H NH3

L-amino acid

racemase, epimerase

2-O PO 3

CO2-

315

27.7: Peptides. Proteins and peptides are polymers made up of amino acid units (residues) that are linked together through the formation of amide bonds (peptide bonds) from the amino group of one residue and the carboxylate of a second residue HO

+ H2N

CO2H

H2N

CO2H

N-terminus

Serine

Alanine

N-terminus

H N

CO2H

R2 N H

H N O

C-terminus OH

By convention, peptide sequences are written left to right from the N-terminus to the C-terminus

C-terminus

Ser - Ala (S - A)

H N

CO2H

Ala - Ser (A - S)

- H2O

H2N

H N

- H2O H2N

R4 N H

H N O

R6 N H

H N O

N H

backbone 316

161

The amide (peptide) bond has C=N double bond character due to resonance resulting in a planar geometry O

H N

N H

H N

+ N H

restricts rotations resistant to hydrolysis

H N O

amide bond

The N-H bond of one amide linkage can form a hydrogen bond with the C=O of another. O

N-O distance 2.85 - 3.20 Å

H N R N H

N H N H

optimal N-H-O angle is 180 °

H N

H N O

Disulfide bonds: the thiol groups of cysteine can be oxidized to form disulfides (Cys-S-S-Cys)

R6 N H

R1 N H

O HS

H N R2

H N

N H

R8 N H

NH2

N H

H N

N H

R5 N H

H N O

317

H N

R11 N H

H N

R12

R13 N H

H N

1/2 O2

SH H N O

R10

CO2H

H N

HO2C

NH2

H2O

1/2 O2

NH2

H2 N H

H N R2

N H

H N O

R5 N H

H N O

Epidermal Growth Factor (EGF): the miracle of mother’s spit 53 amino acid, 3 disulfide linkages

1986 Nobel Prize in Medicine: Stanley Cohen Rita Levi-Montalcini

318

162

27.8: Introduction to Peptide Structure Determination. Protein Structure: primary (1째) structure: the amino acid sequence secondary (2째): frequently occurring substructures or folds tertiary (3째): three-dimensional arrangement of all atoms in a single polypeptide chain quaternary (4째): overall organization of non-covalently linked subunits of a functional protein. 1. Determine the amino acids present and their relative ratios 2. Cleave the peptide or protein into smaller peptide fragments and determine their sequences 3. Cleave the peptide or protein by another method and determine their sequences. Align the sequences of the peptide fragments from the two methods 319

E-A-Y-L-V-C-G-E-R F-V-N-Q-H-L-F-S-H-L-K G-C-F-L-P-K L-G-A

F-V-N-Q-H-L-F S-H-L-K-E-A-Y L-V-C-G-E-R-G-C-F L-P-K-L-G-A

F-V-N-Q-H-L-F F-V-N-Q-H-L-F-S-H-L-K S-H-L-K-E-A-Y E-A-Y-L-V-C-G-E-R L-V-C-G-E-R-G-C-F G-C-F-L-P-K L-P-K-L-G-A L-G-A F-V-N-Q-H-L-F-S-H-L-K-E-A-Y-L-V-C-G-E-R-G-C-F-L-P-K-L-G-A 320

163

27.9: Amino Acid Analysis. automated method to determine the amino acid content of a peptide or protein Reaction of primary amines with ninhydrin O NH3 R

RCHO

CO2

CO2 O Ninhydrin

peptide -orprotein

[H]

reduce any disulfide bonds

Enzymatic digestion

liquid chromatography

individual amino acids

NH3

-orH3O+, Δ

CO2

derivatize w/ ninhydrin

Detected w/ UV-vis

Different amino acids have different chromatographic mobilities (retention times)

1972 Nobel Prize in Chemistry William Stein Stanford Moore 321

27.10: Partial Hydrolysis of Peptides. Acidic hydrolysis of peptides cleave the amide bonds indiscriminately. Proteases (peptidases): Enzymes that catalyzed the hydrolysis of the amide bonds of peptides and proteins. Enzymatic cleavage of peptides and proteins at defined sites: • trypsin: cleaves at the C-terminal side of basic residues, Arg, Lys but not His O

R1 N H

H3N

H N

R3 N H

H N

CO2

trypsin

R1 N H

H3N

H N

CO2

H2O

H N

H3N

NH3

• chymotrypsin: cleaves at the C-terminal side of aromatic residues Phe, Tyr, Trp R1

O H3N

N H

H N O

R3 N H

H N O

O CO2

chymotrypsin H2O

H3N

R1 N H

H N O

R3 O

H N

H3N

CO2

322

164

Trypsin and chymotrypsin are endopeptidases Carboxypeptidase: Cleaves the amide bond of the C-terminal amino acid (exopeptidase) 27.11: End Group Analysis. The C-terminal AA is identified by treating with peptide with carboxypeptidase, then analyzing by liquid chormatography (AA Analysis). N-labeling: The peptide is first treated with 1-fluoro-2,4-dinitro benzene (Sangerâ&#x20AC;&#x2122;s reagent), which selectively reacts with the N-terminal amino group. The peptide is then hydrolyzed to their amino acids and the N-terminal amino acid identified as its N-(2,4-dinitrophenyl) derivative (DNP). NO2 F

H3N

CO2

O2N

enzymatic digestion -orH3O+, !

H N

O2N

R1 NO2

N H

NH2

nucleophilic aromatic substitution

O2N

R1 NO2

N H

H N

CO2

plus other unlabeled amino acids

323

27.12: Insulin. (please read) Insulin has two peptide chains (the A chain has 21 amino acids and the B chain has 30 amino acids) held together by two disulfide linkages Pepsin: cleaves at the C-terminal side of Phe, Tyr, Leu; but not at Val or Ala Pepsin cleavage Trypsin cleavage H3O + cleavage

324

165

27.13: The Edman Degradation and Automated Peptide Sequencing. Chemical method for the sequential cleavage and identification of the amino acids of a peptide, one at a time starting from the N-terminus. Reagent: Ph-N=C=S, phenylisothiocyanate (PITC)

S C N

H N

H2N

pH 9.0

CO2

N H

R1 N H

H N

H+ CO2

O H+

H+

Ph N S HN

H N

CO2

H2N

H+

N-phenylthiohydantoin: separated by liquid chromatography (based of the R group) and detected by UV-vis

Ph N

CO2

-1 peptide with a new N-terminal amino acid (repeat degradation cycle)

HN R1

325

Peptide sequencing by Edman degradation: • Cycle the pH to control the cleavage of the N-terminal amino acid by PITC. • Monitor the appearance of the new N-phenylthiohydantoin for each cycle. • Good for peptides up to ~ 25 amino acids long. • Longer peptides and proteins must be cut into smaller fragments before Edman sequencing. Tandem mass spectrometry has largely replaced Edman degradation for peptide sequencing 27.14: The Strategy for Peptide Synthesis: Chemical synthesis of peptide: 1. Solution phase synthesis 2. Solid-phase synthesis 326

166

H N

H2N

- H2O

CO2H

H2N

CO2H

Val

Ala

Val - Ala (V - A)

H N

H2N

CO2H

Ala - Val (A - V)

The need for protecting groups Pn

N H

peptide coupling

OPc

H2N

Ala - Val (A - V)

Val

Ala

H2N

peptide coupling (-H2O)

H N

OPc

O Pn

Ala - Val (A - V)

selectively remove Pn

OPc

- H2O

H N

N H

H N

N H

OPc

Repeat

peptide synthesis

Ph OH

N H

Phe - Ala - Val (F - A - V)

O Phe (F)

Orthogonal protecting group strategy: the carboxylate protecting group must be stable to the reaction conditions for the removal of the Îą-amino protecting group and ( vice versa) 327

27.15: Amino Group Protection. The Îą-amino group is protected as a carbamate. O

NH3

O O

Base

OH O O

O O

C6H5

NH R

tert-butoxycarbonyl (t-BOC)

benzyloxycarbonyl (cBz)

fluorenylmethylcarbonyl (FMOC)

removed with mild acid

removed with mild acid or by hydrogenolysis

removed with mild base (piperidine)

27.16: Carboxyl Group Protection. Protected as a benzyl ester; removed by hydrogenolysis O C6H5

H N

H2N O

N H

H2N

C6H5

peptide coupling - H2O

O C6H5

N H

H N

mild acid O

C6H5

Ph O

O O

C6H5

peptide coupling

N H

OH O

- H2O

C6H5

H N O Ph

O N H

H N O

H2, Pd/C O

C6H5

O H3N Ph

N H

H N

O O

328

167

27.17: Peptide Bond Formation. Amide formation from the reaction of an amine with a carboxylic acid is slow. Amide bond formation (peptide coupling) can be accelerated if the carboxylic acid is activated. Reagent: dicyclohexylcarbodiimide (DCC) O

(DCC)

C6H11

O C NH HN R' + C6H11

N H

OBn

H2N

Ala

cBz

peptide coupling

O N H

N H

H N

DCC

cBz

N H

cBz OH

O Phe (F)

H N Ph

O N H

C6H11

N H

H N

OBn

H2, Pd/C OBn

C6H11

DCU

CF3CO2H

OBn

O H2N

H N

H2N

selectively remove Nprotecting group

N H

Val

Amide

DCC OH

C6H11 "activated acid"

cBz

NH O C +N N R' H H C6H11

O C

•• R'-NH2

C6H11

NH N

+ C6H11 N C N C6H11 H

C6H11 N C N C6H11

C6H11

N H

H N

O OH

Phe - Ala - Val (F - A - V)

329

• In order to practically synthesize peptides and proteins, time consuming purifications steps must be avoided until the very end of the synthesis. • Large excesses of reagents are used to drive reactions forward and accelerate the rate of reactions. • How are the excess reagents and by-products from the reaction, which will interfere with subsequent coupling steps, removed without a purification step? 27.18: Solid-Phase Peptide Synthesis: The Merrifield Method. Peptides and proteins up to ~ 100 residues long are synthesized on a solid, insoluble, polymer support. Purification is conveniently accomplished after each step by a simple wash and filtration.

330

168

The solid support (Merrifield resin): polystyrene polymer Ph

styrene

initiator

+ polymerization

H3COCH2Cl ZnCl2

CH2Cl

Ph Ph Ph

divinylbenzene (crosslinker, ~1 %)

H N

_ O

BOC

CF3CO2H

O R

NH BOC

NH2 R

Solid-phase peptide synthesis FMOC O

H2N O

DCC FMOC

Ph N H

FMOC

H N

N H

H N O

N H

O O

purify: wash & filter

O N H

purify: wash & filter

peptide coupling

Val

FMOC

N H

DCC

O O

O Phe (F)

N H remove Nprotecting group

purify: wash & filter

N H remove Nprotecting group

H2N

N H

O O

purify by liquid chromatograrphy

HF remove Nprotecting group and cleave from solid-support

or electrophoresis

Ph H N

H2N

N H

OH O

331

Ribonuclease A- 124 amino acids, catalyzes the hydrolysis of RNA Solid-phase synthesis of RNase A: Synthetic RNase A: 78 % activity 0.4 mg was synthesized 2.9 % overall yield average yield ~ 97% per coupling step His-119 A

LYS GLN SER LYS LYS LEU LYS ASN ILE LYS GLN GLU ASP

GLU HIS SER SER PRO ALA ASN CYS THR TYR ALA GLY ALA

THR MET SER ARG VAL ASP VAL TYR ASP PRO ASN ASN SER

ALA ASP ASN ASN ASN VAL ALA GLN CYS ASN LYS PRO VAL

ALA SER TYR LEU THR GLN CYS SER ARG CYS HIS TYR

ALA SER CYS THR PHE ALA LYS TYR GLU ALA ILE VAL

LYS THR ASN LYS VAL VAL ASN SER THR TYR ILE PRO

PHE SER GLN ASP HIS CYS GLY THR GLY LYS VAL VAL

GLU ALA MET ARG GLU SER GLN MET SER THR ALA HIS

ARG ALA MET CYS SER GLN THR SER SER THR CYS PHE

His-12 A

His-12 B His-119 B

pdb code: 1AFL

R. Bruce Merrifield, Rockefeller University, 1984 Nobel Prize in Chemistry: “for his development of methodology for chemical synthesis on a solid matrix.”

332

169

27.19: Secondary Structures of Peptides and Proteins. β-sheet: Two or more extended peptide chain, in which the amide backbones are associated by hydrogen bonded anti-parallel N→C

loop or turn

N→C O

O H

H N

N O

H N

C←N

parallel

N→C

H N

O H

H N

H O

O H

R H

N O

N N

N→C

H O

N R

crossover

N→C

333

α-helix: 3.6 amino acids per coil, 5.4 Å

3.6 AA 5.4 Å

334

170

myoglobin pdb code: 1WLA

Bacteriorhodopsin pdb code: 1AP9

Parallel β-sheets carbonic anhydrase

Anti-parallel β-sheets of lectin pdb code: 2LAL

pdb code: 1QRM

335

27.20: Tertiary Structure of polypeptides and Proteins. Fibrous. Polypeptides strands that “bundle” to form elongated fibrous assemblies; insoluble; Globular. Proteins that fold into a “spherical” conformation . Hydrophobic effect. Proteins will fold so that hydrophobic amino acids are on the inside (shielded from water) and hydrophilic amino acids are on the outside (exposed to water).

Pro • Ile • Lys • Tyr • Leu • Glu • Phe • Ile • Ser • Asp • Ala • Ile • Ile • His •Val • His • Ser • Lys

336

171

Enzymes: proteins that catalyze biochemical reactions. • by bringing the reactive atoms together in the optimal geometry for the reaction. • lowering the activation energy (ΔG‡) by stabilizing the transition state and/or high energy intermediate. • many enzymes use the functional groups of the amino acid sidechain to carry out the reactions Proteases (peptidases): catalyzes the hydrolysis of peptide bonds O H3N

R N H

H N O

H N

N H

protease N H

CO2

H3N

H2O

R N H

H N O

H N

+ H N 3

N H

CO2

Four classes of proteases: Serine (trypsin): aspartate-histidine-serine Aspartyl (HIV protease, renin): two aspartates Cysteine (papain, caspase): histidine-cysteine Metallo (Zn2+) (carboxypeptidase, ACE): glutamate 337

Mechanism of carboxylpeptidase, metalloprotease (p. 1151) Mechanism of a serine protease (trypsin, chymotrypsin): oxy-anion hole NH

O Ser195

His57

Ser192O

O N H

NHR2

O H

Ser195

His57

N H

H N

His57

R1 O

H N N H

Asp102 CO2-

acyl-enzyme intermediate

N N H

HN O

Ser192O

His57

Asp102 CO2-

Asp102 CO2NH

O H

- R2-NH2

N H

Asp102 CO2-

HN O

Ser195

O H

H His57

RCO2H

N N H

Asp102 CO2-

338

172

27.21: Coenzymes. Some reactions require additional organic molecules or metal ions. These are referred to as cofactors or coenzymes. S N

N +

O HO

O P

O OH P OH

Pyridoxal Phophates (vitamin B6)

Thiamin Diphosphate (vitamin B1)

O H

N N

HN H N

N HO

O Folic Acid (vitamin B9)

OH Heme

NH2

S Biotin (vitamin B7)

CO2H

P O-

OH N

O HN

HO O -O P O O

CO2CO2-

O O

O N NH

NH2 O

N H O

H OH

Vitamin B12 (cyanocobalamin)

O O

NH2

N N C N Co N N

H2N H2N

NH2 O

H2N

N NH2

NH2

OHO

N O

Flavin Adenine Diphosphate (FAD) (Vitamin B2)

27.22: Protein Quaternary Structure. (please read)

339

173

Amino Acids, Peptides and Proteins 1. α-Amino Acids R H

COOH O

H2N

H R

(S) or L amino acids

a) dipolar nature (isoelectric points) b) synthesis (racemic) i) from α-bromoacids ii) Strecker synthesis from aldehydes iii) reductive amination of α-ketoacids iv) amidomalonate synthesis 2. Peptides (up to 50 amino acids) R H N-terminal

H2N O

R" H

H R’

C-terminal

a) amino acid analysis b) sequencing i) Edman degradation (N-terminal) ii) carboxypeptidase (C-terminal) c) peptide synthesis (step-by-step and solid-phase) 3. Proteins (large peptides, occasionally with something else attached) a) structure (primary, secondary, tertiary and quaternary) b) classifications 64

Îą-Amino Acids O H 2N

R = side chain ~ 500 known in nature

20 in humans

H R Examples:

10 of them essential

O H2N

H2N

H H

valine neutral, bulky, hydrophobic

proline 2o cyclic, bending in the peptide chain O

glycine smallest, not chiral

H2N

H N

H2N

NH2

lysine basic, nucleophilic, used in catalysis

OH CO2H

aspartic acid acidic, carboxylate available, used in catalysis

H 2N

cysteine crosslinking, catalysis

O H2 N

histidine basic, catalytic side chain

H HN

Peptide Structure Determination 1. Amino acid analysis: a) hydrolysis with HCl/H2O b) column chromatography c) detection with ninhydrin O H2N

O OH

O RCOH

H R N

NaOH/H2O

CO2 O

O (purple)

Peptide Structure Determination 2. The Edman degradation: a) treatment with phenyl isothiocyanate (Ph-N=C=S) b) mild acid hydrolysis c) the resulting phenylthiohydantoin is identified by chromatography

H2N

Peptide

H R

O NH H

Peptide

Next cycle

N O HN H

NH H

Ph S

H+

Peptide

H R

Peptide

Peptide Structure Determination 3. C-terminal residue determination a) enzyme (carboxypeptidase) used to hydrolyze one amino acid at the C-terminus b) identifaction of the amino acid c) further hydrolysis

4. Putting together peptide (protein) structure from fragments

Asp-Arg-Val Arg-Val-Tyr Val-Tyr-Ile Ile-His-Pro Pro-Phe Asp-Arg-Val-Tyr-Ile-His-Pro-Phe

(angiotensin II)

(Di)Peptide Synthesis a) protect the amino group of amino acid 1 O O

H2N

H R1

BOC N

OH H R1

H R1

b) protect the carboxyl group of amino acid 2 H2N

O PhCH2OH/HCl

H2N

OCH 2Ph

H R2

c) couple the two amino acids using DCC H

BOC N

H2N

H DCC

OCH2Ph

H R2

BOC N

H R2

H R1

OCH 2Ph

H R1

d) remove the protective groups H

BOC N H R1

H R2 OCH 2Ph CF COOH H2N 3

N H

OCH2Ph

H R1

H R2

O H2/Pd or NaOH

O H2N H R1

H R2 N

O H Note: side-groups of same amino acids require extra protectiondeprotection!

Peptide Synthesis Solid-Phase Technique 1. BOC-protected amino acid is linked to the polystyrene beads (SN2 ester bond formation) 2. The beads are washed (to remove excess reagents) and treated with CF3COOH to remove BOC group 3. A second BOC-protected amino acid is coupled to the first one using DCC. The beads are washed. 4. The cycle of deprotection, coupling and washing is repeated asmany times as desired to add amino acid units to the growing chain. 5. After the desired peptide has been made, the treatment with anhydrous HF removes the final BOC group and cleaves the ester bond to the polymer 6. The peptide is purified

The yields of the reactions are critical! For a dodecapeptide (20 aa) it requires 40 chemical steps (not counting special treatment for some side groups). If the yield is 90% per step the overall yield is only (0.940) 1.5% If the yield is 99% per step the overall yield is only (0.9940) 67% If the yield is 99.9% per step the overall yield is only (0.99940) 96%

Protein Structure and Function 1. Primary structure sequence of amino acids 2. Secondary structure three-dimensional structure of segments α-helical, β-pleated sheets) (α 3. Tertiary structure three-dimensional arrangements of segments 4. Quaternary structure three-dimensional shape of several proteins in a protein complex -------------------------------------------------------------------------Protein denaturation --------------------------------------------------------------------------Fibrous proteins (insoluble)

Globular proteins (soluble)

Simple proteins

Conjugated proteins (carbohydrates, nucleic acids)

Enzymes: holoenzymes = apoenzyme + cofactor (vitamin)

7.7

Reaction pathway for Amide Formation

The reaction pathway for amide formation is very similar to that of ester formation, with the amine substituting for the alcohol in the reaction pathway. Try the following problems showing the reaction pathway at both low temperature and high temperature:

7.8 Amino Acids and proteins. Alpha amino acids (often just referred to as amino acids are particularly important examples of amines and have the structure:

They contain both amine groups and carboxylic acids. There are 20 different common R groups on the primary amino acids. In most cases the R group is different than the other three groups and hence the alpha C has 4 different groups bonded to it and is a chiral center. The structure of some common amino acids are:

Notice that the central C has 4 different groups and hence is a chiral center in all of the above structures except glycine. Also note the common but confussing COOH notation for the carboxylic acid functional group. Two amino acids can link together head-to-toe to form an amide bond between them. The amide bond between two amino acids is called a peptide bond and we call the resulting molecule a dipeptide.

The resulting dipeptide still has an amine group on one end and a carboxylic acid on the other end so additional amino acids can be covalent bonded to both ends of the molecule. A molecule containing three amino acids is called a tripeptide; one with four amino acids is called a tetrapeptide and so forth. The end with a free amine group is called the N terminal end and the end with the free carboxylic acid group is called the C terminal end of the peptide. Tripeptide

Tetrapeptide

Long chains of less than 50 amino acids are referred to as polypeptides. When 50 or more amino acids are linked together, it is commonly referred to as a protein. In the presence of acid catalysis and water amino acids linked by amide (peptide) bonds can hydrolyze back to separate amino acids:

+ H2O ___> This is in fact what happens when one eats proteins. They are hydrolyzed back to short peptides and amino acids with the help of HCl and the protease pepsin in the stomach and an additional collection of protease enzymes in the small intestine. The amino acids are absorbed and then rebuilt into proteins by your body.

C H A P T E R

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:55 PM

Page 1

Amino Acids, Peptides, and Proteins OU T LIN E SPIDER SILK: A BIOSTEEL PROTEIN 5.1 AMINO ACIDS Amino Acid Classes Biologically Active Amino Acids Modified Amino Acids in Proteins Amino Acid Stereoisomers Titration of Amino Acids Amino Acid Reactions

5.2 PEPTIDES 5.3 PROTEINS Protein Structure The Folding Problem Fibrous Proteins Globular Proteins

5.4 MOLECULAR MACHINES

BIOCHEMISTRY IN PERSPECTIVE Spider Silk and Biomimetics

BIOCHEMISTRY IN THE LAB Protein Technology Available Online

BIOCHEMISTRY IN PERSPECTIVE Protein Poisons

BIOCHEMISTRY IN PERSPECTIVE Lead Poisoning

BIOCHEMISTRY IN PERSPECTIVE Protein Folding and Human Disease

BIOCHEMISTRY IN PERSPECTIVE Myosin: A Molecular Machine

BIOCHEMISTRY IN THE LAB Protein Sequence Analysis: The Edman Degradation

A Spider’s Web Constructed with Silk Fiber The amino acid sequence of spider silk protein and the spider’s silk fiber spinning process combine to made spider silk, one of the strongest materials on earth. 1

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:55 PM

Page 2

Spider Silk: A Biosteel Protein

piders have evolved over 400 million years into exceptionally successful predators. These invertebrate animals are a class of arthropods, called the arachnids. They have an exoskeleton, a segmented body, and jointed appendages. Although spiders possess an efficient venom in jection system, their most impressive feature is the production of silk, a multiuse protein fiber. Silk, which is spun through spinnerets at the end of the spider’s abdomen, is used in locomotion, mating, and offspring protection. The most prominent use of spider silk, however, is prey capture. The most sophisticated method of prey capture is the spiral, wheel-shape orb web, which is oriented vertically to intercept fast-moving flying prey. Spider silk’s mechanical properties ensure that the web readily absorbs impact energy so that prey is retained until the spider can subdue it. Orb webs (and the species that produce them) have fascinated humans for many thousands of years because of their dramatic visual impact. Ancient Greeks and Romans, for example, explained the occurrence of spiders and orb webs with the myth of Arachne, in which the mortal woman Arachne, an extraordinarily gifted weaver, offended Minerva (Athena in the Greek version), the goddess of weaving and other crafts, with her arrogant acceptance of a challenge to a weaving contest with the goddess. When confronted with Arachne’s flawless work, an enraged Minerva transformed her into a spider, doomed to forever weave webs. Humans have also long appreciated spider webs for their physical properties. Examples range from the ancient Greeks, who used spider webs to treat wounds, to the Australian aborigines who

used spider silk to make fishing lines. In modern times spider silk has served as crosshairs in scientific equipment and gun sights. In the past several decades, spider silk and orb webs have attracted the attention of life scientists, bioengineers, and material scientists as they began to appreciate the unique mechanical properties of this remarkable protein. There are eight different types of spider silk, although no spider makes all of them. Dragline silk, a very strong fiber, is used for frame and radial lines in orb webs and as a safety line (to break a fall or escape other predators). Capture silk, an elastic and sticky fiber, is used in the spiral of webs. Spider silk is a lightweight fiber with impressive mechanical properties. Toughness, a combination of stiffness and strength, is a measure of how much energy is needed to rupture a fiber. Spider silk is about five times as tough as high-grade steel wire of the same weight and about twice as tough as synthetic fibers such as Kevlar (used in body armor). Spider silk’s tensile strength, the resistance of a material to breaking when stretched, is as great as that of Kevlar and greater than that of high-grade steel wire. Torsional resistance, the capacity of a fiber to resist twisting (an absolute requirement for draglines used as safety lines), is higher for spider silk than for all textile fibers, including Kevlar. It also has superior elasticity and resilience, the capacity of a material when it is deformed elastically to absorb and then release energy. Scientists estimate that a 2.54 cm (1 in)–thick rope made of spider silk could be substituted for the flexible steel arresting wires used on aircraft carriers to rapidly stop a jet plane as it lands.

Overview PROTEINS ARE MOLECULAR TOOLS THAT PERFORM AN ASTONISHING VARIETY OF FUNCTIONS. IN ADDITION TO SERVING AS STRUCTURAL MATERIALS

in all living organisms (e.g., actin and myosin in animal muscle cells), proteins are involved in such diverse functions as catalysis, metabolic regulation, transport, and defense. Proteins are composed of one or more polypeptides, unbranched polymers of 20 different amino acids. The genomes of most organisms specify the amino acid sequences of thousands or tens of thousands of proteins. 2

䉲

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:55 PM

Page 3

5.1 Amino Acids 3

roteins are a diverse group of macromolecules (Figure 5.1). This diversity is directly related to the combinatorial possibilities of the 20 amino acid monomers. Amino acids can be theoretiPhosphocarrier Lysozyme protein HPr cally linked to form protein molecules in any imaginable size or sequence. Consider, for example, a hypothetical protein composed of 100 amino acids. The total possible number of combinations for such a molecule is an astronomical 20100. Myoglobin Catalase However, of the trillions of possible protein Hemoglobin sequences, only a small fraction (possibly no more than 2 million) is actually produced in all living organisms. An important reason for this remarkable discrepancy is demonstrated by the complex set of structural and functional properties of naturally occurring proteins that have evolved over Deoxyribonuclease Cytochrome c Porin Collagen billions of years in response to selection pressure. Among these are (1) structural features that make protein folding a relatively rapid and successful process, (2) the presence of binding sites that are specific for one or a small group of molecules, (3) Chymotrypsin an appropriate balance of structural flexibility and rigidity so that function is maintained, (4) surface structure that is Calmodulin appropriate for a proteinâ&#x20AC;&#x2122;s immediate environment (i.e., hydrophobic in membranes and hydrophilic in cytoplasm), and (5) vulnerability of proteins to degradation reactions when they become damaged or no Alcohol Aspartate longer useful. Insulin dehydrogenase transcarbamoylase Proteins can be distinguished based on their number of amino acids (called amino acid residues), their overall amino acyl 5 nm composition, and their amino acid sequence. Selected examples of the diversity of proteins are illustrated in Figure 5.1. FIGURE 5.1 Molecules with molecular weights ranging from several thousand to several milProtein Diversity lion daltons are called polypeptides. Those with low molecular weights, typically Proteins occur in an enormous consisting of fewer than 50 amino acids, are called peptides. The term protein diversity of sizes and shapes. describes molecules with more than 50 amino acids. Each protein consists of one or more polypeptide chains. This chapter begins with a review of the structures and chemical properties of the amino acids. This is followed by descriptions of the structural and functional features of peptides and proteins and the protein folding process. The emphasis throughout is on the intimate relationship between the structure and function of polypeptides. In Chapter 6 the functioning of the enzymes, an especially important group of proteins, is discussed. Protein synthesis is covered in Chapter 19.

5.1 AMINO ACIDS The hydrolysis of each polypeptide yields a set of amino acids, referred to as the moleculeâ&#x20AC;&#x2122;s amino acid composition. The structures of the 20 amino acids that are commonly found in naturally occurring polypeptides are shown in Figure 5.2.

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:55 PM

Page 4

CHAPTER FIVE Amino Acids, Peptides, and Proteins

Glycine (Gly)

Alanine (Ala)

Valine (Val)

Leucine (Leu)

Isoleucine (Ile)

H2C

CH2 CH2

Phenylalanine (Pha)

Tryptophan (Trp)

Methionine (Met)

Serine (Ser)

Threonine (Thr)

Tyrosine (Tyr)

Aspartate (Asp)

Glutamate (Glu)

Lysine (Lys)

Cysteine (Cys)

Asparagine (Asp)

Arginine (Arg)

FIGURE 5.2 The Standard Amino Acids The ionization state of the amino acid molecules in this illustration represents the dominant species that occur at a pH of 7. The side chains are indicated by shaded boxes.

Proline (Pro)

Glutamine (Glm)

Histidine (His)

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:56 PM

Page 5

5.1 Amino Acids 5

TABLE 5.1 Names and Abbreviations of the Standard Amino Acids Amino Acid

Three-Letter Abbreviation

Alanine Arginine Asparagine Aspartic acid Cysteine Glutamic acid Glutamine Glycine Histidine Isoleucine Leucine Lysine Methionine Phenylalanine Proline Serine Threonine Tryptophan Tyrosine Valine

Ala Arg Asn Asp Cys Glu Gln Gly His Ile Leu Lys Met Phe Pro Ser Thr Trp Tyr Val

One-Letter Abbreviation A R N D C E Q G H I L K M F P S T W Y V

These amino acids are referred to as standard amino acids. Common abbreviations for the standard amino acids are listed in Table 5.1. Note that 19 of the standard amino acids have the same general structure (Figure 5.3). These molecules contain a central carbon atom (the ␣-carbon) to which an amino group, a carboxylate group, a hydrogen atom, and an R (side chain) group are attached. The exception, proline, differs from the other standard amino acids in that its amino group is secondary, formed by ring closure between the R group and the amino nitrogen. Proline confers rigidity to the peptide chain because rotation about the ␣-carbon is not possible. This structural feature has significant implications in the structure and, therefore, the function of proteins with a high proline content. Nonstandard amino acids consist of amino acid residues that have been chemically modified after incorporation into a polypeptide or amino acids that occur in living organisms but are not found in proteins. Nonstandard amino acids found in proteins are usually the result of posttranslational modifications (chemical changes that follow protein synthesis). Selenocysteine, an exception to this rule, is discussed in Chapter 19. At a pH of 7, the carboxyl group of an amino acid is in its conjugate base form (—COO–), and the amino group is in its conjugate acid form (—NH 3 ). Thus each amino acid can behave as either an acid or a base. The term amphoteric is used to describe this property. Molecules that bear both positive and negative charges are called zwitterions. The R group gives each amino acid its unique properties.

Amino Acid Classes Because the sequence of amino acids determines the final three-dimensional configuration of each protein, their structures are examined carefully in the next four subsections. Amino acids are classified according to their capacity to interact with water. By using this criterion, four classes may be distinguished: (1) nonpolar, (2) polar, (3) acidic, and (4) basic.

+ NH3 O H

O−

FIGURE 5.3 General Structure of the ␣-Amino Acids

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:56 PM

Page 6

CHAPTER FIVE Amino Acids, Peptides, and Proteins

NONPOLAR AMINO ACIDS The nonpolar amino acids contain mostly hydrocarbon

FIGURE 5.4 Benzene

R groups that do not bear positive or negative charges. Nonpolar (i.e., hydrophobic) amino acids play an important role in maintaining the three-dimensional structures of proteins, because they interact poorly with water. Two types of hydrocarbon side chains are found in this group: aromatic and aliphatic. Aromatic hydrocarbons contain cyclic structures that constitute a class of unsaturated hydrocarbons with planar conjugated π electron clouds. Benzene is one of the simplest aromatic hydrocarbons (Figure 5.4). The term aliphatic refers to nonaromatic hydrocarbons such as methane and cyclohexane. Phenylalanine and tryptophan contain aromatic ring structures. Glycine, alanine, valine, leucine, isoleucine, and proline have aliphatic R groups. A sulfur atom appears in the aliphatic side chains of methionine and cysteine. Methionine contains a thioether group (—S—CH3) in its side chain. Its derivative S-adenosyl methionine (SAM) is an important metabolite that serves as a methyl donor in numerous biochemical reactions. POLAR AMINO ACIDS Because polar amino acids have functional groups capable

of hydrogen bonding, they easily interact with water. (Polar amino acids are described as hydrophilic, or “water-loving.”) Serine, threonine, tyrosine, asparagine, and glutamine belong to this category. Serine, threonine, and tyrosine contain a polar hydroxyl group, which enables them to participate in hydrogen bonding, an important factor in protein structure. The hydroxyl groups serve other functions in proteins. For example, the formation of the phosphate ester of tyrosine is a common regulatory mechanism. Additionally, the —OH groups of serine and threonine are points for attaching carbohydrates. Asparagine and glutamine are amide derivatives of the acidic amino acids aspartic acid and glutamic acid, respectively. Because the amide functional group is highly polar, the hydrogen-bonding capability of asparagine and glutamine has a significant effect on protein stability. The sulfhydryl group (—SH) of cysteine is highly reactive and is an important component of many enzymes. It also binds metals (e.g., iron and copper ions) in proteins. Additionally, the sulfhydryl groups of two cysteine molecules oxidize easily in the extracellular compartment to form a disulfide compound called cystine. (See p. 136 for a discussion of this reaction.) ACIDIC AMINO ACIDS Two standard amino acids have side chains with

carboxylate groups. Because the side chains of aspartic acid and glutamic acid are negatively charged at physiological pH, they are often referred to as aspartate and glutamate.

QUESTION 5.1 Classify these standard amino acids according to whether their structures are nonpolar, polar, acidic, or basic. O C O

+ H3N

+ H 3N

O–

CH2SH

+ H3N

O O– H

+ H3N

O–

CH2

+NH3 (a)

(b)

(c)

(d)

O–

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:56 PM

Page 7

5.1 Amino Acids 7

BASIC AMINO ACIDS Basic amino acids bear a positive charge at physiological pH. They can therefore form ionic bonds with acidic amino acids. Lysine, which has a side chain amino group, accepts a proton from water to form the conjugate acid (—NH 3 ). When lysine side chains in collagen fibrils, a vital structural component of ligaments and tendons, are oxidized and subsequently condensed, strong intramolecular and intermolecular cross-linkages are formed. Because the guanidino group of arginine has a pKa range of 11.5 to 12.5 in proteins, it is permanently protonated at physiological pH and, therefore, does not function in acid-base reactions. The imidazole side chain histidine, on the other hand, is a weak base because it is only partially ionized at pH 7 because its pKa is approximately 6. Its capacity under physiological conditions to accept or donate protons in response to small changes in pH plays an important role in the catalytic activity of numerous enzymes.

K E Y C O N C E PT Amino acids are classified according to their capacity to interact with water. This criterion may be used to distinguish four classes: nonpolar, polar, acidic, and basic.

Biologically Active Amino Acids In addition to their primary function as components of protein, amino acids have several other biological roles. 1. Several ␣-amino acids or their derivatives act as chemical messengers (Figure 5.5). For example, glycine, glutamate, ␥-amino butyric acid (GABA, a derivative of glutamate), and serotonin and melatonin (derivatives of try-ptophan) are neurotransmitters, substances released from one nerve cell that influence the function of a second nerve cell or a muscle cell. Thyroxine (a tyrosine derivative produced in the thyroid gland of animals) and indole acetic acid (a tryptophan derivative found in plants) are hormones—chemical signal mole-cules produced in one cell that regulate the function of othercells. 2. Amino acids are precursors of a variety of complex nitrogen-containing molecules. Examples include the nitrogenous base components of nucleotides and the nucleic acids, heme (the iron-containing organic group required for the biological activity of several important proteins), and chlorophyll (a pigment of critical importance in photosynthesis). I

O + H3N

CH2

O−

CH2

GABA I CH2

CH2

O−

+NH

+ NH3

Thyroxine O

N CH2

Serotonin N O H H3C

Indole acetic acid

CH2 CH2 H3C

N H Melatonin

FIGURE 5.5 Some Derivatives of Amino Acids

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:56 PM

Page 8

CHAPTER FIVE Amino Acids, Peptides, and Proteins O

+ H3N

O O−

+ H3N

CH2

CH2 +NH

NH C

NH2 Citrulline

FIGURE 5.6 Citrulline and Ornithine

Ornithine

O−

3. Several standard and nonstandard amino acids act as metabolic intermediates. For example, arginine (Figure 5.2), citrulline, and ornithine (Figure 5.6) are components of the urea cycle (Chapter 15). The synthesis of urea, a molecule formed in vertebrate livers, is the principal mechanism for the disposal of nitrogenous waste.

Modified Amino Acids in Proteins Several proteins contain amino acid derivatives that are formed after a polypeptide chain has been synthesized. Among these modified amino acids is ␥-carboxyglutamic acid (Figure 5.7), a calcium-binding amino acid residue found in the blood-clotting protein prothrombin. Both 4-hydroxyproline and 5-hydroxylysine are important structural components of collagen, the most abundant protein in connective tissue. Phosphorylation of the hydroxyl-containing amino acids serine, threonine, and tyrosine is often used to regulate the activity of proteins. For example, the synthesis of glycogen is significantly curtailed when the enzyme glycogen synthase is phosphorylated. Two other modified amino acids, selenocysteine and pyrolysine, are discussed in Chapter 19.

Amino Acid Stereoisomers Because the ␣-carbons of 19 of the 20 standard amino acids are attached to four different groups (i.e., a hydrogen, a carboxyl group, an amino group, and an R group), they are referred to as asymmetric, or chiral, carbons. Glycine is a symmetrical molecule because its ␣-carbon is attached to two hydrogens. Molecules with chiral carbons can exist as stereoisomers, molecules that differ only in the spatial arrangement of their atoms. Three-dimensional representations of amino acid stereoisomers are illustrated in Figure 5.8. Notice in the figure that the atoms of the two isomers are bonded together in the same pattern except for the position of the ammonium group and the hydrogen atom. O O These two isomers are mirror images of each other. Such molO NH CH C NH CH C O ecules, called enantiomers, canC CH2 CH2 NH CH C not be superimposed on each N CH other. Enantiomers have identical CH2 O CH2 O physical properties except that CH2 CH2 −O −O CH OH P O they rotate plane-polarized light C CH H OH in opposite directions. PlaneCH2 O− C O polarized light is produced by +NH − passing unpolarized light through O 3 a special filter; the light waves γ-Carboxyglutamate 4-Hydroxyproline 5-Hydroxylysine o-Phosphoserine vibrate in only one plane. Molecules that possess this property FIGURE 5.7 are called optical isomers. Some Modified Amino Acid Residues Found in Polypeptides Glyceraldehyde is the reference compound for optical isomers (Figure 5.9). One glyceraldehyde isomer rotates the light beam in a FIGURE 5.8 clockwise direction and is said to Two Enantiomers be dextrorotatory (designated by L-Alanine and D-alanine are ). The other glyceraldehyde mirror images of each other. isomer, referred to as levorotatory (Nitrogen = large red ball; (designated by ), rotates the Hydrogen = small red ball; beam in the opposite direction to Carbon = black ball; Oxygen = blue balls) an equal degree. Optical isomers D-Alanine L-Alanine

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:56 PM

Page 9

5.1 Amino Acids 9

are often designated as D or L (e.g., D-glucose, L-alanine) to indicate the similarity of the arrangement of atoms around a molecule’s asymmetric carbon to the asymmetric carbon in either of the glyceraldehyde isomers. Most biomolecules have more than one chiral carbon. As a result, the letters D and L refer only to a molecule’s structural relationship to either of the glyceraldehyde isomers, not to the direction in which it rotates plane-polarized light. Most asymmetric molecules found in living organisms occur in only one stereoisomeric form, either D or L. For example, with few exceptions, only L-amino acids are found in proteins. Chirality has had a profound effect on the structural and functional properties of biomolecules. For example, the right-handed helices observed in proteins result from the exclusive presence of L-amino acids. Polypeptides synthesized in the laboratory from a mixture of both D- and L-amino acids do not form helices. In addition, because the enzymes are chiral molecules, most bind substrate (reactant) molecules in only one enantiomeric form. Proteases, enzymes that degrade proteins by hydrolyzing peptide bonds, cannot degrade artificial polypeptides composed of D-amino acids.

QUESTION 5.2 Certain bacterial species have outer layers composed of polymers made of D-amino acids. Immune system cells, whose task is to attack and destroy foreign cells, cannot destroy these bacteria. Suggest a reason for this phenomenon.

Titration of Amino Acids Amino acids contain ionizable groups (Table 5.2). The predominant ionic form of these molecules in solution therefore depends on the pH. Titration of an amino acid illustrates the effect of pH on amino acid structure (Figure 5.10a). Titration is also a useful tool in determining the reactivity of amino acid side chains. Consider alanine, a simple amino acid, which has two titratable groups. During titration with a strong base such as NaOH, alanine loses two protons in stepwise fashion. In a strongly acidic solution (e.g., at pH 0), alanine is present mainly in the form in which the carboxyl group is uncharged. Under this circumstance the molecule’s net charge is 1 because the ammonium group is protonated. If the H concentration is lowered, the carboxyl group loses its proton to become a negatively charged carboxylate group. (In a polyprotic acid, the protons are first lost from the group with the lowest pKa.) Once the carboxyl group has lost its proton, alanine has no net charge and is electrically neutral. The pH at which this occurs is called the isoelectric point (pI). The isoelectric point for alanine may be calculated as follows: pI

pK1 pK2 2

The pK1 and pK2 values for alanine are 2.34 and 9.9 respectively (see Table 5.2). The pI value for alanine is therefore pI

2.34 9.69 6.02 2

As the titration continues, the ammonium group loses its proton, leaving an uncharged amino group. The molecule then has a net negative charge because of the carboxylate group.

CH2OH D-Glyceraldehyde

CH2OH L-Glyceraldehyde

FIGURE 5.9 D-

and L-Glyceraldehyde

These molecules are mirror images of each other.

KEY CONCEPTS • Molecules with an asymmetric or chiral carbon atom differ only in the spatial arrangement of the atoms attached to the carbon. • The mirror-image forms of a molecule are called enantiomers. • Most asymmetric molecules in living organisms occur in only one stereoisomeric form.

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:56 PM

Page 10

CHAPTER FIVE Amino Acids, Peptides, and Proteins

TABLE 5.2

pKa Values for the Ionizing Groups of the Amino Acids

Amino Acid

pK1 (—COOH)

pK2 (—NH 3 )

Glycine Alanine Valine Leucine Isoleucine Serine Threonine Methionine Phenylalanine Tryptophan Asparagine Glutamine Proline Cysteine Histidine Aspartic acid Glutamic acid Tyrosine Lysine Arginine

2.34 2.34 2.32 2.36 2.36 2.21 2.63 2.28 1.83 2.83 2.02 2.17 1.99 1.71 1.82 2.09 2.19 2.20 2.18 2.17

9.60 9.69 9.62 9.60 9.60 9.15 10.43 9.21 9.13 9.39 8.80 9.13 10.60 10.78 9.17 9.82 9.67 9.11 8.95 9.04

FIGURE 5.10 Titration of Two Amino Acids (a) Alanine and (b) Glutamic Acids. The ionized forms of glutamic acid are illustrated on p. xxx.

pKR

8.33 6.00 3.86 4.25 10.07 10.79 12.48

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:56 PM

Page 11

5.1 Amino Acids 11

Amino acids with ionizable side chains have more complex titration curves. Glutamic acid, for example, has a carboxyl side chain group (Figure 5.10b). At low pH, glutamic acid has net charge 1. As base is added, the ␣-carboxyl group loses a proton to become a carboxylate group. Glutamate now has no net charge.

As more base is added, the second carboxyl group loses a proton, and the molecule has a –1 charge. Adding additional base results in the ammonium ion losing its proton. At this point, glutamate has a net charge of –2. The pI value for glutamate is the pH halfway between the pKa values for the two carboxyl groups (i.e., the pKa values that bracket the zwitterion): pI

2.19 4.25 3.22 2

Problems 5.1 to 5.3 are sample titration problems. When amino acids are incorporated in polypeptides, the ␣-amino and ␣-carboxyl groups lose their charges. Consequently, except for the ␣-amino and ␣-carboxyl groups of the amino acid residues at the beginning and end, respectively, of a polypeptide chain all the ionizable groups of proteins are the side chain groups of seven amino acids: histidine, lysine, arginine, aspartate, glutamate, cysteine, and tyrosine. It should be noted that the pKa values of these groups can differ from those of free amino acids. The pKa values of individual R groups are affected by their positions within protein microenvironments. For example, when the side chain groups of two aspartate residues are in close proximity, the pKa of one of the carboxylate groups is raised. The significance of this phenomenon will become apparent in the discussion of enzyme catalytic mechanisms (Section 6.4).

QUESTION 5.3 Calculate the isoelectric point of the following tripeptide: O

H3N

CH2

CH H3C

CH3

O–

CH2

SH N H

Assume that the pKa values listed for the amino acids in Table 5.2 are applicable to this problem.

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:56 PM

Page 12

CHAPTER FIVE Amino Acids, Peptides, and Proteins

WORKED PROBLEM 5.1 Consider the following amino acid and its pKa values: O +

H3N

CH2

CH +NH

pKa1 ⫽ 2.18

pKa2 ⫽ 8.95

O–

pKaR ⫽ 10.79

a. Draw the structure of the amino acid as the pH of the solution changes from highly acidic to strongly basic. Solution (a) O

O –

H3N

CH2

H3N

CH2

CH NH3

O OH

O –

H3N

O–

NH3

–

CH2

H2N

CH2

NH2

O–

NH2

The ionizable hydrogens are lost in order of acidity, with the most acidic ionizing first. b. Which form of the amino acid is present at the isoelectric point? Solution (b)

The form present at the isoelectric point is electrically neutral: c. Calculate the isoelectric point. O +

H3N

CH2

O–

NH2

Solution (c)

KE Y CO NCE PTS • Titration is useful in determining the relative ionization potential of acidic and basic groups in an amino acid or peptide. • The pH at which an amino acid has no net charge is called its isoelectric point.

The isoelectric point is the average of the two pKas bracketing the zwitterion.

pK2 + pKR 8.95 + 10.79 9.87 2 2

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:56 PM

Page 13

5.1 Amino Acids 13

WORKED PROBLEM 5.2 a. Sketch the titration curve for the amino acid lysine. Solution (a)

Plateaus appear at the pKa and are centered about 0.5 equivalent (Eq), 1.5 Eq, and 2.5 Eq of base. There is a sharp rise at 1 Eq, 2 Eq, and 3 Eq. The isoelectric point is midway on the sharp rise between pKa1 and pKaR. b. In what direction does the amino acid move when placed in an electric field at the following pH values: 1, 3, 5, 7, 9, and 12? Choice 1: does not move, Choice 2 toward the cathode (negative electrode), Choice 3: toward the anode (positive electrode). Solution (b)

At pH values below the pI (in this case 9.87), the amino acid is positively charged and moves to the cathode. Therefore, the amino acid in this problem will move, to the cathode at the pH values of 1, 3, 5, 7, and 9. The amino acid will be negatively charged at a pH value of 12. Under this condition, the amino acid will move to the anode.

WORKED PROBLEM 5.3 Consider the following dipeptide: O

CH2 C

H3N

O–

CH2

a. What is its isoelectric point? Solution (a)

The isoelectric point is the average of the pKas of the amino group of glycine and the carboxyl group of phenylalanine (obtained from Table 5.2). pI = (9.60 + 1.83)/2 = 5.72 b. In which direction will the dipeptide move at pH 1, 3, 5, 7, 9, and 12? Solution (b)

At pH values below that of the pI the dipeptide will move to the cathode (i.e., 1, 3, and 5). At pH values above the pI the dipeptide will move to the anode. These are 7, 9, and 12.

Amino Acid Reactions The functional groups of organic molecules determine which reactions they may undergo. Amino acids with their carboxyl groups, amino groups, and various R groups can undergo numerous chemical reactions. Peptide bond and disulfide bridge formation, however, are of special interest because of their effect on protein structure. Schiff base formation is another important reaction. PEPTIDE BOND FORMATION Polypeptides are linear polymers composed of

amino acids linked together by peptide bonds. Peptide bonds (Figure 5.11) are amide linkages formed when the unshared electron pair of the ␣-amino nitrogen atom of one amino acid attacks the ␣-carboxyl carbon of another in a

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:56 PM

Page 14

CHAPTER FIVE Amino Acids, Peptides, and Proteins

− Cα

−

+ N O

Cα

N +

(a)

Cα −

N + O

–Cα

FIGURE 5.11 Formation of a Dipeptide O

(b)

nucleophilic acyl substitution reaction. A generalized acyl substitution reaction is shown: O R

(a) A peptide bond forms when the ␣-carboxyl group of one amino acid reacts with the amino group of another. (b) A water molecule is formed in the reaction.

Y–

X–

The linked amino acids in a polypeptide are referred to as amino acid residues because peptide bond formation is a dehydration reaction (i.e., a water molecule is removed). When two amino acid molecules are linked, the product is called a dipeptide. For example, glycine and serine can form the dipeptides glycylserine or serylglycine. As amino acids are added and the chain lengthens, the prefix reflects the number of residues: a tripeptide contains three amino acid residues, a tetrapeptide four, and so on. By convention, the amino acid residue with the free amino group is called the N-terminal residue and is written to the left. The free carboxyl group on the C-terminal residue appears on the right. Peptides are named by using their amino acid sequences, beginning from their N-terminal residue. For example, +

H3N —Tyr—Ala—Cys—Gly—COO– is a tetrapeptide named tyrosylalanylcysteinylglycine.

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:56 PM

Page 15

5.1 Amino Acids 15

Large polypeptides usually have well-defined, three-dimensional structures. This structure, referred to as the molecule’s native conformation, is a direct consequence of its amino acid sequence (the order in which the amino acids are linked together). Because all the linkages connecting the amino acid residues consist of single bonds, each polypeptide might be expected to undergo constant conformational changes caused by rotation around the single bonds. However, most polypeptides spontaneously fold into a single biologically active form. In the early 1950s, Linus Pauling (1901–1994, 1954 Nobel Prize in Chemistry) and his colleagues proposed an explanation. Using X-ray diffraction studies, they characterized the peptide bond (1.33 Å) as rigid and planar (flat) (Figure 5.12). Having discovered that the C—N bonds joining each two amino acids are shorter than other types of C—N bonds (1.45 Å), Pauling deduced that peptide bonds have a partial double-bond character. (This indicates that peptide bonds are resonance hybrids.) Because of the rigidity of the peptide bond, fully one-third of the bonds in a polypeptide backbone chain cannot rotate freely. Consequently, there are limits to the number of conformational possibilities. CYSTEINE OXIDATION The sulfhydryl group of cysteine is highly reactive. The most common reaction of this group is a reversible oxidation that forms a disulfide. Oxidation of two molecules of cysteine forms cystine, a molecule that contains a disulfide bond (Figure 5.13). When two cysteine residues form such a bond, it is referred to as a disulfide bridge. This bond can occur in a single chain to form a ring or between two separate chains to form an intermolecular bridge. Disulfide bridges help stabilize many polypeptides and proteins.

C C

–O

+ N

(a)

Amide plane N C H

-Carbon

R N

Side group

FIGURE 5.12 The Peptide Bond C O -Carbon Amide plane (b)

(a) Resonance forms of the peptide bond. (b) Dimensions of a dipeptide. Because peptide bonds are rigid, the conformational degrees of freedom of a polypeptide chain are limited to rotations around the C␣—C and C␣—N bonds. The corresponding rotations are represented by ␺ and ␾, respectively.

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:56 PM

Page 16

CHAPTER FIVE Amino Acids, Peptides, and Proteins CH3 H3C

O CH

SH NH2

FIGURE 5.14 Structure of Penicillamine

2H+ +2e–

FIGURE 5.13 Oxidation of Two Cysteine Molecules to Form Cystine The disulfide bond in a polypeptide is called a disulfide bridge.

QUESTION 5.4

Cystinuria

In extracellular fluids such as blood (pH 7.2–7.4) and urine (pH 6.5), the sulfhydryl groups of cysteine (pKa 8.1) are subject to oxidation to form cystine. In peptides and proteins thiol groups are used to advantage in stabilizing protein structure and in thiol transfer reactions, but the free amino acid in tissue fluids can be problematic because of the low solubility of cystine. In a genetic disorder known as cystinuria, defective membrane transport of cystine results in excessive excretion of cystine into the urine. Crystallization of the amino acid results in formation of calculi (stones) in the kidney, ureter, or urinary bladder. The stones may cause pain, infection, and blood in the urine. Cystine concentration in the kidney is reduced by massively increasing fluid intake and administering D-penicillamine. It is believed that penicillamine (Figure 5.14) is effective because penicillamine–cysteine disulfide, which is substantially more soluble than cystine, is formed. What is the structure of the penicillamine–cysteine disulfide?

SCHIFF BASE FORMATION Molecules such as amino acids that possess primary

KE Y CO NCE PTS • Polypeptides are polymers composed of amino acids linked by peptide bonds. The order of the amino acids in a polypeptide is called the amino acid sequence. • Disulfide bridges, formed by the oxidation of cysteine residues, are an important structural element in polypeptides and proteins. • Schiff bases are imines that form when amine groups react reversibly with carbonyl groups.

amine groups can reversibly react with carbonyl groups. The imine products of this reaction are often referred to as Schiff bases. In a nucleophilic addition reaction, an amine nitrogen attacks the electrophilic carbon of a carbonyl group to form an alkoxide product. The transfer of a proton from the amine group to the oxygen to form a carbinolamine, followed by the transfer of another proton from an acid catalyst, converts the oxygen into a good leaving group (OH2 ). The subsequent elimination of a water molecule followed by loss of a proton from the nitrogen yields the imine product. The most important examples of Schiff base formation in biochemistry occur in amino acid metabolism. Schiff bases, referred to as aldimines, formed by the reversible reaction of an amino group with an aldehyde group, are intermediates (species formed during a reaction) in transamination reactions (pp. xxx–xxx).

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:56 PM

Page 17

5.2 Peptides 17

+ R

H O−

H R

Amine

Alkoxide H+

OH R

+OH

H2O

Carbinolamine

+ N H

R C

R R

R Aldimine (Schiff Base)

5.2 PEPTIDES Although less structurally complex than the larger protein molecules, peptides have significant biological activities. The structure and function of several interesting examples, presented in Table 5.3, are now discussed. The tripeptide glutathione (␥-glutamyl-L-cysteinylglycine) contains an unusual ␥-amide bond. (Note that the ␥-carboxyl group of the glutamic acid residue, not the ␣-carboxyl group, contributes to the peptide bond.) Found in almost all organisms, glutathione (GSH) is involved in protein and DNA synthesis, drug and environmental toxin metabolism, amino acid transport, and other important biological processes. One group of glutathione’s functions exploits its effectiveness as a reducing agent. Glutathione protects cells from the destructive effects of oxidation by reacting with substances such as peroxides (R–O–O–R), by-products of O2 metabolism. For example, in red blood cells, hydrogen peroxide (H2O2) oxidizes the iron of hemoglobin to its ferric form (Fe3 ). Methemoglobin, the product of this reaction, is incapable of binding O2. Glutathione protects against the formation of methemoglobin by reducing H2O2 in a reaction catalyzed by

TABLE 5.3 Selected Biologically Important Peptides Name

Amino Acid Sequence

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:56 PM

Page 18

CHAPTER FIVE Amino Acids, Peptides, and Proteins

the enzyme glutathione peroxidase. In the oxidized product GSSG, two tripeptides are linked by a disulfide bond: 2 GSH H2O2 â&#x2020;&#x2019; GSSG 2H2O

Vasopressin and Oxytocin

Because of the high GSH:GSSG ratio normally present in cells, glutathione is an important intracellular antioxidant. The abbreviation GSH is used because the reducing component of the molecule is the â&#x20AC;&#x201D;SH group of the cysteine residue. Peptides are one class of signal molecules that multicellular organisms use to regulate their complex activities. The dynamic interplay between opposing processes maintains a stable internal environment, a condition called homeostasis. Peptide molecules with opposing functions are now known to affect numerous processes (e.g., blood pressure regulation). The roles of selected peptides in each of these processes are briefly described. Blood pressure, the force exerted by blood against the walls of blood vessels, is influenced by several factors such as blood volume and viscosity. Two peptides known to affect blood volume are vasopressin and atrial natriuretic factor. Vasopressin, also called antidiuretic hormone, contains nine amino acid residues. It is synthesized in the hypothalamus, a small structure in the brain that regulates a wide variety of functions including water balance, appetite, body temperature, and sleep. In response to low blood pressure or a high blood Na concentration, osmoreceptors in the hypothalamus trigger vasopressin secretion. Vasopressin stimulates water reabsorption in the kidneys by initiating a signal transduction mechanism that inserts aquaporins (water channels) into kidney tubule membrane. Blood pressure rises as water then flows down its concentration gradient through the tubule cells and back into the blood. The structure of vasopressin is remarkably similar to that of another peptide produced in the hypothalamus called oxytocin, the signal molecule that stimulates the ejection of milk by mammary glands during lactation. Oxytocin produced in the uterus stimulates the contraction of uterine muscle during childbirth. Because vasopressin and oxytocin have similar structures, it is not surprising that the functions of the two molecules overlap. Oxytocin has mild antidiuretic activity and vasopressin has some oxytocin-like activity. Atrial natriuretic factor (ANF), a peptide produced by specialized cells in the heart in response to stretching and in the nervous system, stimulates the production of a dilute urine, an effect opposite to that of vasopressin. ANF exerts its effect, in part, by increasing the excretion of Na , a process that causes increased excretion of water, and by inhibiting the secretion of renin by the kidney. (Renin is an enzyme that catalyzes the formation of angiotensin, a hormone that constricts blood vessels.)

QUESTION 5.5 Write out the complete structure of oxytocin. What would be the net charge on this molecule at the average physiological pH of 7.3? At pH 4? At pH 9? Indicate which atoms in oxytocin can potentially form hydrogen bonds with water molecules.

QUESTION 5.6

KE Y CO NCE PT Although small in comparison to larger protein molecules, peptides have significant biological activity. They are involved in a variety of signal transduction processes.

The structural features of vasopressin that allow binding to vasopression receptors are the rigid hexapeptide ring and the amino acid residues at positions 3 (Phe) and 8 (Arg). The aromatic phenylalanine side chain, which fits into a hydrophobic pocket in the receptor, and the large positively charged arginine side chain are especially important structural features. Compare the structures of vasopressin and oxytocin and explain why their functions overlap. Can you suggest what will happen to the binding properties of vasopressin if the arginine at position 8 is replaced by lysine?

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

7:32 PM

Page 19

5.3 Proteins 19

5.3 PROTEINS Of all the molecules encountered in living organisms, proteins have the most diverse functions, as the following list suggests. 1. Catalysis. Catalytic proteins called the enzymes accelerate thousands of biochemical reactions in such processes as digestion, energy capture, and biosynthesis. These molecules have remarkable properties. For example, enzymes can increase reaction rates by factors of between 106 and 1012. They can perform this feat under mild conditions of pH and temperature because they can induce or stabilize strained reaction intermediates. For example, ribulose bisphosphate carboxylase is an important enzyme in photosynthesis, and the protein complex nitrogenase is responsible for nitrogen fixation. 2. Structure. Structural proteins often have very specialized properties. For example, collagen (the major components of connective tissues) and fibroin (silkworm protein) have significant mechanical strength. Elastin, the rubberlike protein found in elastic fibers, is found in blood vessels and skin that must be elastic to function properly. 3. Movement. Proteins are involved in all cell movements. Actin, tubulin, and other proteins comprise the cytoskeleton. Cytoskeletal proteins are active in cell division, endocytosis, exocytosis, and the ameboid movement of white blood cells. 4. Defense. A wide variety of proteins are protective. In vertebrates, keratin, a protein found in skin cells, aids in protecting the organism against mechanical and chemical injury. The blood-clotting proteins fibrinogen and thrombin prevent blood loss when blood vessels are damaged. The immunoglobulins (or antibodies) are produced by lymphocytes when foreign organisms such as bacteria invade an organism. Binding antibodies to an invading organism is the first step in its destruction. 5. Regulation. Binding a hormone molecule or a growth factor to cognate receptors on its target cell changes cellular function. For example, insulin and glucagon are peptide hormones that regulate blood glucose levels. Growth hormone stimulates cell growth and division. Growth factors are polypeptides that control animal cell division and differentiation. Examples include platelet-derived growth factor (PDGF) and epidermal growth factor (EGF). 6. Transport. Many proteins function as carriers of molecules or ions across membranes or between cells. Examples of membrane transport proteins include the enzyme NaâŤš-KâŤš ATPase and the glucose transporter. Other transport proteins include hemoglobin, which carries O2 to the tissues from the lungs, and the lipoproteins LDL and HDL, which transport waterinsoluble lipids in the blood from the liver. Transferrin and ceruloplasmin are serum proteins that transport iron and copper, respectively. 7. Storage. Certain proteins serve as a reservoir of essential nutrients. For example, ovalbumin in bird eggs and casein in mammalian milk are rich sources of organic nitrogen during development. Plant proteins such as zein perform a similar role in germinating seeds. 8. Stress response. The capacity of living organisms to survive a variety of abiotic stresses is mediated by certain proteins. Examples include cytochrome P450, a diverse group of enzymes found in animals and plants that usually convert a variety of toxic organic contaminants into less toxic derivatives, and metallothionein, a cysteine-rich intracellular protein found in virtually all mammalian cells that binds to and sequesters toxic metals such as cadmium, mercury, and silver. Excessively high temperatures and other stresses result in the synthesis of a class of proteins called the heat

Visit the companion website at www.oup.com/us/mckee to read the Biochemistry in Perspective essay on protein poisons.

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:56 PM

Page 20

CHAPTER FIVE Amino Acids, Peptides, and Proteins

shock proteins (hsps) that promote the correct refolding of damaged proteins. If such proteins are severely damaged, hsps promote their degradation. (Certain hsps function in the normal process of protein folding.) Cells are protected from radiation by DNA repair enzymes. Protein research efforts in recent years have revealed that numerous proteins have multiple and often unrelated functions. Once thought to be a rare phenomenon, multifunction proteins (sometimes referred to as moonlighting proteins) are a diverse class of molecules. Glyceraldehyde-3-phosphate dehydrogenase (GAPD) is a prominent example. As the name suggests, GAPD (p. 273) is an enzyme that catalyzes the oxidation of glyceraldehyde-3-phosphate, an intermediate in glucose catabolism. The GAPD protein is now known to have roles in such diverse processes as DNA replication and repair, endocytosis, and membrane fusion events. In addition to their functional classifications, proteins are categorized on the basis of amino acid sequence similarities and overall three-dimensional shape. Protein families are composed of protein molecules that are related by amino acid sequence similarity. Such proteins share an obvious common ancestry. Classic protein families include the hemoglobins (blood oxygen transport proteins, pp. 168â&#x20AC;&#x201C;171) and the immunoglobulins, the antibody proteins produced by the immune system in response to antigens (foreign substances). Proteins more distantly related are often classified into superfamilies. For example, the globin superfamily includes a variety of heme-containing proteins that serve in the binding and/or transport of oxygen. In addition to the hemoglobins and myoglobins (oxygen-binding proteins in muscle cells), the globin superfamily includes neuroglobin and cytoglobin (oxygenbinding proteins in brain and other tissues, respectively) and the leghemoglobins (oxygen-sequestering proteins in the root nodules of leguminous plants). Because of their diversity, proteins are often classified in two additional ways: shape and composition. Proteins are classified into two major groups based on their shape. As the name suggests, fibrous proteins are long, rod-shaped molecules that are insoluble in water and physically tough. Fibrous proteins, such as the keratins found in skin, hair, and nails, have structural and protective functions. Globular proteins are compact spherical molecules that are usually water-soluble. Typically, globular proteins have dynamic functions. For example, nearly all enzymes have globular structures. Other examples include the immunoglobulins and the transport proteins hemoglobin and albumin (a carrier of fatty acids in blood). On the basis of composition, proteins are classified as simple or conjugated. Simple proteins, such as serum albumin and keratin, contain only amino acids. In contrast, each conjugated protein consists of a simple protein combined with a nonprotein component. The nonprotein component is called a prosthetic group. (A protein without its prosthetic group is called an apoprotein. A protein molecule combined with its prosthetic group is referred to as a holoprotein.) Prosthetic groups typically play an important, even crucial, role in the function of proteins. Conjugated proteins are classified according to the nature of their prosthetic groups. For example, glycoproteins contain a carbohydrate component, lipoproteins contain lipid molecules, and metalloproteins contain metal ions. Similarly, phosphoproteins contain phosphate groups, and hemoproteins possess heme groups (p. xx).

Protein Structure Proteins are extraordinarily complex molecules. Complete models depicting even the smallest of the polypeptide chains are almost impossible to comprehend. Simpler images that highlight specific features of a molecule are useful. Two methods of conveying structural information about proteins are presented in Figure 5.15. Another structural representation, referred to as a ball-and-stick model, is presented later (Figures 5.37 and 5.39). Biochemists have distinguished several levels of the structural organization of proteins. Primary structure, the amino acid sequence, is specified by genetic

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:57 PM

Page 21

5.3 Proteins 21

FIGURE 5.15 The Enzyme Adenylate Kinase (a) This space-filling model illustrates the volume occupied by molecular components and overall shape. (b) In a ribbon model â?¤-pleated segments are represented by flat arrows. The â?Ł-helices appear as spiral ribbons.

information. As the polypeptide chain folds, it forms certain localized arrangements of adjacent (but not necessarily contiguous) amino acids that constitute secondary structure. The overall three-dimensional shape that a polypeptide assumes is called the tertiary structure. Proteins that consist of two or more polypeptide chains (or subunits) are said to have a quaternary structure.

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:57 PM

Page 22

CHAPTER FIVE Amino Acids, Peptides, and Proteins

PRIMARY STRUCTURE Every polypeptide has a specific amino acid sequence.

The interactions between amino acid residues determine the protein’s threedimensional structure and its functional role and relationship to other proteins. Polypeptides that have similar amino acid sequences and have arisen from the same ancestral gene are said to be homologous. Sequence comparisons among homologous polypeptides have been used to trace the genetic relationships of different species. For example, the sequence homologies of the mitochondrial redox protein cytochrome c have been used extensively in the study of evolution of species. Sequence comparisons of cytochrome c, an essential molecule in energy production, among numerous species reveal a significant amount of sequence conservation. The amino acid residues that are identical in all homologues of a protein, referred to as invariant, are presumed to be essential for the protein’s function. (In cytochrome c the invariant residues interact with heme, a prosthetic group, or certain other proteins involved in energy generation.) PRIMARY STRUCTURE, EVOLUTION, AND MOLECULAR DISEASES Over time,

KE Y CO NCE PTS • The primary structure of a polypeptide is its amino acid sequence. The amino acids are connected by peptide bonds. • Amino acid residues that are essential for the molecule’s function are referred to as invariant. • Proteins with similar amino acid sequences and functions and a common origin are said to be homologous.

Molecular Diseases

as the result of evolutionary processes, the amino acid sequences of polypeptides change. These modifications are caused by random and spontaneous alterations in DNA sequences called mutations. A significant number of primary sequence changes do not affect a polypeptide’s function. Some of these substitutions are said to be conservative because an amino acid with a chemically similar side chain is substituted. For example, at certain sequence positions leucine and isoleucine, which both contain hydrophobic side chains, may be substituted for each other without affecting function. Some sequence positions are significantly less stringent. These residues, referred to as variable, apparently perform nonspecific roles in the polypeptide’s function. Substitutions at conservative and variable sites have been used to trace evolutionary relationships. These studies assume that the longer the time since two species diverged from each other, the larger the number of differences in a certain polypeptide’s primary structure. For example, humans and chimpanzees are believed to have diverged relatively recently (perhaps only 4 million years ago). This presumption, based principally on fossil and anatomical evidence, is supported by cytochrome c primary sequence data indicating that the protein is identical in both species. Kangaroos, whales, and sheep, whose cytochrome c molecules each differ by 10 residues from the human protein, are believed to have evolved from a common ancestor that lived over 50 million years ago. It is interesting to note that quite often the overall three-dimensional structure does not change despite numerous amino acid sequence changes. The shape of proteins coded for by genes that diverged millions of years ago may show a remarkable resemblance to each other. Mutations, however, can also be deleterious. Such random changes in gene sequence can range from moderate to severe. Individual organisms with nonconservative, variable amino acid substitutions at the conservative, invariant residues of cytochrome c, for example, are not viable. Mutations can also have a profound effect without being immediately lethal. Sickle-cell anemia, which is caused by mutant hemoglobin, is a classic example of a group of maladies that Linus Pauling and his colleagues referred to as molecular diseases. (Dr. Pauling first demonstrated that sickle-cell patients have a mutant hemoglobin through the use of electrophoresis.) Human adult hemoglobin (HbA) is composed of two identical ␣-chains and two identical ␤-chains. Sickle-cell anemia results from a single amino acid substitution in the ␤-chain of HbA. Analysis of the hemoglobin molecules of sickle-cell patients reveals that the only difference between HbA and sickle-cell hemoglobin (HbS) is at amino acid residue 6 in the ␤-chain (Figure 5.16). Because of the substitution of a hydrophobic valine for a negatively charged glutamic acid, HbS molecules aggregate to form rigid rodlike structures in the oxygen-free state (Figure 5.17). The patient’s red blood cells

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:57 PM

Page 23

5.3 Proteins 23 Hb A

Val

His

Leu

Thr

Pro

Glu

Lys

Hb S

Val

His

Leu

Thr

Pro

Val

Glu

Lys

FIGURE 5.16 Segments of ␤-Chain in HbA and HbS Individuals possessing the gene for sicklecell hemoglobin produce ␤-chains with valine instead of glutamic acid at residue 6.

Phe 85 Val 6

FIGURE 5.17 Sickle Cell Hemoglobin Leu 88

become sickle shaped and are susceptible to hemolysis, resulting in severe anemia. These red blood cells have an abnormally low oxygen-binding capacity. Intermittent clogging of capillaries by sickled cells also causes tissues to be deprived of oxygen. Sickle-cell anemia is characterized by excruciating pain, eventual organ damage, and earlier death. Until recently, because of the debilitating nature of sickle-cell disease, affected individuals rarely survived beyond childhood. Thus one might predict that the deleterious mutational change that causes this affliction would be rapidly eliminated from human populations. However, the sickle-cell gene is not as rare as would be expected. Sickle-cell disease occurs only in individuals who have inherited two copies of the sickle-cell gene. Such individuals, who are said to be homozygous, inherit one copy of the defective gene from each parent. Each of the parents is said to have the sickle-cell trait. Such people, referred to as heterozygous because they have one normal HbA gene and one defective HbS gene, are relatively sympton-free, even though about 40% of their hemoglobin is HbS. The incidence of sickle-cell trait is especially high in some regions of Africa. In these areas malaria, caused by the Anopheles mosquito–borne parasite Plasmodium, is a serious health problem. Individuals with the sickle-cell trait are less vulnerable to malaria because their red blood cells are a less favorable environment for the growth of the parasite than are normal cells. Because sickle-cell trait carriers are more likely to survive malaria than normal individuals, the incidence of the sickle-cell gene has remained high. (In some areas, the sickle-cell trait is present in as much as 40% of the native population.)

HbS molecules aggregate into rodlike filaments because the hydrophobic side chain of valine, the substituted amino acid in the ␤-chain, interacts with a hydrophobic pocket in a second hemoglobin molecule.

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:57 PM

Page 24

CHAPTER FIVE Amino Acids, Peptides, and Proteins

QUESTION 5.7 A genetic disease called glucose-6-phosphate dehydrogenase deficiency is inherited in a manner similar to that of sickle-cell anemia. The defective enzyme cannot keep erythrocytes supplied with sufficient amounts of the antioxidant molecule NADPH (Chapter 8). NADPH protects cell membranes and other cellular structures from oxidation. Describe in general terms the inheritance pattern of this molecular disease. Why do you think that the antimalarial drug primaquine, which stimulates peroxide formation, results in devastating cases of hemolytic anemia in carriers of the defective gene? Does it surprise you that this genetic anomaly is commonly found in African and Mediterranean populations?

SECONDARY STRUCTURE The secondary structure of polypeptides consists

FIGURE 5.18 The ␣-Helix Hydrogen bonds form between carbonyl and N—H groups along the long axis of the ␣-helix. Note that there are 3.6 residues per turn of the helix, which has a pitch of 0.54 nm.

of several repeating patterns. The most commonly observed types of secondary structure are the ␣-helix and the ␤-pleated sheet. Both ␣-helix and ␤-pleated sheet patterns are stabilized by localized hydrogen bonding between the carbonyl and N—H groups in the polypeptide’s backbone. Because peptide bonds are rigid, the ␣-carbons are swivel points for the polypeptide chain. Several properties of the R groups (e.g., size and charge, if any) attached to the ␣-carbon influence the ␾ and ␺ angles. Certain amino acids foster or inhibit specific secondary structural patterns. Many fibrous proteins are composed almost entirely of secondary structural patterns. The ␣-helix is a rigid, rodlike structure that forms when a polypeptide chain twists into a right-handed helical conformation (Figure 5.18). Hydrogen bonds form between the N—H group of each amino acid and the carbonyl group of the amino acid four residues away. There are 3.6 amino acid residues per turn of the helix, and the pitch (the distance between corresponding points per turn) is 0.54 nm. Amino acid R groups extend outward from the helix. Because of several structural constraints (i.e., the rigidity of peptide bonds and the allowed limits on the values of the ␾ and ␺ angles), certain amino acids do not foster ␣-helical formation. For example, glycine’s R group (a hydrogen atom) is so small that the polypeptide chain may be too flexible. Proline, on the other hand, contains a rigid ring that prevents the N—C␣ bond from rotating. In addition, proline has no N—H group available to form the intrachain hydrogen bonds that are crucial in ␣-helix structure. Amino acid sequences with large numbers of charged amino acids (e.g., glutamate and aspartate) and bulky R groups (e.g., tryptophan) are also incompatible with ␣-helix structures. ␤-Pleated sheets form when two or more polypeptide chain segments line up side by side (Figure 5.19). Each individual segment is referred to as a ␤-strand. Rather than being coiled, each ␤-strand is fully extended. ␤-Pleated sheets are stabilized by hydrogen bonds that form between the polypeptide backbone N—H and carbonyl groups of adjacent chains. ␤-Pleated sheets are either parallel or antiparallel. In parallel ␤-pleated sheet structures, the hydrogen bonds in the polypeptide chains are arranged in the same direction; in antiparallel chains these bonds are arranged in opposite directions. Occasionally, mixed parallelantiparallel ␤-sheets are observed. Many globular proteins contain combinations of ␣-helix and ␤-pleated sheet secondary structures (Figure 5.20). These patterns are called supersecondary structures or motifs. In the ␤␣␤ unit, two parallel ␤-pleated sheets are connected by an ␣-helix segment. The structure of ␤␣␤ units is stabilized by hydrophobic interactions between nonpolar side chains projecting from the interacting surfaces of the ␤-strands and the ␣-helix. Abrupt changes in direction of a polypeptide involve structural elements called loops. The ␤-turn, a commonly observed type of loop, is a 180o turn involving four residues. The carbonyl oxygen of the first residue in the loop forms a hydrogen bond with the amide hydrogen of the fourth residue. Glycine and proline residues often occur in ␤-turns. Glycine’s lack of

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:57 PM

Page 25

5.3 Proteins 25

HC-R

R-CH C

C H

HC-R

HC-R H

C H

Antiparallel

13.0 Å

HC-R

R-CH

C C

R-CH

HC-R (a)

HC-R

C H

R-CH

HC-R 14.0 Å

R-CH

Parallel

FIGURE 5.19 ␤-Pleated Sheet (a) Two forms of ␤-pleated sheet: antiparallel and parallel. Hydrogen bonds are represented by dotted lines. (b) A more detailed view of antiparallel ␤pleated sheet.

(b)

(a)

(b)

(c)

(d)

FIGURE 5.20 Selected Supersecondary Structures (a) ␤␣␤ units, (b) ␤-meander, (c) ␣␣ unit, (d) ␤-barrel, and (e) Greek key. Note that ␤-strands are depicted as arrows. Arrow tips point toward the C-terminus.

(e)

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:57 PM

Page 26

CHAPTER FIVE Amino Acids, Peptides, and Proteins

an organic side group permits a contiguous proline to assume a cis orientation (same side of the peptide plane), and a tight turn can form in a polypeptide strand. Proline is a helix-breaking residue that alters the direction of the polypeptide chain. The b-turn is common in proteins rich in ␣-helical segments. In the ␤-meander pattern, two antiparallel ␤-sheets are connected by polar amino acids and glycines to effect a more abrupt change in direction called a reverse or hairpin turn. In ␣␣ units (or helix-loop-helix units), two ␣-helical regions separated by a nonhelical loop become aligned in a defined way because of interacting side chains. Several ␤-barrel arrangements are formed when various ␤-sheet configurations fold back on themselves. When an antiparallel ␤-sheet doubles back on itself in a pattern that resembles a common Greek pottery design the motif is called the Greek key. TERTIARY STRUCTURE Although globular proteins often contain significant

numbers of secondary structural elements, several other factors contribute to their structure. The term tertiary structure refers to the unique three-dimensional conformations that globular proteins assume as they fold into their native (biologically active) structures and prosthetic groups, if any, are inserted. Protein folding, a process in which an unorganized, nascent (newly synthesized) molecule acquires a highly organized structure, occurs as a consequence of the interactions between the side chains in their primary structure. Tertiary structure has several important features: 1. Many polypeptides fold in such a fashion that amino acid residues that are distant from each other in the primary structure come into close proximity. 2. Globular proteins are compact because of efficient packing as the polypeptide folds. During this process, most water molecules are excluded from the protein’s interior making interactions between both polar and nonpolar groups possible. 3. Large globular proteins (i.e., those with more than 200 amino acid residues) often contain several compact units called domains. Domains (Figure 5.21) are typically structurally independent segments that have specific functions (e.g., binding an ion or small molecule). The core three-dimensional structure of a domain is called a fold. Well-known examples of folds include the nucleotide-binding Rossman fold and the globin fold. Domains are classified on the basis of their core motif structure. Examples include ␣, ␤, ␣/␤, and ␣ ␤. ␣-Domains are composed exclusively of ␣-helices, and ␤-domains consist of antiparallel ␤ strands. ␣/␤-Domains contain various combinations of an ␣-helix alternating with ␤-strands (␤␣␤ motifs). ␣ ␤ Domains are primarily ␤-sheets with one or more outlying ␣-helices. Most proteins contain two or more domains. 4. A number of eukaryotic proteins, referred to as modular or mosaic proteins, contain numerous duplicate or imperfect copies of one or more domains that are linked in series. Fibronectin (Figure 5.22) contains three repeating domains: Fl, F2, and F3. All three domains, which are found in a variety of extracellular matrix (ECM) proteins, contain binding sites for other ECM molecules such as collagen (p. xxx) and heparan sulfate (p. xxx), as well as certain cell surface receptors. Domain modules are coded for by genetic sequences created by gene duplications (extra gene copies that arise from errors in DNA replication). Such sequences are used by living organisms to construct new proteins. For example, the immunoglobulin structural domain is found not only in antibodies, but also in a variety of cell surface proteins. Several types of interactions stabilize tertiary structure (Figure 5.23): 1. Hydrophobic interactions. As a polypeptide folds, hydrophobic R groups are brought into close proximity because they are excluded from water. Then

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:57 PM

Page 27

5.3 Proteins 27

E helix

6 Ca2+

3 7

F helix

5 4 CH3

CH3

C 1 7

5 (a) EF hand

6 HO

4 (b) Leucine zipper

HOOC

Cys 6

5 His 19

Cys 3

5 4

3 2

NH2

(d) ATP-binding domain of hexokinase

(e) The α /β zinc-binding motif

FIGURE 5.21 Selected Domains Found in Large Numbers of Proteins (a) The EF hand, a helix-loop-helix that binds specifically to Ca2+, and (b) the leucine zipper, a DNA-binding domain, are two examples of ␣-domains. (c) Human retinol-binding protein, a type of ␤-barrel domain (retinol, a visual pigment molecule, is shown in yellow). (d) The ATP-binding domain of hexokinase, a type of ␣/␤-domain. (e) The ␣/␤ zinc-binding motif, a core feature of numerous DNA-binding domains.

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:57 PM

Page 28

CHAPTER FIVE Amino Acids, Peptides, and Proteins Fibronectin N

F1 F1 F1 F1 F1 F1 F2 F2 F1 F1 F1

F3 F3

FIGUR E 5.22 Fibronectin Structure

F3 C

F1 F1 F1

Fibronectin is a mosaic protein that is composed of multiple copies of F1, F2, and F3 modules.

FIGURE 5.23 Interactions That Maintain Tertiary Structure

the highly ordered water molecules in solvation shells are released from the interior, increasing the disorder (entropy) of the water molecules. The favorable entropy change is a major driving force in protein folding. It should be noted that a few water molecules remain within the core of folded proteins, where each forms as many as four hydrogen bonds with the polypeptide backbone. The stabilization contributed by small â&#x20AC;&#x153;structuralâ&#x20AC;? water molecules may free the polypeptide from some of its internal interactions. The resulting increased flexibility of the polypeptide chain is believed to play a critical role in the binding of molecules called ligands to specific sites. Ligand binding is an important protein function. 2. Electrostatic interactions. The strongest electrostatic interaction in proteins occurs between ionic groups of opposite charge. Referred to as salt bridges, these noncovalent bonds are significant only in regions of the protein where water is excluded because of the energy required to remove water molecules from ionic groups near the surface. Salt bridges have been observed

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:57 PM

Page 29

5.3 Proteins 29

to contribute to the interactions between adjacent subunits in complex proteins. The same is true for the weaker electrostatic interactions (ion-dipole, dipole-dipole, van der Waals). They are significant in the interior of the folded protein and between subunits or in protein-ligand interactions. (In proteins that consist of more than one polypeptide chain, each polypeptide is called a subunit.) Ligand-binding pockets are water-depleted regions of the protein. 3. Hydrogen bonds. A significant number of hydrogen bonds form within a protein’s interior and on its surface. In addition to forming hydrogen bonds with one another, the polar amino acid side chains may interact with water or with the polypeptide backbone. Again, the presence of water precludes the formation of hydrogen bonds with other species. 4. Covalent bonds. Covalent linkages are created by chemical reactions that alter a polypeptide’s structure during or after its synthesis. (Examples of these reactions, referred to as posttranslational modifications, are described in Section 19.2.) The most prominent covalent bonds in tertiary structure are the disulfide bridges found in many extracellular proteins. In extracellular environments, these strong linkages partly protect protein structure from adverse changes in pH or salt concentrations. Intracellular proteins do not contain disulfide bridges because of high cytoplasmic concentrations of reducing agents. 5. Hydration. As described previously (p. xx) structured water is an important stabilizing feature of protein structure. The dynamic hydration shell that forms around a protein (Figure 5.24) also contributes to the flexibility required for biological activity.

FIGURE 5.24 Hydration of a Protein Three layers of structured water molecules surround a space-filling model of the enzyme hexokinase, before and after binding the sugar glucose. Hexokinase (p. xxx) is an enzyme that catalyzes the nucleophilic attack of the carbon–6 hydroxyl group of glucose on the phosphorus in the terminal phosphate of ATP. As the hydrated glucose molecule enters its binding site in a cleft in the enzyme, it sheds its water molecules and displaces water molecules occupying the binding site. The water exclusion process promotes the conformation change that moves the domains together to create the catalytic site. Water exclusion from this site also prevents the unproductive hydrolysis of ATP.

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:57 PM

Page 30

CHAPTER FIVE Amino Acids, Peptides, and Proteins

The precise nature of the forces that promote the folding of proteins (described on pp. 159–162) has not been completely resolved. It is clear, however, that protein folding is a thermodynamically favorable process with an overall negative free energy change. According to the free energy equation ⌬G ⌬H T⌬S a negative free energy change in a process is the result of a balance between favourable and unfavorable enthalpy and entropy changes (pp. xx–xx). As a polypeptide folds, favorable (negative) ⌬H values are the result in part of the sequestration of hydrophobic side chains within the interior of the molecule and the optimization of other noncovalent interactions. Opposing these factors is the unfavorable decrease in entropy that occurs as the disorganized polypeptide folds into its highly organized native state. The change in entropy of the water that surrounds the protein is positive because of the decreased organization of the water in going from the unfolded to the folded state of the protein. For most polypeptide molecules the net free energy change between the folded and unfolded state is relatively modest (the energy equivalent of several hydrogen bonds). The precarious balance between favorable and unfavorable forces allows proteins the flexibility they require for biological function.

FIGURE 5.25 Structure of Immunoglobulin G IgG is an antibody molecule composed of two heavy chains (H) and two light chains (L) that together form a Y-shaped molecule. Each of the heavy and light chains contains constant (C) and variable (V), ␤-barrel domains (the classic immunoglobulin fold). The chains are held together by disulfide bridges (yellow lines) and noncovalent interactions. The variable domains of the H and L chains form the site that binds to antigens (foreign molecules). Many antigenic proteins bind to the external surface of these sites. Note that disulfide bridges are also a structural feature within each constant domain.

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:57 PM

Page 31

5.3 Proteins 31

QUATERNARY STRUCTURE Many proteins, especially those with high

molecular weights, are composed of several polypeptide chains. As mentioned, each polypeptide component is called a subunit. Subunits in a protein complex may be identical or quite different. Multisubunit proteins in which some or all subunits are identical are referred to as oligomers. Oligomers are composed of protomers, which may consist of one or more subunits. A large number of oligomeric proteins contain two or four subunit protomers, referred to as dimers and tetramers, respectively. There appear to be several reasons for the common occurrence of multisubunit proteins: 1. Synthesis of separate subunits may be more efficient than substantially increasing the length of a single polypeptide chain. 2. In supramolecular complexes such as collagen fibers, replacement of smaller worn-out or damaged components can be managed more effectively. 3. The complex interactions of multiple subunits help regulate a protein’s biological function. Polypeptide subunits assemble and are held together by noncovalent interactions such as the hydrophobic and electrostatic interactions, and hydrogen bonds, as well as covalent cross-links. As with protein folding, the hydrophobic effect is clearly the most important because the structures of the complementary interfacing surfaces between subunits are similar to those observed in the interior of globular protein domains. Although they are less numerous, covalent cross-links significantly stabilize certain multisubunit proteins. Prominent examples include the disulfide bridges in the immunoglobulins (Figure 5.25), and the desmosine and lysinonorleucine linkages in certain connective tissue proteins. Desmosine (Figure 5.26) cross-links connect four polypeptide chains in the rubberlike connective tissue protein elastin. They are formed as a result of a series of reactions involving the oxidation and condensation of lysine side chains. A similar process results in the formation of lysinonorleucine, a cross-linking structure that is found in elastin and collagen. Quite often the interactions between subunits are affected by the binding of ligands. In allostery, which is the control of protein function through ligand binding, binding a ligand to a specific site in a protein triggers a conformational change that alters its affinity for other ligands. Ligand-induced conformational changes in such proteins are called allosteric transitions, and the ligands that trigger them are called effectors or modulators. Allosteric effects can be positive or negative, depending on whether effector binding increases or decreases the protein’s affinity for other ligands. One of the best understood examples of allosteric effects, the reversible binding of O2 and other ligands to hemoglobin, is described on pp. 169–171. (Because allosteric enzymes play a key role in the control of metabolic processes, allostery is discussed further in Sections 6.3 and 6.5.) UNSTRUCTURED PROTEINS In the traditional view of proteins, a polypeptide’s function is determined by its specific and relatively stable three-dimensional structure. In recent years, however, as a result of new genomic methodologies and new applications of various forms of spectroscopy, it has become apparent that many proteins are in fact partially or completely unstructured. Unstructured proteins are referred to as IUPs (intrinsically unstructured proteins). If there is a complete lack of ordered structure, the term natively unfolded proteins is used. Most IUPs are eukaryotic. Amazingly, over 30% of eukaryotic proteins are partially or completely disordered, whereas only about 2 and 4% of archaean and bacterial proteins, respectively, can be described as unstructured. The folding of IUPs into stable three-dimensional conformations is prevented by biased

O NH C HC

C C

(CH2)3 (CH2)2

(CH2)2

CH NH

HN + N (CH2)4 NH

O Desmosine

C HC

(CH2)4

CH NH

HN Lysinonorleucine

FIGURE 5.26 Desmosine and Lysinonorleucine Linkages

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:57 PM

Page 32

CHAPTER FIVE Amino Acids, Peptides, and Proteins

QUESTION 5.8 Review the following illustrations of globular proteins. Identify examples of secondary and supersecondary structure.

4 8

7 6

2 1

3 N

N C

2 1

8 5 7

N C

QUESTION 5.9 Illustrate the noncovalent interactions that can occur between the following side chain groups in folded polypeptides: (a) serine and glutamate, (b) arginine and aspartate, (c) threonine and serine, (d) glutamine and aspartate, (e) phenylalanine and tryptophan.

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:57 PM

Page 33

5.3 Proteins 33

FIGURE 5.27 Disordered Protein Binding The intrinsically disordered phosphorylated KID domain (pKID) (left) of the transcription regulatory protein CREB searches out and binds to the KIX domain of the transcription coactivator protein CBP (right). As pKID binds to KIX, it folds into a pair of helices.

amino acid sequences that contain high percentages of polar and charged amino acids (e.g., Ser, Gln, Lys, and Glu) and low quantities of hydrophobic amino acids (e.g., Leu, Val, Phe, and Trp). IUPs have a diversity of functions. Many are involved in the regulation of such processes as signal transduction, transcription, translation, and cell proliferation. Highly extended and malleable disordered segments enable the molecule to “search” for binding partners. For example, CREB, a transcription regulatory protein discussed later (Chapter 18), binds to CRE, one type of DNA sequence called a response element. When the pKID (kinase inducible domain) of CREB is phosphorylated by a kinase (an enzyme that attaches phosphate groups to specific amino acid side chains) it becomes unstructured. The unstructured pKID domain is then able to search out and bind to a domain of CREB-binding protein (CBP) called KIX (KID-binding domain) (Figure 5.27). As often happens with IUPS, the disordered pKID domain transitions into a more ordered conformation as it binds to the KIX domain of CBP. As a result of CREB-CBP binding, CREB forms a dimer that alters the expression of certain genes when it binds to its response element. LOSS OF PROTEIN STRUCTURE Considering the small differences in the free energy of folded and unfolded proteins, it is not surprising that protein structure is especially sensitive to environmental factors. Many physical and chemical agents can disrupt a protein’s native conformation. The process of structure disruption, which may or may not involve protein unfolding, is called denaturation. (Denaturation is not usually considered to include the breaking of peptide bonds.) Depending on the degree of denaturation, the molecule may partially or completely lose its biological activity. Denaturation often results in easily observable changes in the physical properties of proteins. For example, soluble and transparent egg albumin (egg white) becomes insoluble and opaque upon heating. Like many denaturations, cooking eggs is an irreversible process. The following example of a reversible denaturation was demonstrated in the 1950s by Christian Anfinsen, who shared the Nobel Prize in Chemistry in 1972. Bovine pancreatic ribonuclease (a digestive enzyme from cattle that degrades RNA) is denatured when treated with ␤-mercaptoethanol and 8 M urea (Figure 5.28). During this process, ribonuclease, composed of a single polypeptide with four disulfide bridges, completely unfolds and loses all biological activity. Careful removal of the denaturing agents with dialysis results in a spontaneous and correct refolding of the polypeptide and re-formation of the disulfide bonds. Anfinsen’s experimental treatment resulting in a full restoration of the enzyme’s catalytic activity provided an important early insight into the roles of different

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

7:33 PM

Page 34

CHAPTER FIVE Amino Acids, Peptides, and Proteins

forces and primary structure in protein folding. However, most proteins treated similarly do not renature. Denaturing conditions include the following:

KE Y CO NCE PTS • Biochemists distinguish four levels of the structural organization of proteins. • In primary structure, the amino acid residues are connected by peptide bonds. • The secondary structure of polypeptides is stabilized by hydrogen bonds. Prominent examples of secondary structure are ␣helices and ␤-pleated sheets. • Tertiary structure is the unique threedimensional conformation that a protein assumes because of the interactions between amino acid side chains. Several types of interaction stabilize tertiary structure: the hydrophobic effect, electrostatic interactions, hydrogen bonds, and certain covalent bonds. • Proteins that consist of several separate polypeptide subunits exhibit quaternary structure. • Both noncovalent and covalent bonds hold the subunits together. Some proteins are partially or completely unstructured.

Visit the companion website at www.oup.com/us/mckee to read the Biochemistry in Perspective essay on lead poisoning.

1. Strong acids or bases. Changes in pH alter the protonation state of certain protein side chain groups, which in turn alters hydrogen bonding and salt bridge patterns. As a protein approaches its isoelectric point, it becomes less soluble and may precipitate from solution. 2. Organic solvents. Water-soluble organic solvents such as ethanol interfere with hydrophobic interactions because they interact with nonpolar R groups and form hydrogen bonds with water and polar protein groups. Nonpolar solvents also disrupt hydrophobic interactions. 3. Detergents. Detergents are substances that disrupt hydrophobic interactions, causing proteins to unfold into extended polypeptide chains. These molecules are called amphipathic because they contain both hydrophobic and hydrophilic components. 4. Reducing agents. In the presence of reagents such as urea, reducing agents (e.g., ␤-mercaptoethanol) convert disulfide bridges to sulfhydryl groups. Urea disrupts hydrogen bonds and hydrophobic interactions. 5. Salt concentration. When there is an increase in the salt concentration of an aqueous solution of protein, some of the water molecules that interact with the protein’s ionizable groups are attracted to the salt ions. As the number of solvent molecules available to interact with these groups decreases, protein-protein interactions increase. If the salt concentration is high enough, there are so few water molecules available to interact with ionizable groups that the solvation spheres surrounding the protein’s ionized groups are removed. The protein molecules aggregate and then precipitate. This process is referred to as salting out. Because salting out is usually reversible and different proteins salt out at different salt concentrations, it is often used as an early step in protein purification. 6. Heavy metal ions. Heavy metals such as mercury (Hg2⫹) and lead (Pb2⫹) affect protein structure in several ways. They may disrupt salt bridges by forming ionic bonds with negatively charged groups. Heavy metals also bond with sulfhydryl groups, a process that may result in significant changes in protein structure and function. For example, Pb2⫹ binds to sulfhydryl groups in two enzymes in the heme synthetic pathway (Chapter 14). The resultant decrease in hemoglobin synthesis causes severe anemia. (In anemia the number of red blood cells or the hemoglobin concentration is lower than normal.) Anemia is one of the most easily measured symptoms of lead poisoning. 7. Temperature changes. As the temperature increases, the rate of molecular vibration increases. Eventually, weak interactions such as hydrogen bonds are disrupted and the protein unfolds. Some proteins are more resistant to heat denaturation, and this fact can be used in purification procedures. 8. Mechanical stress. Stirring and grinding actions disrupt the delicate balance of forces that maintain protein structure. For example, the foam formed when egg white is beaten vigorously contains denatured protein.

The Folding Problem The direct relationship between a protein’s primary sequence and its final threedimensional conformation, and by extension its biological activity, is among the most important assumptions of modern biochemistry. One of the principal underpinnings of this paradigm has already been mentioned: the series of experiments

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

7:33 PM

Page 35

5.3 Proteins 35

FIGURE 5.28 The Anfinsen Experiment Ribonuclease denatured by 8 M urea and a mercaptan (RSH, a reagent that reduces disulfides to sulfhydryl groups) can be renaturated by removing the urea and RSH and air-oxidizing the reduced disulfides.

reported by Christian Anfinsen in the late 1950s. Working with bovine pancreatic RNase, Anfinsen demonstrated that under favorable conditions a denatured protein could refold into its native and biologically active state (Figure 5.28). This discovery suggested that the three-dimensional structure of any protein could be predicted if the physical and chemical properties of the amino acids and the forces that drive the folding process (e.g., bond rotations, free energy considerations, and the behavior of amino acids in aqueous environments) were understood. Unfortunately, several decades of painstaking research with the most sophisticated tools available (e.g., X-ray crystallography and NMR in combination with sitedirected mutagenesis and computer-based mathematical modeling) resulted in only limited progress. However, this work did reveal that protein folding is a stepwise process in which secondary structure formation (i.e., â?Ł-helix and â?¤-pleated sheet) is an early feature. Hydrophobic interactions are an important force in folding. In addition, amino acid substitutions experimentally introduced into certain proteins reveal that changes in surface amino acids rarely affect the proteinâ&#x20AC;&#x2122;s structure. In contrast, substitutions of amino acids within the hydrophobic core often lead to serious structural changes in conformation. In recent years important advances have been made by biochemists in proteinfolding research. Protein-folding researchers have determined that the process does not consist, as was originally thought, of a single pathway. Instead, there are numerous routes that a polypeptide can take to fold into its native state. As illustrated in Figure 5.29a an energy landscape with a funnel shape appears to best describe how an unfolded polypeptide with its own unique set of constraints (e.g., its amino acid sequence and posttranslational modifications, and environmental features within the cell such as temperature, pH, and molecular crowding) negotiates its way to a low-energy folded state. Depending largely on its size, a polypeptide may or may not form intermediates (species existing long enough to be detected) that are momentarily trapped in local energy wells (Figure 5.29b). Small molecules (fewer than 100 residues) often fold without intermediate formation (Figure 5.30a). As these molecules begin emerging from the ribosome, a rapid and cooperative folding process begins in which side chain interactions

Visit the companion website at www.oup.com/us/mckee to read the Biochemistry in Perspective essay on protein folding and human health.

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:57 PM

Page 36

CHAPTER FIVE Amino Acids, Peptides, and Proteins

FIGURE 5.29 The Energy Landscape for Protein Folding (a) Color is used to indicate the entropy level of the folding polypeptide. As folding progresses the polypeptide moves from a disordered stated (high entropy, red) toward a progressively more ordered conformation until its unique biologically active conformation is achieved (lower entropy, blue): (b) A depiction of the conformational state of a polypeptide during folding: polypeptides can fold into their native states by several different pathways. Many molecules form transient intermediates, whereas others may become trapped in a misfolded state.

facilitate the formation and alignment of secondary structures. The folding of larger polypeptides typically involves the formation of several intermediates (Figure 5.30b, c). In many of these molecules or the domains within a molecule, the hydrophobically collapsed shape of the intermediate is referred to as a molten globule. The term molten globule refers to a partially organized globular state of a folding polypeptide that resembles the moleculeâ&#x20AC;&#x2122;s native state. Within the interior of a molten globule, tertiary interactions among amino acid side chains are fluctuating; that is, they have not yet stabilized. It has also become increasingly clear that the folding and targeting of many proteins in living cells are aided by a group of molecules now referred to as the molecular chaperones. These molecules, most of which appear to be heat shock proteins (hsps), apparently occur in all organisms. Several classes of molecular chaperones have been found in organisms, ranging from bacteria to the higher animals and plants. They are found in several eukaryotic organelles, such as mitochondria, chloroplasts, and ER. There is a high degree of sequence homology among the molecular chaperones of all species so far investigated. The properties of several of these important molecules are described next. MOLECULAR CHAPERONES Molecular chaperones apparently assist unfolded

proteins in two ways. First, during a finite time between synthesis and folding,

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:57 PM

Page 37

5.3 Proteins 37

FIGURE 5.30 Protein Folding (a) In many small proteins, folding is cooperative with no intermediates formed. (b) In some larger proteins, folding involves the initial formation of a molten globule followed by rearrangement into the native conformation. (c) Large proteins with multiple domains follow a more complex pathway, with each domain folding separately before the entire molecule progresses to its native conformation.

proteins must be protected from inappropriate protein-protein interactions. For example, certain mitochondrial and chloroplast proteins must remain unfolded until they are inserted in an organelle membrane. Second, proteins must fold rapidly and precisely into their correct conformations. Some must be assembled into multisubunit complexes. Investigations of protein folding in a variety of organisms reveal the existence of two major molecular chaperone classes in protein folding. 1. Hsp70s. The hsp70s are a family of molecular chaperones that bind to and stabilize proteins during the early stages of folding. Numerous hsp70 monomers bind to short hydrophobic segments in unfolded polypeptides, thereby preventing molten globule formation. Each type of hsp70 possesses two binding sites, one for an unfolded protein segment and another for ATP. Release of a polypeptide from an hsp70 involves ATP hydrolysis. Mitochondrial and ER-localized hsp70s are required for transmembrane translocation of some polypeptides. 2. Hsp60s. Once an unfolded polypeptide has been released by hsp70, it is passed on to a member of a family of molecular chaperones referred to as the hsp60s (also called the chaperonins or Cpn60s), which mediate protein folding. The hsp60s form a large structure composed of two stacked seven-subunit rings. The unfolded protein enters the hydrophobic cavity of the hsp60 complex (Figure 5.31). The hsp60 system, which consists of two

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:57 PM

Page 38

CHAPTER FIVE Amino Acids, Peptides, and Proteins

FIGURE 5.31 Space-Filling Model of the E. coli Chaperonin called the GroESGroEL Complex

GroES

GroES (a co-chaperonin, or hsp10) is a seven-subunit ring that sits on top of GroEL. GroEL (a chaperonin, or hsp60) is composed of two stacked, seven-subunit rings with a cavity in which ATP-dependent protein folding takes place.

GroEL rings

KE Y CO NCE PTS • All the information required for each newly synthesized polypeptide to fold into its biologically active conformation is encoded in the molecule’s primary sequence. • Some relatively simple polypeptides fold spontaneously into their native conformations. • Other larger molecules require the assistance of proteins called molecular chaperones to ensure correct folding.

Alzheimer’s and Huntington’s

identical rings and cavities, increases the speed and efficiency of the folding process. ATP hydrolysis converts a cavity into a hydrophilic microenvironment that facilitates the collapse of the hydrophobic core of the folding protein into the molten globule form. It takes 15–20 s for all seven ATPs to hydrolyze in the ring subunits and to complete the folding process. In the (ADP)7 state, the hydrophobic character of the cavity returns, the chamber opens, and the folded protein or domain is released. A new unfolded protein can now bind to repeat the cycle. Protein folding proceeds with two cycles occurring in an overlapping fashion depending on the ATP/ADP binding status of the two cavities. In addition to promoting the folding of nascent protein, molecular chaperones direct the refolding of protein that was partially unfolded as a consequence of stressful conditions. If refolding is not possible, molecular chaperones promote protein degradation. A diagrammatic view of protein folding is presented in Figure 5.32. The effects of protein misfolding on human health can be considerable. Both Alzheimer’s and Huntington’s diseases are neurodegenerative diseases caused by accumulations of insoluble protein aggregates. (See the online Biochemistry in Perspective essay Protein Folding and Human Disease.)

Fibrous Proteins Fibrous proteins typically contain high proportions of regular secondary structures, such as ␣-helices or ␤-pleated sheets. As a consequence of their rodlike or sheetlike shapes, many fibrous proteins have structural rather than dynamic roles. Keratin (Figure 5.33) is a fibrous protein composed of bundles of ␣-helices, whereas the polypeptide chains of the silkworm silk protein fibroin (Figure 5.34) are arranged in antiparallel ␤-pleated sheets. The structural features of collagen, the most abundant protein in vertebrates, are described in some detail. COLLAGEN Collagen is synthesized by connective tissue cells and then secreted into the extracellular space to become part of the connective tissue matrix. The 20 major families of collagen molecules include many closely related proteins that have diverse functions. The genetically distinct collagen molecules in skin, bones, tendons, blood vessels, and corneas impart to these structures many of their special properties (e.g., the tensile strength of tendons and the transparency of corneas).

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

Denatured (unfolded) protein

6:57 PM

Page 39

5.3 Proteins 39

Ribosome

hsp 70

ATP ADP

Folded protein Unfolded olded d protein o n otein

TADP

Folding protein GroES

GroES (ADP)T

GroEL rings

TPi

(ADP) (AD (A DP)T (A (AD (ADP) ADP)T

Cis ring

(ADP)T

GroES

TATP Trans ring Folded protein

180Â° Flip (ADP)T (ADP)T (ADP) (AD ((A ADP)T AD (ADP)T

Unfolded U fo Unf old de ed d protein prro p

FIGURE 5.32 Molecular Chaperone-Assisted Protein Folding Molecular chaperones bind transiently to both nascent proteins and unfolded proteins (i.e., those denatured by stressful conditions). The members of the hsp70 family stabilize nascent proteins and reactivate some denatured proteins. Many proteins also require hsp60 protein complexes to achieve their final conformations. In E. coli, cellular proteins that do not fold spontaneously require processing by the GroEL-GroES complex. [Note that during a folding cycle the GroEL ring capped by GroES is referred to as the cisring. The other attached ring, the one that has not yet initiated a protein folding cycle, is called the transring.] At the beginning of a folding cycle, an unfolded protein (or protein domain) is loosely bound via hydrophobic interactions to the cavity entrance of one of the GroEL-(ADP)7 rings. ADP/ATP exchange converts the cavity to a hydrophobic, expanded microenvironment that then traps the protein substrate under a GroES lid. Subsequently, sequential hydrolysis of the seven ATPs converts the cavity to a hydrophilic microenvironment, driving both the formation of the molten globule state of the protein substrate and the progression of the folding process. When all seven ATPs have been hydrolyzed, the hydrophobic surface of the cavity is reestablished and GroES and the newly folded protein leave the GroEL ring. Meanwhile, the trans-GroEL-(ADP)7 ring is already beginning the loading, trapping, and folding process for another unfolded protein or domain.

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:57 PM

Page 40

CHAPTER FIVE Amino Acids, Peptides, and Proteins

α-Helix

FIGURE 5.33

Coiled coil of two α-helices

␣-Keratin The ␣-helical rodlike domains of two keratin polypeptides form a coiled coil. Two staggered antiparallel rows of these dimers form a supercoiled protofilament. Hydrogen bonds and disulfied bridges are the principal interactions between subunits. Hundreds of filaments, each containing four protofilaments, form a macrofibril. Each hair cell, also called a fiber, contains several macrofibrils. Each strand of hair consists of numerous dead cells packed with keratin molecules. In addition to hair, the keratins are also found in wool, skin, horns, and fingernails.

Protofilament (pair of coiled coils)

Filament (four right-hand twisted protofilaments)

FIGURE 5.34 Molecular Model of Silk Fibroin In fibroin, the silk fibrous protein produced by silkworms, the polypeptide chains are arranged in fully extended antiparallel ␤pleated sheet conformations. Note that the R groups of alanine on one side of each ␤pleated sheet interdigitate with similar residues on the adjacent sheet. Silk fibers (fibroin embedded in an amorphous matrix) are flexible because the pleated sheets are loosely bonded to each other (primarily with weak van der Waals forces) and slide over each other easily.

Collagen is composed of three left-handed polypeptide helices that are twisted around each other to form a right-handed triple helix (Figure 5.35). Type I collagen molecules, found in teeth, bone, skin, and tendons, are about 300 nm long and approximately 1.5 nm wide. Approximately 90% of the collagen found in humans is type I. The amino acid composition of collagen is distinctive. Glycine constitutes approximately one-third of the amino acid residues. Proline and 4-hydroxyproline may account for as much as 30% of a collagen molecule’s amino acid composition. Small amounts of 3-hydroxyproline and 5-hydroxylysine also occur. (Specific proline and lysine residues in collagen’s primary sequence are hydroxylated within the rough ER after the polypeptides have been synthesized. These reactions, which are discussed in Chapter 19, require ascorbic acid (p. xxx). Collagen’s amino acid sequence primarily consists of large numbers of repeating triplets with the sequence of Gly—X—Y, in which X and Y are often proline and hydroxyproline. Hydroxylysine is also found in the Y position. Simple carbohydrate groups are often attached to the hydroxyl group of hydroxylysine residues. It has been suggested that collagen’s carbohydrate components are required for fibrilogenesis, the assembly of collagen fibers in their extracellular locations, such as tendons and bone.

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:58 PM

Page 41

5.3 Proteins 41

FIGURE 5.35

Collagen molecule Packing of molecules

Collagen Fibrils The bands are formed by staggered collagen molecules. Cross-striations are about 680 Å apart. Each collagen molecule is about 3000 Å long.

Hole zone

Overlap zone

The enzyme lysyl oxidase converts some of the lysine and hydroxylysine side groups to aldehydes through oxidative deamination, and this facilitates the spontaneous nonenzymatic formation of strengthening aldimine, and aldol cross-links. (An aldol cross-link is formed in a reaction, called an aldol condensation, in which two aldehydes form an ␣, ␤-unsaturated aldehyde linkage. In condensation reactions, a small molecule, in this case H2O, is removed.) Cross-linkages also occur between hydroxylysine-linked carbohydrates and the amino group of other lysine and hydroxylysine residues on adjacent molecules. Increased crosslinking with age leads to the brittleness and breakage of the collagen fibers that occur in older organisms. Glycine is prominent in collagen sequences because the triple helix is formed by interchain hydrogen bonding involving the glycine residues. Therefore every third residue is in close contact with the other two chains. Glycine is the only amino acid with an R group sufficiently small for the space available. Larger R groups would destabilize the superhelix structure. The triple helix is further strengthened by hydrogen bonding between the polypeptides (caused principally by the large number of hydroxyproline residues) and lysinonorleucine linkages that stabilize the orderly arrays of triple helices in the final collagen fibril.

Globular Proteins The biological functions of globular proteins usually involve the precise binding of small ligands or large macromolecules such as nucleic acids or other proteins. Each protein possesses one or more unique cavities or clefts whose structure is complementary to a specific ligand. After ligand binding, a conformational change occurs in the protein that is linked to a biochemical event. For example, the binding of ATP to myosin in muscle cells is a critical event in muscle contraction. The oxygen-binding proteins myoglobin and hemoglobin are interesting and well-researched examples of globular proteins. They are both members of the hemoproteins, a specialized group of proteins that contain the prosthetic group heme. Although the heme group (Figure 5.36) in both proteins is responsible

CH3

CH2

N CH3

C C N

CH2

C C

CH3

C C

C C N

CH2 COO−

CH2

CH3

CH2 COO−

FIGURE 5.36 Heme Heme consists of a porphyrin ring (composed of four pyrroles) with Fe2+ in the center.

CH2

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:58 PM

Page 42

CHAPTER FIVE Amino Acids, Peptides, and Proteins

QUESTION 5.10 Covalent cross-links contribute to the strength of collagen. The first reaction in cross-link formation is catalyzed by the copper-containing enzyme lysyl oxidase, which converts lysine residues to the aldehyde allysine: C CH

Lysyl oxidase

(CH2)4

NH2

O (CH2)3

NH Lysine residue

Allysine residue

Allysine then reacts with other side chain aldehyde or amino groups to form crosslinkages. For example, two allysine residues react to form an aldol cross-linked product: C

(CH2)3

O H

(CH2)3

CH NH

Allysine residue

O (CH2)2

(CH2)3

CH NH

Aldol cross-link

Lathyrism

In a disease called lathyrism, which occurs in humans and several other animals, a toxin (â?¤-aminopropionitrile) found in sweet peas (Lathyrus odoratus) inactivates lysyl oxidase. Consider the abundance of collagen in animal bodies and suggest some likely symptoms of this malady.

for the reversible binding of molecular oxygen, the physiological roles of myoglobin and hemoglobin are significantly different. The chemical properties of heme are dependent on the Fe2 ion in the center of the prosthetic group. Fe2 , which forms six coordinate bonds, is bound to the four nitrogens in the center of the protoporphyrin ring. Two other coordinate bonds are available, one on each side of the planar heme structure. In myoglobin and hemoglobin, the fifth coordination bond is to the nitrogen atom in a histidine residue, and the sixth coordination bond is available for binding oxygen. In addition to serving as a reservoir for oxygen within muscle cells, myoglobin also facilitates the intracellular diffusion of oxygen. The role of hemoglobin, the primary protein of red blood cells, is to deliver oxygen to cells throughout the body. A comparison of the structures of these two proteins illustrates several important principles of protein structure, function, and regulation. MYOGLOBIN Myoglobin, found in high concentration in skeletal and cardiac muscle, gives these tissues their characteristic red color. The muscles of diving mammals such as whales, which remain submerged for long periods, have high myoglobin concentrations. Because of the extremely high concentrations of

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:58 PM

Page 43

5.3 Proteins 43 48 50 47

41 45

51 46

42 38

Heme 96

98 97

Carboxyl 153 end

95 150 151

152 149

His N

N 89

62 31

29 63

26 65

110

25 66

143

139

108

140 86

141

138

109

142

144 87

23 24

111

114

117 112 21

115

136 135

32 64

106 Iron atom 104 107 105 67

145

59 35 30

N N

52 57

His

146

148

IIe

92 101

147

O 100

53 55

118

116

132 137

84 83

134 74

FIGURE 5.37 17

73 133

131

129

81 1

13 10

78 79 2

127

120

123 16 121

124 122

15 126

Amino end of chain

125

14 130

119

128

3 8 4

myoglobin, such muscles are typically brown. The protein component of myoglobin, called globin, is a single polypeptide chain that contains eight segments of ␣-helix (Figure 5.37). The folded globin chain forms a crevice that almost completely encloses a heme group. Free heme [Fe2 ] has a high affinity for O2 and is irreversibly oxidized to form hematin [Fe3 ]. Hematin cannot bind O2. Noncovalent interactions between amino acid side chains and the nonpolar porphyrin ring within myoglobin’s oxygen-binding crevice decrease heme’s affinity for O2. The decreased affinity protects Fe2 from oxidation and allows for the reversible binding of O2. All of the heme-interacting amino acids are nonpolar except for two histidines, one of which (the proximal histidine) binds directly to the heme iron atom (Figure 5.38). The other (the distal histidine) stabilizes the oxygen-binding site.

Myoglobin With the exception of the side chain groups of two histidine residues, only the ␣-carbon atoms of the globin polypeptide are shown. Myoglobin’s eight helices are designated A through H. The heme group has an iron atom that binds reversibly with oxygen. To improve clarity one of heme’s propionic acid side chains has been displaced.

Distal histidine

N N

HEMOGLOBIN Hemoglobin is a roughly spherical molecule found in red blood

cells, where its primary function is to transport oxygen from the lungs to every tissue in the body. Recall that HbA is composed of two ␣-chains and two ␤-chains (Figure 5.39). The HbA molecule is commonly designated ␣2␤2. [There is another type of adult hemoglobin: approximately 2% of human hemoglobin is HbA2, which contains ␦ (delta)-chains instead of ␤-chains.] Before birth, several additional hemoglobin polypeptides are synthesized. The ␧ (epsilon)-chain, which appears in early embryonic life, and the ␥-chain, found in the fetus, closely resemble the ␤-chain. Because both ␣2␧2 and ␣2␥2 hemoglobins have a greater affinity for oxygen than does ␣2␤2 (HbA), the fetus can preferentially absorb oxygen from the maternal bloodstream. Although the three-dimensional configurations of myoglobin and the ␣and ␤-chains of hemoglobin are very similar, their amino acid sequences have many differences. Comparison of these molecules from dozens of species has revealed nine invariant amino acid residues. Several invariant residues directly

H O O N N

Heme

N Proximal histidine

FIGURE 5.38 The Oxygen-Binding Site of Heme Created by a Folded Globin Chain

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:58 PM

Page 44

CHAPTER FIVE Amino Acids, Peptides, and Proteins

FIGURE 5.39 Hemoglobin The protein contains four subunits, designated ␣ and ␤. Each subunit contains a heme group that binds reversibly with oxygen.

affect the oxygen-binding site, whereas others stabilize the ␣-helical peptide segments. The remaining residues may vary considerably. However, most substitutions are conservative. For example, each polypeptide’s interior remains nonpolar. The four chains of hemoglobin are arranged in two identical dimers, designated as ␣1␤1, and ␣2␤2. Each globin polypeptide has a heme-binding unit similar to that described for myoglobin. Although both myoglobin and hemoglobin bind oxygen reversibly, the latter molecule has a complex structure and more complicated binding properties. The numerous noncovalent interactions (mostly hydrophobic) between the subunits in each ␣␤-dimer remain largely unchanged when hemoglobin interconverts between its oxygenated and deoxygenated forms. In contrast, the relatively small number of interactions between the two dimers change substantially during this transition. When hemoglobin is oxygenated, the salt bridges and certain hydrogen bonds are ruptured as the ␣1␤1and ␣2␤2 dimers slide by each other and rotate 15 (Figure 5.40). The deoxygenated conformation of hemoglobin (deoxyHb) is often referred to as the T (taut) state and oxygenated hemoglobin (oxyHb) is said to be in the R (relaxed) state. The oxygen-induced readjustments in the interdimer contacts are almost simultaneous. In other words, a conformational change in one subunit is rapidly propagated to the other subunits. Consequently, hemoglobin alternates between two stable conformations, the T and R states. Because of subunit interactions, the oxygen dissociation curve of hemoglobin has a sigmoidal shape (Figure 5.41). As the first O2 binds to hemoglobin, the binding of additional O2 to the same molecule is enhanced. This binding pattern, called cooperative binding, results from changes in hemoglobin’s threedimensional structure that are initiated when the first O2 binds. The binding of the first O2 facilitates the binding of the remaining three O2 molecules to the tetrameric hemoglobin molecules. In the lungs, where O2 tension is high, hemoglobin is quickly saturated (converted to the R state). In tissues depleted of O2, hemoglobin gives up about half its oxygen. In contrast to hemoglobin, myoglobin’s oxygen dissociation curve is hyperbolic. This simpler binding pattern, a consequence of myoglobin’s simpler structure, reflects several aspects of this protein’s role in oxygen storage. Because its dissociation curve is well to the

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:58 PM

Page 45

5.3 Proteins 45

15° b2

FIGURE 5.40

15° b1

The Hemoglobin Allosteric Transition When hemoglobin is oxygenated, the ␣1␤1 and ␣2 ␤2 dimers slide by each other and rotate 15º.

(a) Deoxyhemoglobin

100

(b) Oxyhemoglobin

FIGURE 5.41

Myoglobin

Dissociation Curves Measure the Affinity of Hemoglobin and Myoglobin for Oxygen.

Hemoglobin

80 O2 saturation (%)

Arterial blood, enriched in O2, delivers it to the tissues. Venous blood, which drains from tissues, is O2 depleted.

40 Venous pressure

Arterial pressure

40 60 80 Partial pressure of oxygen (torr)

100

120

left of the hemoglobin curve, myoglobin gives up oxygen only when the muscle cell’s oxygen concentration is very low (i.e., during strenuous exercise). In addition, because myoglobin has a greater affinity for oxygen than does hemoglobin, oxygen moves from blood to muscle. The binding of ligands other than oxygen affects hemoglobin’s oxygen-binding properties. For example, the dissociation of oxygen from hemoglobin is enhanced if pH decreases. By this mechanism, called the Bohr effect, oxygen is delivered to cells in proportion to their needs. Metabolically active cells, which require large amounts of oxygen for energy generation, also produce large amounts of the waste product CO2. As CO2 diffuses into blood, it reacts with water to form HCO3– and H . (The bicarbonate buffer was discussed on p. xx.) The subsequent binding of H to several ionizable groups on hemoglobin molecules enhances the dissociation of O2 by converting hemoglobin to its T state. (Hydrogen ions bind preferentially to deoxyHb. Any increase in H concentration stabilizes the deoxy conformation of the protein and therefore shifts the equilibrium distribution between the T and R states.) When a small number of CO2 molecules bind to terminal amino groups on hemoglobin (forming carbamate or —NHCOO– groups) the deoxy form (T state) of the protein is additionally stabilized. 2,3-Bisphosphoglycerate (BPG) (also called glycerate-2,3-bisphosphate) is also an important regulator of hemoglobin function. Although most cells contain only trace amounts of BPG, red blood cells contain a considerable amount. BPG is derived from glycerate-l,3-bisphosphate, an intermediate in the breakdown of the energy-rich compound glucose. In the absence of BPG,

KEY CONCEPTS • Globular protein function usually involves binding to small ligands or to other macromolecules. • The oxygen-binding properties of myoglobin and hemoglobin are determined in part by the number of subunits they contain.

05-McKee-Chap05.qxd:05-McKee-Chap05

6:58 PM

Page 46

CHAPTER FIVE Amino Acids, Peptides, and Proteins

100 80 O2 saturation (%)

1/13/11

–BPG

+BPG

60 40 20

10 20 30 40 50 Partial pressure of oxygen (torr)

FIGURE 5.42 The Effect of 2,3-Bisphosphoglycerate (BPG) on the Affinity Between Oxygen and Hemoglobin In the absence of BPG (–BPG), hemoglobin has a high affinity for O2; where BPG is present and binds to hemoglobin (+BPG), its affinity for O2 decreases.

hemoglobin has a very high affinity for oxygen (Figure 5.42). As with H and CO2, binding BPG stabilizes deoxyHb. A negatively charged BPG molecule binds in a central cavity within hemoglobin that is lined with positively charged amino acids. In the lungs the process is reversed. A high oxygen concentration drives the conversion from the deoxyHb configuration to that of oxyHb. The change in the protein’s three-dimensional structure initiated by the binding of the first oxygen molecule releases bound CO2, H , and BPG. The H recombines with HCO3– to form carbonic acid, which then dissociates to form CO2 and H2O. Afterward, CO2 diffuses from the blood into the alveoli.

QUESTION 5.11 Fetal hemoglobin (HbF) binds to BPG to a lesser extent than does HbA. Why do you think HbF has a greater affinity for oxygen than does maternal hemoglobin?

QUESTION 5.12 Myoglobin stores O2 in muscle tissue to be used by the mitochondria only when the cell is in oxygen debt, while hemoglobin can effectively transport O2 from the lungs and deliver it discriminately to cells in need of O2. Describe the structural features that allow these two proteins to accomplish separate functions.

5.4 MOLECULAR MACHINES Purposeful movement is thehallmark of living organisms. This behavior takes myriad forms that range from the record-setting 110 km/h chasing sprint of the cheetah to more subtle movements such as the migration of white blood cells in the animal body, cytoplasmic streaming in plant cells, intracellular transport of organelles, and the enzyme-catalyzed unwinding of DNA. The multisubunit proteins responsible for these phenomena (e.g., the muscle sarcomere and various other types of cytoskeletal components, and DNA polymerase) function as biological machines. Machines are defined as mechanical devices with moving parts that perform work (the product of force and distance). When machines are used correctly, they permit the accomplishment of tasks that would often be impossible without them. Although biological machines are composed of relatively fragile

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

7:34 PM

Page 47

5.4 Molecular Machines 47

proteins that cannot withstand the physical conditions associated with humanmade machines (e.g., heat and friction), the two types do share important features. In addition to having moving parts, all machines require energy-transducing mechanisms; that is, they convert energy into directed motion. Despite the wide diversity of motion types in living organisms, in all cases, energydriven changes in protein conformations result in the accomplishment of work. Protein conformation changes occur when a ligand is bound. When a specific ligand binds to one subunit of a multisubunit protein complex, the change in its conformation will affect the shapes of adjacent subunits. These changes are reversible; that is, ligand dissociation from a protein causes it to revert to its previous conformation. The work performed by complex biological machines requires that the conformational and, therefore, functional changes occur in an orderly and directed manner. In other words, an energy source (usually provided by the hydrolysis of ATP or GTP) drives a sequence of conformational changes of adjacent subunits in one functional direction. The directed functioning of biological machines is possible because nucleotide hydrolysis is irreversible under physiological conditions.

Motor Proteins Despite their functional diversity, all biological machines possess one or more protein components that bind nucleoside triphosphates (NTP). These subunits, called NTPases, function as mechanical transducers or motor proteins. The NTP hydrolysis–driven changes in the conformation of a motor protein trigger ordered conformational changes in adjacent subunits in the molecular machine. NTP-binding proteins perform a wide variety of functions in eukaryotes, most of which occur in one or more of the following categories. 1. Classical motors. Classical motor proteins are ATPases that move a load along a protein filament, as shown earlier (Figure 2.4). The best-known examples include the myosins, which move along actin filaments, and the kinesins and dyneins, which move vesicles and organelles along microtubules. Kinesins walk along the microtubules toward the (⫹) end, away from the centrosome (the microtubule organizing center). Dyneins walk along the microtubules toward the (–) end, toward the centrosome. 2. Timing devices. The function of certain NTP-binding proteins is to provide a delay period during a complex process that ensures accuracy. The prokaryotic protein synthesis protein EF-Tu (Biochemistry in Perpective Box EF-TU: A Motor Protein, Ch. 19) is a well-known example. The relatively slow rate of GTP hydrolysis by EF-Tu when it is bound to an aminoacyl-tRNA allows sufficient time for the dissociation of the complex from the ribosome if the tRNA-mRNA base sequence binding is not correct. 3. Microprocessing switching devices. A variety of GTP-binding proteins act as on-off molecular switches in signal transduction pathways. Examples include the ␤-subunits of the trimeric G proteins. Numerous intracellular signal control mechanisms are regulated by G proteins. 4. Assembly and disassembly factors. Numerous cellular processes require the rapid and reversible assembly of protein subunits into larger molecular complexes. Among the most dramatic examples of protein subunit polymerization are the assembly of tubulin and actin into microtubules and microfilaments, respectively. The slow hydrolysis of GTP by tubulin and ATP by actin monomers, after the incorporation of these molecules into their respective polymeric filaments, promotes subtle conformational changes that later allow disassembly. The best-characterized motor protein is myosin. A brief overview of the structure and function of myosin in the molecular events in muscle contraction is provided online in the Biochemistry in Perspective essay Myosin: A Moecular Machine.

Visit the companion website at www.oup.com/us/mckee to read the Biochemistry in Perspective essay on myosin: a molecular machine.

05-McKee-Chap05.qxd:05-McKee-Chap05

6:58 PM

Page 48

CHAPTER FIVE Amino Acids, Peptides, and Proteins

Biochemistry IN PERSPECTIVE Spider Silk and Biomimetics What properties of spider silk have made it the subject of research worth hundreds of millions of dollars? The female golden orb-web spider of Madagascar (Nephila madagascariensis) (Figure 5A) is a large spider (length = 12.7 cm or 5 in) that is so named because of its bright yellow silk. In a recent and astonishing effort two Madagascar businessmen oversaw the creation of a hand-woven 11- foot brocaded spider silk textile (Figure 5B). The large amount of silk required in this endeavor was obtained by literally harnessing thousands of orb-web spiders. (After gentle hand pulling of the dragline silk, the spiders were then released.) The silk fiber used in the weaving process was twisted into 96-ply thread. Amazingly, when the Madagascar textile was on display in a New York museum, the owners challenged an onlooker to break a thread in one of the tassels. Unable to do so, he compared its strength with that of a chain in a bicycle lock. In addition to textiles, biodegradable and lightweight spider silk is preferable to artificial fibers for a variety of applications, such as artificial tendon and ligament components, surgical thread (e.g., eye sutures), and lightweight armor (e.g., bulletproof vests

FIGURE 5B Detail from the Madagascar Spider Silk Textile. The design is a classic Malagascan weaving pattern.

FIGURE 5A The Golden Orb-Web Spider Nephila madagascariensis. Spiders have relatively poor eyesight. Usually sitting in the center of the web, a spider senses the arrival of prey by monitoring vibrations in the web threads with sensitive hairs that cover various parts of its body.

and helmets). This fiber is desirable not only because of its remarkable mechanical properties, but also because spiders produce it at ambient temperature and pressure with water as the solvent. In contrast, Kevlar, an aramid (aromatic amide) polymer derived from petroleum, is synthesized by forcing an almost boiling mixture of monomers dissolved in sulfuric acid through the small holes of an industrial spinneret (a multipored device used to convert a polymer into individual fibers). However, despite enormous effort and millions of dollars of investment, most applications of spider silk have not materialized for the simple reason that there is no adequate source. The obvious solution would be spider farming, similar to the over 5000-year-old practice (originating in China) of cultivating the domesticated silkworm moth (Bombyx mori). (Silkworm silk is less tough and elastic than spider silk.) Unfortunately, spider silk farming is not commercially feasible, because spiders are cannibals. (In close quarters these aggressive organisms proceed to eat one another.) Handling each of thousands of spiders separately is also unworkable because of its expense. (The creation of the Madagascar textile cost $500,000.) An alternative strategy is the biomimetic industrial synthesis of artificial spider silk. Biomimetics solves engineering problems by emulating biological processes such as the

â&#x2013;ź

1/13/11

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:58 PM

Page 49

Biochemistry in Perspective 49

Biochemistry

IN PERSPECTIVE cont

spinning of spider silk.Success, unfortunately, has been limited. For example, attempts to use recombinant DNA technology to synthesize spider silk by inserting silk genes into bacteria and yeast and then recovering silk proteins have been disappointing. There has been limited success with transgenic goats, animals into which a spider silk protein gene has been inserted. Efforts at spinning the goat silk protein, purified from the animal’s milk, resulted in fibers with inferior mechanical properties with diameters (10–60 mm) that were considerably thicker than natural spider silk (2.5-5 mm). However, the research efforts continue because the goal of industrial engineers will certainly be worth the investment: biodegradable, environmentally safe artificial spider silk will offer an alternative to petroleum-based fibers. Throughout this effort, scientists will further probe both spider silk structure and the biological spinning process.

Spider Silk Structure Dragline silk is composed of two proteins, spidroin 1 and spidroin 2, that have molecular masses that range from 200 to 350 kDa. Spidroin amino acid composition is distinctive because the majority of residues are glycine (42%) and alanine (25%), with smaller amounts of amino acids with bulkier side chains (Arg, Tyr, Gln, Ser, Leu, and Pro). Both spidroin proteins contain two major types of repeating units: polyalanine sequences (5 to 10 residues) and glycine-enriched motifs such as polyGlyAla, GlyProGlyGlyX (where X is often Gln), and GlyGlyX (where X can be various amino acids). In mature silk protein the polyalanine and polyGlyAla sequences form antiparallel β-pleated sheets, the microcrystalline structures that give silk its tensile strength (Figure 5C). The β-pleated sheets are connected by glycine-enriched sequences that form random coil, β-spirals (similar to β-turns), and GlyGlyX helical structures that together constitute an amorphous and elastic matrix.

Spider Silk Fiber Assembly

FIGURE 5C Diagrammatic View of a Spider Silk Filament Segment. The structure of spider silk is not known with precision. It is known that two types of β-pleated sheet (highly ordered and less ordered) are responsible for spider silk strength. They are linked to each other by polypeptide sequences that form the random coil, left-handed helices, and β-spirals that provide elasticity. Silks differ in their β-pleated sheet and random coil content. For example, dragline silk has a higher content of β-pleated sheet than does capture silk.

▼

Dragline spider silk fiber production provides a rare opportunity to observe protein folding as it occurs in a living organism. The spinning process (Figure 5D) begins in the ampullate gland in the spider’s abdomen where epithelial cells secrete the spidroin into the gland’s lumen. Silk protein, referred to as the silk feedstock or dope, is highly concentrated (as high as 50%). At this stage spidroin’s globular conformation (about 30% α–helices) ensures its solubility in water and prevents aggregation. The silk dope is squeezed through the ampullate gland and the narrow funnel that connects to the spinning duct. Here the flowing dope begins to assume the properties of a liquid crystal as long axes of the protein molecules are forced into parallel orientation. The tapered S-shape spinning duct has three segments. As the protein moves through the segments,

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:58 PM

Page 50

CHAPTER FIVE Amino Acids, Peptides, and Proteins

Biochemistry

IN PERSPECTIVE cont

Major Ampullate Gland Funnel

Spinning Duct Na+

Cl-

PO42K+ Spigots

Spidroin Secretion

H+

H2O

H 2O

Valve

Dragline Silk Fibre

H2O Spinneret

FIGURE 5D Processing of Spider Dragline Silk. After the spidroins are secreted into the lumen of the major ampullate gland they move toward the funnel, where they exit into the beginning of the spinning duct. As a result of shear stress and other forces (e.g., squeezing of the wall of the ampullate gland and the pulling of the silk fiber out of the spinneret by the spider), the spidroins in the silk dope are compressed and forced to align along their long axes. As the silk polymer progresses down the tapering duct, biochemical changes (e.g., Na+, K+, and H+) cause the conversion of α-helices into hydrophobic β-pleated sheets that expel H2O. After passing through the valve the polymer is forced through one of several spigots. Several emerging filaments are twisted together to form a silk fiber that is pulled out of the spinneret by the spider.

nascent silk polymer forms as a result of increasing shear stress (force applied by the parallel duct wall) and several biochemical environment changes. Within the duct, Na+ and Cl- are extracted and phosphate and K + are pumped in. An increase in the K+/Na+ ratio, combined with the secretion of phosphate and H+, is believed to cause the conversion of αhelical conformations to β-pleated sheets. At first randomly oriented, the β-pleated sheets are eventually forced into parallel alignment with the long axis of the filament. In the third segment of the duct, large amounts of water, released from the silk protein as hydrophobic interactions increase, are pumped out by epithelial cells. The valve at the end of the duct is believed to act as a clamp that grips the silk and a means of restarting the spinning process if the silk breaks. The silk polymer then enters one of numerous spigots within a spinneret (Figure 5E). As the silk filament emerges and the remaining water evaporates, it is solid. The filaments from numerous spigots wrap around each other to form a cable-like fiber. The diameter and strength of the fiber depend on the muscular tension within the spinneret valve and how fast the spider draws it out.

FIGURE 5E Illustration of the Silk Spinning Spigots of a Spider Spinneret. Note that emerging filaments are twisting together to form a fiber.

SUMM AR Y: Biodegradable, lightweight, strong spider silk has an enormous number of potential applications. Intense, and as yet unsuccessful, research efforts have focused on duplicating the natural process by which spiders produce this remarkable fiber.

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:58 PM

Page 51

Biochemistry in Perspective 51

Biochemistry IN THE LAB

Protein Technology

Purification Protein analysis begins with isolation and purification. Extraction of a protein requires cell disruption and homogenization (see Biochemistry in the Lab, Cell Technology, Chapter 2). This process is often followed by differential centrifugation and, if the protein is a component of an organelle, by density gradient centrifugation. After the protein-containing fraction has been obtained, several relatively crude methods may be used to enhance purification. In salting out, high concentrations of salts such as ammonium sulfate [(NH4)2SO4] are used to precipitate proteins. Because each protein has a characteristic salting-out point, this technique removes many impurities. (Unwanted proteins that remain in solution are discarded when the liquid is decanted.) When proteins are tightly bound to membrane, organic solvents or detergents often aid in their extraction. Dialysis (Figure 5F) is routinely used to remove low-molecular-weight impurities such as salts, solvents, and detergents. As a protein sample becomes progressively more pure, more sophisticated methods are used to achieve further purification. Among the most commonly used techniques are chromatography and electrophoresis.

Chromatography Originally devised to separate low-molecular-weight substances such as sugars and amino acids, chromatography has become an invaluable tool in protein purification. A wide variety of chromatographic techniques are used to separate protein mixtures on the basis of molecular properties such as size, shape, and weight,

Protein molecule

Water out Water in

Small solute molecule

FIGURE 5F Dialysis Proteins are routinely separated from low-molecular-weight impurities by dialysis. When a dialysis bag (an artificial semipermeable membrane) containing a cell extract is suspended in water or a buffered solution, small molecules pass out through the membraneâ&#x20AC;&#x2122;s pores. If the solvent outside the bag is continually renewed, all low-molecular-weight impurities are removed from the inside.

or certain binding affinities. Often several techniques must be used sequentially to obtain a demonstrably pure protein. In all chromatographic methods the protein mixture is dissolved in a liquid known as the mobile phase. As the protein molecules pass across the stationary phase (a solid matrix), they separate from each other because they are differently distributed between the two phases. The relative movement of each molecule results from its capacity to remain associated with the stationary phase while the mobile phase continues to flow. Three chromatographic methods commonly used in protein purification are gel-filtration chromatography, ion-exchange chromatography, and affinity chromatography. Gel-filtration chromatography (Figure 5G) is a form of size-exclusion chromatography in which particles in an aqueous solution flow through a column (a hollow tube) filled with gel and are separated according to size. Molecules that are larger than the gel pores are excluded and therefore move through the column quickly. Molecules that are smaller than the gel pores diffuse in and out of the pores, so their movement through the column is retarded. Differences in the rates of particle movement separate the protein mixture into bands, which are then collected separately. Ion-exchange chromatography separates proteins on the basis of their charge. Anion-exchange resins, which consist of positively charged materials, bind reversibly with a proteinâ&#x20AC;&#x2122;s negatively charged groups. Similarly, cation-exchange resins bind positively charged groups. After proteins that do not bind to the resin have been removed, the protein of interest is recovered by

â&#x2013;ź

iving organisms produce a stunning variety of proteins. Consequently, it is not surprising that considerable time, effort, and funding have been devoted to investigating their properties. Since the amino acid sequence of bovine insulin was determined by Frederick Sanger in 1953, the structures of several thousand proteins have been elucidated. In contrast to the 10 years required for insulin, current technologies allow protein sequence determination within a few days by mass spectrometry. The amino acid sequence of a protein can be generated from its DNA or mRNA sequence if this information is available. After a brief review of protein purification methods, mass spectrometry is described. An older means of determining the primary sequence of polypeptides, the Edman degradation method, is described in an online Biochemistry in the Lab box Protein Sequencing: The Edman Degradation Method. Note that all the techniques for isolating, purifying, and characterizing proteins exploit differences in charge, molecular weight, and binding affinities. Many of these technologies apply to the investigation of other biomolecules.

05-McKee-Chap05.qxd:05-McKee-Chap05

6:58 PM

Page 52

CHAPTER FIVE Amino Acids, Peptides, and Proteins

Biochemistry

IN THE LAB

cont

Small molecules can penetrate beads: passage is retarded Buffer Column of stationary porous beads

Large molecules move between beads

Buffer Sample Solvent flow

Absorbant material

Buffer

Later time 1

FIGURE 5G Gel-Filtration Chromatography In gel-filtration chromatography the stationary phase is a gelatinous polymer with pore sizes selected by the experimenter to separate molecules according to their sizes. The sample is applied to the top of the column and is eluted with buffer (the mobile phase). As elution proceeds, larger molecules travel faster through the gel than smaller molecules, whose progress is slowed because they can enter the pores. If fractions are collected, the larger molecules appear in the earlier fractions and later fractions contain smaller molecules.

an appropriate change in the solvent pH and/or salt concentration. (A change in pH alters the proteinâ&#x20AC;&#x2122;s net charge.) Affinity chromatography takes advantage of the unique biological properties of proteins. That is, it uses a special non-

covalent binding affinity between the protein and a special molecule (the ligand). The ligand is covalently bound to an insoluble matrix, which is placed in a column. After nonbinding protein molecules have passed through the column, the protein of interest

â&#x2013;ź

1/13/11

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:58 PM

Page 53

Biochemistry in the Lab 53

Biochemistry

IN THE LAB

cont

is removed by altering the conditions that affect binding (i.e., pH or salt concentration).

Electrophoresis Because proteins are electrically charged, they move in an electric field. In this process, called electrophoresis, molecules separate from each other because of differences in their net charge. For example, molecules with a positive net charge migrate toward the negatively charged electrode (cathode). Molecules with a net negative charge will move toward the positively charged electrode (anode). Molecules with no net charge will not move at all. Electrophoresis, one of the most widely used techniques in biochemistry, is nearly always carried out by using gels such as polyacrylamide or agarose. The gel, functioning much as it does in gel-filtration chromatography, also acts to separate proteins on the basis of their molecular weight and shape. Consequently, gel electrophoresis is highly effective at separating complex mixtures of proteins or other molecules. Bands resulting from a gel electrophoretic separation may be treated in several ways. Specific bands may be excised from the gel after visualization with ultraviolet light. Each protein-containing slice is then eluted with buffer and prepared for further analysis. Because of its high resolving power, gel electrophoresis is also used to assess the purity of protein samples. Staining gels with a dye

Sample

such as Coomassie Brilliant Blue is a common method for quickly assessing the success of a purification step. SDSâ&#x20AC;&#x201C;polyacrylamide gel electrophoresis (SDS-PAGE) is a widely used variation of electrophoresis that can be used to determine molecular weight (Figure 5H). SDS, a negatively charged detergent, binds to the hydrophobic regions of protein molecules, causing the proteins to denature and assume rodlike shapes. Because most molecules bind SDS in a ratio roughly proportional to their molecular weights, during electrophoresis SDS-treated proteins migrate toward the anode ( pole) only in relation to their molecular weight.

Mass Spectrometry Mass spectrometry (MS) is a powerful and sensitive technique for separating, identifying, and determining the mass of molecules. It exploits differences in their mass-to-charge (m/z) ratios. In a mass spectrometer, ionized molecules flow through a magnetic field (Figure 5I). The magnetic field force deflects the ions depending on their m/z ratios with lighter ions being more deflected from a straight-line path than heavier ions. A detector measures the deflection of each ion. In addition to protein identity and mass determinations, MS is also used to detect bound cofactors and protein modifications. Because MS analysis involves the ionization and vaporization of the substances to

Myosin

200,000

Î˛-Galactosidase Glycogen phosphorylase b Bovine serum albumin

116,250 97,400 66,200

Ovalbumin

45,000

Carbonic anhydrase

31,000

Soybean trypsin inhibitor Lysozyme

21,500 14,400

Direction of migration -

Mr Standards

Unknown protein

+ +

(a)

(b)

FI GURE 5H Gel Electrophoresis

â&#x2013;ź

(a) Gel apparatus. The samples are loaded into wells. After an electric field is applied, the proteins move into the gel. (b) Molecules separate and move in the gel as a function of molecular weight and shape.

05-McKee-Chap05.qxd:05-McKee-Chap05

6:58 PM

Page 54

CHAPTER FIVE Amino Acids, Peptides, and Proteins

Biochemistry

IN THE LAB

cont

be investigated, its use in the analysis of thermally unstable macromolecules such as proteins and nucleic acids did not become feasible until methods such as electrospray ionization and matrix-assisted laser desorption ionization (MALDI) had been developed. In electrospray ionization a solution containing the protein of interest is sprayed in the presence of a strong electrical field into a port in the spectrometer. As the protein droplets exit the injection device, typically an ultrafine glass tube, the protein molecules becomes charged. In MALDI, a laser pulse vaporizes the protein, which is embedded in a solid matrix. Once the sample has been ionized, its molecules, now in the gas phase, are separated according to their individual m/z ratios. A detector within the mass spectrometer produces a peak for each ion. In a computer-assisted process, information concerning each ionâ&#x20AC;&#x2122;s mass is compared against data for ions of

Glass capillary

known structure and used to determine the sampleâ&#x20AC;&#x2122;s molecular identity. Protein sequencing analysis makes use of tandem MS (two mass spectrometers linked in series, MS/MS). A protein of interest, often extracted from a band in a gel, is then digested by a proteolytic enzyme. Subsequently, the enzyme digest is injected into the first mass spectrometer, which separates the oligopeptides according to their m/z ratios. One by one, each oligopeptide ion is directed into a collision chamber, where it is fragmented by collisions with hot inert gas molecules. Product ions, peptides that differ from each other in size by one amino acid residue, are then sequentially directed into the second mass spectrometer. A computer identifies each peak and automatically determines the amino acid sequence of the peptides. The process is then repeated with oligopeptides derived from digestion with

Mass Analyze

Sample solution + + ++ + ++ ++ ++

+ + ++ + ++ ++ ++

Detector

+ High voltage

Signal Processor

Vacuum System

(a)

Relative Intensity (%)

60+

50+

100 40+

100

57,712

50 50 25

900 (b)

1000

1200 M/Z

1400

0 57,000

1600 (c)

58,000 Mr

FIGURE 5I Mass Spectrometry (a) The principal steps in electrospray ionization. The sample (a protein dissolved in a solvent) is injected via a glass capillary into the ionization chamber. The voltage difference between the electrospray needle and the injection port results in the creation of protein ions. The solvent evaporates during this phase. The ions enter the mass spectrometer, which then measures their m/z ratios. (b) An electrospray mass spectrum showing the m/z ratios for several peaks. (c) A computer analysis of the data showing the molecular mass of the sample protein (Mr = molecular weight).

â&#x2013;ź

1/13/11

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:58 PM

Page 55

Biochemistry in the Lab 55

Biochemistry

IN THE LAB

cont

Small molecules can penetrate beads: passage is retarded Buffer Column of stationary porous beads

Large molecules move between beads

Buffer O

Sample Solvent flow

Absorbant material

FIGURE 5J Schematic Diagram of X-Ray Crystallography X-rays are useful in the analysis of biomolecules because their wavelength range is quite similar to the magnitude of chemical bonds. Consequently, the resolving power of X-ray crystallography is equivalent to interatomic distances.

another enzyme. The computer uses the sequence information derived from both digests to determine the amino acid sequence of the original polypeptide.

Protein Sequence-based Function Prediction Once a polypeptide has been isolated, purified, and sequenced, the next logical step is to determine its function. This endeavor usually begins with a database search of known protein sequences. BLAST (Basic Local Alignment Search Tool) is a computer program (www.ncbi.nim.nih.gov/blast) that allows fast searches of known sequences for matches to the unknown protein sequence (the query sequence). Protein sequence databases (e.g., UniProt [Universal Protein resource] www.uniprot.org) are sufficiently large that about 50% of sequence comparison queries yield matched sequences that are close enough to infer function.

X-Ray Crystallography Much of the three-dimensional structural information about proteins was obtained by X-ray crystallography. Because the

bond distances in proteins are approximately 0.15 nm, the electromagnetic radiation used to resolve protein structure must have a short wavelength. Visible light wavelengths [(␭) 400–700 nm] clearly does not have sufficient resolving power for biomolecules. X-rays, however, have very short wavelengths (0.07–0.25 nm). In X-ray crystallography, highly ordered crystalline specimens are exposed to an X-ray beam (Figure 5J). As the X-rays hit the crystal, they are scattered by the atoms in the crystal. The diffraction pattern that results is recorded on charge-coupled device (CCD) detectors. The diffraction patterns are used to construct an electron density map. Because there is no objective lens to recombine the scattered X-rays, the three-dimensional image is reconstructed mathematically. Computer programs now perform these extremely complex and laborious computations. The threedimensional structure of a polypeptide can also be determined using homologous modeling, a method that is based on the observation that three-dimensional protein structure is more conserved than protein sequences. A structural model is constructed from X-ray diffraction data of one or more homologous proteins in the Protein Data Bank (www.pdb.org).

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

7:35 PM

Page 56

CHAPTER FIVE Amino Acids, Peptides, and Proteins

Chapter Summary 1. Polypeptides are amino acid polymers. Proteins may consist of one or more polypeptide chains. 2. Each amino acid contains a central carbon atom (the ␣-carbon) to which an amino group, a carboxylate group, a hydrogen atom, and an R group are attached. In addition to comprising protein, amino acids have several other biological roles. According to their capacity to interact with water, amino acids may be separated into four classes: nonpolar, polar, acidic, and basic. 3. Titration of amino acids and peptides illustrates the effect of pH on their structures. The pH at which a molecule has no net charge is called its isoelectric point. 4. Amino acids undergo several chemical reactions. Two reactions are especially important: peptide bond formation and cysteine oxidation. 5. Proteins have a vast array of functions in living organisms. In addition to serving as structural materials, proteins are involved in metabolic regulation, transport, defense, and catalysis. Some proteins are multifunctional; that is, they have two or more seemingly unrelated functions. Proteins can also be classified into families and superfamilies, according to their sequence similarities as well as their shapes and composition. Fibrous proteins (e.g., collagen) are long, rod-shaped molecules that are insoluble in water and physically tough. Globular proteins (e.g., hemoglobin) are compact, spherical molecules that are usually soluble in water. 6. Biochemists have distinguished four levels of protein structure. Primary structure, the amino acid sequence, is specified by genetic information. As the polypeptide chain folds, local folding patterns constitute the protein’s secondary structure. The overall three-dimensional shape that a polypeptide

assumes is called the tertiary structure. Proteins that consist of two or more polypeptides have quaternary structure. The functions of numerous proteins, especially molecules that participate in eukaryotic regulatory processes, are partially or completely unstructured. Many physical and chemical conditions disrupt protein structure. Denaturing agents include strong acids or bases, reducing agents, organic solvents, detergents, high salt concentrations, heavy metals, temperature changes, and mechanical stress. 7. One of the most important aspects of protein synthesis is the folding of polypeptides into their biologically active conformations. Despite decades of investigation into the physical and chemical properties of polypetide chains, the mechanism by which a primary sequence dictates the molecule’s final conformation is unresolved. Many proteins require molecular chaperones to fold into their final three-dimensional conformations. Protein misfolding is now known to be an important feature of several human diseases, including Alzheimer’s disease and Huntington’s disease. 8. Fibrous proteins (e.g., ␣-keratin and collagen), which contain high proportions of ␣-helices or ␤-pleated sheets, have structural rather than dynamic roles. Despite their varied functions, most globular proteins have features that allow them to bind to specific ligands or sites on certain macromolecules. These binding events involve conformational changes in the globular protein’s structure. 9. The biological activity of complex multisubunit proteins is often regulated by allosteric interactions in which small ligands bind to the protein. Any change in the protein’s activity is due to changes in the interactions among the protein’s subunits. Effectors can increase or decrease the function of a protein.

Take your learning further by visiting the companion website for Biochemistry at www.oup.com/us/mckee where you can complete a multiple-choice quiz on amino acids, peptides, and proteins to help you prepare for exams.

Suggested Readings Bustamonte, C., Of Torques, Forces, and Protein Machines, Protein Sci. 13:306l–65, 2004. Dyson, H. J., and Wright, P. E., Intrinsically Unstructured Proteins and Their Functions, Nat. Rev. Mol. Cell Biol. 6(3):197–208, 2005. Fink, A. L., Natively Unfolded Proteins, Curr. Opin. Struct Biol. 15:35–41, 2005. Heim, M., Keerl, D., and Scheibel, T., Spider Silk: From Soluble Protein to Extraordinary Fiber, Angewandte Chem. Int. Ed. 48:3584–3596, 2009.

Lindorff, K., Rogen, P., Poci, E., Vendruscolo, M., and Dobson, M., Protein Folding and the Organization of the Protein Topology Universe, Trends, Biochem. Sci. 30(l):l3–19, 2005. Mattos, C., Protein-Water Interactions in a Dynamic World, Trends Biochem. Sci. 27(4):203–208, 2002. Schnabel, J., Protein Folding: The Dark Side of Proteins, Nature 484:828–829, 2010. Tompa, P., Szasz, C., and Buday, L., Structural Disorder Throws New Light on Moonlighting. Trends Biochem. Sci. 30(9):484–489, 2005.

05-McKee-Chap05.qxd:05-McKee-Chap05

1/13/11

6:58 PM

Page 57