Download Matrix algebra: theory, computations and applications in statistics 3rd edition james e. ge
3rd Edition James E. Gentle
Visit to download the full and correct content document: https://textbookfull.com/product/matrix-algebra-theory-computations-and-applicationsin-statistics-3rd-edition-james-e-gentle/
More products digital (pdf, epub, mobi) instant download maybe you interests ...
Linear Algebra and Matrix Computations With MATLAB 1st Edition Dingyü Xue
Theory, Computations and Applications in Statistics
Third Edition
Springer Texts in Statistics
Series Editors
G. Allen, Department of Statistics, Rice University, Houston, TX, USA
R. De Veaux, Department of Mathematics and Statistics, Williams College, Williamstown, MA, USA
R. Nugent, Department of Statistics, Carnegie Mellon University, Pittsburgh, PA, USA
Springer Texts in Statistics (STS) includes advanced textbooks from 3rd-to 4th-year undergraduate levels to 1st-to 2nd-year graduate levels. Exercise sets should be included. The series editors are currently Genevera I. Allen, Richard D. De Veaux, and Rebecca Nugent. Stephen Fienberg, George Casella, and Ingram Olkin were editors of the series for many years.
James E. Gentle
Matrix Algebra
Theory, Computations and Applications in Statistics
Third Edition
James E. Gentle
Fairfax, VA, USA
ISSN 1431-875XISSN 2197-4136 (electronic)
Springer Texts in Statistics
ISBN 978-3-031-42143-3ISBN 978-3-031-42144-0 (eBook) https://doi.org/10.1007/978-3-031-42144-0
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Paper in this product is recyclable.
To María
Preface to Third Edition
This book is different from the several other books on the general topic of “matrix algebra and statistics” or “linear algebra and statistics” in its more extensive coverage of the applications to statistical linear models (mostly in Part II, especially Chap. 9) and the discussions of numerical computations (mostly in Part III). This book also includes numerous examples of R in matrix computations.
The lengths of the chapters vary; I emphasized unity of topics rather than consistency of numbers of pages. Some topics receive attention at multiple places in the book. The repetition is intentional because of different motivations or points of view. The book has extensive cross-references and a large index to facilitate finding discussions in other parts of the book.
The book can serve different audiences. It can be used as a general reference b ook for an applied mathematician, a self-learner, or someone who just needs a refresher. The extensive index should be useful for such persons.
The book, especially Part I, the first seven chapters, can serve as a text for a mathematically oriented course in linear algebra for advanced undergraduates or graduates.
It could also serve as a textbook for a course in linear models. The primary emphasis would be on Part II, with extensive cross-references to the relevant theory in other chapters.
A course in statistical computing or numerical analysis could be based on the nitty-gritty of computer arithmetic, the computational issues, and general design of algorithms in Part III. The emphasis would be on numerical linear algebra, and there would likely be extensive cross-references to the underlying theory in other chapters.
There are separate sections in several chapters that discuss the R programming system, starting from the basics and going through some of the more advanced features of R. There are also many examples and numerous exercises using R. This could serve as a quick course or a refresher course in R. R is also used for illustration in various places in the text.
As in the revisions for the second edition, in this third edition I have corrected all known remaining typos and other errors; I have (it is hoped) clarified certain passages; I have added some additional material; and I have enhanced the Index. I have also added exercises in some of the chapters.
The overall organization of chapters has been preserved, but some sections have been changed. The two chapters that have been changed most are the original Chap. 4, which is now Chap. 7 and has more coverage of multivariate probability distributions, and Chap. 9, with more material on linear models.
In this edition, I discuss the R software system much more frequently. It is likely that most readers know R, but I give a brief introduction to R in Chap. 1 The most commonly used objects in this book are of class matrix, but I use data.frame for the linear models of Chap. 9. I do not use any of the objects, functions, or operators in the set of Tidy packages, which are very popular nowadays.
I require use of R in several exercises, and assume the reader/user has, or will develop, at least a moderate level of competence in use of R. R, as any software system, is learned through usage. Some of the exercises, especially in Part III, also require competence in Fortran or C.
The notation and terms that I use are “standard”; that is, they are (I think) among the most commonly used ones in discussions of matrices and linear algebra, especially by statisticians. Before delving into the book, the reader may want to take a quick look at Appendix A,and then refertoit whenever it is necessary to refresh the recognition of a symbol or term.
In previous editions, I had included answers to selected exercises. In this edition, I have moved the solutions online.
I thank the readers of the first and second editions who informed me of errors, or who made suggestions for improvement. I particularly thank Professor M. Yatov for extensive comments on notation and definitions, as well as for noting several errata and gaps in logic. I also thank Shijia Jin for many general comments and suggestions. Any remaining typos, omissions, and so on are entirely my own responsibility.
I thank John Chambers, Robert Gentleman, and Ross Ihaka for their foundational work on R. I thank the R Core Team and the many package developers and those who maintain the packages that make R more useful.
Again, I thank my wife, María, to whom this book is dedicated, for everything.
PrefacetoThirdEditionIX
I would appreciate receiving suggestions for improvement and notification of errors. Notes on this book, hints and solutions to exercises, and errata are available at
https://mason.gmu.edu/~jgentle/books/matbk/
Fairfax County, VA, USAJames E. Gentle June 30, 2023
Preface to Second Edition
In this second edition, I have corrected all known typos and other errors; I have (it is hoped) clarified certain passages; I have added some additional material; and I have enhanced the Index.
I have added a few more comments about vectors and matrices with complex elements, although, as before, unless stated otherwise, all vectors and matrices in this book are assumed to have real elements. I have begun to use “ det(A)” rather than “ |A|” to represent the determinant of A, except in a few cases. I have also expressed some derivatives as the transposes of the expressions I used formerly.
I have put more conscious emphasis on “user-friendliness" in this edition. In a book, user-friendliness is primarily a function of references, both internal and external, and of the index. As an old software designer, I’ve always thought that user-friendliness is very important. To the extent that internal references were present in the first edition, the positive feedback I received from users of that edition about the friendliness of those internal references (“I liked the fact that you said ‘equation (x.xx) on page yy’, instead of just ‘equation (x.xx)’. ”) encouraged me to try to make the internal references even more useful. It’s only when you’re “eating your own dogfood”, that you become aware of where details matter, and in using the first edition, I realized that the choice of entries in the Index was suboptimal. I have spent significant time in organizing it, and I hope that the user will find the Index to this edition to be very useful. I think that it has been vastly improved over the index in the first edition.
The overall organization of chapters has been preserved, but some sections have been changed. The two chapters that have been changed most are Chaps. 3 and 12. Chapter 3, on the basics of matrices, got about 30 pages longer. It is by far the longest chapter in the book, but I just didn’t see any reasonable way to break it up. In Chap. 12 of the first edition, “Software for Numerical Linear Algebra”, I discussed four software systems or languages: C/C++, Fortran, MATLAB, and R, and did not express any preference for one over another. In this edition, although I occasionally mention various languages and systems, I now limit most of my discussion to Fortran and R.
There are many reasons for my preference for these two systems. R is oriented toward statistical applications. It is open source and freely distributed. As for Fortran versus C/C++, Python, or other programming languages, I agree with the statement by Hanson and Hopkins (2013, page ix), “...Fortran is currently the best computer language for numerical software.” Many people, however, still think of Fortran as the language their elders (or they themselves) used in the 1970s. (On a personal note, Richard Hanson, who passed away recently, was a member of my team that designed the IMSL C Libraries in the mid1980s. Not only was C much cooler than Fortran at the time, but the ANSI committee working on updating the Fortran language was so fractured by competing interests, that approval of the revision was repeatedly delayed. Many numerical analysts who were not concerned with coolness turned to C because it provided dynamic storage allocation and it allowed flexible argument lists, and the Fortran constructs could not be agreed upon.)
Language preferences are personal, of course, and there is a strong “coolness factor” in choice of a language. Python is currently one of the coolest languages, but I personally don’t like the language for most of the stuff I do.
Although this book has separate parts on applications in statistics and computational issues as before, statistical applications have informed the choices I made throughout the book, and computational considerations have given direction to most discussions.
I thank the readers of the first edition who informed me of errors. Two people in particular made several meaningful comments and suggestions. Clark Fitzgerald not only identified several typos, he made several broad suggestions about organization and coverage that resulted in an improved text (I think). Andreas Eckner found, in addition to typos, some gaps in my logic, and also suggested better lines of reasoning at some places. (Although I don’t follow an itemized “theorem-proof” format, I try to give reasons for any non-obvious statements I make.) I thank Clark and Andreas especially for their comments. Any remaining typos, omissions, gaps in logic, and so on are entirely my responsibility.
Again, I thank my wife, María, to whom this book is dedicated, for everything.
IusedTEXvia LATEX2ε to write the book. I did all of the typing, programming, etc., myself, so all misteaks (mistakes!) are mine. I would appreciate receiving suggestions for improvement and notification of errors. Notes on this book, including errata, are available at https://mason.gmu.edu/~jgentle/books/matbk/
Fairfax County, VA, USAJames E. Gentle July 14, 2017
Preface to First Edition
I began this book as an update of Numerical Linear Algebra for Applications in Statistics, published by Springer in 1998. There was a modest amount of new material to add, but I also wanted to supply more of the reasoning behind the facts about vectors and matrices. I had used material from that text in some courses, and I had spent a considerable amount of class time proving assertions made but not proved in that book. As I embarked on this project, the character of the book began to change markedly. In the previous book, I apologized for spending 30 pages on the theory and basic facts of linear algebra before getting on to the main interest: numerical linear algebra. In the present book, discussion of those basic facts takes up over half of the book.
The orientation and perspective of this book remains numerical linear algebra for applications in statistics. Computational considerations inform the narrative. There is an emphasis on the areas of matrix analysis that are important for statisticians, and the kinds of matrices encountered in statistical applications receive special attention.
This book is divided into three parts plus a set of appendices. The three parts correspond generally to the three areas of the book’s subtitle—theory, computations, and applications—although the parts are in a different order, and there is no firm separation of the topics.
Part I, consisting of Chaps. 1 through 6, covers most of the material in linear algebra needed by statisticians. (The word “matrix” in the title of the present book may suggest a somewhat more limited domain than “linear algebra”; but I use the former term only because it seems to be more commonly used by statisticians and is used more or less synonymously with the latter term.)
The first four chapters cover the basics of vectors and matrices, concentrating on topics that are particularly relevant for statistical applications. In Chap. 7, it is assumed that the reader is generally familiar with the basics of partial differentiation of scalar functions. Chapters 4 through 6 begin to take on more of an applications flavor, as well as beginning to give more consideration to computational methods. Although the details of the computations
are not covered in those chapters, the topics addressed are oriented more toward computational algorithms. Chapter 4 covers methods for decomposing matrices into useful factors.
Chapter 5 addresses applications of matrices in setting up and solving linear systems, including overdetermined systems. We should not confuse statistical inference with fitting equations to data, although the latter task is a component of the former activity. In Chap. 5, we address the more mechanical asp ects of the problem of fitting equations to data. Applications in statistical data analysis are discussed in Chap. 9. In those applications, we need to make statements (that is, assumptions) about relevant probability distributions.
Chapter 6 discusses methods for extracting eigenvalues and eigenvectors. There are many important details of algorithms for eigenanalysis, but they are beyond the scope of this book. As with other chapters in Part I, Chap. 6 makes some reference to statistical applications, but it focuses on the mathematical and mechanical aspects of the problem.
Although the first part is on “theory”, the presentation is informal; neither definitions nor facts are highlighted by such words as “Definition”, “Theorem”, “Lemma”, and so forth. It is assumed that the reader follows the natural development. Most of the facts have simple proofs, and most proofs are given naturally in the text. No “Proof” and “Q.E.D.” or “ ” appear to indicate beginning and end; again, it is assumed that the reader is engaged in the development. For example, on page 378:
If A is nonsingular and symmetric, then A 1 is also symmetric because (A 1 )T =(AT ) 1 = A 1
The first part of that sentence could have been stated as a theorem and given a number, and the last part of the sentence could have been introduced as the proof, with reference to some previous theorem that the inverse and transposition operations can be interchanged. (This had already been shown before page 378 —in an unnumbered theorem of course!)
None of the proofs are original (at least, I don’t think they are), but in most cases I do not know the original source, or even the source where I first saw them. I would guess that many go back to C. F. Gauss. Most, whether they are as old as Gauss or not, have appeared somewhere in the work of C. R. Rao. Some lengthier proofs are only given in outline, but references are given for the details. Very useful sources of details of the proofs are Harville (1997), esp ecially for facts relating to applications in linear models, and Horn and Johnson (1991) for more general topics, especially those relating to stochastic matrices. The older books by Gantmacher (1959) provide extensive coverage and often rather novel proofs. These two volumes have been brought back into print by the American Mathematical Society.
I also sometimes make simple assumptions without stating them explicitly. For example, I may write “for all i”when i is used as an index to a vector. I hope it is clear that “for all i” means only “for i that correspond to indices of the vector”. Also, my use of an expression generally implies existence. For
example, if “ AB ” is used to represent a matrix product, it implies that “ A and B are conformable for the multiplication AB ”. Occasionally I remind the reader that I am taking such shortcuts.
The material in Part I, as in the entire book, was built up recursively. In the first pass, I began with some definitions and followed those with some facts that are useful in applications. In the second pass, I went back and added definitions and additional facts that lead to the results stated in the first pass. The supporting material was added as close to the point where it was needed as practical and as necessary to form a logical flow. Facts motivated by additional applications were also included in the second pass. In subsequent passes, I continued to add supporting material as necessary and to address the linear algebra for additional areas of application. I sought a bare-bones presentation that gets across what I considered to be the theory necessary for most applications in the data sciences. The material chosen for inclusion is motivated by applications.
Throughout the book, some attention is given to numerical methods for computing the various quantities discussed. This is in keeping with my belief that statistical computing should be dispersed throughout the statistics curriculum and statistical literature generally. Thus, unlike in other books on matrix “theory”, I describe the “modified” Gram-Schmidt method, rather than just the “classical” GS. (I put “modified” and “classical” in quotes because, to me, GS is MGS. History is interesting, but in computational matters, I do not care to dwell on the methods of the past.) Also, condition numbers of matrices are introduced in the “theory” part of the book, rather than just in the “computational” part. Condition numbers also relate to fundamental properties of the model and the data.
The difference between an expression and a computing method is emphasized. For example, often we may write the solution to the linear system Ax = b as A 1 b. Although this is the solution (so long as A is square and of full rank), solving the linear system does not involve computing A 1 .Wemay write A 1 b, but we know we can compute the solution without inverting the matrix.
“This is an instance of a principle that we will encounter repeatedly: the form of a mathematical expression and the way the expression should be evaluated in actual practice may be quite different.” (The statement in quotes appears word for word in several places in the book.)
Standard textbooks on “matrices for statistical applications” emphasize their uses in the analysis of traditional linear models. This is a large and important field in which real matrices are of interest, and the important kinds of real matrices include symmetric, positive definite, projection, and generalized inverse matrices. This area of application also motivates much of the discussion in this book. In other areas of statistics, however, there are different matrices of interest, including similarity and dissimilarity matrices, stochastic matrices, rotation matrices, and matrices arising from graph-theoretic approaches
to data analysis. These matrices have applications in clustering, data mining, stochastic processes, and graphics; therefore, I describe these matrices and their special properties. I also discuss the geometry of matrix algebra. This provides a better intuition of the operations. Homogeneous coordinates and special operations in IR3 are covered because of their geometrical applications in statistical graphics.
Part II addresses selected applications in data analysis. Applications are referred to frequently in Part I, and of course, the choice of topics for coverage was motivated by applications. The difference in Part II is in its orientation.
Only “selected” applications in data analysis are addressed; there are applications of matrix algebra in almost all areas of statistics, including the theory of estimation, which is touched upon in Chap. 7 of Part I. Certain types of matrices are more common in statistics, and Chap. 8 discusses in more detail some of the important types of matrices that arise in data analysis and statistical modeling. Chapter 9 addresses selected applications in data analysis. The material of Chap. 9 has no obvious definition that could be covered in a single chapter (or a single part, or even a single book), so I have chosen to discuss briefly a wide range of areas. Most of the sections and even subsections of Chap. 9 are on topics to which entire books are devoted; however, I do not believe that any single book addresses all of them.
Part III covers some of the important details of numerical computations, with an emphasis on those for linear algebra. I believe these topics constitute the most important material for an introductory course in numerical analysis for statisticians and should be covered in every such course.
Except for specific computational techniques for optimization, random number generation, and perhaps symbolic computation, Part III provides the basic material for a course in statistical computing. All statisticians should have a passing familiarity with the principles.
Chapter 10 provides some basic information on how data are stored and manipulated in a computer. Some of this material is rather tedious, but it is important to have a general understanding of computer arithmetic before considering computations for linear algebra. Some readers may skip or just skim Chap. 10, but the reader should be aware that the way the computer stores numbers and performs computations has far-reaching consequences. Computer arithmetic differs from ordinary arithmetic in many ways; for example, computer arithmetic lacks associativity of addition and multiplication, and series often converge even when they are not supposed to. (On the computer, a straightforward evaluation of E∞ x=1 x converges!)
I emphasize the differences between the abstract number system IR, called the reals, and the computer number system IF, the floating-point numbers unfortunately also often called “real”. Table 10.3 on page 557 summarizes some of these differences. All statisticians should be aware of the effects of these differences. I also discuss the differences between ZZ, the abstract number system called the integers, and the computer number system II, the fixed-
point numbers. (Appendix A provides definitions for this and other notation that I use.)
Chapter 10 also covers some of the fundamentals of algorithms, such as iterations, recursion, and convergence. It also discusses software development. Software issues are revisited in Chap. 12
While Chap. 10 deals with general issues in numerical analysis, Chap. 11 addresses specific issues in numerical methods for computations in linear algebra.
Chapter 12 provides a brief introduction to software available for computations with linear systems. Some specific systems mentioned include the IMSL™ libraries for Fortran and C, Octave or MATLAB® (or MATLAB®), and R or S-PLUS® (or S-Plus®). All of these systems are easy to use, and the best way to learn them is to begin using them for simple problems. I do not use any particular software system in the book, but in some exercises, and particularly in Part III, I do assume the ability to program in either Fortran or C and the availability of either R or S-Plus, Octave or MATLAB, and Maple® or Mathematica® My own preferences for software systems are Fortran and R, and occasionally these preferences manifest themselves in the text.
Appendix A collects the notation used in this book. It is generally “standard” notation, but one thing the reader must become accustomed to is the lack of notational distinction between a vector and a scalar. All vectors are “column” vectors, although I usually write them as horizontal lists of their elements. (Whether vectors are “row” vectors or “column” vectors is generally only relevant for how we write expressions involving vector/matrix multiplication or partitions of matrices.)
I write algorithms in various ways, sometimes in a form that looks similar to Fortran or C and sometimes as a list of numbered steps. I believe all of the descriptions used are straightforward and unambiguous.
This book could serve as a basic reference either for courses in statistical computing or for courses in linear models or multivariate analysis. When the book is used as a reference, rather than looking for “Definition” or “Theorem”, the user should look for items set off with bullets or look for numbered equations, or else should use the Indexor Appendix A, beginning on page 653.
The prerequisites for this text are minimal. Obviously some background in mathematics is necessary. Some background in statistics or data analysis and some level of scientific computer literacy are also required. References to rather advanced mathematical topics are made in a number of places in the text. To some extent this is because many sections evolved from class notes that I developed for various courses that I have taught. All of these courses were at the graduate level in the computational and statistical sciences, but they have had wide ranges in mathematical level. I have carefully reread the sections that refer to groups, fields, measure theory, and so on, and am convinced that if the reader does not know much about these topics, the material is still understandable, but if the reader is familiar with these topics, the references add to that reader’s appreciation of the material. In many places, I refer to
computer programming, and some of the exercises require some programming. A careful coverage of Part III requires background in numerical programming.
In regard to the use of the book as a text, most of the book evolved in one way or another for my own use in the classroom. I must quickly admit, however, that I have never used this whole book as a text for any single course. I have used Part III in the form of printed notes as the primary text for a course in the “foundations of computational science” taken by graduate students in the natural sciences (including a few statistics students, but dominated by physics students). I have provided several sections from Parts I and II in online PDF files as supplementary material for a two-semester course in mathematical statistics at the “baby measure theory” level (using Shao, 2003). Likewise, for my courses in computational statistics and statistical visualization, I have provided many sections, either as supplementary material or as the primary text, in online PDF files or printed notes. I have not taught a regular “applied statistics” course in almost 30 years, but if I did, I am sure that I would draw heavily from Parts I and II for courses in regression or multivariate analysis. If I ever taught a course in “matrices for statistics” (I don’t even know if such courses exist), this book would be my primary text because I think it covers most of the things statisticians need to know about matrix theory and computations.
Some exercises are Monte Carlo studies. I do not discuss Monte Carlo methods in this text, so the reader lacking background in that area may need to consult another reference in order to work those exercises. The exercises should be considered an integral part of the book. For some exercises, the required software can be obtained from netlib. Exercises in any of the chapters, not just in Part III, may require computations or computer programming.
Penultimately, I must make some statement about the relationship of this book to some other books on similar topics. Much important statistical theory and many methods make use of matrix theory, and many statisticians have contributed to the advancement of matrix theory from its very early days. Widely used books with derivatives of the words “statistics” and “matrices/linear-algebra” in their titles include Basilevsky (1983), Graybill (1983), Harville (1997), Schott (2004), and Searle (1982). All of these are useful books. The computational orientation of this book is probably the main difference between it and these other books. Also, some of these other books only address topics of use in linear models, whereas this book also discusses matrices useful in graph theory, stochastic processes, and other areas of application. (If the applications are only in linear models, most matrices of interest are symmetric, and all eigenvalues can be considered to be real.) Other differences among all of these books, of course, involve the authors’ choices of secondary topics and the ordering of the presentation.
Acknowledgments
I thank John Kimmel of Springer for his encouragement and advice on this book and other books on which he has worked with me. I especially thank Ken Berk for his extensive and insightful comments on a draft of this book. I thank my student Li Li for reading through various drafts of some of the chapters and pointing out typos or making helpful suggestions. I thank the anonymous reviewers of this edition for their comments and suggestions. I also thank the many readers of my previous book on numerical linear algebra who informed me of errors and who otherwise provided comments or suggestions for improving the exposition. Whatever strengths this book may have can be attributed in large part to these people, named or otherwise. The weaknesses can only be attributed to my own ignorance or hardheadedness.
I thank my wife, María, to whom this book is dedicated, for everything.
IusedTEXvia LATEX2ε to write the book. I did all of the typing, programming, etc., myself, so all misteaks are mine. I would appreciate receiving suggestions for improvement and notification of errors.
Fairfax County, VA, USAJames E. Gentle June 12, 2007
2.2.5
3.2
3.3
3.3.5
3.3.6
3.3.7
3.3.8
3.3.9
3.5
3.6
3.4.5
3.4.10
3.4.11
3.4.12
3.4.13
3.4.14
3.5.1
3.5.2
3.6.1
3.6.2 Null Space: The Orthogonal Complement
3.7 Generalized Inverses
3.7.2 The Moore–Penrose Inverse
3.7.3 Generalized Inverses of Products and Sums of Matrices.... ......................................
Another random document with no related content on Scribd:
The Project Gutenberg eBook of Stars and atoms
This ebook is for the use of anyone anywhere in the United States and most other parts of the world at no cost and with almost no restrictions whatsoever. You may copy it, give it away or re-use it under the terms of the Project Gutenberg License included with this ebook or online at www.gutenberg.org. If you are not located in the United States, you will have to check the laws of the country where you are located before using this eBook.
Title: Stars and atoms
Author: Sir Arthur Stanley Eddington
Release date: April 8, 2024 [eBook #73362]
Language: English
Original publication: New Haven: Yale University Press, 1927
Credits: Laura Natal Rodrigues (Images generously made available by Hathi Trust Digital Library.)
*** START OF THE PROJECT GUTENBERG EBOOK STARS AND ATOMS ***
Stars and Atoms
STARS AND ATOMS
Fig. 1. THE SUN. Hydrogen photograph
A. S. EDDINGTON
M.A., D.Sc., LL.D., F.R.S., Plumian Professor of Astronomy in the University of Cambridge
NEW HAVEN: YALE UNIVERSITY PRESS
LONDON: HUMPHREY MILFORD, OXFORD UNIVERSITY PRESS 1927
Ich häufe ungeheure Zahlen, Gebürge Millionen auf, Ich setze Zeit auf Zeit und Welt auf Welt zu Hauf.
A. VON HALLER.
‘S PREFACE
TARS and Atoms’ was the title of an Evening Discourse given at the meeting of the British Association in Oxford in August 1926. In adapting it for publication the restrictions of a time limit are removed, and accordingly it appears in this book as three lectures. Earlier in the year I had given a course of three lectures in King’s College, London, on the same topics; these have been combined with the Oxford lecture and are the origin of most of the additions.
A full account of the subject, including the mathematical theory, is given in my larger book, The Internal Constitution of the Stars (Camb. Univ. Press, 1926). Here I only aim at exposition of some of the leading ideas and results.
The advance in our knowledge of atoms and radiation has led to many interesting developments in astronomy; and reciprocally the study of matter in the extreme conditions prevailing in stars and nebulae has played no mean part in the progress of atomic physics. This is the general theme of the lectures. Selection has been made of the advances and discoveries which admit of comparatively elementary exposition; but it is often necessary to demand from the reader a concentration of thought which, it is hoped, will be repaid by the fascination of the subject. The treatment was meant to be discursive rather than systematic; but habits of mind refuse to be suppressed entirely and a certain amount of system has crept in. In these problems where our thought fluctuates continually from the excessively great to the excessively small, from the star to the atom and back to the star, the story of progress is rich in variety; if it has not lost too much in the telling, it should convey in full measure the delights—and the troubles—of scientific investigation in all its phases.
Temperatures are expressed throughout in degrees Centigrade. The English billion, trillion, &c. (1012, 1018 , &c.) are used.
A. S. E.
Further Remarks on the Companion of Sirius 122
LIST OF ILLUSTRATIONS
FIG.
1. The Sun. Hydrogen Spectroheliogram. (J. Evershed)
2. Solar Vortices. Hydrogen Spectroheliogram. (Mount Wilson Observatory)
3. Tracks of Alpha Particles (helium atoms). (C. T. R. Wilson)
4. Tracks of Beta Particles (electrons). (C. T. R. Wilson)
5. Ionization by X-rays. (C. T. R. Wilson)
6. Ions produced by Collision of a Beta particle. (C. T. R. Wilson)
7. The Mass-luminosity Curve.
8. The Ring Nebula in Lyra. Slitless Spectrogram. (W. H. Wright)
9. Flash Spectrum of Chromosphere showing Head of the Balmer Series. (British Eclipse Expedition, 14 Jan. 1926)
10. Solar Prominence. (British Eclipse Expedition, 29 May 1919)
11. Star Cluster ω Centauri. (Cape Observatory)
LECTURE I THE INTERIOR OF A STAR
THE sun belongs to a system containing some 3,000 million stars. The stars are globes comparable in size with the sun, that is to say, of the order of a million miles in diameter The space for their accommodation is on the most lavish scale. Imagine thirty cricket balls roaming the whole interior of the earth; the stars roaming the heavens are just as little crowded and run as little risk of collision as the cricket balls. We marvel at the grandeur of the stellar system. But this probably is not the limit. Evidence is growing that the spiral nebulae are ‘island universes’ outside our own stellar system. It may well be that our survey covers only one unit of a vaster organization.
A drop of water contains several thousand million million million atoms. Each atom is about one hundred-millionth of an inch in diameter. Here we marvel at the minute delicacy of the workmanship. But this is not the limit. Within the atom are the much smaller electrons pursuing orbits, like planets round the sun, in a space which relatively to their size is no less roomy than the solar system.
Nearly midway in scale between the atom and the star there is another structure no less marvellous—the human body. Man is slightly nearer to the atom than to the star. About 1027 atoms build his body; about 1028 human bodies constitute enough material to build a star.
From his central position man can survey the grandest works of Nature with the astronomer, or the minutest works with the physicist. To-night I ask you to look both ways. For the road to a knowledge of the stars leads through the atom; and important knowledge of the atom has been reached through the stars.
The star most familiar to us is the sun. Astronomically speaking, it is close at hand. We can measure its size, weigh it, take its temperature, and so on, more easily than the other stars. We can take photographs of its surface, whereas the other stars are so distant that the largest telescope in the world does not magnify them into anything more than points of light. Figs. 1 and 2[1] show recent pictures of the sun’s surface. No doubt the stars in general would show similar features if they were near enough to be examined.
I must first explain that these are not the ordinary photographs. Simple photographs show very well the dark blotches called sunspots, but otherwise they are rather flat and uninteresting. The pictures here shown were taken with a spectroheliograph, an instrument which looks out for light of just one variety (wave-length) and ignores all the rest. The ultimate effect of this selection is that the instrument sorts out the different levels in the sun’s atmosphere and shows what is going on at one level, instead of giving a single blurred impression of all levels superposed. Fig. 2, which refers to a high level, gives a wonderful picture of whirlwinds and commotion. I think that the solar meteorologists would be likely to describe these vortices in terms not unfamiliar to us—‘A deep depression with secondaries is approaching, and a renewal of unsettled conditions is probable.’ However that may be, there is always one safe weather forecast on the sun; cyclone or anticyclone, the temperature will be very warm—about 6,000° in fact.
But just now I do not wish to linger over the surface layers or atmosphere of the sun. A great many new and interesting discoveries have recently been made in this region, and much of the new knowledge is very germane to my subject of ‘Stars and Atoms’. But personally I am more at home underneath the surface, and I am in a hurry to dive below. Therefore with this brief glance at the scenery that we pass we shall plunge into the deep interior—where the eye cannot penetrate, but where it is yet possible by scientific reasoning to learn a great deal about the conditions.
Fig 2 THE SUN Hydrogen photograph
Temperature in the Interior
By mathematical methods it is possible to work out how fast the pressure increases as we go down into the sun, and how fast the temperature must increase to withstand the pressure. The architect can work out the stresses inside the piers of his building; he does not need to bore holes in them. Likewise the astronomer can work out the stress or pressure at points inside the sun without boring a hole. Perhaps it is more surprising that the temperature can be found by pure calculation. It is natural that you should feel rather sceptical about our claim that we know how hot it is in the very middle of a star —and you may be still more sceptical when I divulge the actual figures! Therefore I had better describe the method as far as I can. I shall not attempt to go into detail, but I hope to show you that there is a clue which might be followed up by appropriate mathematical methods.
I must premise that the heat of a gas is chiefly the energy of motion of its particles hastening in all directions and tending to scatter apart. It is this which gives a gas its elasticity or expansive force; the elasticity of a gas is well known to every one through its practical application in a pneumatic tyre. Now imagine yourself at some point deep down in the star where you can look upwards towards the surface or downwards towards the centre. Wherever you are, a certain condition of balance must be reached; on the one hand there is the weight of all the layers above you pressing downwards and trying to squeeze closer the gas beneath; on the other hand there is the elasticity of the gas below you trying to expand and force the superincumbent layers outwards. Since neither one thing nor the other happens and the star remains practically unchanged for hundreds of years, we must infer that the two tendencies just balance. At each point the elasticity of the gas must be just enough to balance the weight of the layers above; and since it is the heat which furnishes the elasticity, this requirement settles how much
heat the gas must have. And so we find the degree of heat or temperature at each point.
The same thing can be expressed a little differently As before, fix attention on a certain point in a star and consider how the matter above it is supported. If it were not supported it would fall to the centre under the attractive force of gravitation. The support is given by a succession of minute blows delivered by the particles underneath; we have seen that their heat energy causes them to move in all directions, and they keep on striking the matter above. Each blow gives a slight boost upwards, and the whole succession of blows supports the upper material in shuttlecock fashion. (This process is not confined to the stars; for instance, it is in this way that a motor car is supported by its tyres.) An increase of temperature would mean an increase of activity of the particles, and therefore an increase in the rapidity and strength of the blows. Evidently we have to assign a temperature such that the sum total of the blows is neither too great nor too small to keep the upper material steadily supported. That in principle is our method of calculating the temperature.
One obvious difficulty arises, The whole supporting force will depend not only on the activity of the particles (temperature) but also on the number of them (density). Initially we do not know the density of the matter at an arbitrary point deep within the sun. It is in this connexion that the ingenuity of the mathematician is required. He has a definite amount of matter to play with, viz. the known mass of the sun; so the more he uses in one part of the globe the less he will have to spare for other parts. He might say to himself, ‘I do not want to exaggerate the temperature, so I will see if I can manage without going beyond 10,000,000°.’ That sets a limit to the activity to be ascribed to each particle; therefore when the mathematician reaches a great depth in the sun and accordingly has a heavy weight of upper material to sustain, his only resource is to use large numbers of particles to give the required total impulse. He will then find that he has used up all his particles too fast, and has nothing left to fill up the centre. Of course his structure, supported on nothing, would come tumbling down into the hollow. In that way we can prove that it is
impossible to build up a permanent star of the dimensions of the sun without introducing an activity or temperature exceeding 10,000,000°. The mathematician can go a step beyond this; instead of merely finding a lower limit, he can ascertain what must be nearly the true temperature distribution by taking into account the fact that the temperature must not be ‘patchy’. Heat flows from one place to another, and any patchiness would soon be evened out in an actual star. I will leave the mathematician to deal more thoroughly with these considerations, which belong to the following up of the clue; I am content if I have shown you that there is an opening for an attack on the problem.
This kind of investigation was started more than fifty years ago. It has been gradually developed and corrected, until now we believe that the results must be nearly right—that we really know how hot it is inside a star.
I mentioned just now a temperature of 6,000°; that was the temperature near the surface—the region which we actually see. There is no serious difficulty in determining this surface temperature by observation; in fact the same method is often used commercially for finding the temperature of a furnace from the outside. It is for the deep regions out of sight that the highly theoretical method of calculation is required. This 6,000° is only the marginal heat of the great solar furnace giving no idea of the terrific intensity within. Going down into the interior the temperature rises rapidly to above a million degrees, and goes on increasing until at the sun’s centre it is about 40,000,000°.
Do not imagine that 40,000,000° is a degree of heat so extreme that temperature has become meaningless. These stellar temperatures are to be taken quite literally. Heat is the energy of motion of the atoms or molecules of a substance, and temperature which indicates the degree of heat is a way of stating how fast these atoms or molecules are moving. For example, at the temperature of this room the molecules of air are rushing about with an average speed of 500 yards a second; if we heated it up to 40,000,000° the speed would be just over 100 miles a second. That is nothing to be alarmed about; the astronomer is quite accustomed to speeds like
that. The velocities of the stars, or of the meteors entering the earth’s atmosphere, are usually between 10 and 100 miles a second. The velocity of the earth travelling round the sun is 20 miles a second. So that for an astronomer this is the most ordinary degree of speed that could be suggested, and he naturally considers 40,000,000° a very comfortable sort of condition to deal with. And if the astronomer is not frightened by a speed of 100 miles a second, the experimental physicist is quite contemptuous of it; for he is used to handling atoms shot off from radium and similar substances with speeds of 10,000 miles a second. Accustomed as he is to watching these express atoms and testing what they are capable of doing, the physicist considers the jog-trot atoms of the stars very commonplace.
Besides the atoms rushing to and fro in all directions we have in the interior of a star great quantities of ether waves also rushing in all directions. Ether waves are called by different names according to their wave-length. The longest are the Hertzian waves used in broadcasting; then come the infra-red heat waves; next come waves of ordinary visible light; then ultra-violet photographic or chemical rays; then X-rays; then Gamma rays emitted by radio-active substances. Probably the shortest of all are the rays constituting the very penetrating radiation found in our atmosphere, which according to the investigations of Kohlhörster and Millikan are believed to reach us from interstellar space. These are all fundamentally the same but correspond to different octaves. The eye is attuned to only one octave, so that most of them are invisible; but essentially they are of the same nature as visible light.
The ether waves inside a star belong to the division called Xrays. They are the same as the X-rays produced artificially in an Xray tube. On the average they are ‘softer’ (i.e. longer) than the Xrays used in hospitals, but not softer than some of those used in laboratory experiments. Thus we have in the interior of a star something familiar and extensively studied in the laboratory.
Besides the atoms and ether waves there is a third population to join in the dance. There are multitudes of free electrons. The electron is the lightest thing known, weighing no more than ¹⁄₁₈₄₀ of
the lightest atom. It is simply a charge of negative electricity wandering about alone. An atom consists of a heavy nucleus which is usually surrounded by a girdle of electrons. It is often compared to a miniature solar system, and the comparison gives a proper idea of the emptiness of an atom. The nucleus is compared to the sun, and the electrons to the planets. Each kind of atom—each chemical element—has a different quorum of planet electrons. Our own solar system with eight planets might be compared especially with the atom of oxygen which has eight circulating electrons. In terrestrial physics we usually regard the girdle or crinoline of electrons as an essential part of the atom because we rarely meet with atoms incompletely dressed; when we do meet with an atom which has lost one or two electrons from its system, we call it an ‘ion’. But in the interior of a star, owing to the great commotion going on, it would be absurd to exact such a meticulous standard of attire. All our atoms have lost a considerable proportion of their planet electrons and are therefore ions according to the strict nomenclature.