THE GALAXY Notes for Lecture Courses ASTM002 and MAS430 Queen Mary University of London
Bryn Jones Prasenjit Saha
January – March 2006
Chapter 1
Introducing Galaxies Galaxies are a slightly difficult topic in astronomy at present. Although a great amount is known about them, galaxies are much less well understood than, say, stars. There remain many problems relating to their formation and evolution, and even with aspects of their structure. Fortunately, extragalactic research currently is very active and our understanding is changing, and improving, noticeably each year. One reason that galaxies are difficult to understand is that they are made of three very different entities: stars, an interstellar medium (gas with some dust, often abbreviated as ISM), and dark matter. There is an interplay between the stars and gas, with stars forming out of the gas and with gas being ejected back into the interstellar medium from evolved stars. The dark matter affects the other material through its strong gravitational potential, but very little is known about it directly. We shall study each of these in this course, and to a small extent how they influence each other. An additional factor complicating an understanding of galaxies is that their evolution is strongly affected by their environment. The gravitational effects of other galaxies can be important. Galaxies can sometimes interact and even merge. Intergalactic gas, for example that found in clusters of galaxies, can be important. These various factors, and the feedback from one to another, mean that a number of important problems remain to be solved in extragalactic science, but this gives the subject its current vigour. One important issue of terminology needs to be clarified at the outset. The word “Galaxy” with a capital ‘G’ refers to our own Galaxy, i.e. the Milky Way Galaxy, as does the word Galactic. In contrast, “galaxy”, “galaxies” and “galactic” with a lower-case ‘G’ refer to other galaxies and to galaxies in general.
1.1
Galaxy Types
Some galaxies (more of them in earlier epochs) have active nuclei which can vastly outshine the starlight. We shall not go into that here – we shall confine ourselves to normal galaxies and ignore active galaxies. There are three broad categories of normal galaxies: • elliptical galaxies (denoted E); • disc galaxies, i.e. spiral (S) and lenticular (S0) galaxies; • irregular galaxies (I or Irr). Classifications often include peculiar galaxies which have unusual shapes. These are mostly the result of interactions and mergers. These classifications are based on the shapes and structures of galaxies, i.e. on the morphology. They are therefore known as morphological types.
1
1.2
Disc Galaxies
Disc galaxies have prominent flattened discs. They have masses of 109 M¯ to 1012 M¯ . They include spiral galaxies and S0 (or lenticular) galaxies. Spirals are gas rich and this gas takes part in star formation. S0s on the other hand have very little gas and no star formation, while their discs are more diffuse than those of spirals.
1.2.1
Spiral galaxies
Spiral galaxies have much gas within their discs, plus some embedded dust, which amounts to 1-20% of their visible mass (the rest of the visible mass is stars). This gas shows active star formation. The discs contain stars having a range of ages as a result of this continuing star formation. Spiral arms are apparent in the discs, defined by young, luminous stars and by H II regions. Spiral galaxies have a central bulge component containing mostly old stars. These bulges superficially resemble small elliptical galaxies. The disc and bulge are embedded in a fainter halo component composed of stars and globular clusters. The stars of this stellar halo are very old and metal-poor (deficient in chemical elements other than hydrogen and helium). Some spirals have bars within their discs. These are called barred spirals and are designated type SB. Non-barred, or normal spirals are designated type S or SA. The spectra are dominated by F- and G-type stars but also show prominent emission lines from the gas: the spectra show the absorption lines from the stars with the emission lines superimposed. The spiral disc is highly flattened and the gas is concentrated close to the plane. The spiral arms are associated with regions of enhanced gas density that are usually caused by density waves. Discs rotate, with the stars and gas in near-circular orbits close to the plane, having circular velocities ∼ 200 to 250 kms−1 . In contrast, the halo stars have randomly oriented orbits with speeds ∼ 250 kms−1 . Spiral galaxies are plentiful away from regions of high galaxy density (away from the cores of galaxy clusters). All disc galaxies seem to be embedded in much larger dark haloes; the ratio of total mass to visible stellar mass is ' 5, but we do not really have a good mass estimate for any disc galaxy. The disc’s surface brightness I tends to follow a roughly exponential decline with radial distance R from the centre, i.e., I(R) = I0 exp(−R/R0 ) where I0 ∼ 102 L¯ pc−2 is the central surface brightness, and R0 is a scale length for the decline in brightness. The scale length R0 is ' 3.5 kpc for the Milky Way. There are clear trends in the properties of spirals. Those containing the smallest amounts of gas are called subtype Sa and tend to have large, bright bulges compared to the discs, tightly wound spiral arms, and relatively red colours. More gas-rich spirals, such as subtype Sc, have small bulges, open spiral arms and blue colours. There is a gradual variation in these properties from subtype Sa through Sab, Sb, Sbc, to Sc. Some classification schemes include more extreme subtypes Sd and Sm. Subtypes of barred spirals are denoted SBa, SBab, SBb, SBbc, SBc, ... Because the spiral arms mark regions of recent star formation, the stars in the arms are young and blue in colour. Therefore, spiral arms are most prominent when a galaxy is observed in blue or ultraviolet light, and less prominent when observed in the red or infrared. Figure 1.5 shows three images of a spiral galaxy recorded through blue, red and infrared filters only. The spiral pattern is strong in the blue image, but weak in the infrared. Observations show that there is a correlation between luminosity L (the total power output of galaxies due to emitted light, infrared radiation, ultraviolet etc.) and the maximum rotational velocity vrot of the disc for spiral galaxies. The relationship is close to 4 L ∝ vrot
2
E0 NGC 1407
E2 NGC 1395
E4 NGC 584
E6 NGC 4033
Figure 1.1: the sequence of morphological types of elliptical galaxies. Elliptical galaxies are classified according to their shape. [Created with blue-band data from the SuperCOSMOS Sky Survey.]
Sa ESO286-G10
Sb NGC 3223
Sc M74
Sd NGC 300
Sm NGC 4395
Figure 1.2: the sequence of normal (non-barred) spiral types. Spiral galaxies are classified according to how tightly would their arms are, the prominence of the central bulge, and the quantity of interstellar gas. [Created with blue-band data from the SuperCOSMOS Sky Survey.]
SBa NGC 4440
SBb NGC 1097
SBc NGC 1073
SBd NGC 1313
SBm NGC 4597
Figure 1.3: the sequence of barred spiral types. [Created with blue-band data from the SuperCOSMOS Sky Survey.]
NGC 1569
NGC 4214
NGC 4449
NGC 7292
Figure 1.4: examples of irregular galaxies. [Created with blue-band data from the SuperCOSMOS Sky Survey.]
3
Blue
Red
Infrared
Figure 1.5: the spiral galaxy NGC2997 in blue, red and infrared light, showing that the prominence of the spiral arms varies with the observed wavelength of light. The blue spiral arms are most evident in the blue image, while the old (red) stellar population is seen more clearly in the infrared image. [The picture was produced using data from the SuperCOSMOS Sky Survey of the Royal Observatory Edinburgh, based on photography from the United Kingdom Schmidt Telescope.] This is known as the Tully-Fisher relationship. It is important because it allows the luminosity L to be calculated from the rotation velocity vrot using optical or radio spectroscopy. A comparison of the luminosity and the observed brightness gives the distance to the galaxy.
1.2.2
S0 galaxies
S0 galaxies, sometimes also known as lenticular galaxies, are flattened disc systems like spirals but have very little gas or dust. They therefore contain only older stars. They have probably been formed by spirals that have lost or exhausted their gas.
1.3
Elliptical Galaxies
These have masses from 1010 M¯ to 1013 M¯ (not including dwarf ellipticals which have lower masses). They have elliptical shapes, but little other structure. They contain very little gas, so almost all of the visible component is in the form of stars (there is very little dust). With so little gas, there is no appreciable star formation, with the result that elliptical galaxies contain almost only old stars. Their colours are therefore red. K-type giant stars dominate the visible light, and their optical spectra are broadly similar to K-type stars with no emission lines from an interstellar medium: the spectra have absorption lines only. Dark matter is important, and there is probably an extensive dark matter halo with a similar proportion of dark to visible matter as spirals. Ellipticals are classified by their observed shapes. They are given a type En where n is an integer describing the apparent ellipticity defined as n = int[10(1 − b/a)] where b/a is the axis ratio (the ratio of the semi-minor to semi-major axes) as seen on the sky. In practice we observe only types E0 (circular) to E7 (most flattened). We never see ellipticals flatter than about E7. The reason (as indicated by simulations and normal mode analyses) seems to be that a stellar system any flatter is unstable to buckling, and will eventually settle into something rounder. Note that these subtypes reflect the observed shapes, not the three-dimensional shapes: a very elongated galaxy seen end-on would be classified as type E0. Luminous ellipticals have very little net rotation. The orbits of the stars inside them are randomly oriented. The motions of the stars are characterised by a velocity dispersion σ along the line of sight, most commonly the velocity dispersion at the centres σ 0 . These 4
ellipticals are usually triaxial in shape and have different velocity dispersions in the directions of the different axes. Less luminous ellipticals can have some net rotation. Their surface brightness distributions are more centrally concentrated than those of spirals. There are various functional forms around for fitting the surface brightness, of which the best known is the de Vaucouleurs model, " µ ¶1 # R 4 I(R) = I0 exp − , R0 with I0 ∼ 105 L¯ pc−2 for giant ellipticals. (To fit to observations, one typically un-squashes the ellipses to circles first. Also, the functional forms are are only fitted to observations over the restricted range in which I(R) is measurable. So don’t be surprised to see very different looking functional forms being fit to the same data.) Ellipticals have large numbers of globular clusters. These are visible as faint starlike images superimposed on the galaxies. These globular clusters have masses 10 4 M¯ to few×106 M¯ . Ellipticals are plentiful in environments where the density of galaxies is high, such as in galaxy clusters. Isolated ellipticals are rare. Observations show that there is a correlation between three important observational parameters for elliptical galaxies. These quantities are the scale size R 0 , the central surface brightness I0 , and the central velocity dispersion σ0 . The observations show that R0 I00.9 σ0−1.4 ' constant . This relation is known as the fundamental plane for elliptical galaxies. (The relation is also often expressed in terms of the radius Re containing half the light of the galaxy and the surface brightness Ie at this radius.) An older, and cruder, relation is that between luminosity L and the central velocity dispersion: L ∝ σ04 . This is known as the Faber-Jackson relation. The Faber-Jackson relation, and particularly the fundamental plane, are very useful in estimating the distances to elliptical galaxies: the observational parameters I0 and σ0 give an estimate of R0 , which in turn with I0 gives the total luminosity of the galaxy, which can then be used with the observed brightness to derive a distance.
1.4
Irregular Galaxies
Irregular galaxies have irregular, patchy morphologies. They are gas-rich, showing strong star formation with many young stars. Ionised gas, particularly H II regions, is prominent around the regions of star formation. They tend to have strong emission lines from the interstellar gas, and their starlight is dominated by B, A and F types. As a result their colours are blue and their spectra show strong emission lines from the interstellar gas superimposed on the stellar absorption-line spectrum. Their internal motions are relatively chaotic. They are denoted type I or Irr.
1.5
Other Types of Galaxy
As already noted, some galaxies have unusual, disturbed morphologies and are called peculiar. These are mostly the result of interactions and mergers between galaxies. They are particularly numerous among distant galaxies. 5
Figure 1.6: examples of the optical spectra of elliptical, spiral and irregular galaxies. The elliptical spectrum shows only absorption lines produced by the stars in the galaxy. The spiral galaxy has absorption lines from its stars and some emission lines from its interstellar gas. In contrast, the irregular galaxy has very strong emission lines on a weaker stellar continuum. [Produced with data from the 2dF Galaxy Redshift Survey.]
6
Figure 1.7: The tuning fork diagram of Hubble types. The galaxies on the left are known as early types, and those on the right as late types. Clusters of galaxies often have a very luminous, dominant elliptical at their cores, of a type called a cD galaxy. These have extensive outer envelopes of stars. Low-luminosity galaxies are called dwarf galaxies. They have masses 106 to 109 MÂŻ . Common subclasses are dwarf irregulars having a large fraction of gas and active star formation, and dwarf ellipticals which are gas poor and have no star formation. Dwarf spheroidal galaxies are very low luminosity, very low surface brightness systems, essentially extreme versions of dwarf ellipticals. Our Galaxy has several dwarf spheroidal satellites.
1.6
The Hubble Sequence
On the whole, galaxy classification probably should not be taken as seriously as stellar classification, because there are not (yet) precise physical interpretations of what the gradations mean. But some physical properties do clearly correlate with the so-called Hubble types. The basic system of classification described above was defined in detail by Edwin Hubble, although a number of extensions to his system are available. Figure 1.7 shows the Hubble tuning fork diagram which places the various types in a sequence based on their shapes. Ellipticals go on the left, arranged in a sequence based on their ellipticities. Then come the lenticulars or disc galaxies without spiral arms: S0 and SB0. Then spirals with increasingly spaced arms, Sa etc. if unbarred, SBa etc. if barred. The left-hand galaxies are called early types, and the right-hand ones late types. People once thought this represented an evolutionary sequence, but that has long been obsolete. (Our current understanding is that, if anything, galaxies tend to evolve towards early types.) But the old names early and late are still used. Note that for spirals, bulges get smaller as spiral arms get more widely spaced. The theory behind spiral density waves predicts that the spacing between arms is proportional to the disc’s mass density. Several galaxy properties vary in a sequence from ellipticals to irregulars. However, the precise shapes of ellipticals are not important in this: all ellipticals lie in the same position in the sequence:
7
E (all Es)
S0
Sa
Sb
Sc
Early type Old stars Red colour Gas poor Absorption-line spectra
Sd
Irr Late type Young stars Blue colour Gas rich Strong emission lines in spectra
Some evolution along this sequence from right to left (late to early) can occur if gas is used up in star formation or gas is taken out of the galaxies.
1.7
A Description of Galaxy Dynamics
Interactions between distributions of matter can be very important over the lifetime of a galaxy, be these the interactions of stars, the interactions between clouds of gas, or the interactions of a galaxy with a near neighbour. An important distinction between interactions is whether they are collisional or collisionless. Encounters between bodies of matters are: • collisional if interactions between individual particles substantially affect their motions; • collisionless if interactions between individual particles do not substantially affect their motions. Gas is collisional. If two gas clouds collide, even with the low densities found in astronomy, individual atoms/molecules interact. These interactions on the atomic scale strongly influence the motions of the two gas clouds. Stars are collisionless on the galactic scale. If two stellar systems collide, the interactions between individual stars have little effect on their motions. The ‘particles’ are the stars in this case. The motions of the stars are mostly affected by the gravitational potentials of the two stellar systems. Interactions between individual stars are rare on the scale of galaxies. Stars are therefore so compact on the scale of a galaxy that a stellar system behaves like a collisionless fluid (except in the cores of galaxies and globular clusters), resembling a plasma in some respects. This distinction between stars and gas leads to two very important differences between stellar and gas dynamics in a galaxy. 1. Gas will tend to settle into rotating discs within galaxies. Stars will not settle in this way. 2. Star orbits can cross each other, but in equilibrium gas must follow closed paths which do not cross (and in the same sense). Two streams of stars can go through each other and hardly notice, but two streams of gas will shock (and probably form stars). You could have a disc of stars with no net rotation (just reverse the directions of motion of some stars), but not so with a disc of gas. The terms ‘rotational support’ and ‘pressure support’ are used to describe how material in galaxies balances its self-gravity. Stars and gas move in roughly circular orbits in the discs of spiral galaxies, where they achieve a stable equilibrium because they are rotationally supported against gravity. The stars and gas in the spiral discs have rotational velocities of ' 250 kms−1 , while the dispersion of the gas velocities locally around this net motion is only ' 10 kms−1 . 8
In contrast, stars in luminous elliptical galaxies maintain a stable equilibrium because they are moving in randomly oriented orbits. Drawing a parallel with atoms/molecules in a gas, they are said to be pressure supported against gravity. Random velocities of ' 300 kms −1 are typical. The velocities for pressure support need not be isotropically distributed. It is also possible to have a mixture of rotational and pressure support for a system of stars, with some appreciable net rotation.
1.8
A Brief Overview of Galaxy Evolution Processes
Galaxies formed early in the history of the Universe, but exactly when is not known with certainty. They were probably formed by the coalescence of a number of separate clumps of dark matter that also contained gas and stars, rather than by the collapse of single large bodies of dark matter and gas. Galaxies have evolved with time to give us the galaxy populations seen today. Some of the processes driving the evolution of galaxies are briefly stated in this section. Exactly how these processes affect galaxies is not understood precisely at present. Gas in a galaxy will fall into a rotationally supported disc if it has angular momentum. Subsequent star formation in the gaseous disc will form the stellar disc of a spiral galaxy. Stars formed in gas clouds falling radially inwards during the formation of a galaxy can contribute to the stellar halo of a spiral galaxy. Stars in smaller galaxies falling radially inwards in a merging process can also contribute to the stellar halo of a spiral galaxy. Differential rotation in a spiral’s disc will generate spiral density waves in the disc, leading to spiral arms. Spiral discs without a bulge can be unstable, and can buckle and thicken, with mass being redistributed into a bulge, giving the bulge some rotation in the process. Continuing star formation in the disc gives rise to a range of ages for stars in the disc, while bulges will tend to be older. If a spiral uses up most of its gas in star formation, it will have a stellar disc but no spiral arms. Mergers and interactions between galaxies can be important. In a merger two galaxies fuse together. In an interaction, however, one galaxy interacts with another through their gravitational effects. One or both galaxy may survive an interaction, but may be altered in the process. If two spiral galaxies merge, or one spiral is disrupted by a close encounter with another galaxy, the immediate result can be an irregular or peculiar galaxy with strong star formation. This can produce an elliptical galaxy if and when the gas is exhausted. If a merger between two galaxies produces an elliptical with no overall angular momentum, it will be a pressure-supported system. If a merger produces an elliptical with some net angular momentum, the elliptical will have an element of rotational support. Ellipticals, although conventionally gas-poor, can shed gas from their stars through mass loss and supernova remnants, which can settle into gas discs and form stars in turn. Almost all galaxies have some very old stars. Most galaxies appear to have been formed fairly early on (> 10 Gyr ago) but some have been strongly influenced by mergers and interactions since then. These processes are not well understood at present. Understanding the evolution of galaxies is currently a subject of much active research, from both a theoretical and observational perspective. Then there is dark matter ...
1.9
The Galaxy: an Overview
1.9.1
The Structure of the Galaxy
We live in a spiral galaxy. It is relatively difficult to measure its morphology from inside, but observations show that it is almost certainly a barred spiral. The best assessment of its
9
Figure 1.8: A sketch of the Galaxy seen edge on, illustrating the various components.
Figure 1.9: The Galaxy showing its geometry. morphological type puts it of type SBbc (intermediate between SBb and SBc). The Sun lies close to the plane of the Galaxy, at a distance of 8.0 Âą 0.4 kpc from the Galactic Centre. It is displaced slightly to the north of the plane. The overall diameter of the disc is ' 40 kpc. The structure of the Galaxy can be broken into various distinct components. These are: the disc, the central bulge, the bar, the stellar halo, and the dark matter halo. These are 10
illustrated in Figure 1.8. The disc consists of stars and of gas and dust. The gas and dust are concentrated more closely on the plane than the stars, while younger stars are more concentrated around the plane than old stars. The disc is rotationally supported against gravity. The bulge and bar are found within the central few kpc. They consist mostly of old stars, some metal-poor (deficient in heavy elements). The Galactic Centre has a compact nucleus, probably with a black hole at its core. The stellar halo contains many isolated stars and about 150 globular clusters. These are very old and very metal-poor. The stellar halo is pressure supported against gravity. The entire visible system is embedded in an extensive dark matter halo which probably extends out to > 100 kpc. Its properties are not well defined and the nature of the dark matter is still uncertain. We shall return to discuss the structure of the Galaxy and its individual components in detail later in the course.
1.9.2
Stellar Populations
It was realised in the early 20th century that the Galaxy could be divided into the disc and into a spheroid (which consists of the bulge and stellar halo). The concept of stellar populations was introduced by Walter Baade in 1944. In his picture, the stars in the discs of spiral galaxies, which contain many many young stars, were called Population I. This included the disc of our own Galaxy. In contrast, elliptical galaxies and the spheroids of spiral galaxies contain many old stars, which he called Population II. Population I systems consisted of young and moderately young stars which had chemical compositions similar to the Sun. Meanwhile, Population II systems were made of old stars which were deficient in heavy elements compared to the Sun. Population I systems were blue in colour, Population II were red. Spiral discs and irregular galaxies contained Population I stars, while ellipticals and the haloes and bulges of disc galaxies were Population II. This picture was found to be rather simplistic. The population concept was later refined for our Galaxy, with a number of subtypes replacing the original two classes. Today, it is more common to refer to the stars of individual components of the Galaxy separately. For example, we might speak of the halo population or the bulge population. The disc population is often split into the young disc population and the old disc population.
1.9.3
Galactic Coordinates
A galactic coordinate system is frequently used to specify the positions of objects within the Galaxy, and the positions of other galaxies on the sky. In this system, two angles are used to specify the direction of objects as seen from the Earth, called galactic longitude and galactic latitude (denoted l and b respectively). The system works in a way that is very similar to latitude and longitude on the Earth’s surface: galactic longitude measures an angle in the plane of the galactic equator (which is defined to be in the plane of the Galaxy), and galactic latitude measures the angle from the equator. Note that the galactic coordinate system is centred on the Earth. Galactic longitude is expressed as an angle l between 0◦ and 360◦ . Galactic latitude is expressed as an angle b between −90◦ and +90◦ . Zero longitude is defined to be the Galactic Centre. Therefore the Galactic Centre is at (l, b) = (0◦ , 0◦ ). The direction on the sky opposite to the Galactic Centre is known as the Galactic Anticentre and has coordinates (l, b) = (180 ◦ , 0◦ ). The North Galactic Pole is at b = +90◦ , and the South Galactic Pole is at b = −90◦ . Note again that these are positions on the sky relative to the Earth, and not relative to the Galactic Centre.
11
1.10
Density profiles versus surface brightness profiles
The observed surface brightness profiles of galaxies are the result of the projection of threedimensional density distributions of stars. For example, the observed surface brightness of an elliptical galaxy is well fitted by the de Vaucouleurs R1/4 law, as was discussed earlier. This surface brightness is the projection into two-dimensions on the sky of the three-dimensional density distribution of stars in space. Let us consider an example which demonstrates this. An observer on the Earth observes a spherically symmetric galaxy, such as an E0-type elliptical galaxy. The mean density of stars in space at a radial distance r from the centre of the galaxies is ρ(r) (measured in M¯ pc−3 or kg m−3 , and smoothed out over space). Consider a sight line that passes a tangential distance R from the galaxy’s centre.
Consider an element of the path length dl at a distance l from the point on the sight line closest to the nucleus. Let the contribution to the surface brightness from the element be dI(R) = k ρ(r) dl (1.1) where k is a constant of proportionality. ∴
dI(R)
dr dr on subs. dl = sin θ sin θ
=
k ρ(r)
=
r dr k ρ(r) √ 2 r − R2 on subs. sin θ =
√
r 2 − R2 . r
Integrating along the line of sight from B (r = R) to infinity (r → ∞), Z ∞ Z ∞ r r ρ(r) √ IB ∞ = k ρ(r) √ dr = k dr . 2 2 r −R r 2 − R2 R R This has neglected the near side of the galaxy. From symmetry, the total surface brightness along the line of sight is I(R) = 2 IB ∞ (R). Z ∞ r ρ(r) dr √ ∴ I(R) = 2k . (1.2) r 2 − R2 R In general, the density profile of a galaxy will not be spherical and we need to take account of the profile ρ(r) as a function of position vector r. The calculation of the surface brightness profile from the density profile is straightforward numerically, if not always analytically. The inverse problem, converting from an observed surface brightness profile to a density profile, often has to be done numerically. 12
Chapter 2
Stellar Dynamics in Galaxies 2.1
Introduction
A system of stars behaves like a fluid, but one with unusual properties. In a normal fluid twobody interactions are crucial in the dynamics, but in contrast star-star encounters are very rare. Instead stellar dynamics is mostly governed by the interaction of individual stars with the mean gravitational field of all the other stars combined. This has profound consequences for how the dynamics of the stars within galaxies are described mathematically, allowing for some considerable simplifications. This chapter establishes some basic results relating to the motions of stars within galaxies. The virial theorem provides a very simple relation between the total potential and kinetic energies of stars within a galaxy, or other system of stars, that has settled down into a steady state. The virial theorem is derived formally here. The timescale for stars to cross a system of stars, known as the crossing time, is a simple but important measure of the motions of stars. The relaxation time measures how long it takes for two-body encounters to influence the dynamics of a galaxy, or other system of stars. An expression for the relaxation time is derived here, which is then used to show that encounters between stars are so rare within galaxies that they have had little effect over the lifetime of the Universe. The motions of stars within galaxies can be described by the collisionless Boltzmann equation, which allows the numbers of stars to be calculated as a function of position and velocity in the galaxy. The equation is derived from first principles here. Similarly, the Jeans Equations relate the densities of stars to position, velocity, velocity dispersion and gravitational potential.
2.2 2.2.1
The Virial Theorem The basic result
Before going into the main material on stellar dynamics, it is worth stating – and deriving – a basic principle known as the virial theorem. It states that for any system of particles bound by an inverse-square force law, the time-averaged kinetic energy hT i and the time-averaged potential energy hV i satisfy 2 hT i + hV i = 0 , (2.1) for a steady equilibrium state. hT i will be a very large positive quantity and hV i a very large negative quantity. Of course, for a galaxy to hold together, the total energy hT i + hV i < 0 ; the virial theorem provides a much tighter constraint than this alone. Typically, hT i and −hV i ∼ 1050 to 1054 J for galaxies. In practice, many systems of stars are not in a perfect final steady state and the virial theorem does not apply exactly. Despite this, it does give important, approximate results 13
for many astronomical systems.
2.2.2
Deriving the virial theorem from first principles
To prove the virial theorem, consider a system of N stars. Let the ith star have a mass m i and a position vector xi . The velocity of the ith star is x˙ i ≡ dxi /dt, where t is the time. Consider a parameter N X F ≡ (2.2) mi x˙ i · xi , i=1
(F is related to the moment of inertia of the system, as we shall see below.) Differentiating with respect to time t, ! Ã ´ X d ³ dF d X = mi x˙ i · xi = mi x˙ i · xi dt dt dt i i ´ ³ X d assuming that the masses mi do not change x˙ i · xi = mi dt i ³ ´ X ¨ i · xi + x˙ i · x˙i = mi x from the product rule i
=
X i
¨ i · xi + mi x
X
mi x˙i 2
(2.3)
i
The kinetic energy of the ith particle is entire system of stars is X 1 ˙ 2i T = 2 mi x
1 2 2 mi x˙i .
Therefore the total kinetic energy of the ∴
X
mi x˙ 2i = 2 T
i
i
Substituting this into Equation 2.3, X dF ¨ i · xi , = 2T + mi x dt
(2.4)
i
at any time t. We now need to remember that the average of any parameter y(t) over time t = 0 to τ is Z 1 τ y(t) dt hyi = τ 0 Consider the average value of dF /dt over a time interval t = 0 to τ . ! À ¿ Z Ã X dF 1 τ ¨ i · xi dt = 2T + mi x dt τ 0 Z Zi 1 τX 2 τ ¨ i · xi dt mi x T dt + = τ 0 τ 0 i X mi Z τ ¨ i · xi dt assuming mi is constant over time = 2 hT i + x τ 0 i X = 2 hT i + mi h¨ xi · xi i (2.5) i
We can define a moment of inertia of the system of particles about the origin as X I ≡ m i xi · xi . i
14
(Note that this definition of moment of inertia is different from the moment of inertia about a particular axis that is commonly used to study the rotation of bodies.) Differentiating with repect to time, dI dt
=
X X ¡ X ¢ ¢ d¡ d X mi x˙ i ·xi = mi x˙ i ·xi + xi ·x˙ i = 2 mi xi ·xi = mi xi ·xi dt dt i
Substituting for F =
i mi
i
i
i
P
x˙ i · xi from Equation 2.2, F =
1 dI 2 dt
When the system of stars eventually reaches equilibrium, the moment of inertia will be constant. Therefore, F = 0 at all times after equilibrium has been reached. So, hdF/dti = 0. (An alternative way of visualising this is by considering that F will be bounded in any physical system. Therefore the long-time average h dF dt i will vanish as τ becomes large, i.e. R 1 τ dF i = lim ( dt) → 0.) limτ →∞ h dF τ →∞ τ 0 dt dt Substituting for hdF/dti = 0 into Equation 2.5, X 2 hT i + mi h¨ xi · xi i = 0 . (2.6) i
The term
P
xi · xi i i mi h¨
is related to the gravitational potential. We next need to show how.
Newton’s Second Law of Motion gives for the ith star, X ¨i = Fj mi x j j6=i
where Fj is the force exerted on the ith star by the jth star. Using the law of universal gravitation, X G mi mj ¨i = − mi x (xi − xj ) . |xi − xj |3 j j6=i
Taking the scalar product (dot product) with xi , ¨ i · xi = mi x
³ X j j6=i
−
´ G mi mj (x − x ) · xi i j |xi − xj |3
Summing over all i, X i
¨ i ·xi = − mi x
X X G mi mj X G mi mj (xi −xj )·xi = − (xi −xj )·xi 3 |xi − xj | |xi − xj |3 i
j j6=i
(2.7)
i,j i6=j
Switching i and j, we have X j
¨ j · xj = − mj x
X G mj mi (xj − xi ) · xj |xj − xi |3
(2.8)
j,i i6=j
Adding Equations 2.7 and 2.8, X i
¨ i ·xi + mi x
X j
¨ j ·xj = − mj x
X G mi mj X G mj mi (x −x )·x − (xj −xi )·xj i j i |xi − xj |3 |xj − xi |3 i,j i6=j
i,j i6=j
15
∴
2
X i
¨ i · xi = mi x
−
´ X G mi mj ³ (x − x ) · x + (x − x ) · x i j i j i j |xi − xj |3 i,j i6=j
But (xi − xj ) · xi + (xj − xi ) · xj
=
(xi − xj ) · xi − (xi − xj ) · xj
=
(xi − xj ) · (xi − xj )
= 2
∴
X i
¨ i · xi = mi x X
∴
i
|xi − xj | −
¨ i · xi = mi x
(factorising)
2
X G mi mj |xi − xj |2 |xi − xj |3 i,j i6=j
1 X G mi mj 2 |xi − xj |
−
(2.9)
i,j i6=j
We now need to find the total potential energy of the system. The gravitational potential at star i due to star j is Φi j = −
G mj |xi − xj |
Therefore the gravitational potential at star i due to all other stars is Φi =
X
Φi j =
X j j6=i
j j6=i
−
G mj |xi − xj |
Therefore the gravitational potential energy of star i due to all the other stars is Vi = m i Φ i = − m i
X j j6=i
G mj |xi − xj |
The total potential energy of the system is therefore V
=
X
Vi =
i
X G mj 1 X − mi 2 |xi − xj | i
j j6=i
The factor 12 ensures that we only count each pair of stars once (otherwise we would count each pair twice and would get a result twice as large as we should). Therefore, V
= −
1 X G mi mj 2 |xi − xj | i,j i6=j
Substituting for the total potential energy into Equation 2.9, X ¨ i · xi = V mi x i
Equation 2.6 uses time-averaged quantities. So, averaging over time t = 0 to τ , Z 1 τ X ¨ i · xi dt = hV i mi x τ 0 i
16
∴
X i
∴
Z 1 τ ¨ i · xi dt = hV i x mi τ 0 X ¨ i · xi i = hV i mi h x i
Substituting this into Equation 2.6,
2 hT i + hV i = 0 This is Equation 2.1, the Virial Theorem.
2.2.3
Using the Virial Theorem
The virial theorem applies to systems of stars that have reached a steady equilibrium state. It can be used for many galaxies, but can also be used for other systems such as some star clusters. However, we need to be careful that we use the theorem only for equilibrium systems. The theorem can be applied, for example, to: • elliptical galaxies • evolved star clusters, e.g. globular clusters • evolved clusters of galaxies (with the galaxies acting as the particles, not the individual stars) Examples of places where the virial theorem cannot be used are: • merging galaxies • newly formed star clusters • clusters of galaxies that are still forming/still have infalling galaxies The virial theorem provides an easy way to makes rough estimates of masses, because velocity measurements can give hT i. To do this we need to measure the observed velocity dispersion of stars (the dispersion along the line of sight using radial velocities obtained from spectroscopy). The theorem then gives the total gravitational potential energy, which can provide the total mass. This mass, of course, is important because it includes dark matter. Virial masses are particularly important for some galaxy clusters (using galaxies or atoms in X-ray emitting gas as the particles). But it is prudent to consider virial mass estimates as order-of-magnitude P only, because 1 (i) generally one can measure only line-of-sight velocities, and getting T = 2 i mi x˙ 2i from these requires more assumptions (e.g. isotropy of the velocity distribution); and (ii) the systems involved may not be in a steady state, in which case of course the virial theorem does not apply—clusters of galaxies are particularly likely to be quite far from a steady state. Note that for galaxies beyond our own, we cannot measure three-dimensional velocities of stars directly (although some projects are attempting to do this for some Local Group galaxies). We have to use radial velocities (the component of the velocity along the line of sight to the galaxy) only, obtained from spectroscopy through the Doppler shift of spectral lines. Beyond nearby galaxies, radial velocities of individual stars become difficult to obtain. It becomes necessary to measure velocity dispersions along the line of sight from the observed widths of spectral lines in the combined light of millions of stars.
17
2.2.4
Deriving masses from the Virial Theorem: a naive example
Consider a spherical elliptical galaxy of radius R that has uniform density and which consists of N stars each of mass m having typical velocities v. From the virial theorem, 2 hT i + hV i = 0 where hT i is the time-averaged total kinetic energy and hV i is the average total potential energy. We have N X 1 1 T = mv 2 = N mv 2 2 2 i=1
and averaging over time, hT i = 12 N mv 2 also. The total gravitational potential energy of a uniform sphere of mass M and radius R (a standard result) is 3 GM 2 V = − 5 R where G is the universal gravitational constant. So the time-averaged potential energy of the galaxy is 3 GM 2 hV i = − 5 R where M is the total mass. Substituting this into the virial theorem equation, µ ¶ 1 3 GM 2 2 2 N mv − = 0 2 5 R But the total mass is M = N m. ∴
v2 =
3 GM 3 N Gm = 5 R 5 R
The calculation is only approximate, so we shall use v2 '
N Gm GM ' . R R
(2.10)
v2 R . G
(2.11)
This gives the mass to be M '
So an elliptical galaxy having a typical velocity v = 350 km s−1 = 3.5 × 105 m s−1 , and a radius R = 10 kpc = 3.1 × 1020 m, will have a mass M ∼ 6 × 1041 kg ∼ 3 × 1011 M¯ .
2.2.5
Example: the fundamental plane for elliptical galaxies
We can derive a relationship between scale size, central surface brightness and central velocity dispersion for elliptical galaxies that is rather similar to the fundamental plane, using only assumptions about a constant mass-to-light ratio and a constant functional form for the surface brightness profile. We shall assume here that: • the mass-to-light ratio is constant for ellipticals (all E galaxies have the same M/L regardless of size or mass), and • elliptical galaxies have the same functional form for the mass distribution, only scalable. 18
Let I0 be the central surface brightness and R0 be a scale size of a galaxy (in this case, different galaxies will have different values of I0 and R0 ). The total luminosity will be L ∝ I0 R02 , because I0 is the light per unit projected area. Since the mass-to-light ratio is a constant for all galaxies, the mass of the galaxy is M ∝ L . ∴ M ∝ I0 R02 . From the virial theorem, if v is a typical velocity of the stars in the galaxy GM . R0
v2 '
The observed velocity dispersion along the line sight, σ0 , will be related to the typical velocity v by σ0 ∝ v (because v is a three-dimensional space velocity). So σ02 ∝
M . R0
∴ M ∝ σ02 R0 .
Equating this with M ∝ I0 R02 from above,
σ02 R0 ∝ I0 R02 .
∴ R0 I0 σ0−2 ' constant . This is close to, but not the same, as the observed fundamental plane result R0 I00.9 σ0−1.4 ' constant. The deviation from this virial prediction presumably has something to do with a varying mass-to-light ratio, but why it is a very good correlation in practice is not understood in detail.
2.3
The Crossing Time, Tcross
The crossing time is a simple, but important, parameter that measures the timescale for stars to move significantly within a system of stars. It is sometimes called the dynamical timescale. It is defined as R , (2.12) Tcross ≡ v where R is the size of the system and v is a typical velocity of the stars. As a simple example, consider a stellar system of radius R (and therefore an overall size 2R), having N stars each of mass m; the stars are distributed roughly homogeneously, with v being a typical velocity, and the system is in dynamical equilibrium. Then from the virial theorem, N Gm v2 ' . R The crossing time is then r R3 2R 2R ' q . (2.13) ' 2 Tcross ≡ v N Gm N Gm R
But the mass density is
ρ = ∴
Nm 4 3 3 πR
=
3N m . 4πR3
R3 3 = . Nm 4πρ 19
∴ Tcross = 2 So approximately,
r
3 4πGρ
1 Tcross ∼ √ . Gρ
(2.14)
Although this equation has been derived for a particular case, that of a homogeneous sphere, it is an important result and can be used for order of magnitude estimates in other situations. (Note that ρ here is the mass density of the system, averaged over a volume of space, and not the density of individual stars.) Example: an elliptical galaxy of 1011 stars, radius 10 kpc. R ' 10 kpc ' 3.1 × 1020 m
N
= 1011
m ' 1 M¯ ' 2 × 1030 kg r
R3 gives Tcross ' 1015 s ' 108 yr. N Gm The Universe is 14 Gyr old. So if a galaxy is ' 14 Gyr old, there are ' few × 100 crossing times in a galaxy’s lifetime so far. Tcross ' 2
2.4
The Relaxation Time, Trelax
The relaxation time is the time taken for a star’s velocity v to be changed significantly by two-body interactions. It is defined as the time needed for a change ∆v 2 in v 2 to be the same as v 2 , i.e. the time for ∆v 2 = v 2 . (2.15) To estimate the relaxation time we need to consider the nature of encounters between stars in some detail.
2.5 2.5.1
Star-Star Encounters Types of encounters
We might expect that stars, as they move around inside a galaxy or other system of stars, will experience close encounters with other stars. The gravitational effects of one star on another would change their velocities and these velocity perturbations would have a profound effect on the overall dynamics of the galaxy. The dynamics of the galaxy might evolve with time, as a result only of the internal encounters between stars. The truth, however, is rather different. Close star-star encounters are extremely rare and even the effects of distant encounters are so slight that it takes an extremely long time for the dynamics of galaxies to change substantially. We can consider two different types of star-star encounters: • strong encounters – a close encounter that strongly changes a star’s velocity – these are very rare in practice • weak encounters – occur at a distance – they produce only very small changes in a star’s velocity, but are much more common
20
2.5.2
Strong encounters
A strong encounter between two stars is defined so that we have a strong encounter if, at the closest approach, the change in the potential energy is larger than or equal to the initial kinetic energy. For two stars of mass m that approach to a distance r0 , if the change in potential energy is larger than than initial kinetic energy, Gm2 1 ≥ mv 2 , r0 2 where v is the initial velocity of one star relative to the other. 2Gm . v2
r0 ≤
∴ So we define a strong encounter radius
rS ≡
2Gm . v2
(2.16)
A strong encounter occurs if two stars approach to within a distance r S ≡ 2Gm/v 2 . For an elliptical galaxy, v ' 300 kms−1 . Using m = 1M¯ , we find that rS ' 3×109 m ' 0.02 AU. This is a very small figure on the scale of a galaxy. The typical separation between stars is ∼ 1 pc ' 200 000 AU. For stars in the Galactic disc in the solar neighbourhood, we can use a velocity dispersion of v = 30 kms−1 and m = 1M¯ . This gives rS ' 3 × 1011 m ' 2 AU. This again is very small on the scale of the Galaxy. So strong encounters are very rare. The mean time between them in the Galactic disc is ∼ 1015 yr, while the age of the Galaxy is ' 13 × 109 yr. In practice, we can ignore their effect on the dynamics of stars.
2.5.3
Distant weak encounters between stars
A star experiences a weak encounter if it approaches another to a minimum distance r 0 when r0 > r S ≡
2Gm v2
(2.17)
where v is the relative velocity before the encounter and m is the mass of the perturbing star. Weak encounters in general provide only a tiny perturbation to the motions of stars in a stellar system, but they are so much more numerous than strong encounters that they are more important than strong encounters in practice. We shall now derive a formula that expresses the change δv in the velocity v during a weak encounter (Equation 2.19 below). This result will later be used to derive an expression for the square of the velocity change caused by a large number of weak encounters, which will then be used to obtain an estimate of the relaxation time in a system of stars. Consider a star of mass ms approaching a perturbing star of mass m with an impact parameter b. Because the encounter is weak, the change in the direction of motion will be small and the change in velocity will be perpendicular to the initial direction of motion. At any time t when the separation is r, the component of the gravitational force perpendicular to the direction of motion will be Fperp =
Gms m cos φ , r2
where φ is the angle at the perturbing mass between the point of closest approach and the perturbed star. Let the component of velocity perpendicular to the initial direction of motion be vperp and let the final value be vperp f . 21
√ Making the approximation that the speed along the trajectory is constant,√r ' b2 + v 2 t2 at time t if t = 0 at the point of closest approach. Using cos φ = b/r ' b/ b2 + v 2 t2 and applying F = ma perpendicular to the direction of motion we obtain dvperp Gmb = 2 dt (b + v 2 t2 )3/2
,
where vperp is the component at time t of the velocity perpendicular to the initial direction of motion. Integrating from time t = −∞ to ∞, Z ∞ h ivperp f dt . vperp = Gmb 2 2 2 3/2 0 −∞ (b + v t ) R∞ We have the standard integral −∞ (1 + s2 )−3/2 ds = 2 (which can be shown using the substitution s = tan x). Using this standard integral, the final component of the velocity perpendicular to the initial direction of motion is vperp f =
2Gm . bv
(2.18)
Because the deflection is small, the change of velocity is δv ≡ |δv| = vperp f . Therefore the change in the velocity v is given by δv =
2Gm , bv
(2.19)
where G is the constant of gravitation, b is the impact parameter and m is the mass of the perturbing star. As a star moves through space, it will experience a number of perturbations caused by weak encounters. Many of these velocity changes will cancel, but some net change will occur over time. As a result, the sum over all δv will remain small, but the sum of the squares δv 2 will build up with time. It is this change in v 2 that we need to consider in the definition of the relaxation time (Equation 2.15). Because the change in velocity δv is perpendicular to the initial velocity v in a weak encounter, the change in v 2 is therefore δv 2 ≡ vf2 − v 2 = |v + δv|2 − v 2 = (v + δv) · (v + δv) − v 2 = v · v + 2v · δv + δv · δv − v 2 = 2v · δv + (δv)2 = (δv)2 , where vf is the final velocity of the star. The change in v 2 resulting from a single encounter that we need to consider is ¶ µ 2Gm 2 δv 2 = . (2.20) bv 22
Consider all weak encounters occurring in a time period t that have impact parameters in the range b to b + db within a uniform spherical system of N stars and radius R. The volume swept out by impact parameters b to b + db in time t is 2 π b db v t. Therefore the number of stars encountered with impact parameters between b and b + db in time t is ³ ´ N 3 b v t N db (volume swept out) (number density of stars) = 2 π b db v t 4 3 = 2R3 3 πR The total change in v 2 caused by all encounters in time t with impact parameters in the range b to b + db will be µ ¶ µ ¶ 2Gm 2 3 b v t N db ∆v 2 = bv 2 R3 Integrating over b, the total change in a time t from all impact parameters from b min to bmax is ¶ µ ¶ µ ¶ Z bmax µ Z bmax 2Gm 2 db 3 b v t N db 3 2Gm 2 v t N 2 ∆v (t) = = 3 3 bv 2R 2 v R b bmin bmin ∴
2
∆v (t) =
6
µ
Gm v
¶2
vtN ln R3
µ
bmax bmin
¶
.
(2.21)
It is sometimes useful to have an expression for the change in v 2 that occurs in one crossing time. In one crossing time Tcross = 2R/v, the change in v 2 is µ ¶ ¶ µ ¶ µ Gm 2 v 2R bmax ∆v 2 (Tcross ) = 6 N ln v R3 v bmin µ ¶ ¶2 µ Gm bmax = 12 N ln . (2.22) Rv bmin The maximum scale over which weak encounters will occur corresponds to the size of the system of stars. So we shall use bmax ' R. µ ¶ ¶ µ Gm 2 R ∆v 2 (Tcross ) = 12 N . (2.23) ln Rv bmin 23
We are more interested here in the relaxation time Trelax . The relaxation time is defined as the time taken for ∆v 2 = v 2 . Substituting for ∆v 2 from Equation 2.21 we get, µ µ ¶ ¶ Gm 2 v Trelax N bmax 6 ln = v2 . v R3 bmin Trelax =
∴
6N ln
or putting bmax ' R, Trelax =
6N ln
1 ³
1 ³
bmax bmin
R bmin
´
´
(Rv)3 , (Gm)2
(Rv)3 . (Gm)2
(2.24)
(2.25)
Equation 2.25 enables us to estimate the relaxation time for a system of stars, such as a galaxy or a globular cluster. Different derivations can have slightly different numerical constants because of the different assumptions made. In practice, bmin is often set to the scale on which strong encounters begin to operate, so bmin ' 1 AU. The precise values of bmax and bmin have relatively little effect on the estimation of the relaxation time because of the log dependence. As an example of the calculation of the relaxation time, consider an elliptical galaxy. This has: v ' 300 kms−1 = 3.0 × 105 ms−1 , N ' 1011 , R ' 10 kpc ' 3.1 × 1020 m and m ' 1 M¯ ' 2.0 × 1030 kg. So, ln(R/bmin ) ' 21 and Trelax ∼ 1024 s ∼ 1017 yr. The Universe is 14 × 109 yr old, which means that the relaxation time is ∼ 108 times the age of the Universe. So star-star encounters are of no significance for galaxies. For a large globular cluster, we have: v ' 10 kms−1 = 104 ms−1 , N ' 500 000, R ' 5 pc ' 1.6 × 1017 m and m ' 1 M¯ ' 2.0 × 1030 kg. So, ln(R/bmin ) ' 15 and Trelax ∼ 5 × 1015 s ∼ 107 yr. This is a small fraction (10−3 ) of the age of the Galaxy. Two body interactions are therefore significant in globular clusters. The importance of the relaxation time calculation is that it enables us to decide whether we need to allow for star-star interactions when modelling the dynamics of a system of stars. Modelling becomes much easier if two-body encounters can be ignored. A system where these interactions are not important is called a collisionless system. This is why stars on the scale of galaxies were described as being collisionless in Chapter 1. Fortunately, we can ignore these star-star interactions when modelling galaxies and this makes possible the use of a result called the collisionless Boltzmann equation later.
2.6
The Ratio of the Relaxation Time to the Crossing Time
An approximate expression for the ratio of the relaxation time to the crossing time can be calculated easily. Dividing the expressions for the relaxation and crossing times (Equations 2.25 and 2.12), R2 v 4 Trelax 1 ´ ³ = . Tcross (Gm)2 12N ln R bmin
For a uniform sphere, from the virial theorem (Equation 2.10), v2 '
N Gm R
and setting bmin equal to the strong encounter radius rS = 2GM/v 2 (Equation 2.16), we get, Trelax = Tcross
R2 v 4 1 N2 ³ ´ ' 2 Rv 2 (Gm) 12N ln(N ) 12N ln 2GM 24
∴
N Trelax ' . Tcross 12 ln N
(2.26)
For a galaxy, N ∼ 1011 . Therefore Trelax /Tcross ∼ 109 . For a globular cluster, N ∼ 105 and Trelax /Tcross ∼ 103 .
2.7
The Nature of the Gravitational Potential in a Galaxy
The gravitational potential in a galaxy can be represented as essentially having two components. The first of these is the broad, smooth, underlying potential due to the entire galaxy. This is the sum of the potentials of all the stars, and also of the dark matter and the interstellar medium. The second component is the localised deeper potentials due to individual stars. We can effectively regard the potential as being made of a smooth component with very localised deep potentials superimposed on it. This is illustrated figuratively in Figure 2.1.
Figure 2.1: A sketch of the gravitational potential of a galaxy, showing the broad potential of the galaxy as a whole, and the deeper, localised potentials of individual stars.
Interactions between individual stars are rare, as we have seen, and therefore it is the broad distribution that determines the motions of stars. Therefore, we can represent the dynamics of a system of stars using only the smooth underlying component of the gravitational potential Φ(x, t), where x is the position vector of a point and t is the time. If the galaxy has reached a steady state, Φ is Φ(x) only. We shall neglect the effect of the localised potentials of stars in the following sections, which is an acceptable approximation as we have shown.
2.8 2.8.1
Gravitational potentials, density distributions and masses General principles
The distribution of mass in a galaxy – including both the visible and dark matter – determines the gravitational potential. The potential Φ at any point is related to the local density ρ by Poisson’s Equation, ∇2 Φ (≡ ∇ · ∇Φ) = 4π Gρ . This means that if we know the density ρ(x) as a function of position across a galaxy, we can calculate the potential Φ, either analytically or numerically, by integration. Alternatively, if we know Φ(x), we can calculate the density profile ρ(x) by differentiation. In addition, because the acceleration due to gravity g is related to the potential by g = −∇Φ, we can compute g(x) from Φ(x) and vice-versa. Similarly, substituting for g = −∇Φ in the Poission Equation gives ∇ · g = − 4πGρ . These computations are often done for some example theoretical representations of the potential or density. A number of convenient analytical functions are encountered in the literature, depending on the type of galaxy being modelled and particular circumstances. 25
The issue of determining actual density profiles and potentials from observations of galaxies is much more challenging, however. Observations readily give the projected density distributions of stars on the sky, and we can attempt to derive the three-dimensional distribution of stars from this. However, it is the total density ρ(x), including dark matter ρ DM (x), that is relevant gravitationally, with ρ(x) = ρDM (x) + ρV IS (x). The dark matter distribution can only be inferred from the dynamics of visible matter (or to a limited extent from gravitational lensing of background objects). In practice, therefore, the three-dimension density distribution ρ(x) and the gravitational potential Φ(x) are poorly known.
2.8.2
Spherical symmetry
Calculating the relationship between density and potential is much simpler if we are dealing with spherically symmetric distributions, which are appropriate in some circumstances such as spherical elliptical galaxies. Under spherical symmetry, ρ and Φ are functions only of the radial distance r from the centre of the distribution. Therefore, µ ¶ 1 d 2 2 dΦ ∇ Φ = 2 r = 4πGρ r dr dr because Φ is independent of the angles θ and φ in a spherical coordinate system (see Appendix B). Another useful parameter for spherically symmetric distributions is the mass M (r) that lies inside a radius r. We can relate this to the density ρ(r) by considering a thin spherical shell of radius r and thickness dr centred on the distribution. The mass of this shell is dM (r) = ρ(r)×surface area×thickness = 4πr 2 ρ(r)dr. This gives us the differential equation dM = 4π r 2 ρ , dr
(2.27)
often known as the equation of continuity of mass. The total mass is Mtot = limr→∞ M (r). The gravitational acceleration g in a spherical distribution has an absolute value |g| of g =
GM (r) , r2
(2.28)
at a distance r from the centre, where G is the constant of gravitation (derived in Appendix B), and is directed towards the centre of the distribution.
2.8.3
Two examples of spherical potentials
The Plummer Potential A function that is often used for the theoretical modelling of spherically-symmetric galaxies is the Plummer potential. This has a gravitational potential Φ at a radial distance r from the centre that is given by GMtot , (2.29) Φ(r) = − √ r 2 + a2 where Mtot is the total mass of the galaxy and a is a constant. The constant a serves to flatten the potential in the core. For this potential the density ρ at a radial distance r is ρ(r) =
3Mtot a2 , 4π (r2 + a2 )5/2
(2.30)
which can be derived from the expression for Φ using the Poisson equation ∇2 Φ = 4πGρ. This density scales with radius as ρ ∼ r −5 at large radii. 26
The mass interior to a point M (r) can be computed from the R density ρ using dM/dr = 4πr2 ρ, or from the potential Φ using Gauss’s Law in the form S ∇Φ · dS = 4πGM (r) for a spherical surface of radius r. The result is M (r) =
Mtot r3 . (r2 + a2 )3/2
(2.31)
The Plummer potential was first used in 1911 by H. C. K. Plummer (1875–1946) to describe globular clusters. Because of the simple functional forms, the Plummer model is sometimes useful for approximate analytical modelling of galaxies, but the r −5 density profile is much steeper than elliptical galaxies are observed to have. The Isothermal Sphere The density distribution known as the isothermal sphere is a spherical model of a galaxy that is identical to the distribution that would be followed by a stable cloud of gas having the same temperature everywhere. A spherically-symmetric cloud of gas having a single temperature T throughout would have a gas pressure P (r) at a radius r from its centre that is related to T by the ideal gas law as P (r) = np kB T , where np (r) is the number density of gas particles (atoms or molecules) at radius r and kB is the Boltzmann constant. The cloud will be supported by hydrostatic equilibrium, so therefore dP dr
= −
GM (r) ρ(r) , r2
(2.32)
where M (r) is the mass enclosed within a radius r. The gradient in the mass is dM/dr = 4πr2 ρ(r). These equations have a solution ρ(r) =
σ2 , 2πG r2
and M (r) =
2σ 2 r , G
where σ 2 ≡
kB T , mp
(2.33)
where mp is the mass of each gas particle. The parameter σ is the root-mean-square velocity in any direction. The isothermal sphere model for a system of stars is defined to be a model that has the same density distribution as the isothermal gas cloud. Therefore, an isothermal galaxy would also have a density ρ(r) and mass M (r) interior to a radius r given by ρ(r) =
σ2 , 2πG r2
and M (r) =
2σ 2 r , G
(2.34)
where σ is root-mean-square velocity of the stars along any direction. The isothermal sphere model is sometimes used for the analytical modelling of galaxies. While it has some advantages of simplicity, it does suffer from the disadvantage of being unrealistic in some important respects. Most significantly, the model fails totally at large radii: formally the limit of M (r) as r −→ ∞ is infinite.
2.9
Phase Space and the Distribution Function f (x, v, t)
To describe the dynamics of a galaxy, we could use: • the positions of each star, xi • the velocities of each star, vi where i = 1 to N , with N ∼ 106 to 1012 . However, this would be impractical numerically. If we tried to store these data on a computer as 4-byte numbers for every star in a galaxy having N ∼ 1012 stars, we would need 6 × 4 × 1012 bytes ∼ 2 × 1013 bytes ∼ 20 000 Gbyte. 27
This is such a large data size that the storage requirements are prohibitive. If we needed to simulate a galaxy theoretically, we would need to follow the galaxy over time using a large number of time steps. Storing the complete set of data for, say, 103 − 106 time steps would be impossible. Observationally, meanwhile, it is impossible to determine the positions and motions of every star in any galaxy, even our own. In practice, therefore, people represent the stars in a galaxy using the distribution function f (x, v, t) over position x and velocity v, at a time t. This is the probability density in the 6-dimensional phase space of position and velocity at a given time. It is also known as the “phase space density”. It requires only modest data resources to store the function numerically for a model of a galaxy, while f can also be modelled analytically. The number of stars in a rectangular box between x and x + dx, y and y + dy, z and z + dz, with velocity components between vx and vx + dvx , vy and vy + dvy , vz and vz + dvz , is f (x, v, t) dx dy dz dvx dvy dvz ≡ f (x, v, t) d3 x d3 v . The number density n(x, v, t) of stars in space can be obtained from the distribution function f by integrating over the velocity components, Z ∞ Z ∞ f (x, v, t) d3 v . (2.35) f (x, v, t) dvx dvy dvz = n(x, v, t) = −∞
−∞
2.10
The Continuity Equation
We shall assume here that stars are conserved: for the purpose of modelling galaxies we shall assume that the number of stars does not change. This means ignoring star formation and the deaths of stars, but it is acceptable for the present purposes. The assumption that stars are conserved results in the continuity equation. This expresses the rate of change in the distribution function f as a function of time to the rates of change with position and velocity. The equation becomes an important starting point in deriving other equations that relate f to the gravitational potential and to observational quantities. Consider the x − vx plane within the 6-dimensional phase space (x, y, z, vx , vy , vz ) in Cartesian coordinates. Consider a rectangular box in the plane extending from x to x + ∆x and vx to vx + ∆vx . But the velocity vx means that stars move in x (vx ≡ dx/dt). So there is a flow of stars through the box in both the x and the vx directions.
28
We can represent the flow of stars by the continuity equation: µ ¶ µ ¶ µ ¶ ¶ µ ∂f ∂ dx ∂ dy ∂ dz ∂ dvx + f + f + f + + f ∂t ∂x dt ∂y dt ∂z dt ∂vx dt µ µ ¶ ¶ dvy ∂ dvz ∂ f f + = 0 . ∂vy dt ∂vz dt
(2.36)
This can be abbreviated as µ µ ¶ ¶¶ 3 µ X ∂f dxi dvi ∂ ∂ f f + + = 0, ∂t ∂xi dt ∂vi dt
(2.37)
i=1
where x1 ≡ x, x2 ≡ y, x3 ≡ z, v1 ≡ vx , v2 ≡ vy , and v3 ≡ vz . It is sometimes also abbreviated as µ µ ¶ ¶ ∂ dx dv ∂f ∂ + · f · f + = 0 , (2.38) ∂t ∂x dt ∂v dt where, in this notation, for any vectors a and b with components (a1 , a2 , a3 ) and (b1 , b2 , b3 ), 3 X ∂bi ∂ . ·b ≡ ∂a ∂ai
(2.39)
i=1
(Note that it does not mean a direct differentiation by a vector). It is also possible to simplify the notation further by introducing a combined phase space coordinate system w = (x, v) with components (w1 , w2 , w3 , w4 , w5 , w6 ) = (x, y, z, vx , vy , vz ). In this case the continuity equation becomes 6 X ∂f ∂ (f w˙ i ) = 0 . + ∂t ∂wi
(2.40)
i=1
The equation of continuity can also be expressed in terms of the momentum p = mv, where m is mass of an element of gas, as µ ¶ µ ¶ ∂f ∂ dx ∂ dp + · f + · f = 0 . (2.41) ∂t ∂x dt ∂p dt
29
2.11
The Collisionless Boltzmann Equation
2.11.1
The importance of the Collisionless Boltzmann Equation
Equation 2.25 showed that the relaxation time for galaxies is very long, significantly longer than the age of the Universe: galaxies are collisionless systems. This, fortunately, simplifies the analysis of the dynamics of stars in galaxies. It is possible to derive an equation from the continuity equation that more explicitly states the relation between the distribution function f , position x, velocity v and time t. This is the collisionless Boltzmann equation (C.B.E.), which takes its name from a similar equation in statistical physics derived by Boltzmann to describe particles in a gas.
2.11.2
A derivation of the Collisionless Boltzmann Equation
The continuity equation (2.37) states that ¶ ¶¶ µ µ 3 µ X ∂ ∂f ∂ dxi dvi + + = 0 , f f ∂t ∂xi dt ∂vi dt i=1
where f is the distribution function in the Cartesian phase space (x1 , x2 , x3 , v1 , v2 , v3 ). But the acceleration of a star is given by the gradient of the gravitational potential Φ: dvi ∂Φ = − dt ∂xi in each direction (i.e. for each value of i for i = 1, 2, 3). (This is simply dv/dt = g = −∇Φ resolved into each dimension.) We also have
dxi = vi , so, dt µ ¶¶ 3 µ X ∂f ∂Φ ∂ ∂ −f = 0 . + (f vi ) + ∂t ∂xi ∂vi ∂xi i=1
But vi is a coordinate, not a value associated with a particular star: we are using the continuous function f rather than considering individual stars. Therefore v i is independent of xi . So, ∂f ∂ (f vi ) = vi . ∂xi ∂xi The potential Φ ≡ Φ(x, t) does not depend on vi : Φ is independent of velocity. µ ¶ ∂ dΦ ∂Φ ∂f ∴ f = ∂vi dxi ∂xi ∂vi ∴
¶ 3 µ X ∂f ∂f ∂Φ ∂f + vi − ∂t ∂xi ∂xi ∂vi
= 0 .
i=1
dvi ∂Φ But =− , so, dt ∂xi
¶ 3 µ X ∂f dvi ∂f ∂f + vi + ∂t ∂xi dt ∂vi
= 0 .
(2.42)
i=1
This is the collisionless Boltzmann equation. It can also be written as ¶ 3 µ X dxi ∂f ∂f dvi ∂f + + ∂t dt ∂xi dt ∂vi i=1
30
= 0.
(2.43)
Alternatively it can expressed as, 6 X ∂f ∂f w˙ i + = 0 , ∂t ∂wi
(2.44)
∂f dx ∂f dv ∂f + · + · = 0 , ∂t dt ∂x dt ∂v
(2.45)
i=1
where w = (x, v) is a 6-dimensional coordinate system, and also as
and as
dx ∂f dp ∂f + · + ∂t dt ∂x dt Note the use here of the notation 3 X dxi ∂f dx ∂f · ≡ dt ∂x dt ∂xi
·
∂f = 0 . ∂p
= 0 ,
etc.
(2.46)
(2.47)
i=1
2.11.3
Deriving the Collisionless Boltzmann Equation using Hamiltonian Mechanics
The collisionless Boltzmann equation can also be derived from the continuity equation using Hamiltonian mechanics. This derivation is given here. It has the advantage of being neat. However, do not worry if you are not familiar with Hamiltonian mechanics: this is given as an alternative to Section 2.11.2. Hamilton’s Equations relate the differentials of the position vector x and of the (generalised) momentum p to the differential of the Hamiltonian H: dx ∂H = , dt ∂p
dp ∂H = − . dt ∂x
(2.48)
(In this notation this means dxi ∂H = dt ∂pi
and
dpi ∂H = − dt ∂xi
for i = 1 to 3,
(2.49)
where xi and pi are the components of x and p.) Substituting for dx/dt and dp/dt into the continuity equation, µ ¶ µ ¶ ∂f ∂ ∂H ∂ ∂H + · f + · −f = 0 . ∂t ∂x ∂p ∂p ∂x For a star moving in a gravitational potential Φ, the Hamiltonian is p2 + m Φ(x) 2m where p is its momentum and m is its mass. ∂H d ³p · p´ = + ∂p dp 2m H =
= = and
∂H ∂x
= = =
p·p + m Φ(x) . 2m Differentiating, d (mΦ) dp =
p + 0 because Φ(x, t) is independent of p m p m µ 2¶ ∂ p ∂Φ + m ∂x 2m ∂x ∂Φ 0 + m because p2 = p · p is independent of x ∂x ∂Φ m . ∂x 31
(2.50)
Substituting for ∂H/∂p and ∂H/∂x, µ ¶ ∂ ³ p´ ∂ ∂Φ ∂f + · f − · fm = 0 ∂t ∂x m ∂p ∂x ∴
∂f p ∂f ∂Φ ∂f + · − m · = 0 ∂t m ∂x ∂x ∂p
because p is independent of x, and because ∂Φ/∂x is independent of p since Φ ≡ Φ(x, t). 1 dp/dt = − ∂Φ/∂x (the gradient But the momentum p = m dx/dt and the acceleration is m of the potential). ∂Φ 1 dp ∴ = − . ∂x m dt µ ¶ ∂f m dx ∂f 1 dp ∂f + · − m − · = 0 So, ∂t m dt ∂x m dt ∂p ∴
∂f dx ∂f dp ∂f + · + · = 0 . ∂t dt ∂x dt ∂p
The left-hand side is the differential df /dt. So, ∂f dx ∂f dp ∂f df + · + · ≡ ∂t dt ∂x dt ∂p dt
= 0
(2.51)
— the collisionless Boltzmann equation. While this equation is called the collisionless Boltzmann equation (or CBE) in stellar dynamics, in Hamiltonian dynamics it is known as Liouville’s theorem.
2.12
The implications of the Collisionless Boltzmann Equation
The collisionless Boltzmann equation tells us that df /dt = 0. This means that the density in phase space, f , does not change with time for a test particle. Therefore if we follow a star in orbit, the density f in 6-dimensional phase space around the star is constant. This simple result has important implications. If a star moves inwards in a galaxy as it follows its orbit, the density of stars in space increases (because the density of stars in each of the components of the galaxy is greater closer to the centre). df /dt = 0 then tells us that the spread of stellar velocities around the star will increase to keep f constant. Therefore the velocity dispersion around the star increases as the star moves inwards. The velocity dispersion is larger in regions of the galaxy where the density of stars is greater. Conversely, if a star moves out from the centre, the density of stars around it will decrease and the velocity dispersion will decrease to keep f constant. The collisionless Boltzmann equation, and the Poisson equation (which is the gravitational analogue of Gauss’s law in electrostatics) together constitute the basic equations of stellar dynamics: df = 0 , ∇2 Φ(x) = 4πGρ(x) , (2.52) dt where f is the distribution function, t is time, Φ(x, t) is the gravitational potential at point x, ρ(x, t) is the mass density at point x, and G is the constant of gravitation. The collisionless Boltzmann equation applies because star-star encounters do not change the motions of stars significantly over the lifetime of a galaxy, as was shown in Section 2.5. Were this not the case and the system was collisional, the CBE would have to be modified by adding a “collisional term” on the right-hand side. 32
Though f is a density in phase space, the full form of the collisionless Boltzmann equation does not necessarily have to be written in terms of x and p. We can express df dt = 0 in any set of six variables in phase space. You should remember that f is always taken to be a density in six-dimensional phase space, even in situations where it is a function of fewer variables. For example, if f happens to be a function of energy alone, it is not the same as the density in energy space.
2.13
The Collisionless Boltzmann Equation in Cylindrical Coordinates
So far we have considered Cartesian coordinates (x, y, z, vx , vy , vz ). However, the form ¶ 3 µ X dxi ∂f dvi ∂f ∂f + + ∂t dt ∂xi dt ∂vi
= 0 ,
i=1
for the collisionless Boltzmann equation of Equation 2.43 applies to any coordinate system. For a galaxy, it is often more convenient to use cylindrical coordinates with the centre of the galaxy as the origin.
The coordinates of a star are (R, φ, z). A cylindrical system is particularly useful for spiral galaxies like our own where the z = 0 plane is set to the Galactic plane. (Note the use of a lower-case φ as a coordinate angle, whereas elsewhere we have used a capital Φ to denote the gravitational potential.) The collisionless Boltzmann equation in this system is df dt
=
dvφ ∂f ∂f dR ∂f dφ ∂f dz ∂f dvR ∂f dvz ∂f + + + + + + ∂t dt ∂R dt ∂φ dt ∂z dt ∂vR dt ∂vφ dt ∂vz = 0 ,
(2.53)
where vR , vφ , and vz are the components of the velocity in the R, φ, z directions. We need to replace the differentials of the velocity components with more convenient terms. dvR /dt, dvφ /dt and dvz /dt are related to the acceleration a (but are not actually the components of the acceleration for the R and φ directions). The velocity and acceleration in terms of these differentials in a cylindrical coordinate system are v = a =
dr dR dφ dz = ˆ eR + R ˆ eφ + ˆ ez dt dt dt dt à µ ¶2 ! µ ¶ d2 R d2 φ dφ dR dφ d2 z dv = − R ˆ e + 2 + R ˆ e + ˆ ez (2.54) R φ dt dt2 dt dt dt dt2 dt2 33
where ˆ eR , ˆ eφ and ˆ ez are unit vectors in the R, φ and z directions (a standard result for any cylindrical coordinate system, and for any velocity, acceleration or force). Representing the velocity as v = vRˆ eR + v φ ˆ eφ + v z ˆ ez and equating coefficients of the unit vectors, vφ dφ dz dR = vR , = , = vz . (2.55) dt dt R dt The acceleration can be related to the gravitational potential Φ with a = −∇Φ (because the only forces acting on the star are those of gravity). In a cylindrical coordinate system, ∇ ≡ ˆ eR
∂ 1 ∂ ∂ + ˆ eφ + ˆ ez . ∂R R ∂φ ∂z
(2.56)
Using this result and equating coefficients, we obtain, µ ¶2 d2 R dR dφ d2 φ ∂Φ 1 ∂Φ dφ , 2 + R , − R = − = − 2 2 dt dt ∂R dt dt dt R ∂φ
d2 z dΦ = − 2 dt dz
Rearranging these and substituting for dR/dt, dφ/dt and dz/dt from 2.55, we obtain, 2 ∂Φ vφ dvz ∂Φ dvR =− + , =− , dt ∂R R dt ∂z and with some more manipulation, dvφ vφ ³ 1 ∂Φ d ³ dφ ´ dR dφ d2 φ dR dφ ´ = R = + R 2 = vR + − −2 dt dt dt dt dt dt R R ∂φ dt dt vR vφ vφ vR vφ 1 ∂Φ 1 ∂Φ = − − 2 vR = − − . (2.57) R R ∂φ R R ∂φ R
Substituting these into Equation 2.53, we obtain, df dt
à vφ2 vφ ∂f ∂f ∂f ∂f = + vR + + vz + − ∂t ∂R R ∂φ ∂z R µ ¶ 1 ∂Φ ∂f − vR vφ + − R ∂φ ∂vφ
∂Φ ∂R
!
∂f ∂vR
∂Φ ∂f ∂z ∂vz
= 0 ,
(2.58)
This is the collisionless Boltzmann equation in cylindrical coordinates. This form relates f to observable parameters (R, φ, z, vR , vφ , vz ) and the potential Φ. In many practical cases, particularly spiral galaxies, Φ will be independent of φ, so ∂Φ/∂φ = 0 (but not if we include spiral arms where the potential will be slightly deeper).
2.14
Orbits of Stars in Galaxies
2.14.1
The character of orbits
The term orbit is used to describe the trajectories of stars within galaxies, even though they are very different to Keplerian orbits such as those of planets in the Solar System. The orbits of stars in a galaxy are usually not closed paths and in general they are three dimensional (they do not lie in a plane). They are often complex. In general they are highly chaotic, even if the galaxy is in equilibrium. The orbit of a star in a spherical potential, to consider the simplest example, is confined to a plane perpendicular to the angular momentum vector of the star. It is, however, not a closed path and has an appearance that is usually described as a rosette. In axisymmetric potentials (e.g. an oblate elliptical galaxy) the orbit is confined to a plane that precesses. This plane is inclined to the axis of symmetry and rotates about the axis. The orbit within the plane is similar to that in a spherical potential. Triaxial potentials can have orbits that are much more complex. Triaxial potentials often have the tendency to tumble about one axis, which leads to chaotic star orbits. 34
Figure 2.2: An example of the orbit of a star in a spherical potential. An example star has been put into an orbit in the x − y plane. Its orbit follows a “rosette” pattern, but it remains in the x − y plane. [These diagrams were plotted using data generated assuming a Plummer potential: the potential lacks a deep central cusp.]
Figure 2.3: The orbit of a star in a flattened (oblate) potential. An example star has been put into an orbit inclined to the x − y plane. The galaxy is flattened in the z direction with an axis ratio of 0.7. The orbit follows a “rosette” pattern, but the plane of the orbit precesses. This illustrates the trajectory of a star in an oblate elliptical galaxy, for example.
2.14.2
The chaotic nature of many orbits
In chaotic systems, stars that initially move along similar paths will diverge, eventually moving along very different orbits. The divergence in their paths is exponential in time, which is the technical definition of chaos in dynamical systems. Their motion shows a stretching and folding in phase space. This can be so even if there is no collective motion of stars at all (f in equilibrium).
35
Figure 2.4: The orbit of a star in a triaxial potential. An example star has been put into an orbit inclined to the x − y plane. The galaxy has different dimensions in each of the x, y and z directions. The orbit is complex and it maps out a region of space. This illustrates the trajectory of a star in a triaxial elliptical galaxy, for example. (This simulation extends over a longer time period than those of Figures 2.2 and 2.3.) This stretching and folding in phase space can be appreciated using an analogy. When making bread, a baker’s dough behaves essentially as a fluid. Dough is incompressible, but that does not prevent the baker stretching it in one direction and shrinking it in others, and then folding it back. So while the dough keeps much the same overall shape, particles initially nearby within it can be dispersed to widely different parts of it, through the repeated stretching and folding. The same stretching and folding operation can take place for stars in phase space. In fact it appears that phase space is typically riddled with regions where f gets stretched in one direction while being shrunk in others. Thus nearby orbits tend to diverge, and the divergence is exponential in time. Simulations show that the timescale for divergence (the e-folding time) is Tdiverge ∼ Tcross , the crossing time, and gets shorter for higher star densities. However, in some special cases, there is no chaos. These systems are said to be integrable. If the dynamics is confined to one real-space dimension (hence two phase-space dimensions) then no stretching-and-folding can happen, and orbits are regular. So in a spherical system all orbits are regular. In addition, there are certain potentials (usually referred to as St¨ackel potentials) where the dynamics decouples into three effectively one-dimensional systems; so if some equilibrium f generates a St¨ackel potential, the orbits will stay chaosfree. Also, small perturbations of non-chaotic systems tend to produce only small regions of chaos,1 and orbits may be well described through perturbation theory.
2.14.3
Integrals of the motion
To solve the collisionless Boltzmann equation for stars in a galaxy, we need further constraints on the position and velocity. This can be done using integrals of the motion. These are simply functions of the star’s position x and velocity v that are constant along its orbit. They are useful in potentials Φ(x) that are constant over time. The distribution function f is also constant along the orbit and can be written as a function of integrals of the motion. 1
If you ever come across the ‘KAM theorem’, that’s basically it.
36
Examples of integrals of the motion are: • The total energy. The energy E of a particular star in a potential is constant over time, so E(x, v) = 12 mv 2 + mΦ(x). Because this is dependent on the mass of the star, it is more normal to work with the energy per unit mass, which will be written as E m here. So Em = 21 v 2 + Φ is a constant. • In an axisymmetric potential (e.g. our Galaxy), the z-component of the angular momentum, Lz , is conserved. Therefore Lz is an integral of the motion in such a potential. • In a spherical potential, the total angular momentum L is constant. Therefore L is an integral of the motion in this potential, and the x, y and z components of L are each integrals of the motion. An orbit is said to be regular if it has as many isolating integrals that can define the orbit unambiguously as there are spatial dimensions.
2.14.4
Isolating integrals and integrable systems
The collisionless Boltzmann equation tells us that df /dt = 0 (Section 2.12). As was discussed earlier, if we move with a star in its orbit, f is constant locally as the star passes through phase space at that instant in time. But if the system is in a steady state (the potential is constant over time), f is constant along the star’s path at all times. This means that the orbits of stars map out constant values of f . An integral of the motion for a star (e.g. energy per unit mass, Em ) is constant (by definition). They therefore define a 5-dimensional hypersurface in 6-dimensional phase space. The motion of a star is confined to that 5-dimensional surface in phase space. Therefore f is constant over that hypersurface. A different value of the isolating integral (e.g. a different value of Em ) will define a different hypersurface. In turn, f will be different on this surface. So f is a function of the isolating integral, i.e. f (x, y, z, vx , vy , vz ) = fn(I1 ) where I1 is an integral of the motion. I1 here “isolates” a hypersurface. Therefore the integral of the motion is known as an isolating integral. Integrals that fail to confine orbits are called “non-isolating” integrals. A system is integrable if we can define isolating integrals that enable the orbit to be determined. In integrable systems there are significant simplifications. Each orbit is (i) confined to a three-dimensional toroidal subspace of six-dimensional phase space, and (ii) fills its torus evenly.2 Phase space itself is filled by nested orbit-carrying tori—they have to be nested, since orbits can’t cross in phase space. Therefore the time-average of each orbit is completely specified once we have specified which torus it is on; this takes three numbers for each orbit, and these are called ‘isolating integrals’ – they are constants for each orbit of course. Think of the isolating integrals as a coordinate system that parameterises orbital tori; transformations to a different set of isolating integrals is like a coordinate transformation. If isolating integrals exist, then any f that depends only on them will automatically satisfy the collisionless Boltzmann equation. Conversely, since orbits fill their tori evenly, any equilibrium f cannot depend on location on the tori, it can only depend on the tori themselves, i.e., on the isolating integrals. This result is known as Jeans’ theorem.
2.14.5
The Jeans Theorem
The Jeans Theorem is an important result in stellar dynamics that states the importance of integrals of the motion in solving the collisionless Boltzmann equation for gravitational 2
These two statements are important results from Hamiltonian dynamical systems which we won’t try to prove here. But the statements that follow in this section are straightforward consequences of (i) and (ii).
37
potentials that do not change with time. It was named after its discoverer, the English astronomer, physicist and mathematician Sir James Hopwood Jeans (1877–1946). It states that any steady-state solution of the collisionless Boltzmann equation depends on the phase-space coordinates only through integrals of the motion in the galaxy’s potential, and any function of the integrals yields a steady-state solution of the collisionless Boltzmann equation. This means that in a potential that does not change with time, we can express the collisionless Boltzmann equation in terms of integrals of motion, and then solve for the distribution function f in terms of those integrals of motion. We can then convert the solution of f in terms of the integrals to a solution for f in terms of the space and velocity coordinates. For example, if the energy per unit mass Em and total angular momentum components Lx and Ly are constant for each star in some potential, then we can solve for f uniquely as a function of Em , Lx and Ly . Then we can convert from Em , Lx and Ly to give f as a function of (x, y, z, vx , vy , vz ). You should be wary of Jeans’ theorem, especially when people tacitly assume it, because as we saw, it assumes that the system is integrable, which is in general not the case.
2.15
Spherical Systems
2.15.1
Solving for f in spherical galaxies
The Jeans Theorem does apply in spherical systems of stars, such as spherical elliptical galaxies. As a consequence, f can depend on (at most) three integrals of motion in a spherical system. The simplest case is for f to be a function of the energy of the stars only. (Since we are considering bound systems, f = 0 for E > 0 always.) To find an equilibrium solution, we only have to satisfy Poisson’s equation. The total energy of a star of mass m moving with a velocity v is E = 12 mv 2 + mΦ, where Φ is the gravitational potential at the point where the star is situated. Here it is more convenient to use the energy per unit mass Em = 12 v 2 + Φ. A spherical galaxy can be described very simply by a spherical polar coordinate system (r, θ, φ) with the origin at the centre. Poisson’s equation relates the Laplacian of the gravitational potential Φ at a point to the local mass density ρ as ∇2 Φ = 4πGρ. In a spherical polar coordinate system the Laplacian of any scalar function A(r, θ, φ) is µ ¶ µ ¶ ∂ ∂2A 1 ∂ ∂A 1 ∂A 1 ∇2 A ≡ 2 (2.59) r2 + 2 sin θ + 2 2 r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂φ2 (a standard result from vector calculus: see Appendix B). In a spherically symmetric galaxy that does not change with time, the potential is a function of the radial distance r from the centre only. So ∂Φ/∂θ = 0 and ∂Φ/∂φ = 0. Therefore, µ ¶ 1 d 2 2 dΦ ∇ Φ = 2 r . (2.60) r dr dr Substituting this into the Poisson equation, µ ¶ 1 d 2 dΦ r = 4πGρ . r2 dr dr The distribution function f is related to the number density n of stars by Z n = f d3 v
38
(2.61)
(from Equation 2.35), and in this case f is a function of energy per unit mass: f = f (E m ). We can relate this to the density ρ using ρ = m n where m is the mean mass of a star, giving, Z f d3 v . (2.62) ρ = m
This integral is over all velocities. We can convert from d3 v to dv by considering a thin spherical shell in a space defined by the three velocity components, which gives d3 v = 4πv 2 dv. So Z ρ = 4π m f v 2 dv . (2.63)
Note that this integration can be performed over velocity at each and every point in the galaxy, so this ρ is ρ(r). We must determine the limits on this integral. For any particular point in the galaxy (i.e. any value of r), the minimum possible velocity is 0, which occurs when a star moving on a radial orbit reaches its maximum distance from the centre at that point. The maximum velocity occurs when a star has the greatest possible energy (Em = 0, which would allow a star p to move out from the point to arbitrary distance). Therefore the maximum velocity is p v = −2Φ(r). So the integration is from velocity v = 0 to −2Φ(r). So, µ ¶ Z √−2Φ(r) 1 d dΦ f v 2 dv . (2.64) r2 = (4π)2 G m r2 dr dr 0
We can convert this integral to an integral over energy per unit mass. E m = 12 v 2 + Φ gives dEm = v dv at a fixed position (and hence for a constant Φ). The maximum possible energy per unit mass is 0, while the minimum possible value at a radius r would be given by a star that is stationary at that point: Em = Φ(r) (which is of course negative). So, at any radius r, µ ¶ Z 0 p √ 1 d 2 dΦ 2 r = (4π) 2 G m Em − Φ(r) f (Em ) dEm , (2.65) r2 dr dr Φ(r) p on substituting v = 2(Em − Φ) . It is usual in Equation 2.64 to take f (v) as given and to try to solve for Φ(r) and hence ρ(r); this is a nonlinear differential equation. In Equation 2.65 we would normally take Φ as given, and try to solve for f (Em ); this is a linear integral equation. There are f (Em ) models in the literature, and you can always concoct a new one by picking some ρ(r), computing Φ(r) and then solving Equation 2.65 numerically. Note that the velocity distribution is isotropic for any f (Em ). If f depends on other integrals of motion, say angular momentum L or its z component, or both – thus f (Em , L2 , Lz ) – then the velocity distribution will be anisotropic, and there are many examples of these around too.
2.15.2
Example of a spherical, isotropic distribution function: the Plummer potential
As discussed earlier, the Plummer potential has a gravitational potential Φ and a mass density ρ at a radial distance r from the centre that are given by GMtot , Φ(r) = − √ r 2 + a2
ρ(r) =
3Mtot a2 , 4π (r2 + a2 )5/2
(2.29) and (2.30)
where Mtot is the total mass of the galaxy and a is a constant. The distribution function for the Plummer model is related to the density by Equation 2.63. It can be shown that these Φ(r) and ρ(r) forms give a solution, √ 7 a2 24 2 2 . f (Em ) = (−E ) (2.66) m 4 3 7π G5 Mtot m 39
This can be verified by inserting in Equation 2.64, although this is not trivial to do. This result gives the distribution function f as a function only of the energy per unit mass E m . To calculate f for any point (x, y, z, vx , vy , vz ) in phase space, we need only to calculate Em from these coordinates and then calculate the value of f associated with that E m .
2.15.3
Example of a spherical, isotropic distribution function: the isothermal sphere
The isothermal sphere was introduced in Section 2.8.3. The density profile was given in Equation 2.34. The isothermal sphere is defined by analogy with a Maxwell-Boltzmann gas, and therefore the distribution function as a function of the energy per unit mass E m is given by, Ã ! µ ¶ 1 2 Em n0 n0 2v + Φ − 2 − = , (2.67) f (Em ) = 3 exp 3 exp σ σ2 (2πσ 2 ) 2 (2πσ 2 ) 2 where σ 2 is a velocity dispersion and acts in this distribution like a temperature does in a gas. n0 is a constant. Integrating over velocities gives µ µ ¶Z ∞ ¶ Z Z ∞ v2 4π n0 Φ 2 2 3 v exp − 2 dv f . 4π v dv = − 2 n(r) = fd v = 3 exp σ 2σ 0 0 (2πσ 2 ) 2 ¶ µ Φ(r) , (2.68) = n0 exp − 2 σ R∞ √ √ 2 using the standard integral 0 e−ax dx = π/2 a . Converting this to density ρ(r) using ρ = m n, where m is the mean mass of the stars, we get, ¶ µ ¶ µ ρ(r) Φ(r) 2 , and equivalently, Φ(r) = − σ ln , (2.69) ρ(r) = ρ0 exp − 2 σ ρ0 where ρ0 is a constant. Using this, Poisson’s equation (∇2 Φ = 4πGρ) in a spherically symmetric potential becomes on substituting for dΦ/dr, µ ¶ d 4πG 2 d ln ρ r = − 2 r2 ρ , (2.70) dr dr σ for which the solution is
σ2 (2.71) 2πGr2 (see Equation 2.34). As already commented in Section 2.8.3, the isothermal sphere has infinite mass! (A side effect of this is that the boundary condition Φ(∞) = 0 cannot be used, which we why we needed the redundant-looking constant ρ0 in Equations 2.67 and 2.68.) Nevertheless, it is often used as a model, with some large-r truncation assumed, for the dark haloes of disc galaxies. The same ρ(r) can be produced by many different f , all having different velocity distributions. ρ(r) =
2.16
Observable and Measurable Quantities
The phase space distribution f is usually very difficult to measure observationally, because of the challenges of measuring the distribution of stars over space and particularly over velocity. Velocity components along the line of sight can be measured spectroscopically from a Doppler shift. However, transverse velocity components cannot be measured directly for galaxies beyond our own (or at least beyond the Local Group). As a function of seven 40
variables (six of the phase space, plus time), the function f can be awkward to compute theoretically. It is therefore more convenient to use quantities related to f . The number density n of stars in space can be measured observationally by counting more luminous stars for nearby galaxies, or from the observed intensity of light for more distant galaxies. Star counts combined with estimates of the distances of individual stars can provide n as a function of position within our Galaxy. For a distant galaxy, converting the intensity along the line of sight of the integrated light from large numbers of stars in to number densities – a process known as deprojection – requires assumptions about the stellar populations and their three-dimensional distribution. Nevertheless, reasonable attempts can be made in many instances. Spectroscopy provides mean velocities hvr i along the line of sight through a galaxy, and the widths of absorption lines provide velocity dispersions σr along the line of sight. These mean velocities will be weighted according to the numbers of stars. It is therefore much more convenient to calculate quantities involving number densities n, mean velocities and velocity dispersions from f . These quantities can then be compared with observations more directly. A series of equations called the Jeans Equations allow this to be done.
2.17
The Jeans Equations
The Jeans Equations relate number densities, mean velocities, velocity dispersions and the gravitational potential. They were first used in stellar dynamics by Sir James Jeans in 1919. It is useful to derive equations for the quantities Z n = f d3 v , Z n hvi i = vi f d 3 v , Z 2 n σij = (vi − hvi i) (vj − hvj i) f d3 v , (2.72) by taking moments of the collisionless Boltzmann equation (expressed in the Cartesian variables xi and vi ). σij is a velocity dispersion tensor: it is discussed in more detail below. The collisionless Boltzmann equation gives (Equation 2.43) ¶ 3 µ X ∂f dxi ∂f dvi ∂f + + ∂t dt ∂xi dt ∂vi
= 0,
i=1
or equivalently, 3 3 X X ∂f ∂Φ ∂f ∂f + vi − = 0, ∂t ∂xi ∂xi ∂vi i=1
i=1
on substituting for the components of acceleration from dv/dt = −∇Φ. To derive the first of the Jeans Equations, we shall consider the zeroth moment by integrating this equation over all velocities. ! Z Ã Z 3 3 X ∂f X ∂f ∂Φ ∂f vi d3 v = 0 . d3 v . (2.73) + − ∂t ∂xi ∂xi ∂vi i=1
∴
Z
i=1
Z 3 Z 3 X X ∂f 3 ∂f 3 ∂Φ ∂f 3 vi d v + d v − d v = 0 , ∂t ∂xi ∂xi ∂vi i=1
i=1
41
(with the right hand being zero because it is a definite integral). Some of these terms can be simplified, particularly by noting the integration is performed over all velocities at each position and time. Z Z ∂f 3 ∂ But d v = f d3 v because t and vi ’s are independent ∂t ∂t = Z
and
vi
∂f 3 d v ∂xi
= = =
and
Z
∂Φ ∂f 3 d v ∂xi ∂vi
=
R ∂n because n = f d3 v, ∂t Z ∂(vi f ) 3 d v because vi ’s and xi ’s are independent ∂xi Z ∂ vi f d 3 v because xi ’s and vi ’s are independent ∂xi R ∂ (n hvi i ) on substituting nhvi i = vi f d3 v. ∂xi Z ∂f 3 ∂Φ d v because xi ’s and Φ are independent of vi ’s ∂xi ∂vi
=
∂Φ (0) ∂xi
=
0 .
because f −→ 0 as |vi | −→ ∞ (by analogy with the divergence theorem)
Substituting for these terms, 3 X ∂n ∂nhvi i + = 0 , ∂t ∂xi
(2.74)
i=1
which is a continuity equation. This is the first of the Jeans Equations. To derive the second of the Jeans Equations, we consider the first moment of the collisionless Boltzmann equation by multiplying by vi and integrating over all velocities. Multiplying the C.B.E. throughout by vi , we obtain, 3 3 X X ∂f ∂f ∂Φ ∂f vi + vi − vi vj ∂t ∂xj ∂xj ∂vj j=1
= 0 ,
(2.75)
j=1
where the summation is performed over an integer j because we have introduced a velocity component vi . Note that the use of vi means that we are considering one particular velocity component only at this stage, i.e. one value of i from i = 1, 3. Integrating this over all velocities, Z Z 3 3 X X ∂f ∂Φ ∂f 3 vi ∂f + d v = 0 . d3 v . (2.76) vi vj − vi ∂t ∂xj ∂xj ∂vj j=1
∴
Z
j=1
3 Z 3 Z X X ∂f 3 ∂f 3 ∂Φ ∂f 3 vi d v + d v − d v = 0 . vi vj vi ∂t ∂xj ∂xj ∂vj j=1
j=1
42
∂f 3 d v vi ∂t
Z
But
∂(vi f ) 3 d v ∂t Z ∂ vi f d 3 v ∂t
Z
= =
R ∂ (n hvi i) because n hvi i = vi f d3 v, ∂t Z ∂ (vi vj f ) d3 v because vi and vj are independent of xi ∂xj Z ∂ vi vj f d 3 v because xi and vi ’s are independent ∂xj
= Z
and
vi vj
∂f 3 d v ∂xj
= =
R ∂ (n hvi vj i ) on substituting nhvi vj i = vi vj f d3 v ∂xj Z ∂f 3 ∂Φ vi d v because xj ’s and Φ are independent of vi ’s ∂xj ∂vj
= and
Z
vi
∂Φ ∂f 3 d v ∂xj ∂vj ∂(vi f ) ∂vj ∂vi and ∂vj
But
∴ ∴
So
Z
∂vi ∂vj ∂f vi ∂vj
=
∂f ∂vi + f ∂vj ∂vj
=
vi
=
1
if i = j
=
0
if i 6= j
=
δij
=
∂(vi f ) − δij f . ∂vj
∂Φ ∂f 3 vi d v ∂xj ∂vj
because vi and t are independent
= = = =
∴
vi
∂f ∂(vi f ) ∂vi = − f ∂vj ∂vj ∂vj
because vi and vj are independent if i 6= j
¶ Z µ ∂(vi f ) ∂Φ − δij f d3 v ∂xj ∂vj µZ ¶ Z ∂Φ ∂(vi f ) 3 3 d v − δij f d v ∂xj ∂vj ¢ ∂Φ ¡ 0 − δij n because vi f −→ 0 as |vi | −→ ∞ ∂xj ∂Φ δij n . − ∂xj
Substituting for these terms, ¶ 3 3 µ X X ¢ ∂(nhvi i) ∂ ¡ ∂Φ − + nhvi vj i − δij n = 0 . ∂t ∂xj ∂xi j=1
j=1
So, 3 X ¢ ∂(nhvi i) ∂ ¡ ∂Φ n , + nhvi vj i = − ∂t ∂xj ∂xi
(2.77)
j=1
for each of i = 1, 2, 3. This is the second of the Jeans Equations. We need to introduce a tensor velocity dispersion σij defined so that Z 2 n σij ≡ (vi − hvi i) (vj − hvj i) f d3 v 43
(2.78)
for i, j = 1, 3 (see Equation 2.72 above). This is used to represent the spread of velocities in each direction. It is a symmetric tensor and we can choose some coordinate system in which it is diagonal (i.e. σ11 6= 0, σ22 6= 0, σ33 6= 0, but all the other elements are zero). This is known as the velocity ellipsoid. For example, in a cylindrical coordinate system, we might use elements such as σRR , σφφ and σzz . If the velocity dispersion is isotropic, σ11 = σ22 = σ33 , which we might simplify by writing as σ only. Rearranging Equation 2.78 and multiplying out, Z ¡ ¢¡ ¢ 1 2 σij = vi − hvi i vj − hvj i f d3 v n Z ³ ´ 1 vi vj − vi hvj i − hvi ivj + hvi i hvj i f d3 v = n Z Z Z Z 1 1 1 1 3 3 3 = vi vj f d v − vi hvj i f d v − hvi ivj f d v + hvi i hvj i f d3 v n n n n Z Z Z Z 1 1 1 1 3 3 3 = vi vj f d v − hvj i vi f d v − hvi i vj f d v + hvi i hvj i f d3 v n n n n because hvi i and hvj i are constants =
hvi vj i − hvj i hvi i − hvi i hvj i + hvi i hvj i
from Equation 2.72.
So, 2 σij = hvi vj i − hvi i hvj i
(2.79)
This can be used to find hvi vj i using 2 hvi vj i = σij + hvi i hvj i
Substituting for hvi vj i into the second of the Jeans Equations (Equation 2.77), ¸ 3 · X ¢ ∂(nhvi i) ∂ ¡ 2¢ ∂ ¡ ∂Φ + nσij + nhvi ihvj i = − n , ∂t ∂xj ∂xj ∂xi j=1
for each of i = 1, 2 and 3. Therefore, hvi i
3 3 X X ¢ ∂ ¡ ∂Φ ∂ ¡ 2¢ ∂n ∂hvi i nσij + nhvi ihvj i = − + n + n . ∂t ∂t ∂xj ∂xj ∂xi
(2.80)
j=1
j=1
We can eliminate the 1st and 4th terms using the first of the Jeans Equations (Equation 2.74). Multiplying that equation throughout by hvi i, 3 X ¢ ∂n ∂ ¡ hvi i + hvi i nhvj i ∂t ∂xj
=
0
=
0
j=1
∴
But
hvi i
3 X ¢ ∂n ∂ ¡ hvi i + nhvj i ∂t ∂xj j=1
¢ ¢ ∂ ¡ ∂hvi i ∂ ¡ nhvi ihvj i = hvi i nhvj i + n hvj i ∂xj ∂xj ∂xj
∂ (nhvj i), ∂xj ¶ 3 µ X ∂ ∂hvi i ∂n = 0 + (nhvi ihvj i) − n hvj i hvi i ∂t ∂xj ∂xj
Substituting for hvi i
j=1
44
(2.81)
∴
3 3 X X ∂hvi i ∂n ∂ hvj i hvi i + (nhvi ihvj i) = n ∂t ∂xj ∂xj j=1
j=1
Substituting this into Equation 2.80, we obtain, 3 3 X X ∂ ∂hvi i ∂Φ ∂hvi i 2 n + n = −n − (nσij ) , hvj i ∂t ∂xj ∂xi ∂xj j=1
(2.82)
j=1
where i can be any of 1, 2 or 3. This is a third Jeans Equation. This can also be expressed as, dhvi 1 = − ∇Φ − ∇ · (nσ 2 ) . dt n
(2.83)
where hvi is the mean velocity vector, t is the time, Φ is the potential, n is the number 2 . Note that here d/dt is not ∂/∂t, but density of stars and σ 2 represents the tensor σij dv dt
µ
Dv ≡ Dt
¶
=
∂v + v · ∇v , ∂t
(2.84)
which is sometimes called the convective derivative; it is also sometimes written as D/Dt to ∂ . emphasise that it is not simply ∂t This is similar to the Euler equation in fluid dynamics. An ordinary fluid has dhvi ∇p = − ∇Φ − + viscous terms , dt ρ
(2.85)
where the pressure p arises because of the high rate of molecular encounters, which also leads to the equation of state, and p is isotropic. In stellar dynamics, the stars behave like a fluid in which ∇ · (ρσ) behaves like a pressure, but it is anisotropic. Indeed, this anisotropy is the reason that it is represented by a tensor, whereas in an ordinary fluid the pressure is represented by a scalar. A related fact is that in the flow of an ordinary fluid the particle paths and streamlines coincide, whereas stellar orbits and the streamlines hvi do not generally coincide. The Jeans Equations have been represented here in terms of the number density n of stars. However, it is possible to work instead with the mean mass density in space ρ instead of n. The Jeans Equations can be used for all stars in a galaxy, but sometimes they are used for subpopulations in our Galaxy (e.g G dwarfs, K giants). If they are used for subpopulations, Φ remains the total gravitational potential of all matter (including dark matter), but the velocities and number densities refer to the subpopulations.
2.18
The Jeans Equations in an Axisymmetric System, e.g. the Galaxy
Using cylindrical coordinates (R, φ, z) and assuming axisymmetry (so ∂/∂φ = 0), the second Jeans Equation is ¢ ∂ ∂ ∂ n¡ 2 2 (nhvR i) + (nhvR i) + (nhvR vz i) + hvR i − hvφ2 i ∂t ∂R ∂z R for the R direction ∂ ∂ ∂ 2n (nhvφ i) + (nhvR vφ i) + (nhvφ vz i) + hvR vφ i ∂t ∂R ∂z R for the φ direction 45
=
−n
=
0
∂Φ ∂R
∂ ∂ n hvR vz i ∂ ( nhvz i ) + ( nhvR vz i ) + ( nhvz2 i ) + ∂t ∂R ∂z R for the z direction.
=
−n
∂Φ ∂z (2.86)
In a steady state, where the potential does not change with time, we can use ∂/∂t = 0. This axisymmetric form of the second Jeans Equation is useful in spiral galaxies, such as our own Galaxy, provided that we neglect any change in the potential in the φ direction (although there might be a φ dependence if the potential is deeper in the spiral arms). Meanwhile, the first of the Jeans Equations in a cylindrical coordinate system with axisymmetric symmetry (∂/∂φ = 0) is, ∂n ∂ ∂ + ( RnhvR i ) + ( nhvz i ) = 0 . ∂t ∂R ∂z
2.19
(2.87)
The Jeans Equations in a Spherically Symmetric System
The second Jeans Equation in a steady-state (∂/∂t = 0) spherically-symmetric (∂/∂φ = 0) galaxy in a spherical polar coordinate system (r, θ, φ) is i n h dΦ d ( n hvr2 i ) + 2hvr2 i − hvθ2 i − hvφ2 i . = −n dr r dr
(2.88)
This might be used, for example, for a spherical elliptical galaxy. We can calculate the gradient in the potential in this spherical case very sinply. Using the general result that the acceleration due to gravity is g = −∇Φ, that g = −GM (r)/r 2 in a spherically symmetric system where M (r) is the mass interior to the radius r, and that ∇Φ = dΦ/dr in a spherical system, we get dΦ/dr = GM (r)/r 2 . As a simple test to see whether this really does work, let us make a crude model of our Galaxy’s stellar halo. We shall assume that the halo is spherical, assume a logarithmic potential of the form Φ(r) = v02 ln r where v0 is a constant, assume there are isotropic velocity components (i.e. hvr2 i = hvθ2 i = hvφ2 i = σ 2 , where σ is a constant), and assume that the star number density of the halo can be approximated by n(r) ∝ r −l where l is a constant. Equation 2.88 becomes ¢ d ¡ dΦ n (0) = − n , n hσ 2 i + dr r dr on substituting√for the velocity terms. Substituting for Φ(r) and n(r), and cancelling r, we obtain σ = v0 / l. For the Milky Way’s halo, observations show that n ∝ r −3.5 (i.e. l = 3.5), while v0 as measured from gas on circular orbits is 220 km s−1 , and rotation is negligible √ (which is a requirement for vθ2 = σ 2 etc.). So we expect σ ' 220 km s−1 / 3.5 ' 120 km s−1 . And it is.
2.20
Example of the Use of the Jeans Equations: the Surface Mass Density of the Galactic Disc
The Jeans Equations can be applied to our Galaxy to measure the surface mass density of the Galactic disc at the solar distance from the centre using observations of the velocities along the line of sight of stars lying some distance above or below the Galactic plane. The surface mass density is the mass per unit area of the disc when viewed from from a great distance. It is expressed in units of kg m−2 , or more commonly solar masses per square 46
parsec (M¯ pc−2 ). This analysis is important because it allows the quantity of dark matter in the disc to be estimated. Determining whether there is dark matter in the Galactic disc is a very important constraint on the nature of dark matter.
The second Jeans Equation in a cylindrical coordinate system (R, φ, z) centred on the Galaxy, with z = 0 in the plane and R = 0 at the Galactic Centre states for the z direction that ∂( nhvR vz i ) ∂( nhvz2 i ) nhvR vz i ∂Φ ∂( nhvz i ) + + + = −n ∂t ∂R ∂z R ∂z (Equation 2.86), where n is the star number density, vR and vz are the velocity components in the R and z directions, Φ(R, z, t) is the Galactic gravitational potential and t is time. The Galaxy is in a steady state, so n does not change with time. Therefore the first term ∂(nhvz i)/∂t = 0. Observations show that ∂( nhvR vz i ) ' 0 ∂R
and
nhvR vz i R
' 0 ,
as is to be expected because of the cancelling of positive and negative terms of the zcomponents of the velocity. Therefore, ∂Φ ∂( nhvz2 i ) = −n . ∂z ∂z
(2.89)
hvz2 i is the mean square velocity in the direction perpendicular to the Galactic plane. Poisson’s equation gives ∇2 Φ = 4πGρ, where ρ is the mass density at a point. In cylindrical coordinates the Laplacian is µ ¶ 1 ∂ ∂Φ 1 ∂2Φ ∂2Φ ∇2 Φ = R + 2 + . R ∂R ∂R R ∂φ2 ∂z 2 If we observe stars above and below the Galactic plane, all at the same Galactocentric radius R, we can neglect the ∂Φ/∂R and ∂ 2 Φ/∂φ2 terms. ∴ and so
∂ ∂z
∂2Φ = 4πGρ , ∂z 2
µ ´¶ 1 ∂ ³ 2 − nhvz i = 4πGρ . n ∂z
Integrating perpendicular to the galactic plane from −z to z, the surface mass density within a distance z of the plane at a Galactocentric radius R is µ Z z Z z ´¶ 1 ∂ 1 ∂ ³ 0 2 ρ dz = − nhvz i Σ(R, z) = dz 0 4πG ∂z n ∂z −z −z 47
1 = − 4πG
·
´ 1 ∂ ³ nhvz2 i n ∂z
¸z
z 0 =−z
¯ ´¯ ∂ ³ 1 ¯ = − nhvz2 i ¯ , ¯ 2πGn ∂z z
assuming symmetry about z = 0. Therefore the surface mass density within a distance z of the plane at the solar Galactocentric radius R0 is ¯ ´¯ ∂ ³ 1 ¯ (2.90) nhvz2 i ¯ . Σ(R0 , z) = − ¯ 2πGn ∂z z
If the star densities n can be measured as a function of height z from the plane and if the z-component of the velocities vz can be measured as spectroscopic radial velocities, we can solve for Σ(R0 , z) as a function of z. This gives, after modelling the contribution from the dark matter halo, the mass density of the Galactic disc.
The analysis can be performed on some subclass of stars, such as G giants or K giants. In this case the number density n of stars in space is that of the subclass. Number counts of stars towards the Galactic poles, combined with estimates of the distances to individual stars, give n. Spectroscopic observations give radial velocities (the velocity components along the line of sight) through the Doppler effect. By observing towards the Galactic poles, the radial velocities are the same as the vz components. This analysis gives Σ(R0 , z) as a function of z. The value increases with z as a greater proportion of the stars of the disc are included, until all the disc matter is included. Σ(R 0 , z) will still increase slowly with z beyond this as an increasing amount mass from the dark matter halo is included. Indeed, it is necessary to determine the contribution Σ d (R0 ) from the disc alone to the observed data. An additional complication is that in measuring ∂(nhv z2 i)/∂z as a function of z, we are dealing with the differential of observed quantities. This means that the effects of observational errors can be considerable. The first measurement of the surface density of the Galactic disc was carried out by Oort in 1932. More modern attempts were carried out in the 1980s by Bahcall and by Kuijken and Gilmore. There has been considerable debate about the interpretation of results. Early studies claimed evidence of dark matter in the Galactic disc, but more recently some consensus has developed that there is little dark matter in the disc itself, apart from the contribution from the dark matter halo that extends into the disc. A modern value is 48
Σd (R0 ) = 50 ± 10 M¯ pc−2 . The absence of significant dark matter in the disc indicates that dark matter does not follow baryonic matter closely on a small scale.
2.21
N -body Simulations
An alternative approach that can be adopted to study the dynamics of stars in galaxies is to use N -body simulations. In these analyses, the system of stars is represented by a large number of particles and computer modelling is used to trace the dynamics of these particles under their mutual gravitational attractions. These simulations usually determine the positions of the test particles at each of a series of time steps, calculating the changes in their positions between each step. It is possible to add further particles to trace gas and dark matter, although the gas must be made collisional. The individual particles in a galaxy simulation, however, do not correspond to stars. It is impossible to represent every star in a galaxy in N -body simulations. In practice, the limits on computational power allows only ∼ 105 to 108 particles, whereas there may be as many as 1012 stars in the galaxies being modelled. The appropriate interpretation of simulation particles is as Monte-Carlo samplers of the distribution function f . In Section 2.6 we found that the ratio of the relaxation time to the crossing time was Trelax /Tcross ∼ N/12 ln N for a system of N particles. It follows that a system that is modelled computationally by too few particles will have a relaxation time that is too short, and may experience the effects of two-body encounters. As a consequence, the particles in N -body simulations have to be made collisionless artificially. The standard way of doing this 1 is to replace the 1/r gravitational potential by (r 2 + a2 )− 2 , which amounts to smearing out the mass on the ‘softening length’ scale a. Early N -body computer codes performed calculations for each time step that took a time that depended on the number of particles as N 2 . Modern codes perform faster computations by treating distant particles differently to nearby particles. “Tree codes” combine the effects of number of distant particles together. This increases their efficiency and the computation times scale as only N ln N . N -body simulations are widely used now to study the evolution of galaxies, and an active research area at present is to incorporate gas dynamics in them. In contrast to standard N -body methods, smoothed particle hydrodynamics (SPH) are often used to study the gas in galaxies. Modern simulations include the effects of dark matter alongside stars and gas. They can follow the collapse of clumps of dark matter in the early Universe that lead to the formation of galaxies. Simulations can also follow the growth of structure in the Universe as gravitational attraction produced the clustering of galaxies observed today. N body simulations can model the effects of large changes in gravitational potentials, whereas analytical methods can find these more challenging.
49
Chapter 3
The Interstellar Medium 3.1
An Introduction to the Interstellar Medium
The interstellar medium (ISM) of a galaxy consists of the gas and dust distributed between the stars. The mass in the gas is much larger than that in dust, with the mass of dust in the disc of our Galaxy ' 0.1 of the mass of gas. Generally, the interstellar medium amounts to only a small fraction of a galaxy’s luminous mass, but this fraction is strongly correlated with the galaxy’s morphological type. This fraction is ' 0 % for an elliptical galaxy, 1–25 % for a spiral (the figure varies smoothly from type Sa to Sd), and 15–50 % for an irregular galaxy. The interstellar gas is very diffuse: in the plane of our Galaxy, where the Galactic gas is at its densest, the particle number density is ' 103 to 109 atomic nuclei m−3 . Some of this gas is in the form of single neutral atoms, some is in the form of simple molecules, some exists as ions. Whether gas is found as atoms, molecules or ions depends on its temperature, density and the presence of radiation fields, primarily the presence of ultraviolet radiation from nearby stars. Note that the density of the gas is usually expressed as the number of atoms, ions and molecules per unit volume; here they will be expressed in S.I. units of m−3 , but textbooks, reviews and research papers still often use cm−3 (with 1 cm−3 ≡ 106 m−3 , of course). The interstellar medium in a galaxy is a mixture of gas remaining from the formation of the galaxy, gas ejected by stars, and gas accreted from outside (such as infalling diffuse gas or the interstellar medium of other galaxies that have been accreted). The ISM is very important to the evolution of a galaxy, primarily because it forms stars in denser regions. It is important observationally. It enables us, for example, to observe the dynamics of the gas, such as rotation curves, because spectroscopic emission lines from the gas are prominent. The chemical composition is about 90 % hydrogen, 9% helium plus a trace of heavy elements (expressed by numbers of nuclei). The heavy elements in the gas can be depleted into dust grains. Dust consists preferentially of particles of heavy elements. Individual clouds of gas and dust are given the generic term nebulae. However, the interstellar gas is found to have a diverse range of physical conditions, having very different temperatures, densities and ionisation states. There are several distinct types of nebula, as is described below.
3.2
Background to the Spectroscopy of Interstellar Gas
The gas in the interstellar medium readily emits detectable radiation and can be studied relatively easily. The gas is of very low density compared to conditions on the Earth, even compared to many vacuums created in laboratories. Therefore spectral lines are observed from the interstellar gas that are not normally observed in the laboratory. These are referred 50
to as forbidden lines, whereas those that are readily observed under laboratory conditions are called permitted lines. Under laboratory conditions, spectral lines with low transition probabilities are ‘forbidden’ because the excited states get collisionally de-excited before they can radiate. In the ISM, collisional times are typically much longer than the lifetimes of those excited states that only have forbidden transitions. So forbidden lines are observable from the ISM, and in fact they can dominate the spectrum. Astronomy uses a particular notation to denote the atoms and ions that is seldom encountered in other sciences. Atoms and ions are written with symbols such as H I, H II, He I and He II. In this notation, I denotes a neutral atom, II denotes a singly ionised positively charged ion, III denotes a doubly ionised positive ion, etc. So, H I is H0 , H II is H+ , He I is He0 , He II is He+ , He III is He2+ , Li I is Li0 , etc. A negatively charged ion, such as H− , is indicated only as H− , although few of these are encountered in astrophysics. Square brackets around the species responsible for a spectral line indicates a forbidden line, for example [O II].
3.3
Cold Gas: the 21 cm Line of Neutral Hydrogen
Cold gas emits only in the radio and the microwave region, because collisions between atoms and the radiation field (e.g. from stars) are too weak to excite the electrons to energy levels that can produce optical emission. The most important ISM line from cold gas is the 21 cm line of atomic hydrogen (H I). It comes from the hyperfine splitting of the ground state of the hydrogen atom (split because of the coupling of the nuclear and electron spins). The energy difference between the two states is ∆E = 9.4 × 10−25 J = 5.9 × 10−6 eV. This produces emission with a rest wavelength λ0 = hc/∆E = 21.1061 cm and a rest frequency ν0 = ∆E/h = 1420.41 MHz. In this process, hydrogen atoms are excited into the upper state through collisions (collisional excitation), but these collisions are rare in the low densities encountered in the cold interstellar medium. The transition probability is A = 2.87 × 10−15 s−1 . The lifetime of an excited state is ' 1/A = 11 million years. This is still much shorter than times between collisions, which allows de-excitation to occur through spontaneous emission rather than through collisional de-excitation.
The 21cm spin flip transition itself cannot be observed in a laboratory because of the extremely low transition probabilities, but the split ground state shows up in the laboratory 51
through of the hyperfine splitting of the Lyman lines in the ultraviolet. In the ISM, the 21cm line is observed primarily in emission, but can also be observed in absorption against a background radio continuum source. H I observations have many uses. One critically important application is to measure the orbital motions of gas to determine rotation curves in our own Galaxy and in other galaxies. H I observations can map the distribution of gas in and around galaxies.
3.4
Cold Gas: Molecules
Molecular hydrogen (H2 ) is very difficult to detect directly. It has no radio lines, which is unfortunate, since it prevents the coldest and densest parts of the ISM being observed directly. There is H2 band absorption in the far ultraviolet, but this can only be observed from above the Earth’s atmosphere. What saves the situation somewhat is that other molecules do emit in the radio/microwave region. Molecular energy transitions can be due to changes in the electron energy levels, and also to changes in vibrational and rotational energies of the molecules. All three types of energy are quantised. Transitions between the electron states are in general the most energetic, and can produce spectral lines in the optical, ultraviolet and infrared. Transitions between the vibrational states can produce lines in the infrared. Transitions between the rotational states are in general the least energetic and produce lines in the radio/microwave region. CO has strong lines at 1.3mm and 2.6mm from transitions between rotational states. CO is particularly useful as a tracer of H2 molecules on the assumption that the densities of the two are proportional. Mapping CO density is therefore used to determine the distribution of cold gas in the ISM. Cold, dense molecular gas is concentrated into clouds. These are relatively small in size (' 2−40 pc across). Temperatures are T ∼ 10 K and densities ∼ 10 8 to 1011 m−3 . Molecular clouds fill only a very small volume of the ISM but contain substantial mass. Regions of gas in molecular clouds can experience collapse under its own gravitation to form stars. The newly formed hot stars in turn irradiate the gas with ultraviolet light, ionising and heating the gas. Giant molecular clouds are larger regions of cold, molecular gas. Their masses are large enough (up to 106 M¯ ) that they can have perturbing gravitational effects on stars in the disc of the Galaxy. Within our Galaxy they are mostly found in the spiral arms.
3.5
Hot Gas: H II Regions
Hot gas is readily observed in the optical region of the spectrum. Gas that is largely ionised produces emission lines from electronic transitions as ions and some atoms revert to lower energy states. The gas is therefore observed as nebulae. An important kind of object is H II regions, which are regions of partially ionised hydrogen surrounding very hot young stars of O or B spectral types. Hot stars produce a large flux of ultraviolet photons, and any Lyman continuum photons (i.e., photons with wavelengths λ < 912 ˚ A) will photoionise hydrogen producing a region of H+ , i.e. H II ions. These wavelengths correspond to energies > 13.6 eV, the ionisation energy from the ground state of hydrogen. The ionised hydrogen then recombines with electrons to form neutral atoms. But the hydrogen does not have to recombine into the ground state; it can recombine into an excited state and then radiatively decay after that. Electrons will cascade down energy levels, emitting photons as they do, in a process known as radiative decay. This process produces a huge variety of observable emission lines and continua in the ultraviolet, optical, infrared and radio parts of the spectrum. Free-bound transitions (in52
The optical spectrum of the Orion Nebula. The spectrum shows very strong emission lines from species such as H I, [O II] and [O III].
volving free electrons combining with ions to become bound in atoms) produce continuum radiation. Bound-bound transitions (involving electrons inside an atom moving to a different energy level in the same atom) produce spectral lines. Transitions in hydrogen atoms down to the first excited level (n = 2) produce Balmer lines, which lie in the optical. These are prominent in nebulae. Transitions down to the ground (n = 1) state produce Lyman lines in the ultraviolet. In each series, the individual lines are labelled α, β, γ, δ, ..., in order of decreasing wavelength. The transitions from n to n − 1 levels are the strongest. Therefore the α line of any series is the strongest. Lyman series lines of hydrogen are: Lyα λ = 1216 ˚ A (in ultraviolet) ˚ Lyβ 1026 A (”) Lyγ 973 ˚ A (”)
Balmer series lines of hydrogen are: Hα λ = 6563 ˚ A (in optical) ˚ Hβ 4861 A (”) Hγ 4340 ˚ A (”) Hδ 4102 ˚ A (”) ˚ H² 3970 A (”)
Atoms in H II regions can also be collisionally excited. Atomic hydrogen has no levels accessible at collision energies characteristic of H II regions (T ∼ 10 4 K) but N II, O II, S II, O III, Ne III all do. The [O III] lines at 4959˚ A and 5007˚ A are particularly prominent. Some of the most prominent lines in the optical spectra of H II regions, other than the hydrogen lines listed above, are: [O II] 3727 ˚ A [Ne III] 3869 ˚ A [O III] 4959 ˚ A ˚ [O III] 5007 A
He I 5876 ˚ A [N II] 6548 ˚ A [N II] 6584 ˚ A
Colour optical images of H II regions of the kind used in popular astronomy books show strong red and green colours: the red is produced mainly by the Hα line, while the green is produced by [O III] and Hβ. H II regions are seen prominently in images of spiral and irregular galaxies. Their emission lines dominate the spectra of late-type galaxies and are valuable for use in measuring redshifts. The photoionisation and recombination process in H II regions and planetary nebulae produces, by a convenient accident, one Balmer photon for each Lyman continuum photon from the hot star, so the ultraviolet flux from the star can be measured by observing an 53
optical spectrum of the H II region surrounding the hot star. The reason is basically that the gas is opaque to Lyman photons and transparent to other photons, since almost all the H atoms are in the ground state. A Lyman continuum photon initially from the star will get absorbed by a hydrogen atom, producing a free electron. This electron will then be captured into some bound state. If it gets captured to the ground state we are back where we started (with a ground state atom and a Lyman continuum photon), so consider the case where the electron is captured to some n > 1 state. Such a capture releases a free-bound continuum photon which then escapes, and leaves an excited state which wants to decay to n = 1. If it decays to n = 1 bypassing n = 2, it will just produce a Lyman photon which will get almost certainly get absorbed again. Only if it decays to some n > 1 will a photon escape. In other words, if the decay bypasses n = 2 it almost always gets another chance to decay to n = 2 and produce a Balmer photon that escapes. The Lyα photons produced by the final decay from n = 2 to n = 1 random-walk through the gas as they get absorbed and re-emitted again and again. The total Balmer photon flux thus equals the Lyman continuum photon flux. One can then place the source star in an optical-ultraviolet colour–magnitude diagram, and determine a colour temperature which is called the Zanstra temperature in this context. H II regions and planetary nebulae also produce thermal continuum radiation. The process that produces this is free-free emission: free electrons in the H II can interact with protons without recombination, and the acceleration of the electrons in this process produces radiation. (Electrons can interact with other electrons in similar fashion as well, but this produces no radiation because the net electric dipole moment does not change.) The resulting spectrum is not blackbody because the gas is transparent to free-free photons: there is no redistribution of the energy of the free-free photons. In fact the spectrum is quite flat at radio frequencies – this is the same thing as saying that the time scale for free-free encounters is ¿ 1/ν for radio frequency ν. Detailed analyses of the relative strengths of the emission lines from H II regions can provide measurements of the temperatures, densities and chemical composition of the interstellar gas. This is possible for nebulae in our Galaxy and for emission lines in other galaxies. Measuring the chemical compositions of gas in galaxies is important in understanding how the abundances of chemical elements vary from one place in the Universe to another.
3.6
Hot Gas: Planetary Nebulae
A planetary nebula is like a compact H II region, except that it surrounds the exposed core of a hot, highly evolved star rather than a hot young star. The gas is ejected from the star through mass loss over time. Ultraviolet photons from the star ionise the gas in manner similar to H II regions, and the gas emits photons like a H II region. Emission processes are similar to H II regions, but the density, temperature and ionisation state of the gas around a planetary nebula can be somewhat different to a H II region. Planetary nebulae are relatively luminous and have prominent emission lines. As such they can be observed in other galaxies and can be detected at greater distances than many individual ordinary stars. They can be used to trace the distribution and kinematics of stars in other galaxies.
3.7
Hot Gas: Supernova Remnants
Supernovae eject material at very high velocities into the interstellar medium. This gas shocks, heats and disrupts the ISM. Low density components of the ISM can be significantly affected, but dense molecular clouds are less strongly affected. Hot gas from supernovae can even be ejected out of the Galactic disc into the halo of the Galaxy. Supernova remnants have strong line emission. They expand into and mix with the ISM. 54
3.8
Hot Gas: Masers
In the highest density H II regions (â&#x2C6;ź 1014 mâ&#x2C6;&#x2019;3 ), either very near a young star, or in a planetary-nebula-like system near an evolved star, population inversion between certain states becomes possible. The overpopulated excited state then decays by stimulated emission, i.e., it becomes a maser. An artificial maser or laser uses a cavity with reflecting walls to mimic an enormous system, but in an astrophysical maser the enormous system is available for free; so an astrophysical maser is not directed perpendicular to some mirrors but shines in all directions. But as in an artificial maser, the emission is coherent (hence polarised), with very narrow lines and high intensity. Masers from OH and H2 0 are known. Their high intensity and relatively small size makes masers very useful as kinematic tracers.
3.9
Hot Gas: Synchrotron Radiation
Finally, weâ&#x20AC;&#x2122;ll just briefly mention synchrotron radiation. It is a broad-band non-thermal radiation emitted by electrons gyrating relativistically in a magnetic field, and can be observed in both optical and radio. The photons are emitted in the instantaneous direction of electron motion and polarised perpendicular to the magnetic field. The really spectacular sources of synchrotron emission are systems with jets (young stellar objects with bipolar outflows, or active galactic nuclei). It is synchrotron emission that lights up the great lobes of radio galaxies.
3.10
Absorption-Line Spectra from the Interstellar Gas
If interstellar gas is seen in front of a continuum background light source, light from the source is found to be absorbed at particular wavelengths. A number of interstellar lines and molecular bands are seen in absorption. This process requires a relatively bright background source in practice. The molecular absorption can be very complex. Electron transitions in the molecules do not produce single strong spectral lines, but a set of bands. This is because of the effect of the various vibrational and rotational energy states of the molecules in the gas, combined with the stronger electronic transitions. Some of the interstellar absorption features are not well understood. One particular problem is the diffuse interstellar bands in the infrared. These are probably caused by carbon molecules, possibly by polycyclic aromatic hydrocarbons (PAHs). An interesting example of the importance of interstellar absorption lines concerns measurements of the temperatures of cold interstellar CN molecules. Like most heteronuclear molecules, CN has rotational modes which produce radio lines. The radio lines can be observed directly, but more interesting are the optical lines that have been split because of these rotational modes. Observations of cold CN against background stars reveal, through the relative widths of the split optical lines, the relative populations of the rotational modes, and hence the temperature of the CN. The temperature turns out to be 2.7 K, i.e., these cold clouds are in thermal equilibrium with the microwave background. The temperature of interstellar space was first estimated to be ' 3 K in 1941, well before the Big Bang predictions of 1948 and later, but nobody made the connection at the time.
3.11
The Components of the Interstellar Medium
It is sometimes convenient to divide the diffuse gas in the interstellar medium into distinct components, also called phases:
55
• the cold neutral medium – consisting of neutral hydrogen (H I) and molecules at temperatures T ∼ 10 − 100 K and relatively high densities; • the warm neutral medium – consisting of H I but at temperatures T ∼ 10 3 − 104 K of lower densities; • the warm ionised medium – consisting of ionised gas (H II) at temperatures T ∼ 10 4 K of lower densities; • the hot ionised medium – consisting of ionised gas (H II) at very high temperatures T ∼ 105 − 106 K but very low densities. These phases are pressure-confined and are stable in the long term. Ionisation by supernova remnants is an important mechanism in producing the hot ionised medium. The cold neutral medium makes up a significant fraction ∼ 50% of the ISM’s mass, but occupies only a very small fraction by volume. Some other individual structures in the ISM, such as supernova remnants, planetary nebulae, H II regions and giant molecular clouds, are not included in these phases because they are not in pressure equilibrium with these phases.
3.12
Interstellar Dust
Interstellar dust consists of particles of silicates or carbon compounds. They are relatively small, but have a broad range in size. The largest are ' 0.5 µm with ∼ 10 4 atoms, but some appear to have . 102 atoms and thus are not significantly different from large molecules. Dust has a profound observational effect – it absorbs and scatters light. Dust diminishes the light of background sources, a process known as interstellar extinction. Examples of this are dark nebulae and the zone of avoidance for galaxies at low galactic latitudes. Consider light of wavelength λ with a specific intensity Iλ passing through interstellar space. (Specific intensity of light here means the energy transmitted in a direction per unit time through unit area perpendicular to that direction into unit solid angle per unit interval in wavelength.) If the light passes through an element of length in the interstellar medium, it will experience a change dIλ in the intensity Iλ due to absorption and scattering by dust. This is related to the change dτλ in the optical depth τλ at the wavelength λ that the light experiences along its journey by dIλ = − dτλ . Iλ Integrating over the line of sight from a light source to an observer, the observed intensity is I λ = I λ 0 e − τλ where Iλ 0 is the light intensity at the source and τλ is the total optical depth along the line of sight. This can be related to the loss of light in magnitudes. The magnitude m in some photometric band is related to the flux F in that band by m = C − 2.5 log 10 ( F ), where C is a calibration constant. So, the change in magnitude caused by an optical depth τ in the band is A = − 2.5 log10 (e−τ ) = + 1.086 τ. The observed magnitude m is related to the intrinsic magnitude m0 by m = m0 + A, where A is the extinction in magnitudes (the intrinsic magnitude m0 is the magnitude that the star would have if there were no interstellar extinction). A depends on the photometric band. For example, for the V (visual) band (corresponding to yellow-green colours and centred at 5500 ˚ A), V = V0 + AV , while for the B (blue) band ˚ (centred at 4400 A), B = B0 + AB . For sight lines at the Galactic poles, AV ' 0.00 to 0.05 mag, while at intermediate galactic latitudes, AV ' 0.05 to 0.2 mag. However, in the Galactic plane, the extinction can be many magnitudes, and towards the Galactic Centre it is AV À 20 mag. Aλ is a strong function of wavelength and it scales as Aλ ∼ 1/λ (but not as strong as the ∼ 1/λ4 relation of the Rayleigh law). There is therefore much stronger absorption in the 56
The interstellar extinction law. The extinction caused by dust is plotted against wavelength and extends from the ultraviolet through to the nearinfrared. [Based on data from Savage & Mathis, Ann. Rev. Astron. Astrophys., 1979.]
blue than in the red. This produces an interstellar reddening by dust: as light consisting of a range of wavelengths passes through the interstellar medium, it is reddened by the selective loss of short wavelength light compared to long wavelengths. Colour indices are reddened, e.g. B − V is reddened so that the observed value is (B − V ) = (B − V )0 + EB−V , where (B − V )0 is the intrinsic values (what would be observed in the absence of reddening) and EB−V is known the colour excess. The colour excess measures how reddened a source is. The colour excess is therefore the difference in the extinctions in the two magnitudes, e.g. EB−V = AB − AV . If the intrinsic colour can be predicted, i.e. we can predict (B − V )0 (e.g. from a spectrum), it is possible to calculate EB−V from EB−V = (B−V ) − (B−V )0 . EB−V data can then be used to map the dust distribution in space. It is found from observations that the extinction in the V band AV ' 3.1 EB−V . Extinction gets less severe for λ & 1 µm as the wavelength gets much longer than the grains. Grains are transparent to X-rays. The extinction becomes very strong for long sight lines through the disc of the Galaxy. The Galactic Centre is completely opaque to optical observations. There is some patchiness in the distribution of the dust. A few areas of lower dust extinction towards the bulge of our Galaxy, such as Baade’s Window, enable the stars in the bulge to be studied. We can model the extinction caused by dust in the Galaxy to predict how the extinction in magnitudes towards distant galaxies will depend on their galactic latitude b. The optical depth caused by dust extinction when light of wavelength λ travels a short distance ds through the interstellar medium is given by dτλ = κλ ρd ds, where ρd is the density of dust at that point in space and κλ is the mass extinction coefficient at that point for the wavelength λ. Observations of star numbers show that their number density declines exponentially with distance from the Galactic plane. We shall adopt a similar behaviour for the density of dust, and therefore assume that the density of dust varies with distance z above the Galactic plane as ρd (z) = ρd0 e−|z|/h , where ρd0 and h are constants. The optical depth when travelling a distance ds along the line of sight at galactic latitude b is dτλ = κλ ρd (z) ds = κλ ρd (z) dz/ sin |b| where z is the distance north of the Galactic plane. Integrating along the line of sight, Z
τλ 0
dτλ0 =
Z
∞ 0
κλ ρd (z) dz = sin |b|
Z
∞ 0
κλ ρd0 e−|z|/h κλ ρd0 dz = sin |b| sin |b|
57
Z
∞ 0
e−|z|/h dz
assuming that the opacity κλ does not vary with distance from the Galactic plane. Therefore, τλ =
κλ ρd0 h = κλ ρd0 h cosec|b| . sin |b|
Since the extinction in magnitudes is Aλ = 1.086τλ , we get Aλ = 1.086 κλ ρd0 h cosec|b| . Therefore the extinction in magnitudes towards an extragalactic object is predicted to vary with galactic latitude b as Aλ ∝ cosec|b| in this particular model. By coincidence, the same cosec|b| dependence of Aλ on galactic latitude b is obtained using a simplistic model in which the dust is found in a slab of uniform density centred around the Galactic plane.
3.13
Interstellar Dust: Polarisation by Dust
However, extinction by dust does one very useful thing for optical astronomers. Dust grains are not spherical and tend to have some elongation. Spinning dust grains tend to align with their long axes perpendicular to the local magnetic field. They thus preferentially block light perpendicular to the magnetic field: extinction produces polarised light. The observed polarisation will tend to be parallel to the magnetic field. Hence polarisation measurements of starlight reveal the direction of the magnetic field (or at least the sky-projection of the direction). Dust also reflects light, with some polarisation. This is observable as reflection nebulae, where faint diffuse starlight can be seen reflected by dust. 58
3.14
Interstellar Dust: Radiation by Dust
Light absorbed by dust will be reradiated as a black-body spectrum (or close to a black-body spectrum). The Wien displacement law states that the maximum of the Planck function B λ over wavelength of a black-body at a temperature T is found at a wavelength λmax =
2.898 × 10−3 Km . T
This predicts that the peak of the black-body spectrum from dust at a temperature of T = 10 K will be at a wavelength λmax = 290 µm, from dust at T = 100 K will be at a wavelength λmax = 29 µm, and for T = 1000 K will be at λmax = 2.9 µm. The radiation emitted by dust will be found in the infrared, mostly in the mid-infrared, given the expected temperatures of dust. This can be observed, for example using observations from space (such as those made by the IRAS satellite), as diffuse emission superimposed on a reflected starlight spectrum. However, the associated temperature of the black-body is surprisingly high – T ∼ 103 K – which is much hotter than most of the dust. The interpretation of this is that some dust grains are so small (< 100 atoms) that a single ultraviolet photon packs enough energy to heat them to ∼ 103 K, after which these ‘stochastically heated’ grains cool again by radiating, mostly in the infrared. This process may be part of the explanation for the correlation between infrared and radio continuum luminosities of galaxies (e.g., at 0.1 mm and 6 cm), which seems to be independent of galaxy type. The idea is that ultraviolet photons from the formation of massive stars cause stochastic heating of dust grains, which then reradiate them to give the infrared luminosity. The supernovae resulting from the same stellar populations produce relativistic electrons which produce the radio continuum as synchrotron emission.
3.15
Star formation
Stars form by the collapse of dense regions of the interstellar medium under their own gravity. This occurs in the cores of molecular clouds, where the gas is cold (∼ 10 K) and densities can exceed 1010 molecules m−3 . A region of cold gas will collapse when its gravitational self-attraction is greater than the hydrostatic pressure support. This gravitational instability is often described by the Jeans length and Jeans mass. For gas of uniform density ρ, the Jeans length λJ is the diameter of a region of the gas that is just large enough for the gravitational force to exceed the pressure support. It is given by r π λJ = c s , (3.1) Gρ where cs is the speed of sound in the gas and G is the constant of gravitation. The Jeans mass is the mass of a region that has a diameter equal to the Jeans length and is therefore MJ =
π ρ λJ3 . 6
(3.2)
Star formation can be self-propagating. Stars will form, heat up and ionise the cold molecular gas, with the resulting outwards flow of gas compressing gas ahead of it. This compression causes instabilities that result in local collapse to form new stars. The enhanced density in the spiral arms of our Galaxy and in other spiral galaxies means that star formation occurs preferentially in the arms.
59
Chapter 4
Galactic Chemical Evolution 4.1
Introduction
Chemical evolution is the term used for the changes in the abundances of the chemical elements in the Universe over time, since the earliest times to the present day. The study of these changes is an important field in astronomy, for both our Galaxy and for other galaxies. This includes the study of the elemental abundances in stars and in the interstellar medium. A fundamental objective of studies of chemical evolution is to develop a complete understanding of how the elemental abundances correlate with parameters like time, location within a galaxy, and stellar velocities. The term â&#x20AC;&#x2DC;chemicalâ&#x20AC;&#x2122; here refers to the chemical elements. It does not refer to chemistry in the broader sense: the study of interactions between molecules in the Universe is a different, distinct field called astrochemistry.
4.2
Chemical Abundances
The relative abundances of the chemical elements can be measured in a number of astronomical objects, most importantly using spectroscopic techniques. The observed strengths of spectral lines depend on a variety of factors among which are the chemical abundances of the elements producing the spectral lines. Abundances can measured in stellar photospheres from the strengths of absorption lines. The observed strengths of lines in stellar spectra depend on the abundance of the element responsible, on the effective temperature of the star, on the acceleration due to gravity at its surface and on small-scale turbulence in the atmosphere of the star. All these parameters can be solved for if there are sufficient spectroscopic observations, while the analysis becomes simpler if the temperature can be determined in advance from photometric observations. Equally, abundances can be determined from the strengths of emission lines from interstellar gas, most notably from H II regions. It is important to define some terms and parameters that are used for the study of the abundances of chemical elements. In astronomy, for historical reasons, the term metals is used for elements other than H and He; the term heavy elements is also used for these. Under this definition, even elements such as carbon, oxygen, nitrogen and sulphur are called metals. The term metallicity is used for the fraction of heavy elements, usually expressed as a fraction by mass. It is convenient to define the fractions by mass of hydrogen X, of helium Y , and of heavy elements Z. Therefore, Z = (mass of heavy elements)/(total mass of all nuclei) in some object, objects or region of space. We therefore have X + Y + Z â&#x2030;Ą 1. In some other applications, the abundances by number of nuclei are used. These are usually expressed relative to hydrogen. So N (He)/N (H) is the ratio of the abundance of helium to hydrogen by number, and N (Fe)/N (H) the iron-to-hydrogen ratio. 60
For convenience, chemical abundances in the Universe are often compared to the values in the Sun. Solar abundances give X = 0.70, Y = 0.28, Z = 0.02 by mass. An object, such as a star, that has a heavy element fraction significantly lower than the Sun is said to be metal poor, while one that has a larger heavy element fraction is said to be metal rich. Abundance ratios by number are expressed relative to the Sun using a parameter [A/B], where A and B are the chemical symbols of two elements, and is defined as, µ µ ¶ ¶ N (A) N (A) [A/B] = log10 − log10 , (4.1) N (B) N (B) ¯ where ¯ represents the abundance ratio in the Sun. So the ratio of iron to hydrogen in a star relative to the Sun is written as µ ¶ ¶ µ N (Fe) N (Fe) [Fe/H] = log10 . (4.2) − log10 N (H) N (H) ¯ The [Fe/H] parameter for the Sun is, by definition, 0. A mildly metal-poor star in the Galaxy might have [Fe/H] ' −0.3, while a very metal-poor star in the halo of the Galaxy might have [Fe/H] ' −1.5 to −2. A metal-rich star in the Galaxy might have [Fe/H] ' +0.3. The interstellar gas in the Galaxy has a near-solar metallicity with [Fe/H] ∼ 0. The ironto-hydrogen number ratio is encountered more often in research papers than other ratios because iron produces large numbers of absorption lines in the spectra of late-type stars like our Sun, making iron abundances relatively straightforward to measure.
4.3
The Chemical Enrichment of Galaxies
Current cosmological models show that the Big Bang produced primordial gas having a chemical composition that was 91 % H, 9 % He by number, plus a trace of Li7 . This corresponds to a composition by mass that was 77 % H and 23 % He: i.e. X = 0.77, Y = 0.23, Z = 0.00. The baryonic material produced by the Big Bang was therefore almost pure hydrogen and helium, with ten hydrogen nuclei for every one of helium. This production of the chemical elements in the Big Bang is known as primordial nucleosynthesis. The material we find around us in the Universe today contains significant quantities of heavy elements, although these are still only minor contributors to the total mass of baryonic matter. These heavy elements have been synthesised in nuclear reactions in stars, a process known as nucleosynthesis. These nuclear reactions in general occur in the cores of stars, producing enriched material in stellar interiors. Enriched material can be ejected into the interstellar medium in the later stages of stellar evolution, through mass loss and in supernovae. Star formation therefore produces stars which, after a time delay, eject heavy elements into the interstellar medium, including heavy elements newly synthesised in the stellar interiors. Star formation from this enriched material in turn results in stars with enhanced abundances of heavy elements. This process occurs repeatedly over time, with the continual recycling of gas, leading to a gradual increase in the metallicity of the interstellar medium with time. Supernovae are important to chemical enrichment. They can eject large quantities of enriched material into interstellar space and can themselves generate heavy elements in nucleosynthesis. Type II supernovae are produced by massive stars (M & 8M¯ ). They eject enriched material into the interstellar medium ∼ 107 yr after formation. This material is rich in C, N and O. Type Ia supernovae are probably caused by explosive fusion reactions in binary systems and eject enriched material & 108 yr after the initial star formation. This material is rich in iron. The main sequence lifetime TMS of a star is a very strong function of the star’s mass M . Stars with masses M . 0.8M¯ have TMS > age of the Universe. So mass that goes 61
into low mass stars is lost from the recycling process. Usefully, samples of low mass stars preserve abundances of the interstellar gas from which they formed if enriched material is not dredged up from the stellar interiors. This is true for G dwarfs for example, where the gas in the photosphere is almost unchanged in chemical composition from the gas from which they formed. So a sample of G dwarfs provides samples of chemical abundances through the history of the Galaxy. In contrast, observations of the interstellar gas in a galaxy provide information on present-day abundances and on the current state of chemical evolution.
4.4
The Simple Model of Galactic Chemical Evolution
The Simple Model of chemical evolution simulates the build up of the metallicity Z in a volume of space. The Simple Model makes some simplifying assumptions: • the volume initially contains only unenriched gas – initially there are no stars and no heavy elements; • the volume of space where the evolution takes place is a ‘closed box’ – no gas enters or leaves the volume; • the gas in the volume is well mixed – it has the same chemical composition throughout; • instantaneous recycling occurs – following star formation, all newly created heavy elements that enter the ISM do so immediately; • the fraction of newly-synthesised heavy elements ejected into the ISM after material forms stars is constant. These assumptions, although slightly naive, allow some important predictions about the variation of the metallicity in the interstellar gas with the amount of star formation that has taken place. Consider a volume within a galaxy, small enough to be fairly homogeneous, but large enough to contain a good sample of stars. The total mass of chemical elements in this volume at time t is Mtotal , made up of Mstars in stars and Mgas in gas. So Mtotal = Mstars + Mgas .
(4.3)
Here, Mstars = Mstars (t) and Mgas = Mgas (t). The assumption that the volume is a closed box means that Mtotal = constant at all times. Initially, at time t = 0, Mstars = 0 (the volume contains pure gas initially). Let Mmetals be the mass of heavy elements in the gas within the volume at time t. Therefore the heavy element mass fraction of the gas is Z ≡
Mmetals . Mgas
(4.4)
Consider a time interval from t to t + δt. Star formation will occur in this time, with gas forming stars. Let the change in Mstars and Mgas in this time be δMstars and δMgas . Some stars will eject enriched gas back into the interstellar medium (through supernovae and mass loss). We firstly need to express the change δZ in the metallicity of the interstellar gas in terms of δMstars and δMgas . From Equation 4.4, we have Z = fn(Mmetals , Mgas ), so the differential of Z with respect to time is dZ dt
=
∂Z ∂Z dMgas dMmetals + ∂Mmetals dt ∂Mgas dt
Differentiating Equation 4.4, ∂Z 1 = ∂Mmetals Mgas
and 62
∂Z Mmetals = − , 2 ∂Mgas Mgas
which gives,
dZ dt
=
1 dMmetals Mmetals dMgas − . 2 Mgas dt Mgas dt
For a small time interval δt, we have, δZ =
δMmetals Mmetals − δMgas , 2 Mgas Mgas
which gives from the definition of Z in Equation 4.4, δZ =
δMgas δMmetals − Z . Mgas Mgas
(4.5)
We need to distinguish between the the total mass in stars Mstars at time t and the total mass that has taken part in star formation MSF over all periods up to time t. When a mass δMSF goes into stars during star formation, the total mass in stars will change by amount less than this, because material from the new stars is ejected back into the interstellar gas. So, δMSF > δMstars , and MSF > Mstars . Let α be the fraction of mass participating in star formation that remains locked up in long-lived stars and stellar remnants. So, δMstars = α δMSF
(4.6)
The mass of newly synthesised heavy elements ejected back into the ISM is proportional to the mass that goes into stars (from the Simple Model assumptions listed above). Let the mass of newly synthesised heavy elements ejected into the ISM be p δMstars , where p is a parameter known as the yield, with p set to be a constant here. The change δMmetals in the mass of heavy elements in the gas in a time δt will be caused by the loss of heavy elements in the gas that goes into star formation, by the gain of old heavy elements that have gone into star formation and are then ejected back into the gas, and by the gain of newly synthesised heavy elements from stars that are ejected into the gas. The contribution to δMmetals from the loss of heavy elements in the gas going into star formation will be − δMSF Mmetals /Mgas = − δMSF Z. The contribution from old heavy elements in the gas that have gone into star formation then are ejected back unchanged into the gas will be Z δMSF × (fraction of mass going into stars that is ejected back) = Z (1 − α) δM SF . The contribution from the newly synthesised heavy elements that are ejected into the gas will be p δMSF from the definition of the yield p above. Therefore, in time δt, δMmetals = − Z δMSF + Z (1 − α) δMSF + p δMstars , which gives on expanding and cancelling, δMmetals
=
− Z α δMSF + p δMstars
=
− Z δMstars + p δMstars ,
(4.7)
on substituting for α δMSF = δMstars . Dividing by Mgas , δMmetals δMstars δMstars = −Z + p . Mgas Mgas Mgas But from Equation 4.3, the changes in masses are related by δMtotal = δMstars + δMgas = 0 for a closed box. Therefore, δMstars = − δMgas . (4.8) 63
δMgas δMgas δMmetals = Z − p . Mgas Mgas Mgas
∴
Substituting this into Equation 4.5, δZ = Z
δMgas δMgas δMgas − p − Z Mgas Mgas Mgas δZ = − p
∴
δMgas , Mgas
(4.9)
Converting this to a differential and integrating from time 0 to t, Z ∴
Z(t)
dZ
0
0
= −
Z
Mgas (t)
p Mgas (0)
Z(t) − 0 = − p
h
Z(t) = − p ln
µ
This gives,
0 ln Mgas
Mgas (t) Mgas (0)
0 dMgas 0 Mgas
iMgas (t)
Mgas (0)
¶
.
.
(4.10)
Since the Mgas (0) = Mtotal (t) (a constant) for all t (because we have a closed box that initially contained only gas), we can rewrite this equation using the gas fraction µ ≡ Mgas (t)/Mtotal (t) as Z(t) = − p ln µ .
(4.11)
Both Z and µ can in principle be measured with appropriate observations, and this equation does not depend on time t or star formation rate explicitly. This is an important prediction from the theory that is discussed more in Section 4.5, where comparisons with observations are considered. We now need to consider how the mass in stars depends on metallicity. Equation 4.9 gives the change in metallicity Z in time δt. But from Equation 4.3, at any time, Mtotal = Mstars + Mgas , and at time 0, Mtotal = Mgas (0). So, Mgas (t) = Mgas (0) − Mstars (t). From Equation 4.8, δMstars = − δMgas . Substituting for Mgas (t) and δMstars into Equation 4.9, δZ = p Integrating from time 0 to t, Z Z 0
∴ ∴
Z
dZ
0
= p
δMstars . Mgas (0) − Mstars (t)
Z
Mstars 0
0 dMstars 0 Mgas (0) − Mstars
£ ¡ ¢¤Mstars 0 Z − 0 = − p ln Mgas (0) − Mstars 0 =
=
− p ln (Mgas (0) − Mstars (t)) + ln (Mgas (0)) µ ¶ Mgas (0) − Mstars (t) − p ln , Mgas (0)
64
which rearranges to Mstars (t) = 1 − e−Z(t)/p . Mgas (0)
(4.12)
This is a prediction of how the fraction of the mass of the volume that is in stars varies with metallicity. Again we have a neat prediction of the Simple Model that involves time only through Z(t) and Mstars (t). This equation does not involve the star formation rate as a function of time explicitly, which means that the predictions are much simpler to compare with observations. Today, at time t1 , we have a metallicity Z1 and a mass in stars Mstars1 . Therefore we have Mstars1 = 1 − e−Z1 /p , Mgas (0) at the present time. Dividing Equation 4.12 by this, Mstars (t) 1 − e−Z(t)/p = . Mstars1 1 − e−Z1 /p
(4.13)
This is a prediction of how the mass in stars at any time varies with the metallicity. If we observe some subsample of long-lived stars of similar mass, the number N (Z) of these stars having a metallicity Z and less will be related to the mass by N (Z) ∝ M stars (t). So, N (Z) Mstars (t) = , N1 Mstars1 where N1 is the value of N (Z) today. This then gives, 1 − e−Z(t)/p N (Z) = . N1 1 − e−Z1 /p
(4.14)
This gives a specific prediction of the number of stars as a function of metallicity. In practice it is often easier to work with the differential distribution dN/dZ, which expresses the number of stars with metallicity Z against Z.
4.5
Comparing the Simple Model with Observations: Abundances in the Interstellar Gas of Galaxies
Equation 4.11 provided a simple expression for the metallicity in the Simple Model: Z = − p ln (gas fraction) . This was derived by considering the chemical evolution in a single closed box. However, this is a general result for any closed box and it can be compared with the observed metallicities and gas fractions in a number of galaxies, provided that the yield is the same in all places. Magellanic irregular galaxies are found to fit this relation reasonably well, and p is estimated to be ' 0.0025 from observations. In spiral galaxies, the gas fraction in the disc increases as we go outwards, and Z is indeed observed to decrease, though perhaps more steeply than this crude model predicts.
65
4.6
Comparing the Simple Model with Observations: Stellar Abundances in the Galaxy and the G-Dwarf Problem
Equation 4.14 gives a prediction of the number of stars N (Z) having a metallicity â&#x2030;¤ Z as a function of Z for a sample of long-lived stars, based on the Simple Model assumptions in Section 4.4. These predictions can be compared with observations. These comparisons require metallicity data for relatively large numbers of stars to ensure adequate statistics. Therefore, metallicity estimates are often made using photometric techniques, rather than using more precise spectroscopic measurements. These photometric estimates tend to measure the abundances of the elements that produce strong absorption in the light of the stars, rather than the overall metallicity. Therefore iron abundances [Fe/H] are generally quoted for studies of the metallicity distribution. G- or K-type main sequence stars can conveniently be used in these studies because their lifetimes are sufficiently long that they will have survived from the earliest times to the present: these are known as G and K dwarfs. G dwarfs have generally been used because of the advantage of their greater luminosity and because the techniques of estimating metallicities have been better calibrated. The metallicity distribution observed for metal-poor globular clusters (where the abundance of each cluster is used instead of individual stars) gives a tolerably good fit to the Simple Model prediction. However, matters are very different for the stars in the solar neighbourhood, within the disc of the Galaxy. The figure below gives a comparison of the the predicted metallicity distribution of Equation 4.14 with observations of long-lived stars in the solar neighbourhood. The Simple Model prediction is found to be very different to the observed distribution. The Simple Model predicts a far larger proportion of metal-poor stars than are actually found. This has become known as the G dwarf problem.
The observed cumulative metallicity distribution for stars in the solar neighbourhood, compared with the Simple Model prediction for p = 0.010 and Z1 = ZÂŻ = 0.017. [The observed distribution uses data from Kotoneva et al., M.N.R.A.S., 336, 879, 2002, for stars in the Hipparcos Catalogue.] 66
The differential metallicity distribution, representing simply a histogram of star numbers as a function of [Fe/H], also shows the failure of the Simple Model prediction. This is shown in the figure below.
The observed differential metallicity distribution for stars in the solar neighbourhood, compared with the Simple Model prediction for p = 0.010 and Z1 = Z¯ = 0.017. [The observed distribution uses data from Kotoneva et al., M.N.R.A.S., 336, 879, 2002, for stars in the Hipparcos Catalogue.]
Determining the metallicity distribution in galaxies outside our own is very difficult observationally. G dwarf stars are very faint in even the nearest Local Group galaxies.
4.7
Solutions to the G-Dwarf Problem
The G-dwarf problem indicates that the Simple Model is an oversimplification in the solar neighbourhood: one or more of the assumptions in Section 4.4 must be wrong. This is a very important result, but precisely which of the assumptions are wrong is difficult to say. A better fit to the observed data can be had by relaxing any of a number of the Simple Model assumptions. These include: • the gas was not initially of zero metallicity; • there has been an inflow of very metal-poor gas (this can help, but the value of the yield p must be adjusted); • there has been a variable initial mass function (which could result in a change in the fraction α of mass that remains locked up in long-lived stars, or in a change in the yield p); • the samples of stars used to test the Simple Model are biased against low-metallicity stars (but considerable care is taken by observers to correct for these effects).
67
(The initial mass function is the number N of stars as a function of star mass M immediately following star formation. There is virtually no evidence that the initial mass function varied with time.) One possible change to the Simple Model is to allow for a loss of gas from the volume. This is plausible, because supernovae following star formation could drive gas out of the region of the galaxy being studied. One possible prescription would be to set the outflow rate to be proportional to the rate of star formation. Therefore, the loss of gas from the volume in a time δt is c δMstars , where c is a constant. In this case the total mass Mtotal in the volume varies with time, unlike in the Simple Model. However, it is found that a loss of enriched gas would make the G dwarf problem even worse: it reduces the quantity of enriched gas to make stars. One modification to the Simple Model that can achieve a better fit between theoretical prediction and observations is to allow for the inflow of gas. This gas could be unprocessed, primordial gas. Analytic models of this type often set the inflow rate to be proportional to the star formation rate, so the inflow of gas in a time δt is c δMstars , where c is a constant. This can produce a better fit between models and observations, provided the yield p is chosen appropriately.
4.8
Nucleosynthesis
The processes by which chemical elements are created are called nucleosynthesis. The elements produced in the hot, dense conditions in the Big Bang (hydrogen, helium and Li 7 ) were created in primordial nucleosynthesis. Other isotopes and elements were produced by nucleosynthesis inside stars and in supernova explosions (including some additional helium). The nuclear reactions responsible are complex and varied. The proto-proton chain is a series of reactions that fuse protons to form helium. In summary these reactions involve 4 H1 → He4 with the addional production of positrons, neutrinos and gamma rays. The carbon-nitrogen cycle and the carbon-nitrogen-oxygen bicycle also fuse protons to form He 4 using pre-existing C12 in these reactions, creating N and O as (mostly) temporary byproducts which ultimately return to C12 . However, incomplete reactions in this series can leave behind some N, O and F. Helium burning occurs in evolved red giant stars. In summary, 3He4 → C12 through an intermediate stage involving Be8 . Reactions of this type can continue, with carbon burning and oxygen burning producing elements such as Mg and Si. Another important process involves the capture of He nuclei by other nuclei, known as α-capture (because the He nucleus is an α particle). For example, O16 + He4 → Ne20 . Elements that are mainly in form of isotopes consisting of multiples of α particles are known as α elements and are relatively abundant. A number of isotopes are built up by neutron capture. This process can occur slowly inside stars over long periods of time, when it is called the s-process. Under extreme circumstances in supernova explosions, neutron capture can occur rapidly and it is called the r-process. Some isotopes are particularly stable, while others are fragile in the high temperatures in stellar interiors (and participate readily in reactions with other nuclei, producing other isotopes). This stability/fragility affects the abundances of the elements produced during nucleosynthesis. When the amount of an element produced by nucleosynthesis in stars does not depend on the abundances of elements in the gas that formed the stars, that element is said to be a primary element. On the other hand, if the amount of an element produced by nucleosynthesis does depend on the abundances in the gas that went into the star, the element is said to be a secondary element. For example, the amount of the isotope N14 produced in stars depends on the abundance of carbon in the gas that formed the stars: the more carbon, the
68
more N14 can be formed as a minor byproduct of the CNO bicycle.
4.9
Element Ratios
The sections on chemical evolution above considered the changes in the overall metallicity Z. However, the abundances of individual elements provide very important additional information. The ratios of the abundances of individual elements, for example oxygen to iron or carbon to oxygen, have been determined by the details of nucleosynthesis and the enrichment of the interstellar medium over time, including how the relative quantities of these elements ejected into the interstellar gas has varied with time.
The [O/Fe] element abundance ratio plotted as a function of [Fe/H]. [Based on data from Edvardsson et al., Astron. Astrophys., 275, 101, 1993, supplemented with data from Zhang & Zhao, M.N.R.A.S., 364, 712, 2005.]
Chemical elements can be measured from the spectra of stars. Correlations are often found between abundance ratios. An important example is how the oxygen-to-iron ratio, [O/Fe], varies with the iron-to-hydrogen ration, [Fe/H]. Stars with metallicities similar to that of the Sun have, unsurprisingly, [O/Fe] values similar to the Sun. Metal-poor stars have [O/Fe] values that are larger than the Sun, with the [O/Fe] values increasing with decreasing [Fe/H] until a near-constant value is reached. The conventional interpretation of this is that the heavy metal enrichment that produced the material that went into very metal-poor stars was caused by type II supernovae predominantly. Type II supernovae occur soon after a burst of star formation and produce large quantities of oxygen relative to iron. Therefore the material in very metal-poor stars had high values of [O/Fe]. Later, type Ia supernovae produced larger quantities of iron compared with oxygen, reducing the oxygen-to-iron ration in the interstellar gas. Later stars were therefore less metal-poor and had [O/Fe] values closer to the Sun. 69
Chapter 5
Rotation Curves 5.1
Circular Velocities and Rotation Curves
The circular velocity vcirc is the velocity that a star in a galaxy must have to maintain a circular orbit at a specified distance from the centre, on the assumption that the gravitational potential is symmetric about the centre of the orbit. In the case of the disc of a spiral galaxy (which has an axisymmetric potential), the circular velocity is the orbital velocity of a star moving in a circular path in the plane of the disc. If the absolute value of the acceleration 2 /R where R is the radius of the orbit (with R a is g, for circular velocity we have g = vcirc 2 /R, assuming symmetry. constant for the circular orbit). Therefore, ∂Φ/∂R = vcirc The rotation curve is the function vcirc (R) for a galaxy. If vcirc (R) can be measured over a range of R, it will provide very important information about the gravitational potential. This in turn gives fundamental information about the mass distribution in the galaxy, including dark matter. We can go further in cases of spherical symmetry. Spherical symmetry means that the gravitational acceleration at a distance R from the centre of the galaxy is simply GM (R)/R 2 , where M (R) is the mass interior to the radius R. In this case, r 2 vcirc GM (R) GM (R) = and therefore, vcirc = . (5.1) 2 R R R If we can assume spherical symmetry, we can estimate the mass inside a radial distance R by inverting Equation 5.1 to give M (R) =
2 vcirc R , G
(5.2)
and can do so as a function of radius. This is a very powerful result which is capable of telling us important information about mass distribution in galaxies, provided that we have spherical symmetry. However, we must use a more sophisticated analysis for the general case where we do not have spherical symmetry. The more general case of axisymmetry is considered in Section 5.3.
5.2
Observations
Gas and young stars in the disc of a spiral galaxy will move on nearly closed orbits, and if the underlying potential is axisymmetric these will be nearly circular. Therefore if the bulk 2 velocity v of gas or young stars can be measured, it provides vcirc = R ∂Φ/∂R. Old stars should be avoided: old stars have a greater velocity dispersion around their mean orbital motion and their bulk rotational velocity will be slightly smaller than the circular velocity.
70
Spectroscopic radial velocities can be used to determined the rotational velocities of spiral galaxies provided that the galaxies are inclined to the line of sight. The analysis is impossible for face-on spiral discs, but inclined spirals can be used readily. The circular velocity v circ is related to the velocity vr along the line of sight (corrected for the bulk motion of the galaxy) by vr = vcirc cos i where i is the inclination angle of the disc of the galaxy to the line of sight (defined so that i = 90◦ for a face-on disc). Placing a spectroscopic slit along the major axis of the elongated image of the disc on the sky provides the rotation curve from optical observations. Radio observations of the 21 cm line of neutral hydrogen at a number of positions on the disc of the galaxy can also provide rotation curves. For example, in our Galaxy the circular velocity at the solar distance from the Galactic Centre is 220 km s−1 (at R0 = 8.0 kpc from the centre). When people first starting measuring rotation curves (c. 1970), it quickly became clear that the mass in disc galaxies does not follow the visible disc. It was found that disc galaxies generically have rotation curves that are fairly flat to as far out as they could be measured (out to several scale lengths). This is very different to the behaviour that would be expected were the visible mass – the mass of the stars and gas – the only matter in the galaxies. This is interpreted as strong evidence for the existence of dark matter in galaxies. The simplest interpretation of a flat rotation curve is that based on the assumption the dark matter is spheroidally distributed in a ‘dark halo’. For a spherical distribution of mass, vcirc = constant implies that the enclosed mass M (r) ∝ r, and so ρ(r) ∝ 1/r 2 . Rotation curves determined from optical spectra are generally limited to ' few scale lengths (assuming an exponential density profile). These do provide important evidence of flat rotation curves. However, 21 cm radio observations can be followed out to significantly greater distances from the centres of spiral galaxies, using the emission from the atomic hydrogen gas. These H I observations provide powerful evidence of a constant circular velocity with radius, out to radial distances where the density of stars has declined to very low levels, providing strong evidence for the existence of extensive dark matter haloes. As yet it is not clear exactly how far dark matter haloes extend. Neither is there a good estimate of the total mass of any disc galaxy. This is what makes disc rotation curves very important.
Figure 5.1: The spiral galaxy NGC 2841 and its H I 21 cm radio rotation curve. The figure on the left presents an optical (blue light) image of the galaxy, while that on the right gives the rotation curve in the form of the circular velocity plotted against radial distance. The optical image covers the same area of the galaxy as the radio observations: the 21 cm radio emission from the atomic hydrogen gas is detected over a much larger area than the galaxy covers in the optical image. [The optical image was created using Digitized Sky Survey II blue data from the Palomar Observatory Sky Survey. The rotation curve was plotted using 71
data by A. Bosma (Astron. J., 86, 1791, 1981) taken from S. M. Kent (AJ, 93, 816, 1987).]
5.3
Theoretical Interpretation
However, one needs to be careful about interpreting flat rotation curves. The existence of dark matter haloes is a very important subject and caution is appropriate before accepting evidence that has profound significance to our understanding of matter in the Universe. For this reason, attempts were made to model observed rotation curves using as little mass in the dark matter haloes as possible. These ‘maximal disc models’ attempted to fit the observed data by assuming that the stars in the galactic discs had as much mass as could still be consistent with our understanding of stellar populations. They still, however, required a contribution from a dark matter halo at large radii when H I observations were taken into account. Importantly, the maximum contribution to the rotation curve from an e−R/R0 disc is not (as we might naively expect) around R0 but around 2.5R0 . Adding the effect of a bulge can easily give a fairly flat rotation curve to 4R0 without a dark halo. To be confident about the dark halo, one needs to have the rotation curve for & 5R0 . In practice, that means H I measurements; optical rotation curves do not go out far enough to say anything conclusive about dark haloes. The rest of this section is a more detailed working out of the previous paragraph. It follows an elegant derivation and explanation due to A. J. Kalnajs. Consider an axisymmetric disc galaxy. Consider the rotation curve produced by the disc matter only (at this stage we shall not consider the contribution from the bulge or from the dark matter halo). This analysis will use a cylindrical coordinate system (R, φ, z) with the disc at z = 0 and R = 0 at the centre of the galaxy. Let the surface mass density of the disc be Σ(R). The gravitational potential in the plane of the disc at the point (R, φ, 0) is Φ(R) = − G
Z
∞
0
0
R Σ(R ) dR 0
0
Z
2π 0
dφ p 2 02 R + R − 2RR0 cos φ
,
(5.3)
found by integrating the contribution from volume elements over the whole disc. To make this tractable, let us first define a function L(u) so that 1 L(u) ≡ 2π
Z
2π 0
dφ p 2 1 + u − 2u cos φ
.
(5.4)
(The function within the integral can be expanded into terms called Laplace coefficients, which are explained in many old celestial mechanics books.) This can be expanded as L(u) = 1 +
9 4 25 6 1225 8 u2 + u + u + u + O(u10 ) 4 64 256 16384
for u < 1 ,
(5.5)
either using a binomial expansion of the function in u or by expressing it as Laplace coefficients (which uses Legendre polynomials), and then integrating each term in the expansion. The integration over φ in Equation 5.3 can be expressed in terms of L(u) as Z
2π 0
dφ p 2 02 R + R − 2RR0 cos φ
= =
72
2π ³ R0 ´ L R for R0 < R R 2π ¡ R ¢ L R0 for R0 > R , R0
(5.6)
because the expansion of L(u) assumed that u < 1. Splitting the integration in Equation 5.3 into two parts (for R 0 = 0 to R, and for R0 = R to ∞) and substituting for L(R0 /R) and L(R/R0 ), we obtain, Φ(R) = − 2πG
Z
R 0
³ 0´ R0 Σ(R0 ) L RR dR0 − 2πG R
Z
∞
Σ(R0 ) L
R
¡R¢ R0
dR0 .
(5.7)
Consider a star in a circular orbit in the disc at radius R, having a velocity v. The radial component of the acceleration is v2 ∂Φ = , R ∂R and hence v 2 (R)
∂Φ ∂R
=
R
=
d − 2πGR dR
Z
R 0
³ 0´ R0 d Σ(R0 ) L RR dR0 − 2πGR R dR
Z
∞
Σ(R0 ) L
R
¡R¢ R0
dR0
These two differentials of integrals can be simplified by using a result known as Leibniz’s Integral Rule, or Leibniz’s Theorem for the differential of an integral. This states for a function f of two variables, d dc
Z
b(c)
f (x, c) dx = a(c)
Z
b(c) a(c)
db da ∂ f (x, c) dx + f (b, c) − f (a, c) ∂c dc dc
(5.8)
This gives R
³ 0´ R0 Σ(R0 ) L RR dR0 R 0 Z R ¡ ¢ d Σ(R0 ) L RR0 dR0 dR 0
d dR and
Z
Therefore we get 2
v (R)
=
− 2πGR
Z
R
d dR
0
µ
But, d dR
µ
R 0 ³ R0 ´ L R R
¶
=
= =
∴
v
2
= + 2πG
Z
0
R·
R
· ¸ ∂ R 0 ³ R0 ´ L R = Σ(R0 ) dR0 + Σ(R)L(1) 0 ∂R R Z ∞ ∂ h ¡ R ¢i L R0 Σ(R0 ) dR0 − Σ(R)L(1) = ∂R R Z
¶ Z ∞ d ¡R¢ R 0 ³ R0 ´ 0 0 Σ(R ) dR − 2πGR Σ(R0 ) L R L 0 dR0 R dR R R
R 0 ³ R0 ´ R 0 d ³ R0 ´ L L R + R R2 R dR ³ 0´ R ´ ³ dL 0 0 R R R d(R0 /R) R0 − 2L R + R R d(R0 /R) dR −
−
R 0 ³ R0 ´ R02 0 ³ R0 ´ L − L R R R2 R3
writing L0 (u) ≡
¸ R 0 ³ R 0 ´ ³ R 0 ´2 0 ³ R 0 ´ Σ(R0 ) dR0 L R + R L R R Z ∞ ¡ R ¢ 0¡ R ¢ 0 0 − 2πG R0 L R0 Σ(R ) dR . R
73
dL(u) du
.
(5.9)
This can be quite messy and it can abbreviated as 2
v (R) = 2πG
Z
∞
K 0
¡R¢ R0
Σ(R0 ) dR0 ,
(5.10)
¡ ¢ where the function K RR0 represents the function over both R0 = 0 to R and R0 = R to ∞ domains. Changing variables to x ≡ ln R, y ≡ ln R0 , we can write this as a convolution Z ∞ v 2 (R) = 2πG K(ex−y ) R0 Σ(R0 ) dy. (5.11) −∞
The kernel K(R/R0 ) is in Figure 5.2.
Figure 5.2: The kernel K(R/R0 ). Observe that the R > R0 part tends to have higher absolute value than the R < R0 part. Figure 5.3 shows R Σ(R) and v 2 for an exponential disc, but the general shapes are not very sensitive to whether Σ(R) is precisely exponential. The important qualitative fact is that whatever R Σ(R) does, v 2 does roughly the same, but expanded by a factor of ' e. The distinctive shape of the v 2 (ln R) curve for realistic discs makes it very easy to recognise non-disc mass. Figure 5.4, following Kalnajs, shows the rotation curves you get by adding either a bulge or a dark halo. (Actually this figure fakes the bulge/halo contribution by adding a smaller/larger disc; but if you properly add spherical mass distributions for disc/halo, the result is very similar.) Kalnajs’s point is that a bulge+disc rotation curve has a similar shape to a disc+halo rotation curve – only the scale is different. So when examining a flat(-ish) rotation curve, you must ask what the disc scale radius is.
5.4
Representing Dark Matter Distributions
The dark matter within spiral galaxies does not appear to be confined to the discs and it is probably distributed approximately spheroidally. A popular density profile that has been adopted for modelling dark matter haloes has the form ρ(r) =
ρ0 , 1 + (r/a)2
74
(5.12)
Figure 5.3: The dashed curve is R Σ(R) for an exponential disc with Σ ∝ e−R and the solid curve is v 2 (R). Note that R is measured in disc scale lengths, but the vertical scales are arbitrary. where r is the radial distance from the centre of the galaxy, ρ0 is the central dark matter density, and a is a constant. This form does reproduce the observed rotation curves of spiral galaxies adequately: it gives a circular velocity that is vcirc = 0 at R = 0, that rises rapidly with the raidal distance in the plane of the disc R, and then becomes flat (vcirc =constant) for R À a. This profile, however, has the problem that its mass is infinite. Therefore a more practical functional form is ρ0 , (5.13) ρ(r) = 1 + (r/a)n where a and n are constants, with n > 2 giving a finite mass. Some numerical N -body simulations of galaxy formation have predicted that dark matter haloes will have density profiles of the form ρ(r) =
k , r (a + r)2
(5.14)
where a and k are constants. This is known as the Navarro-Frenk-White profile after the scientists who first described it. It fits the densities of collections of particles representing dark matters haloes in numerical simulations, and does so adequately over broad ranges in masses and sizes. It is therefore often used to represent the dark matter haloes of galaxies and also of clusters of galaxies. The profiles above are spherical: the density depends only on the radial distance r from the centre. These functional forms for ρ can be modified to allow for flattened systems.
75
Figure 5.4: Plots of v 2 against ln R (upper panel) or v against R (lower panel) For one curve in each panel, a second exponential disc with mass and scale radius both scaled down by e2 ' 7.39 has been added (to mimic a bulge); for the other curve a second exponential disc with mass and scale radius both scaled up by e2 ' 7.39 has been added (to mimic a dark halo).
76
Chapter 6
Gravitational Lensing and Dark Matter in the Galactic Halo 6.1
Introduction
Gravitational lensing is the process that causes the appearance of distant bright objects to be altered by the gravity of foreground mass. Being a purely gravitational effect makes lensing astrophysically important as a probe of mass, including dark matter as well as visible matter. Examples of gravitational lensing that have been observed include • microlensing by stars, brown dwarfs etc. in the Galactic halo; • deflection of light and radio waves by the Sun; • lensing by distant galaxies; and • lensing by galaxy clusters. We shall begin this Chapter with a detailed review of gravitational lensing. The purpose of this is to explain the background to gravitational lensing before using these principles to understand how the light of distant stars can be lensed by objects within our Galaxy. It is the sections on microlensing in the Milky Way that are really syllabus material. The rest you should consider as relevant background material, plus information of general interest.
6.2
Gravitational Deflection of Light by a Point Mass
Photons are affected by a gravitational field, but not in the same way as massive particles are. For the details we need general relativity, but fortunately, for astrophysical applications we only need to take over a few simple results. The most important is that if a light ray
Figure 6.1: The deflection of a ray of light by a point mass. The deflection angle is α. 77
passes by a mass M with impact parameter R (À GM/c2 and À the size of the mass), it gets deflected by an angular amount α =
4GM . c2 R
(6.1)
In contrast, a massive body at high speed v gets deflected by α = 2GM/v 2 R (which was proved in the discussion of relaxation time in stellar dynamics). In most practical applications, the gravitational deflection of light is very small. For example, the deflection of a ray of light skimming the surface of the Sun is only 8.5 × 10−6 rad = 1.8 arcsec. This was first measured using observations of a solar eclipse in 1919 by a team led by Arthur Eddington. If lensing takes place over a short enough distance that the deflection can be taken to be sudden, the lens is said to be geometrically thin. Otherwise it is a thick lens.
6.3
The Lensing Equation
To make Equation 6.1 useful we need two approximations, both very good in almost all astrophysical situations: (i) The deflector is much smaller than the distances to the observer and the object being viewed (the ‘source’); (ii) The deflections are always very small, so we can freely use sin α = α, and also we can get the total deflection from a mass distribution by integrating Equation 6.1.
Figure 6.2: The definitions of the quantities DL , DS , DLS , θ, θ S , and α. Accordingly, let us consider a situation as in Figure 6.2: the observer is viewing a source at distance DS , with a lens (a mass screen) intervening at distance DL ; DLS is the distance from the lens to the source. On galactic scales DL , DS , DLS are ordinary distances, but on cosmological scales they must be understood as angular diameter distances, and D S 6= DL + 78
DLS . The reason for this complication is that the universe will have expanded substantially over the light travel time. We shall ignore these cosmological effects in this analysis because our objective is to understand lensing within the neighbourhood of our Galaxy. We can use angular coordinates to describe the transverse positions. 1 Let θ S be the position of the source, and θ be its observed position after being deflected. Note that these are two-dimensional angular positions, and they are therefore represented as vectors. Let α(θ) be the deflection angle. θ S , θ and α will be measured in radians. Let Σ(θ) be the lens’s surface mass density (expressed as the mass per unit solid angle, in units such as kg sr −1 , solar masses per steradian, or solar masses per square arcsecond). Then, comparing vectors in the source plane, we get DS θ = DS θ S + DLS α .
(6.2)
(By convention,2 α is directed outwards from the deflecting mass rather than towards it.) Using Equation 6.1 to get α in terms of Σ, we get Z DLS 4G Σ(θ 0 ) (θ − θ 0 ) d2 θ 0 θ = θS + . (6.3) α(θ) , α(θ) = 2 DS c DL |θ − θ 0 |2 This is known as the lens equation. It gives θ S as an explicit function of θ, but θ as an implicit function of θ S . Moreover, θ(θ S ) need not be single-valued, so sources can be multiply imaged.
6.4
Time Delays
The deflected light experiences a time delay because of: • the increased geometric light travel time Tgeom = 21 T0 (θ − θ S ), where T0 ≡
DL DS cDLS ;
• the delay within the gravitational potential, −Ψ(θ) (the Shapiro time delay).
The total time delay is therefore,
T =
1 T0 (θ − θ S )2 − Ψ(θ) . 2
(6.4)
The potential time delay, Ψ, is a scalar function with the dimensions of time. We denote it here by the capital letter Psi (a related quantity called the lensing potential will be introduced later and will be denoted by a lower-case psi, ψ: take care not to confuse them). It is given by Z 4G Σ(θ 0 ) ln |θ − θ 0 | d2 θ 0 , (6.5) Ψ(θ) = 3 c The total time delay T has the property that
∇θ T = 0 where ∇θ represents the gradient in the lens plane ¡
i.e. ∇θ ≡
¢ ∂ ∂ ˆ ex + ˆ ey . ∂θx ∂θy
1 Later on, we’ll use θr , θx , θy as coordinates rather than r, x, y, to remind us that these are angles on the sky, not distances. 2 The astrophysical convention being that you first think how a rational person would do it, and then you change the sign.
79
This is merely Fermat’s Principle, a standard result in optics. It states that the path taken by the light minimises the travel time given the particular lens-observer-source configuration. The four equations ∇T = 0, 4G Ψ(θ) = 3 c
Z
T =
1 T0 (θ − θ S )2 − Ψ(θ) , 2
Σ(θ 0 ) ln |θ − θ 0 | d2 θ 0 ,
T0 =
DL DS , c DLS
(6.6)
represent a reformulation of the results of Equation 6.3. Although it is possible to work entirely with Equation 6.3, Equations 6.6 are much more intuitive. In the cosmological situation, both terms for T need to be multiplied by (1 + zL ), where zL is the redshift of the lensing object. The gravitational time delay Ψ can be derived directly from general relativity, independently of Equation 6.1, and is known as the Shapiro time delay. Radio astronomers can measure it directly within the Solar System.
6.5
The Einstein Radius and Einstein Rings
Figure 6.3: The Einstein ring produced by symmetrical lensing of a point object by a point mass. Consider a point mass M , which happens to be precisely between us and a point source. In other words θ S = 0 and Σ(θ) = M δ(θ). From the symmetry, we expect to observe a ring. The lens equation (6.3) is solved by θ = θE , with θE2 =
4GM DLS , c2 D L D S
(6.7)
The interpretation of this is that this perfectly aligned lens will produce an image that is a ring with an angular radius θE given (in radians) by r 4GM DLS θE = . (6.8) c2 D L D S This circular image is called an Einstein ring. The physical length RE corresponding to the angle θE is called the Einstein radius and is given by RE = θE DL . The Einstein radius is therefore, r 4GM DL DLS . (6.9) RE = c2 DS By a Gauss’s-law type argument, for any circular mass distribution Σ(θr ), Ψ(θr ) and α(θ) will be influenced only by interior mass. So we’ll get the same images for any circular 80
distribution of the mass M , provided it fits within an Einstein radius. Bodies that fit within their own Einstein radius are said to be ‘compact’. But the Einstein radius depends on where the source and observer are: 1
RE ∼ (Schwarzschild radius × DL ) 2 . This effectively means that the further away you look, the easier it gets to see examples of gravitational lensing. This is a surprising fact at first, but it’s really just the gravitational analogue of a familiar fact about glass lenses – to get the maximum effect from a lens you have to be near the focal plane, if you’re too near the lens doesn’t have much effect.
6.6
The Critical Surface Density Σcrit
For given values of DL and DS , for a lens to be compact object you have to pack a mass M (in projection) into a circle of radius θE . However, the area of this circle is proportional to the mass (through the dependence of θE on the mass). So clearly there has to be a critical density, say Σcrit , such that if Σ ≥ Σcrit somewhere then there is a compact object (or sub-object). 2 . Substituting for θ from EquaΣcrit corresponds to a mass surface density of M/πθE E tion 6.8, we find that D L D S c2 Σcrit = . (6.10) DLS 4πG It has units of mass per unit solid angle (e.g. kg sr−1 , M¯ sr−1 or M¯ arcsec−2 ). Note that Σcrit depends on the distances DL , DS and DLS . If Σ > Σcrit , the lensing is compact. The fact there is a critical density, and that it depends on distances, has important astrophysical consequences. For example, a galaxy as a whole (a smooth distribution of ∼ 1012 M¯ on a scale of ∼ 105 pc) is not compact to lensing for DL . 109 pc – cosmological distances. But clumps within the galaxy may be compact at much smaller distances. In particular, a star is compact to lensing at distances of even . 1 pc.
6.7
The Lensing Potential ψ
It is possible to define a two-dimensional function ψ(θ) so that α is related to the gradient of ψ in the lens plane as, ∇θ ψ ≡ θ − θ S =
DLS α(θ) , DS
(6.11)
from the lensing equation. ψ is a dimensionless scalar function known as the lensing potential. It denoted by a lower-case letter ψ and care should be taken to avoid confusing it with the time delay Ψ caused by the gravitational potential. Ψ and ψ are related by Ψ = T0 ψ =
DL DS ψ , cDLS
Using this, we can write Equation 6.6 more concisely as £ ¤ ∇T = 0, T = T0 21 (θ − θ S )2 − ψ(θ) R ψ(θ) = π1 κ(θ0 ) ln |θ − θ 0 | d2 θ 0 ,
(6.12)
(6.13)
where κ is the projected mass density in units of the critical density (i.e. κ ≡ Σ/Σ crit ). From the second line of Equation 6.13 it should be evident that ψ satisfies a two-dimensional Poisson equation ∇2 ψ = 2κ . (6.14) 81
6.8
The Arrival Time Surface
The surface T (θ) is known as the time delay surface or the arrival time surface. Wherever the arrival time is stationary (i.e., the surface as a maximum, minimum, or saddle point) there will be constructive interference, and an image. This is Fermat’s principle. Furthermore, the less the curvature of the surface at the images, the more magnified the image will be. This is formalised in the next section. Try to visualise the arrival time surface. The geometrical part is a parabola with a minimum at θ S . Having mass in the lens pushes up the surface variously. If κ(θ) > 1 anywhere, there will be a maximum somewhere near there, hence another image. There must be a third image too, because to have a minimum and a maximum in a surface you must have a saddle point somewhere. In fact maxima + minima = saddle points + 1 .
(6.15)
This is a really a statement about geometry that should be intuitively clear, though a formal proof is difficult. A good way of gaining some intuition about the arrival time surface is to take a transparency with a blank piece of paper behind it and look at the reflections of a light bulb. Notice how images merge and split, and how you get grotesquely stretched images just as they do. Deep images of rich clusters of galaxies show just these effects!
6.9
Magnification
By magnification we mean: how much does the image move when we move the source? It should be clear that this magnification can’t be a scalar, because an image doesn’t in general move in the same direction as the source. In fact the magnification is a tensor. We’ll denote it by M (the letter A for ‘amplification’ is also used). Formalising our definition, we have M −1 =
∂θ S ∂2 = T (θ) . ∂θ ∂θ 2
(6.16)
In Cartesian coordinates
M −1
∂2ψ 1 − ∂θx2 = ∂2ψ ∂θy ∂θx
∂2ψ ∂θx ∂θy . ∂2ψ 1− ∂θy2
Notice that M −1 is basically taking the curvature of the arrival time surface. It is helpful to write M −1 in terms of its eigenvalues, and the usual form is like µ ¶ µ ¶ 1 0 cos 2φ sin 2φ −1 M = (1 − κ) − γ . 0 1 sin 2φ − cos 2φ
(6.17)
The eigenvalues are of course 1 − κ ± γ. The first term in Equation 6.17 is the trace part – and comparing equations 6.17 and 6.14 shows that it must be κ – while the second term is traceless. The term with κ produces an isotropic expansion or contraction, while the γ term produces a stretching in the φ direction and a shrinking in the perpendicular direction; κ is known as ‘convergence’ and γ as ‘shear’. The determinant of M can be thought of as a scalar magnification. |M | = [(1 − κ)2 + γ 2 ]−1 . 82
(6.18)
The area of images on the sky are increased by a factor |M |. The places where one of the eigenvalues of M −1 becomes zero (and in consequence |M | is infinite) are in general curves and are known as critical curves. When critical curves are mapped onto the source plane through the lens equation, they give caustics; a source lying on a caustic gets infinitely magnified.
6.10
Examples of the Magnification: Lensing by a Point-Mass and by an Isothermal Sphere
For a point mass, the lens equation is θSx = θx −
θx 2 θ , θr2 E
θSy = θy −
θy 2 θ , θr2 E
and this gives
M
−1
µ
θx2 1 + 2 θr2 θr4
1− = θx θy 2 4 θE2 θr
¶
θx θy 2 4 θE2 θr à ! θy2 1 1− + 2 4 θE2 θr2 θr
θE2
.
Taking the determinant and simplifying, we get |M |−1 = 1 −
θE4 . θr4
(6.19)
(6.20)
We shall now consider a circular mass distribution Σ ∝ θr−1 . This is known as the ‘isothermal lens’, because it is the ρ ∝ 1/r 2 isothermal sphere in projection. The lens equation for the isothermal lens is θSx = θx −
θx 2 θ , θr E
and gives
M
−1
And from this we get
µ
θSy = θy −
1 θ2 + x3 θr θr
1− = θx θy 2 θ θr3 E
¶
θy 2 θ , θr E
θx θy 2 θ θr3 E
θE2 1−
Ã
θy2 1 + 3 θr θr
!
. θE2
θE . (6.21) θr This is shorter in polar coordinates, but tensor components in polar coordinates can get confusing. |M |−1 = 1 −
6.11
The Conservation of Surface Brightness
Magnification in lensing conserves surface brightness. We can prove this in a rather interesting way. Let us consider the axial direction as a formal time variable t; then light rays can be thought of as trajectories. Now allow observers to be at arbitrary transverse position w (say w – two dimensional) and arbitrary t. Then θ as observed at (w, t) is just the local ddt for the corresponding light ray, up to a constant factor. This means we can make a formal analogy with Hamiltonian formulation of stellar dynamics, with θ (up to a constant) playing 83
the role of the momentum, w playing the role of the coordinates, and ψ(w, t) replacing the Newtonian potential. The phase space density f is the density of photons in (w, θ) space, or the number of photons per unit solid angle on the sky per unit telescope area, i.e., the surface brightness. The collisionless Boltzmann equation applies (as it does for any Hamiltonian system) and it tells us that surface brightness is conserved along trajectories! Surface brightness must be conserved by the act of placing the lens there too – think of surface brightness before and after going through the lens. QED. We must be careful, though, to understand ‘along the trajectories’ correctly. It means we must always be looking at photons from the same source, so if the image is moved in the sky by lensing we must follow it when we measure surface brightness. This means that lensing changes the apparent sizes (and shapes) of objects, but does not alter their surface brightness. Lensing a source will change its apparent brightness because its angular area is changed, not because the surface brightness changes.
6.12
The Preservation of Spectroscopic and Colour Information
The gravitational lensing of light does not depend on the wavelength: the wavelength of light, or equivalently the energy of photons, does not appear in the equations describing lensing. Therefore, the spectrum of a source is not changed if the source is lensed by a foreground object. Equally, the colour of the source is unchanged. This provides an important test for the effects of lensing. If two images on the sky are suspected to be caused by lensing of a single object, we expect that their spectra will be the same. They should show the same spectral features, and have the same redshift. One qualification needs to be made. An extended source, such as a galaxy, may emit light from different regions that have different spectra and different colours. This can produce some practical changes in observed spectra on account of lensing in some instances, caused by the different magnification of subregions.
6.13
Multiple-image QSOs
Multiple imaging of a quasi-stellar object (QSO) happens when a foreground galaxy lies . θ E (in projection) of a QSO, and produces two or four images with arcsecond order separations. Two-image systems have a minimum and a saddle point, while four-image systems have two minima and two saddle points. In both cases there is a maximum as well, at the bottom of the galaxy’s potential well; but since that is also generally the densest part of the galaxy, κ is very high and |M | nearly vanishes, so these central images are too faint to detect. Multiple-image QSOs are of great astrophysical interest, and two things make them so. The first is that since QSOs are often very time-variable and the different images have different arrival times, the images will show the same time-variability, but with offsets. These offsets are simply the differences in T (θ) between different images. (So far they have been explicitly measured for several lenses.) Provided we know (or can model) κ(θ), the measured time offsets tell us T0 , and hence H0 . Basically it’s this: normally we can only measure dimensionless quantities (image separations, relative magnifications) in lens systems; but if we succeed in measuring a quantity that has a scale (the time delays), it can tell us the scale of the Universe (H0 ). In practice, there is considerable uncertainty about the distribution of mass in the lensing galaxies, and this translates into an uncertainty in the inferred H 0 that is much larger than errors in the time delays. Maybe this problem will be overcome, maybe not. The second thing has to do with the extremely small size of QSOs in optical continuum. Now the κ(θ) of a galaxy isn’t perfectly smooth, it becomes granular on the scale of individual 84
stars. This produces a very complicated network of critical lines (in the lens plane), and a corresponding complicated network of caustics in the source plane (like the pattern at the bottom of a swimming pool). The optical continuum emitting regions of QSOs are small enough to fit between the caustics, but the line emitting regions straddle several caustics. As proper motions move the caustic network, the continuum region will sometimes cross a caustic, and show a sudden change in brightness; the time taken for the brightness to change is the time it take to cross the caustic. This is the phenomenon of QSO microlensing: continuum shows it but lines don’t. (It’s just the gravitational version of stars twinkling and planets not twinkling.) This has been observed, and modelling the caustic network and putting in plausible values for the proper motion leads to an estimate of the intrinsic size of the continuum regions of QSOs. It’s very small ∼ 100 AU.
Figure 6.4: Examples of gravitational lensing of quasars and distant galaxies by foreground galaxies. The pictures show images of candidate lenses recorded with the Hubble Space Telescope. [From NASA HST press release 1999-18, produced by Kavan Ratnatunga (Carnegie Mellon Univ.), NASA and the Space Science Telescope Institute.]
6.14
Galaxy Clusters
Galaxy clusters are generally not in dynamical equilibrium (there haven’t been enough crossing times since they formed). Their mass distributions and ψ potentials are thus warped in more complicated ways than for single galaxies. They are also much bigger on the sky and thus have many more background objects (faint blue galaxies) to lens. The transparency with a paper behind it and several light bulbs overhead is a good analogy of lensing by a cluster. Rich clusters show many highly stretched images of background galaxies, and these are known as arcs. A deep HST image of Abell 2218 shows over a hundred arcs, including seven multiple image systems. An arc is close to a zero eigenvalue of M −1 , and is stretched along the corresponding eigenvalue. Thus each arc provides some sort of constraint on the ψ of the cluster. Clusters also show weak lensing. That’s when the eigenvalues 1 − κ ± γ are too close to unity to show up as arcs, but if many galaxies in the same region are examined then statistically a stretching is measurable. The statistical stretching measures the ratio of the two eigenvalues, and thus γ/(1 − κ). Several groups have been reconstructing cluster mass profiles from information provided by multiple-images, arcs, and weak lensing.
85
Figure 6.5: Gravitational lensing by a cluster of galaxies. This Hubble Space Telescope image of the cluster Abell 2218 shows many arcs caused by the lensing of distant background galaxies by the mass distribution in the cluster. [NASA image recorded with the Hubble Space Telescope by Andrew Fruchter and the ERO Team and released by the Space Telescope Science Institute (as STScI-2000-07).]
6.15
Microlensing in the Milky Way
The exact nature of dark matter in the Universe is still unclear, despite dedicated research over many years. The available evidence indicates that a majority of the dark matter is dynamically cold and that it is collisionless. It could be in the form of subatomic particles, such as weakly-interacting massive particles (WIMPs), or could be astronomical objects that emit little or no radiation, known as massive astrophysical compact halo objects (MACHOs). One possibility is that the dark matter in the Milky Way halo is MACHOs in the form of brown dwarfs, compact objects below the hydrogen burning threshold of 0.08M ¯ . Such objects would act as point lenses. Indeed, the gravitational lensing of background stars by dark astronomical objects in the halo would be one way of detecting individual dark matter objects, and therefore of identifying the nature of dark matter. A point lens has two images, at µ ¶ q 1 θ= (6.22) θ S ± θ S 2 + 4θE2 . 2 (There is formally a third image at θ = 0, i.e., at the lens itself, but for a point mass this image has zero magnification.) The image separation for a ∼ M 0 lens at distances of ∼ 10, kpc is < 1 mas, far too small to resolve. What will be observed is a brightening equal to the combined magnification of both images. Using the result Equation 6.20 for |M | for a point lens, and adding the absolute values of |M | at the two image positions, we get Mtot =
u2 + 2 u(u2
+ 4)
1 2
,
u=
θS . θE
(6.23)
Now because of stellar motions, θS will change by an amount θE over times of order a month, so microlensing in the Milky Way can be observed by monitoring light curves. If the background source star has impact parameter b and velocity v (projected onto the lens
86
Figure 6.6: Predicted light curves for impact parameters of RE (lowest), 0.5RE and 0.2RE (top). The unit of time is how long it takes the source to move a distance RE . place) with respect to the lens, then 1
(b2 + v 2 t2 ) 2 u= . DL θ E
(6.24)
Inserting Equation 6.24 into 6.23 gives us Mtot (t), i.e. the light curve. This is plotted for three different b in Figure 6.6. The height of a measured light curve immediately gives R E /b, and the width gives RE /v. Though trying to resolve the images images in microlensing seems hopeless with foreseeable technology, there are some prospects for tracking the moving double image indirectly. By combining the positions and magnifications of the two images, we have for the centroid θcen =
u(3 + u2 ) θE . 2 + u2
(6.25)
Such microlensing events are rare, because θS has to be . θE for significant magnification. People speak of an optical depth τ to microlensing in a field. This is the probability of a star being (in projection) within θE of a foreground lens, at any given time. From Equation 6.24 √ it amounts to the probability of Mtot ≥ 2/ 5 = 1.34. It’s just the covering factor of discs of radius θE (Einstein rings) from all lenses between us and the stars in the field. The source stars might be bright stars in the Large Magellanic Cloud (LMC) and the lenses very faint stars or brown dwarfs in the Milky Way halo. Note that the term optical depth has a very different meaning here to the use of the term in radiation physics, as was used for example in the discussion of extinction by dust in the interstellar medium. We can derive an expression giving the optical depth towards a source plane as a function of the density of microlensing objects in space between the source plane and an observer. Consider a field on the sky subtending a solid ange Ω. Microlensing occurs due to objects of mass ML between the observer and the source plane. Consider a thin surface over this solid angle between a distance DL and DL + DDL . The fraction of the field covered by Einstein radii of lensing sources in this thin shell is 2 dτ = π θE dN / Ω
87
Figure 6.7: The observed light curve of microlensing event BUL SC3 91382 from the OGLE survey. The graph shows the observed brightness of a star in the Galactic Bulge, expressed as the infrared magnitude, plotted against time. The star was observed to brighten and fade over a period of several weeks. The curve is a fit to the data points using the expression for Mtot from Equation 6.23 and u from Equation 6.24, after choosing the time of maximum brightness, the magnitude m0 before/after the lensing event, and the quantities b/DL θE and v/DL θE to achieve the best fit. The magnitude at any time is fitted with m(t) = m0 − 2.5 log Mtot (t). [Plotted with data provided by the OGLE project.]
where dN is the number of lenses in this thin shell. If n is the number density of lenses, dN = n DL2 dDL Ω . The mass density is ρ = nML . Therefore, 2 ρ D 2 dD π θE L L . ML
dτ =
Substituting for the angular Einstein radius and integrating over lens distance D L from the observer to the source plane, we obtain, 4πG τ = 2 c DS
Z
DS
DL DLS ρ(DL ) dDL .
(6.26)
0
The really nice thing about the formula 6.26 is that it doesn’t depend on the mass distribution of the lenses, as long as each mass fits within its own Einstein radius (diffuse gas clouds don’t count, nor does any kind of diffuse dark matter). So τ estimated from light curve monitoring could be used to make inferences about ρ. How large is τ through the Galactic halo? To estimate that, we need an estimate for ρ. Now the Milky Way rotation curve suggests an isothermal halo, ρ = σ 2 /(2πGr 2 ), with σ ∼ 200 km/sec. If we then say that r will be of order the D factors in Equation 6.26, we get σ2 τ ∼ 2 , or τ ∼ 10−7 to 10−6 . c 88
This is a very low figure, showing that the probability of detecting a microlensing event when monitoring a single star is negligible. However, monitoring very large numbers (∼ 10 6 to 107 ) of stars simultaneously can make this feasible. With some more care, we can estimate the lensing optical depth more accurately. As an example, consider a study of stars in a (hypothetical) dwarf galaxy companion to our own Galaxy that lies at a distance S from the Sun in the direction of the South Galactic Pole. The survey aims to detect the brightening of stars in the dwarf galaxy caused by microlensing by MACHOs in the Galactic halo along the sight line to the dwarf galaxy, as a test of whether the dark matter halo of our Galaxy is made out of MACHOs. We shall assume that the dark matter halo can be represented by an isothermal sphere model in which the density profile is given by σ2 ρ(r) = , (6.27) 2πGr2 where r is the radial distance from the Galactic Centre, σ is the velocity dispersion of particles moving in this potential, and G is the constant of gravitation. This isothermal sphere model is likely to be a reasonable representation of the real density profile because it implies a flat rotation curve, providing we do not use it close to the Galactic Centre (because it implies infinite density as r → ∞) or use at very large distances (because it implies a flat rotation out to infinite distance). In this example, the source of light is a star in the companion galaxy at a distance DS = S. The lensing object is a MACHO at a distance q DL from the Sun. The distance of the lensing object from the Galactic Centre is r = R02 + DL2 where R0 is the distance of the Sun from the Galactic Centre. The distance between the lensing object and the source is DLS = DS − DL = DS − R0 . Using Equation 6.26, 4πG τ = 2 c DS 4πG = c2 S
Z
Z S
0
DS
DL DLS ρ(r) dDL 0
4πG = 2 c DS
Z
DS 0
DL (DS − DL )
σ2 2σ 2 DL (S − DL ) dD = L c2 S 2πG(R02 + DL2 ) 89
Z
S 0
σ2 dDL 2πGr2
(DL S − DL2 ) dDL (R02 + DL2 )
2σ 2 = 2 c S
µ
S
Z
S 0
DL dDL − (R02 + DL2 )
Z
S 0
DL2 dDL (R02 + DL2 )
¶
This can be solved using the standard integrals Z Z ³ ´ ¯ 2 ¯ x2 dx x dx −1 x 1 ¯a + x2 ¯ + constant , = = x − a tan ln + constant . 2 a2 + x 2 a2 + x 2 a
Using these standard integrals, we get à · ! · µ ¸S ¶¸S ¢ D 2σ 2 S ¡ 2 L τ = 2 ln R0 + DL2 + − DL + R0 tan−1 c S 2 R0 DL =0 DL =0 =
2σ 2 c2 S
µ
µ ¶ ¶ ¢ S S ¡ 2¢ S ¡ 2 + 0 − tan−1 0 ln R0 + S 2 − ln R0 − S + R0 tan−1 2 2 R0 µ ¶¶ µ ¶ µ S2 S S 2σ 2 −1 ln 1 + 2 − S + R0 tan = 2 c S 2 R0 R0
which simplifies to σ2 τ = 2 c
µ
µ ¶¶ µ ¶ S S2 R0 −1 ln 1 + 2 − 2 + 2 tan S R0 R0
(6.28)
The distance of the Sun from the Galactic Centre is R0 = 8.0 kpc. Putting S = 100 kpc as the distance to the companion galaxy (a reasonable figure), we get S/R 0 = 12.5. Therefore, τ = 3.3 ×
σ2 . c2
Observations of stars in the stellar halo of the Galaxy find σ = 200 km s−1 . Using this figure for the hypothetical MACHOs gives τ = 1.5×10−6 . That is to say, were the dark matter halo made of compact objects such as low mass stars, brown dwarfs, planet-size bodies or stellar mass black holes, we would expect mictolensing events to occur with a frequency defined by τ = 1.5 × 10−6 . So to have any hope of detecting such microlensing events, it is necessary to monitor the light curves of millions of stars. A number of surveys have been undertaken in the last decade, observing fields in the LMC and the Milky Way bulge among others. (The bulge surveys go through the Milky Way disc, of course, but do also probe that part of the dark matter halo that extends through the disc.)3
6.16
Results of Microlensing Surveys
Several surveys of large numbers of stars have been conducted over the past several years to identify microlensing events. These have observed stars in the Galactic Bulge, in the Large Magellanic Cloud and in the Andromeda Galaxy M31. These surveys have monitored many millions of stars regularly for years to search for increases in their brightnesses consistent with microlensing by possible MACHOs, and also caused by lensing by ordinary stars. A considerable number of microlensing events have been observed to date. The current measurements of τ from observations are ∼ 10−7 towards the LMC and ' 3 × 10−6 towards the Bulge. However, this frequency is substantially lower than would be expected were the dark matter halo of the Galaxy made entirely of MACHOs: too few lensing events have been 3
An estimate of τ from a survey will include a correction for the detection efficiency. Surveys have to be very wary of spurious detections; hence any light curve possibly contaminated by stellar variability has to be discarded for microlensing purposes. Detection efficiencies are of order 30%.
90
found to explain the dark matter in the halo. Many of the lensing events can be explained as being caused by main-sequence stars or white dwarfs. How much of the lensing mass is in brown dwarfs as distinct from faint stars is not entirely clear, but the available evidence suggests that compact astronomical objects can make up . 20 % of the dark matter halo of the Galaxy. Meanwhile, the huge number of variable stars discovered by these surveys are revolutionising that field of study.
91
Chapter 7
The Galaxy: Its Structure and Content 7.1
Introduction
Chapter 1 included a brief overview of our Galaxy, while later chapters discussed important processes affecting the Galaxy and other galaxies. Here we bring these concepts together to develop an understanding of the Galaxy itself. This will include a consideration of each of the components (disc, bulge, halo, etc.) in detail. The Milky Way Galaxy is, as far as we know, a typical disc galaxy. Figure 1.8 was a cartoon to remind you of its different components. The luminous parts are mostly a disc of population I stars and a bulge of older population II stars. We live in the disc, with the Sun at a distance R0 = 8.0 kpc from the centre. Apart from stars, the disc also has clusters of young stars and H II regions, and gas and dust; the gas is mostly observed as an H I layer which flares at large radii. There is some evidence that there are two or three spiral arms in the disc (the dust makes it hard to tell). The bulge is accompanied by a bar, though the dimensions of it are unclear. There are some very old stars (and globular clusters of very old stars) in the stellar halo. But the most massive part is the dark matter halo, which is made of dark matter of unknown composition. That is not all: there are also the small companion galaxies. The best known of these are the the Large and Small Magellanic Clouds (LMC and SMC) which are ' 50 kpc away; these are associated with a trail of debris, mostly H I gas, known as the Magellanic Stream. Then there is the Sagittarius Dwarf Galaxy which appears to be merging with the Milky Way now.
7.2
The Mass of the Galaxy
The mass of the Galaxy enclosed within different radii can be determined using a variety of methods. Observations of the rotation curve provide measurements out to ' 12 â&#x2C6;&#x2019; 15 kpc. Measurements of the dynamics of globular clusters can constrain the enclosed mass to a greater distance. However, while there are good estimates of the enclosed mass of the Milky Way within different radii, it is not known where the halo of the Milky Way finally fades out (or even if the size of the halo is a very meaningful concept). So the only way to get at the total mass of the Milky Way is to observe its effect on other galaxies. The simplest but most robust of these comes from an analysis of the mutual dynamics of the Milky Way and M31 (the Great Andromeda Galaxy): it is known as the timing argument. This is discussed in the next section.
92
7.3
The Mass of the Galaxy from Dynamical Timing Arguments
The dynamical timing argument relies on modelling the dynamics of the Galaxy and nearby galaxies. The Local Group contains two substantial spiral galaxies, the Galaxy and M31 (it does also contain one less massive spiral, M33, several irregular galaxies of modest mass, and numerous low mass dwarfs). We shall first consider the mass constraint that can be obtained from the dynamics of M31 and the Galaxy, ignoring the other Local Group galaxies. The observational inputs are (i) M31 is 750 kpc away, and (ii) the Milky Way and M31 are approaching at 121 km s−1 . (The transverse velocity of M31 is poorly determined at present.) A simple approximation for their dynamics is to suppose that they started out at the same point moving apart with initial velocities from the Big Bang, and have since turned around because of mutual gravity. This is not strictly true of course, because galaxies had not already formed at the Big Bang; however it is thought that galaxies (at least galaxies like these) formed early in the history of the Universe, so the approximation may be acceptable. Writing l for the distance of M31 from the Galaxy, and M for the combined mass of both systems, the equation of of motion for the reduced Keplerian one-body problem is d2 l GM = − 2 . 2 dt l
(7.1)
Here we shall count time from the Big Bang, so that t = 0 refers to the Big Bang. The current time and separation are t0 and l0 . In considering a Keplerian problem without perturbation we are, of course, assuming that the gravity from Local Group dwarfs and the cosmological tidal field is negligible; but as there are no other large galaxies within a few Mpc this seems a fair approximation. It is not obvious how to solve this nonlinear equation, but fortunately the solutions are well known and easy to verify. There are actually three solutions, depending on the precise circumstances. One solution applies to the case where the combined mass M is too small to halt the expansion and the two galaxies drift further apart for ever: this is not the case we have here. A second solution applies to the limiting case where the mass is just insufficient to stop the motion apart (so dl/dt → 0 and l → ∞ as t → ∞). The third solution applies to the case where the mass is great enough to halt the drift apart and the galaxies fall back toward each other: it is this case that we have here, where the two galaxies are already falling towards each other. This solution is most conveniently expressed in parametric form, as t = τ0 (η − sin η) , ¢1 ¡ l = GM τ02 3 (1 − cos η) .
(7.2)
Here τ0 is an integration constant. The other integration constant has been eliminated by the boundary condition that the two galaxies (or at least the material from which they formed) were at the same position immediately after the Big Bang: i.e. l = 0 at t = 0. It is easy to show that these equations for t and l are a solution to Equation 7.1 by differentiating them to get d2 l/dt2 and substituting them into Equation 7.1. This can be done using à µ ¶ ! µ ¶ µ ¶ µ ¶ µ ¶−1 d2 l d dl dt −1 d dl d dl dt dt −1 = = = . . dt2 dt dt dη dt dη dη dη dη dη To determine the total mass M , we first consider the dimensionless quantity µ ¶µ ¶ t0 dl sin η0 (η0 − sin η0 ) , = l0 dt t0 (1 − cos η0 )2 93
(7.3)
Figure 7.1: The change in the separation l of the Galaxy and M31 with time t since the Big Bang in the model used for the timing argument.
where the subscripts in t0 and so on refer to the current time, as is conventional in cosmology. The quantity on the left-hand side can be calculated directly from observational data. ¡ dl ¢ Inserting the observed values of l0 = 750 kpc and dt t0 = −121 km s−1 and a plausible value of 14 Gyr for t0 (the age of the Universe), we get sin η0 (η0 − sin η0 ) (1 − cos η0 )2
= −2.32 .
This can be solved numerically to give η0 = 4.28. Inserting these values into t0 = τ0 (η0 − sin η0 ) from Equation 7.2, we get τ0 = 2.70 Gyr. Then l0 = (GM τ02 )1/3 (1 − cos η0 ) gives (GM τ02 )1/3 , and using the value we found for τ0 gives1 M ' 4.4 × 1012 M¯ .
(7.4)
From its luminosity and rotation curve, M31 appears to have approximately twice the mass of the Milky Way, i.e. MM31 ' 2MGalaxy . Using M = MM31 + MGalaxy , this implies that the mass of Milky Way exceeds 1012 M¯ . Estimates for the mass of the luminous part of the Milky Way range from (0.05 − 0.12) × 1012 M¯ , which confirms that the majority of the mass of the Galaxy is unseen (it is dark matter). It should be noted that this analysis predicts that the Galaxy and M31 will collide, and consequently merge, at some time, about 3.0 Gyr in the future. However, it fails to take account of the component of the velocity of M31 tangential to our line of sight. M31 and the Galaxy may have a tangential component, and therefore may have enough angular momentum that they may not actually come together. The timing argument can be applied not only to the Andromeda Galaxy, but also to Local Group dwarf galaxies (which have much less mass and behave just as tracers). Figure 7.2 shows plots of l against dl/dt for some Local Group dwarfs, along with the predictions of 1
−1 It is useful to remember G in useful astrophysical units as 4.98 × 10−15 M¯ pc3 yr−2 .
94
Figure 7.2: Distances and velocities of six Local Group dwarf galaxies, and predictions for different values of GM/τ0 (by Alan Whiting). the timing argument for different values of GM/τ0 . This uses l=
¡
GM τ02
¢1
3
(1 − cos η)
and
dl = dt
µ
GM τ0
¶1 3
sin η . 1 − cos η
(7.5)
This model assumes that the dwarfs have been moving on radial trajectories since the Big Bang.
7.4
Kinematics in the Solar Neighbourhood
The Milky Way is a differentially rotating system. The local standard of rest (LSR) is a system located at the Sun and moving with the local circular velocity (which is ' 220km s −1 ). The Sun has its own peculiar motion of ' 13 km s−1 with respect to the LSR. The rotation velocity and its derivative at the solar position are traditionally expressed in terms of Oort’s constants: µ ¶ 1 vφ ∂vφ A ≡ at R = R0 − 2 R ∂R µ ¶ 1 vφ ∂vφ B ≡ − at R = R0 (7.6) + 2 R ∂R Observations show that A = +14.4 ± 1.2 km s−1 (kpc)−1 and B = −12.0 ± 2.8 km s−1 (kpc)−1 . One reason that these parameters are useful is that B vanishes for solid body rotation (i.e. B = 0 when the angular velocity Ω(R) = constant). Another useful property is that the gradient of the rotational velocity is ∂vφ /∂R = −(A + B) at R = R0 , which means that A + B = 0 if the rotation curve is flat (because vφ is the same as vcirc , and therefore we have ∂vφ /∂R = ∂vcirc /∂R = 0 for vcirc = constant). Therefore calculating A + B is a test of whether the Galaxy has a flat rotation curve close to the Sun’s distance from the Galactic Centre. Similarly, the angular velocity in the solar neighbourhood is Ω0 = vφ /R|R=R0 = A − B. 95
The radial and tangential components of the velocity of stars or gas in circular orbit, v r and vt , can be written as functions of the galactic longitude l as vr ' A d sin(2l)
vt ' A d cos(2l) + B d
(7.7)
locally (within about 1 kpc), where d is the distance from the Sun. The advantage of the Oort constants A and B is that they describe the motions of stars around the Sun in the Galaxy, and they can be measured from simple velocity and distance data. But now that we have accurate proper motions from the Hipparcos satellite mission, and hence (combining with ground-based line-of-sight velocities) three-dimensional stellar velocities in the solar neighbourhood, A and B are less important. If you take the average (three-dimensional) velocity and dispersions of any class of stars in the solar neighbourhood, then hvR i and hvz i turn out to be nearly zero, while hvφ i is such that hvφ i−vLSR is negative and ∝ σRR . This is known as the asymmetric drift and essentially expresses the degree of rotational support versus pressure support. Young stars are almost entirely supported by hvφ i, like the gas that produced them. The asymmetric drift for young stars is therefore nearly zero, because hvφ i ' vLSR . Older stars pick up increasing amounts of pressure support in the form of σRR ; they then need less vφ to support them, and thus tend to lag behind the LSR. The linear relation can be derived from the Jeans equations, but we won’t go through that because you’ve probably had enough of Jeans equations for now. . . When examined in detail using proper motions from the Hipparcos astrometry satellite, the velocity structure in the solar neighbourhood is more complicated than anyone expected. Figure 7.3 shows a reconstruction of the stellar (u, v) (i.e., radial and tangential velocity) distribution in the solar neighbourhood for stars in different ranges of the main sequence. 2 Notice the clumps in the velocity distribution which appear for stars of all ages. (And these are clumps only in velocity space, not in real space.) The idea that there are groups of stars at similar velocities is itself not new—it actually dates from the early proper motion measurements of nearly a century ago. But these ‘streams’ have generally been interpreted as groups of stars which formed in the same complex and were later stretched in real space over several galactic orbits. The surprising new finding is that the ‘streams’ are seen for stars of all ages, which indicates a dynamical origin; they seem to be wanting to tell us something interesting about Milky Way dynamics, but as yet we don’t know what.
7.5
Dynamics of the Galactic Disc
The orbits of stars in galaxies are, in general, not closed paths, as we saw in Chapter 2. This is equally true of stars in the disc of our Galaxy. The tangential (vφ ) component dominates for disc stars (vφ is much larger than the vR and vz components). We can therefore break the motion of disc stars into two parts: a uniform motion about the Galactic Centre, plus the motion relative to this uniform rotation. This second part is called an epicycle. The epicycle is very nearly an ellipse, but the period of the motion around the epicycle is not the same as the period of the uniform component of the motion. (The term epicycle was used historically for the complicated system of cycles that was used to fit the motion of the planets before Kepler explained the elliptical orbits about the Sun.)
7.6
The Disc (or Thin Disc)
The disc of the Galaxy contains mostly stars, with some gas. The stars are distributed with an exponential density profile in both the R and z directions. The density ρ(R, z) is therefore 2
The Schwarzschild ellipsoid and its vertex deviation that you may find in textbooks should now be considered obsolete—they are essentially the result of washing out the structure in Figure 7.3.
96
Figure 7.3: Distribution of radial (u) and tangential (v) velocities of main sequence stars in the solar neighbourhood, recently reconstructed from Hipparcos proper motions by Walter Dehnen (1998). The upper left panel is for the youngest (and bluest) stars; these are estimated to be < 0.4 Gyr old. The upper right panel is for stars younger than 2 Gyr, and the lower left panel is for stars younger than 8 Gyr. The lower right panel shows the combined distribution for all main sequence stars. The Sun is at (0, 0) and the LSR is marked by a triangle.
Figure 7.4: Epicyclic orbits. The first diagram shows the orbit of a star in an elliptical orbit in the disc of a spiral galaxy as viewed in an inertial frame. It moves in a â&#x20AC;&#x153;rosetteâ&#x20AC;? pattern. The middle diagram shows the same orbit as viewed in a rotating frame, with the frame rotating with uniform motion and with a period equal to the orbital period of the star. The third diagram show the orbit of the star in the centre diagram in greater detail, showing the epicyclic motion.
97
described by ρ(R, z) = ρ0 e−R/hR e−|z|/hz , where hR and hz are the scale lengths in the R and z directions, and ρ0 is a constant. Observations show that hR ' 3.5 kpc. The vertical scale height hz is different for different age stars – young stars have smaller scale heights – but hz = 250 pc is a typical value. The disc is rotationally supported with a circular velocity vcirc ' 220 km s−1 at the Sun’s position from the centre. There is a small velocity dispersion around this of 15 km s −1 for young stars, 40 km s−1 for old ones. Young stars form in the gas and naturally have a small velocity dispersion about the mean rotation. The greater velocity dispersion of older stars has probably been caused by perturbations of the stars during encounters with giant molecular clouds. We learnt in Chapter 2 that stars are collisionless in galaxies. However, encounters between stars with giant molecular clouds can perturb stellar velocities in galactic discs to a limited degree over the lifetime of a spiral galaxy. Heavy element abundances in disc stars are close to the solar values. Typical metallicities are [Fe/H] = −0.4 to +0.2. The gas and its associated dust are concentrated close to the Galactic plane. The gas moves in circular orbits. The H I gas layer flares and warps at large radius. The main disc is often called the thin disc to distinguish it from the thick disc, described below.
7.7
The Thick Disc
The term thick disc is usually given to a distribution of stars that is more extended in the vertical direction (perpendicular to the plane) than the main Galactic disc (the thin disc). The term is associated with stars, not gas. It consists of moderately metal-poor, older stars, with [Fe/H] close to −0.6. The system is rotationally supported, but vcirc slightly smaller than for thin disc, a consequence of the stars showing a velocity dispersion that is larger than for those of the thin disc. The asymmetric drift is 30 − 50 km s−1 . Only about 2% of the stars in the solar neighbourhood belong to the thick disc. The density distribution, like that of the thin disc, is a double exponential function, with probably a comparable radial scale length to the thin disc, but the vertical scale height is hz ' 1.3 kpc. There has been some controversy over whether it is a distinct component of the Galaxy in its own right, or is made merely out of a small number of disc stars with extreme metallicities and kinematics.
7.8
The Bulge
The bulge is a spheroidally distributed, but flattened, system in the central regions of the Galaxy, confined to the inner 2 kpc. Its stars are old. They show a very wide range in metallicity, ranging from [Fe/H]= −1 to +1. It is largely pressure supported.
7.9
The Bar
There is little doubt now that the distribution of stars in the region of the Milky Way bulge is triaxial – there is a (rotating) bar with the positive l side nearer to us and moving away. The evidence for this was at first indirect, and took the following form. Consider gas in the ring, which must move on closed orbits. If it moved on circular orbits in the disc, and we measured its Galactic longitude l and line of sight velocity v, then all the gas at positive l (i.e. on one side of the Galactic Centre) would have one sign for v and similarly all the gas at negative l (on the other side) would have the opposite sign for v. In fact gas at positive l is seen with both signs for v, and likewise at negative l. So the gas orbits must be non-circular, 98
and hence the gravitational potential must be non-circular in the disc. This suggests a bar and indeed the observed gas kinematics is well fitted by a bar.
Figure 7.5: Schematic of the bar in the Milky Way Bulge, viewed from the North Galactic pole (left), and from the Sun (right). (From Blitz and Spergel, ApJ, 1991. The right panel uses minus the usual convention for l.) The features of a bar can in fact be seen in an infrared map of the bulge, if you know what to look for. Figure 7.5 shows a bar in the plane, and its effect on an l, b map. 1. The side nearer to us is brighter. Contours of constant surface brightness are further apart in both l and b on the nearer size. 2. Very near the centre, the further side appears brighter, so the brightest spot is slightly to the further size of l = 0. The reason is that on the further side our line of sight passes through a greater depth of bar material, which more than compensates for it being slightly further. The features (i) can be discerned in many different data sets; the feature (ii) is harder to find, it just about shows up in the COBE maps of the bulge.
7.10
The Galactic Centre
Observing the centre of the Galaxy is extremely difficult in the optical because the extinction caused by dust in the Galactic plane is approximately 30 magnitudes in the V-band. Things are not so extreme in the infrared, and in the K band (2.2 µm) the extinction is a more moderate 3–4 mag. However, the available data show that there is a very dense star cluster at the Galactic centre, with a compact radio source, Sagittarius A ∗ (abbreviated Sgr A∗ ) at its centre. There is a ring or disc of gas around the centre, about 5 pc across, detected by its molecular emission. Orbital velocities immediately around Sgr A∗ are very high. The available evidence suggests that there is a compact massive object at the centre of Sgr A ∗ . This is probably a central black hole with a mass 1 − 3 × 106 M¯ .
7.11
The Stellar Halo
The stellar halo is the spheroidally distributed, slightly flattened, system that extends far from the disc. It includes diffuse field stars and globular clusters. The stars are very old 99
(13 Gyr) and are very metal-poor, having [Fe/H] ' −1 to −2.5. It makes only a very small contribution to the total mass of the Galaxy. It is a pressure-supported system. The velocity dispersion is σ = 200 kms−1 . The asymmetric drift is ' 190 km s−1 . These two figures mean that the kinematics of halo stars are very different to those of the disc. Only about 1/1000-th of the stars in the solar neighbourhood belong to the halo. Examples of halo stars in the solar neighbourhood, of value for example in studying chemical abundances in very metal-poor stars, have often been found from their high proper motions: relatively nearby halo stars usually have large motions across the sky compared with typical (disc) stars.
7.12
Globular Clusters
The Galaxy contains about 150 globular clusters. They are compact systems of ∼ 10 5 stars. Many of these are very metal-poor, are distributed spheroidally and have randomly orientated orbits: they appear to be associated with the stellar halo. However, some globular clusters are only moderately metal-poor. Those globular clusters having [Fe/H> −0.8 form a more flattened system. They may be associated with the thick disc.
7.13
The Dark Matter Halo
The dark matter halo appears to extend out to large radii, to & 100 kpc, as is shown by studies of the dynamics of companion dwarf galaxies, for example. It dominates the mass of the Galaxy. It appears to be spheroidally distributed: it does not appear to be concentrated in the Galactic disc, as we saw in Section 2.20. The nature of dark matter is unknown, but it appears not to be in the form of ∼ stellar mass compact objects, such as white dwarfs or brown dwarfs, as microlensing surveys have shown (at least only a small proportion of the dark matter can be in stellar mass compact objects). Neither is it likely to be in the form of dark compact objects having masses & 1000M¯ , such as massive black holes. Objects of this type would perturb the dynamics of disc stars, thickening the disc, which is not observed on a significant scale. Constraints from primordial nucleosynthesis imply that baryonic matter only contributes 4 % of the closure density of the Universe. Cosmological results indicate that matter constributes 27 % of the closure density. Therefore we expect that most of the dark matter in the Universe is not in the form of baryonic matter, if the cosmological models are correct. This is consistent with the results of Galactic microlensing surveys. The dark matter must be composed of individual particles, be they subatomic particles or astronomical bodies. To support a spheroidal distribution within the Galaxy’s potential, these particles must be moving on mostly randomly orientated trajectories with a velocity dispersion of 200 − 400 km s−1 . The dark matter particles do not dissipate significant energy in interactions with each other (or with the luminous matter), otherwise it would settle down to a rotating disc or a single mass at the centre of the Galaxy’s potential – which it has not done.
7.14
The Local Group
The Galaxy lies in a system of more than 40 galaxies called the Local Group, about 1 Mpc across. There are two large spiral galaxies – the Galaxy and the Great Andromeda Galaxy (M31) – and one spiral (M33) of slightly lower mass. There are a few irregular galaxies, most notably the Large Magellanic Cloud and the Small Magellanic Cloud, but these are not particularly massive. All the other members are dwarf galaxies, either dwarf irregulars, dwarf ellipticals or dwarf spheroidals. 100
A large majority of the galaxies are companions of either the Galaxy or M31. For example, the Magellanic Clouds are situated 50 â&#x2C6;&#x2019; 60 Mpc from the Galaxy. Several of the dwarf spheroidal galaxies lie within 200 kpc.
7.15
The Formation of the Galaxy
A fundamental question relating to the Galaxy is how it was formed. Some stars in the Galaxy, such as those in the thick disc and bulge, and particularly in the halo, are old. The oldest stars were formed relatively early in the history of the Universe, indicating that the Galaxy had a relatively early origin. Star formation has continued in the disc, at least, throughout its history. There are two main scenarios for the formation of the Galaxy: â&#x20AC;˘ the monolithic collapse model, and â&#x20AC;˘ the merging of subunits. The monolithic collapse model was developed in the 1960s, particularly by Eggen, LyndenBell and Sandage. In this picture, the Galaxy formed by the collapse of a protogalactic cloud of gas that had some net angular momentum. The gas initially had a very low metallicity. The collapse occurred mostly in the radial direction and some modest star formation occurred during this time. This produced very metal-poor stars with randomly-oriented elongated orbits, which today are observed as the stellar halo. The gas settled into a broad rotating disc, which was moderately metal-poor by this time as a result of the enrichment of the gas by the heavy elements created by halo stars. The rotation was the result of the net angular momentum of the protogalactic cloud. Star formation in this cooling, settling disc produced thick disc stars which are rotationally supported but have an appreciable velocity dispersion. The gas continued to settle into a thinner, stable, rotating disc. Residual gas that fell to the central regions formed the bulge stars. Star formation continued in the gas disc at a gradual rate, building up the stars of the thin disc. This model predicted the main features of the Galaxy, and did so very neatly. It explained the fast rotation of disc stars with their near-solar metallicities, and the randomly orientated orbits of halo stars with their very low metallicities and great ages. The merging scenario instead maintained that the Galaxy was built up by the merging and accretion of subunits. It was first developed by Searle and Zinn in 1977 for the stellar halo. The merging model is strongly supported by detailed computer simulations of galaxy formation. These simulations predict that the primordial material from the Big Bang clumped into large numbers of dark matter haloes that also contained gas. These small haloes then merged through their mutual gravitational attraction, building up larger haloes in the process. The gas formed some stars in these dark matter clumps. A large number of these subunits produced our Galaxy, with the gas settling into a rotating disc as a natural consequence of the dissipative, collisional nature of the gas. Star formation in the disc then formed the disc stars. The stars of the Galactic stellar halo may have come from the accretion of subunits that had already formed some stars. This process of building galaxies by the merging of clumps to form successively larger and larger units is known as hierarchical galaxy formation. Mergers certainly played an important role in the formation and subsequent evolution of the Galaxy. Detailed computer modelling of galaxy formation provides powerful evidence in favour of this picture. Indeed the merging process may well be occurring today.
7.16
The Sagittarius Dwarf
We shall end our discussion of our Galaxy with the Sagittarius Dwarf. Although it has been in the past an independent galaxy, it is today plunging into our Galaxy and appears to be 101
5
0
-5
-10
-15 0
10
20
30
X (kpc)
Figure 7.6: A partial map of the Sagittarius dwarf galaxy, from RR Lyrae variables. We are at (0, 0), the ellipses around (8.5, 0) represent the bulge, and the four circles indicate the four microlensing survey fields where the RR Lyraes were found. (From Minniti et al. 1997.) in the process of being pulled apart by the gravitational influence of our Galaxy. It may seem amazing that this fairly substantial companion galaxy of the Milky Way remained undiscovered till 1993; the reason is that it is located behind the bulge, and thus has the densest part of the Milky Way in the foreground as camouflage. We do not know in detail how large the Sagittarius Dwarf is, because its stars are difficult to distinguish from the foreground stars. A lower limit on its size comes indirectly from microlensing surveys, because they detect RR Lyrae variables in their fields. Figure 7.6 shows its rough extent. The Sagittarius Dwarf is a highly elongated body. It includes the some globular clusters, including M54, and is associated with a very faint star stream. It contains mostly moderately old or very old stars. It is almost certainly being tidally stretched as it passes through Milky Way halo: that would explain its long, thin structure. It probably will be totally disrupted over the following 108 â&#x20AC;&#x201C;109 years, with its stars being lost into the Galaxyâ&#x20AC;&#x2122;s stellar halo. The Sagittarius Dwarf provides evidence that the Galaxy does accrete small companion galaxies. It may well have consumed many such galaxies in the past. Indeed, a recent study of infrared observations has found debris from a dwarf galaxy, which has been called the Canis Major Dwarf, situated within the Galaxy about 13 kpc from the Galactic centre. This provides evidence that merging is a significant process in galaxy evolution, and possibly to their formation.
102
Appendix A
Revision of Astronomical Quantities A1. Astronomical Units At a research and academic level in astronomy, large distances are expressed in parsecs (pc), while at a popular level they tend to be given in light years (ly). 1 pc = 3.0857 × 1016 m = 3.2616 ly Distances on the scale of galaxies are often expressed in kiloparsecs (kpc), with 1 kpc ≡ 1000 pc. Distances between galaxies and on the cosmological scale are usually expressed in megaparsecs (Mpc), with 1 Mpc ≡ 106 pc. Distances on the scale of the Solar System are measured in terms of the semi-major axis of the Earth’s orbit, the astronomical unit (AU), with 1 AU = 1.4960 × 10 11 m. Masses are often measured in terms of the mass of the Sun, the solar mass M ¯ , with 1 M¯ = 1.989 × 1030 kg Luminosities, defined as the total power output of radiation (in the form of visible light, infrared, ultraviolet etc.), are often expressed in terms of the luminosity of the Sun, the solar luminosity L¯ , with 1 L¯ = 3.826 × 1026 W Angular separations on the sky are measured in degrees (deg or ◦ ), minutes of arc (arcmin or 0 ) and seconds of arc (arcsec or 00 ). The abbreviations arcmin and arcsec are used in preference to min and sec to distinguish them from the minutes and seconds of time that are used when expressing coordinates of right ascension on the sky. Wavelengths of light are sometimes expressed in ˚ Angstr¨om units (˚ A), with 1 ˚ A ≡ 10−10 m, −9 in preference to the nanometre (nm, with 1 nm ≡ 10 m ≡ 10 ˚ A). Wavelengths of infrared radiation are often expressed in micrometres (µm), with 1 µm ≡ 10−6 m. The micrometre is often called the micron. Time is often expressed in years, with 1 yr = 3.1557 × 107 s. Long time spans are often measured in Gigayears, with 1 Gyr ≡ 109 yr = 3.1557 × 1016 s. In all other instances, S.I. units should be used. Unfortunately, some older systems, such as cgs units, still persist in research articles and textbooks.
A2. Astronomical Magnitudes The brightnesses of astronomical objects in the optical, near infrared and near ultraviolet regions of the spectrum are expressed on a logarithmic scale called magnitudes. A magnitude is the brightness integrated over a some range of wavelength, and consequently any particular magnitude applies to a certain region of the spectrum. This region of the spectrum is conventionally selected by passing the light through a coloured filter and that region of the spectrum is called the filter’s waveband, passband or the photometric band. Commonly used wavebands are the U band in the near ultraviolet (around 360 nm wavelength), the B band in the blue (around 440 nm), the V band in the green/yellow (around 550 nm), the R band in the red (around 640 nm) and the I band in the near infrared (around
790 nm). It is always necessary to specify which waveband is being used when magnitudes are quoted (and precisely which definition of passband is being used). The apparent magnitude mF of an object in some waveband F is related to the flux of radiation FF in that band at the top of the Earth’s atmosphere by mF = CF − 2.5 log10 (FF ) , where CF is a calibration constant for that band. The constant 2.5 has been chosen to maintain consistency with historical definitions of magnitudes. A fundamental consequence of this definition is that brighter objects have smaller magnitudes. For example, a magnitude 16.3 star is brighter than a magnitude 19.7 star. Therefore two objects that have fluxes F1 and F2 in some band will have apparent magnitudes m1 and m2 in that band that are related by µ ¶ 2 F1 F1 m1 − m2 = − 2.5 log10 = 10− 5 (m1 −m2 ) . and equivalently, F2 F2 The absolute magnitude is the magnitude that an object would have if it were observed at a distance of precisely 10 pc. The absolute magnitude therefore measures the luminosity, or total power output, in the photometric band. Absolute magnitudes are denoted by a capital M with a subscript indicating the photometric band, such as MV for the V-band absolute magnitude. The apparent magnitude mF and the absolute magnitude MF of some object through some filter F are related by mF − MF = 5 log10 (D/pc) − 5 + AF , where D is the distance (here expressed in parsecs) and AF is the loss of light due to extinction by intervening material (usually interstellar dust). This equation has to be modified for distant galaxies, for which cosmological effects are important, by using, mF − MF = 5 log10 (DL /pc) − 5 + AF + kF , where DL is the luminosity distance (again expressed here in parsecs), and kF is known as the k-correction (it expresses the effect of redshift on the passband). Apparent magnitudes are often denoted simply by the name of the waveband, rather than by a letter m followed by a subscript indicating the band. For example, V as well as mV denotes the V-band apparent magnitude, and B as well as mB denotes the B-band apparent magnitude. The difference between magnitudes in different wavebands is known as a colour index. The colour index is an excellent measure of the colour of an object. For example, the (B −V ) colour index measures the relative brightness of an object in the blue and yellow parts of the spectrum. The calibration constants CF for different photometric bands are usually defined so that a star of spectral type A0 V (a relatively hot main sequence star) has zero colour indices. So the bright star Vega, which happens to be of type A0 V, has (B − V ) = 0.00 and (V − R) = 0.00. As an example of the use of magnitudes, if a star is observed to have a B-band apparent magnitude of B = 17.85 mag and a V-band apparent magnitude of V = 17.05 mag, its (B − V ) colour index will be (B − V ) = 17.85 − 17.05 = 0.80 mag. If it lies at a distance of D = 2000 pc and there is negligible interstellar extinction between us and the star (i.e. AB = AV = 0.00), the absolute magnitudes will be MB = B − 5 log10 (D/pc) + 5 − AB = +6.34 mag and MV = V − 5 log10 (D/pc) + 5 − AV = +5.54 mag.
Appendix B
Revision of Newtonian Gravitation B1. Summary This appendix summarises some basic results relating to gravitation from Newtonian Mechanics. This information covers most of the basic principles from physics about gravitation that are needed for the course.
B2. The Gravitational Field from a Point Mass The attractive force between two particles of mass m1 and m2 a distance r apart is Fgrav =
Gm1 m2 , r2
where G is the universal constant of gravitation, with G = 6.673 × 10 −11 m3 kg−1 s−2 . Using Newton’s Second Law, the acceleration due to gravity at a distance r from a point mass m is Gm g = , r2 directed towards the point mass. The acceleration due to gravity is the gravitational field strength. The gravitational potential a distance r from a point mass m is Φ = −
Gm . r
B3. General Results about Gravitational Fields The acceleration due to gravity g at any point is related to the gradient of the gravitational potential Φ by g = − ∇Φ , in any gravitational field. The potential is negative at all times, tending to zero at infinite distance. The potential energy of a particle of mass m at a point in a gravitational field is V
= mΦ ,
where Φ is the potential at the point. This defintion means that the potential energy is negative. Gauss’s Law relates the integral of the gravitational acceleration over a closed surface to the mass lying inside that surface. If g is the acceleration due to gravity and dS is an element of the surface S, then Z S
g · dS = − 4π G MS
for any closed surface S, where MS is the total mass contained within the surface. This is R the direct equivalent of Gauss’s Law for electrostatics ( S E · dS = QS /²0 ). Substituting for g = − ∇Φ, we also have Z ∇Φ · dS = + 4π G MS S
The Poisson Equation relates the Laplacian of the potential at a point to the mass density. It states that ∇2 Φ = 4π G ρ , where Φ is the potential at the point and ρ is the density.
B4. The Gravitational Field within a Spherically Symmetric Distribution of Mass The acceleration due to gravity at a distance r from the centre of a spherically symmetric distribution of mass is G M (r) g = r2 where M (r) is the mass interior to a radius r, and is directed towards the centre of the distribution. This result does not depend on how the mass is distributed, other than it is spherically symmetric. Mass outside the radius r does not affect the gravitational field at r in this spherically symmetric case. This is the same acceleration as would be given by a point mass M (r) at the centre of the distribution. This result can be derived very easily using Gauss’s Law.
Consider a spherical surface S of radius r centred on the distribution. The acceleration due to gravity at a point on the surface is g. The magnitude of the acceleration everywhere on the surface is |g| ≡ g, from symmetry.
From Gauss’s Law,
Z
S
g · dS = − 4πGM (r) ,
where M (r) is the mass inside the surface and G is the universal constant of gravitation. But an element of the surface area dS is anti-parallel to the acceleration due to gravity g, so g · dS = − |g| |dS| = − g dS. But since g is constant over the spherical surface, Z −g dS = − 4πGM (r) S
∴
− g (4πr 2 ) = − 4πGM (r) ,
which gives, G M (r) . r2 This analysis is possible because of the spherical symmetry. g =
B5. Gradient Operators The results presented above use various operators from vector calculus. It may be useful to quote mathematical expressions for these explicitly for some important coordinate systems. ˆ the gradient of any In a Cartesian coordinate system (x, y, z) with unit vectors ˆ ı, ˆ and k, function A(x, y, z) is ∂A ∂A ˆ ∂A . ∇A ≡ ˆ ı + ˆ + k ∂x ∂y ∂z The Laplacian operator in the Cartesian system is ∇2 A ≡ ∇ · (∇A) ≡
∂2A ∂2A ∂2A + + . ∂x2 ∂y 2 ∂z 2
In a spherical polar coordinate system (r, θ, φ) with unit vectors ˆ er , ˆ eθ and ˆ eφ , these operators are 1 ∂A 1 ∂A ∂A + ˆ eθ + ˆ eφ , ∇A ≡ ˆ er ∂r r ∂θ r sin θ ∂φ µ ¶ µ ¶ ∂2A ∂ 1 ∂ 1 ∂A 1 2 2 ∂A ∇ A ≡ 2 r + 2 sin θ + 2 2 , r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂φ2 for any scalar function A(r, θ, φ). In a cylindrical coordinate system (R, φ, z) with unit vectors ˆ er , ˆ eφ and ˆ ez , these operators are ∂A 1 ∂A ∂A + ˆ eφ + ˆ ez . ∇A ≡ ˆ eR ∂R R ∂φ ∂z µ ¶ 1 ∂ ∂A 1 ∂2A ∂2A 2 ∇ A ≡ R + 2 + , R ∂R ∂R R ∂φ2 ∂z 2 for any scalar function A(R, φ, z).
B6. Distributions of point masses In this section we shall consider the gravitational effects of a series of point masses m i which are located at positions ri , for i = 1, N . The gravitational potential at some position r caused by the distribution is Φ(r) = − G
N X i=1
mi |r − ri |
,
where G is the constant of gravitation. The acceleration due to gravity at the point r is g = −G
N X i=1
mi (r − ri ) . |r − ri |3
The internal gravitational potential energy of the distribution of point masses is X mi mj V = − 12 G . |ri − rj | i,j i6=j
B7. Continuous distributions of mass The gravitational potential at a position r in a continuous distribution of mass enclosed in a volume V is given by Z ρ(r0 ) dV 0 . Φ(r) = − G 0| |r − r V
where r0 is the position vector of the volume element dV 0 , ρ(r0 ) is the mass density at the position r0 , and G is the constant of gravitation. The gradient of this is Z ρ(r0 ) (r − r0 ) ∇Φ(r) = G dV 0 . 0 |3 |r − r V
So the acceleration due to gravity at the point r is Z ρ(r0 ) (r − r0 ) dV 0 . g = − ∇Φ(r) = − G 0 |3 |r − r V The internal gravitational potential energy of some distribution of mass is Z Z ρ(r)ρ(r0 ) V = − 12 G dV dV 0 . 0| |r − r V V where r is the position vector of the volume element dV and where r0 is the position vector of the volume element dV 0 .
Appendix C: Example Problems Problem 1: Question Suppose some category of galaxies has an observed surface brightness profile I(R) = I 0 f (R/R0 ) with all galaxies having the same I0 and function f but different galaxies having different R0 . If the mass-to-light ratio is constant everywhere then show that L ∝ v4 where L is the total luminosity and v is a characteristic velocity. Problem 2: Question The Plummer potential has a gravitational potential Φ(r) at a distance r from the centre of a spherically-symmetric mass distribution that is given by GMtot Φ(r) = − √ r 2 + a2
,
where Mtot is the total mass, G is the constant of gravitation, and a is a constant. Derive from this an expression for the mass M (r) interior to a radius r, and show that the density ρ(r) at a radius r is a2 3Mtot . ρ(r) = 4π (r2 + a2 ) 52 Problem 3: Question A family of radial density profiles ρ(r) that have been popular for the theoretical modelling of spherically symmetric galaxies have been Dehnen models. These are defined so that the density profiles are qa rq ρ(r) = Mtot , 4π r3 (r + a)q+1 where r is the radial distance from the centre of the galaxy, q is an adjustable parameter, a is a scaling constant (determining the size of the galaxy), and Mtot is the total mass. The special case of q = 1, which is called the Jaffe model, is particularly important because it is found to fit the observed I(R) of ellipticals at least as well as the de Vaucouleurs R 1/4 profile. What is the mass M (r) interior to a radius r for any value of q? What is the gravitational potential of a mass distribution having a Jaffe ρ(r)? The Dehnen models have an interesting limit as q → 0. What is it? You may use the standard integral Z 1 rq rq−1 dr = + constant . (r + a)q+1 q a (r + a)q
Appendix C: Example Problems Problem 1: Answer Integrating over the surface brightness gives the luminosity of a galaxy to be L ∝ I 0 R02 . Because I0 is constant for all galaxies of this type, L ∝ R02 for all. The virial theorem implies M/R0 ∝ v 2 , where M is the mass of a galaxy and v is a typical velocity of stars in a galaxy. Eliminating R0 gives L ∝ M 2 v −4 . Because the mass-to-light ratio is constant, M/L = constant, so M ∝ L. Substituting for M in L ∝ M 2 v −4 gives L ∝ L2 v −4 , which in turn gives L ∝ v4 , the required result. This is the same as the Tully-Fisher relation for spiral galaxies, or the Faber-Jackson relation for elliptical galaxies (and observed samples of both types of galaxies do tend to have only a limited range in I0 and standard I(R) profiles).
Problem 2: Answer R Gauss’s Law gives S ∇Φ · dS = 4πGMS for any closed surface S, where Φ is the potential at a point on the surface and MS is the total mass enclosed within that surface. Consider the surface S to be a spherical surface of radius r centred on the mass distribution. Therefore Φ = Φ(r) is constant over the surface of radius r. Using a spherical polar coordinate system (r, θ, φ) centred on the mass distribution with unit vectors ˆ er , ˆ eθ and ˆ eφ , ∇Φ ≡ ˆ er
1 ∂Φ 1 ∂Φ dΦ ∂Φ + ˆ eθ + ˆ eφ = ˆ er , ∂r r ∂θ r sin θ ∂φ dr
in this case because the ∂Φ/∂θ and ∂Φ/∂φ terms are zero on account of the spherical symmetry. So, ∇Φ is directed radially outwards and |∇Φ| = dΦ/dr . So ∇Φ and dS are parallel over the whole surface. Therefore ∇Φ · dS = |∇Φ||dS| cos 0 = |∇Φ|dS, which gives in Gauss’s Law, Z |∇Φ| dS = 4π G M (r) , S
using the fact that ∇Φ is constant over the surface. Substituting for |∇Φ| = dΦ/dr we get, dΦ (4πr2 ) = 4π G M (r) dr
∴
M (r) =
r2 dΦ . G dr
Differentiating the expression for Φ in the question, dΦ G Mtot r , = dr (r2 + a2 )3/2
which gives M (r) =
Mtot r3 , (r2 + a2 )3/2
the result given in the lectures. To determine the density ρ, we can consider a thin spherical shell of radius r and thickness dr centred on the mass distribution. The volume of the shell is 4πr 2 dr and its mass is 4πr2 ρ(r)dr where ρ(r) is the density at a radius r. So, ρ(r) =
1 dM . 4π r2 dr
Differentiating the expression for M (r) derived above using the product rule, dM dr
=
3 Mtot r4 3 Mtot a2 r2 3 Mtot r2 − = . (r2 + a2 )3/2 (r2 + a2 )5/2 (r2 + a2 )5/2 ∴
ρ(r) =
3 Mtot a2 , 4π (r2 + a2 )5/2
the result we had to prove. As an alternative method, we could use Poisson’s equation ∇2 Φ = 4πGρ, which in this case of spherical symmetry gives µ ¶ 1 1 1 d 2 2 dΦ ρ(r) = ∇ Φ = r . 4πG 4πG r2 dr dr Substituting for the expression for dΦ/dr from above and differentiating would give the required result.
Problem 3: Answer Consider a thin spherical shell of radius r 0 and thickness dr 0 concentric with the galaxy. The mass in the shell will be dM 0 = 4π r 0 2 dr0 ρ(r0 ) Integrating from the centre of the galaxy (r 0 = 0) to radial distance r, Z M (r) Z r Z r qa r0q 0 02 0 0 dM = 4π r dr ρ(r ) = 4π r0 2 dr0 Mtot . 4π r03 (r0 + a)q+1 0 0 0 ¸r · Z r r0q r0q−1 0 ∴ M (r) = q a Mtot dr = Mtot 0 q+1 (r0 + a)q r0 =0 0 (r + a) on using the standard integral provided. This gives, µ ¶ rq rq 0q M (r) = Mtot − = M , tot (r + a)q (0 + a)q (r + a)q
the required result (for all q 6= 0). The gravitational potential Φ is related to the mass MS inside a surface S byRGauss’s Law, R S ∇Φ · dS = 4πGMS . If S is a sphere of radius r centred on the galaxy, S ∇Φ · dS = 4πGM (r), which in this spherically symmetric case becomes Z Z dΦ dΦ rq dS = 4πGM (r) ∴ dS = 4πGMtot dr S (r + a)q S dr ∴
4πr2
dΦ r = 4πGMtot dr (r + a) ∴
for the Jaffe model (q = 1)
1 dΦ 1 = GMtot . dr r (r + a)
Integrating from a radius r to infinity (with Φ(∞) = 0), Z 0 Z ∞ 1 1 0 dΦ = GMtot dr0 . 0 0 r (r + a) Φ(r) r This can be solved using partial fractions: µ ¶ Z ∞ 1 1 1 Φ(r) − 0 = GMtot − dr0 = 0 0 + a) a r (r r
i∞ G Mtot h ln r0 − ln(r 0 + a) a r 0 =r
Φ(r)
= =
· µ ¶ ¸∞ ¶ µ G Mtot 1 G Mtot r ln ln = − a 1 + a/r 0 a r+a r µ ¶ G Mtot r+a ln , a r
for the potential at a radius r in the Jaffe (q = 1) model. Alternatively, we could approach the problem from a more physical perspective and consider the potential energy released when a particle of mass m is brought from infinity to a radius r in the presence of the gravitational force F = −GM (r)m/r 2 . The potential energy at a distance r from the centre is then Vp (r) = mΦ(r), from which we could calculate Φ(r). This would give the same result as the method above. If q → 0, the density profile gives ρ = 0 for r > 0. However, ρ(0) =
a Mtot 4π
lim
q, r→0
q rq . r3 (r + a)q+1
So q → 0 implies that all the mass Mtot is concentrated at the centre: it corresponds to a point mass.
Appendix C: Example Problems Problem 4: Question The distribution function f in a spherically-symmetric galaxy is related to the mass density ρ(r) at a radial distance r from the centre by ρ(r) = 4π
Z √ 2m
0 Φ(r)
p
Em − Φ(r) f (Em ) dEm ,
where Em is the energy per unit mass for a star, Φ(r) is the gravitional potential at a radius 7 r, and m is the mean mass per star. Show that a functional form f (Em ) = b (−Em ) 2 is a solution to this equation for a Plummer potential, where b is a constant, using the potential and density given in Question 2. Express b is terms of G, Mtot and a using the result of Question 2. The substitution Em = Φ cos2 θ and the standard result Z
π 2
0
sin2 θ cos8 θ dθ =
7π 512
may prove useful. Assuming m = 0.70M¯ , what is the value of the distribution function f for (x, y, z, vx , vy , vz ) = (10 kpc, 0, 0, 0, 0, 200 kms−1 ) in a galaxy having a Plummer potential with a softening parameter a = 1.70 kpc and a total mass of 2.0 × 1012 M¯ ? Note that x = y = z = 0 corresponds to the centre of the galaxy in this coordinate system. [ 1M¯ = 1.989 × 1030 kg, 1 kpc = 3.0857 × 1019 m, and G = 6.673 × 10−11 m3 kg−1 s−2 .]
Appendix C: Example Problems Problem 4: Answer Try f = b (−Em )7/2 , where Em is the energy per unit mass and b is a constant. The density becomes Z 0 p √ ρ(r) = 4π 2 m b Em − Φ (−Em )7/2 dEm . Φ
Note that Em and Φ are both negative, so −Em and −Φ are both positive. Use the substitution Em = Φ cos2 θ. Differentiating, dEm = −2Φ sin θ cos θ dθ. The limits of the integral are: when Em = Φ, cos2 θ = 1 . 2
when Em = 0, cos θ = 0 .
∴ cos θ = ±1 .
∴ cos θ = 0 .
Take θ = 0 . Take θ = π/2 .
Using this substitution, the density becomes Z π/2 p √ Φ cos2 θ − Φ (−Φ cos2 θ)7/2 (−2Φ sin θ cos θdθ) ρ(r) = 4π 2 m b 0
= =
Z √ 4π 2 m b
8π
√
π/2 √
−Φ sin θ (−Φ)7/2 cos7 θ (2)(−Φ) sin θ cos θ dθ
0
2 m b (−Φ)
5
Using the standard integral, we get,
Z
π/2
sin2 θ cos8 θ dθ .
0
√ ρ(r) = 8π 2 m b (−Φ)5
µ
7π 512
¶
=
7
√
2 π2 m b (−Φ)5 64
Substituting for the Plummer potential from Question 1, we obtain, √ 7 2 π2 G5 M 5 ρ(r) = . m b 2 2 tot 64 (r + )5/2 This is the same as the expression for the density of the Plummer potential in Question 1 if √ 24 2 a2 . b = 4 7 π 3 m G5 Mtot So f (−Em ) = b(−Em )7/2 is a solution to the equation relating density and the distribution function in Question 1 if the constant b has this value. The energy per unit mass for the position and velocity in the question is G Mtot 1 Em = − √ + v 2 = − 8.48 × 1011 + 2.00 × 1010 J kg−1 = − 8.28 × 1011 J kg−1 2 r 2 + a2 p using a radial distance from the centre of the galaxy of r = x2 + y 2 + z 2 = 10 kpc = 3.09 × 1020 m. But f = b(−Em )7/2 with √ a2 24 2 b = 4 7 π 3 m G5 Mtot (1.70 × 3.0857 × 1019 )2 = 0.1564 × m−13 s10 (0.70 × 1.989 × 1030 )(6.673 × 10−11 )5 (2 × 1012 × 1.989 × 1030 )4 (1.70 × 3.0857)2 × 1038 m−13 s10 (0.70 × 1.989)(6.673)5 (2 × 1.989)4 × 10143
=
0.1564 ×
=
9.33 × 10−112 m−13 s10
So f = b(−Em )7/2 gives, f = 9.33 × 10−112 (8.28 × 1011 )7/2 m−3 (m s−1 )−3 = 4.82 × 10−70 m−3 (m s−1 )−3 . [Part of this question appeared in the May 2005 examination.]
Appendix C: Example Problems Problem 5: Question A spherical elliptical galaxy has a total density distribution ρtot (r) =
ρ0 , 1 + r 2 /a2
as a function of radial distance r from its centre, where ρ0 and a are constants. Show that the mass M (r) interior to a radius r has the form M (r) ∝ r 3 for r ¿ a and M (r) ∝ r for r À a. Consider a population of massless test particles in the potential of this galaxy. Assume that this population is spherical, non-rotating, isothermal and isotropic, with velocity dispersion σ in each velocity component. What is the radial density distribution ρp (r) of this test particle population, expressed in terms of M (r) and r? Solve for ρp (r) in terms of r explicitly for large radii (i.e. for regions where r À a) to show that the density has a power law dependence on radius. What is the index of this power law? Give a physical interpretation of this index. What is the condition for the density distributions of the test particle population and the galaxy itself to have similar forms at large r? Problem 6: Question Many of the researchers who perform N -body simulations do so to study the dynamics of galaxies, but some others us N -body techniques to study the dynamics of globular clusters. Naively, we might expect the latter group of people would have an easier job, because they can easily afford as many particles in their simulations as there are stars in the real objects, and they do not need to worry about gas dynamics. We might therefore expect that globular cluster dynamics would be a well-understood subject by now. However, many problems have not been solved fully and plenty of difficult research remains to be done. This problem is to work out why. Consider a globular cluster and a galaxy, both ∼ 1010 yr old. The globular cluster has a size ∼ 20 pc across and contains 105 stars moving with a typical velocity 15 km s−1 . The galaxy is ∼ 20 kpc across and contains ∼ 1011 stars with a typical velocity 200 km s−1 . Both of these are simulated using 105 particles. Give two reasons why the globular cluster simulation will be more difficult.
Appendix C: Example Problems Problem 5: Answer For a thin spherical shell we have, dM (r) = 4πr 2 ρ(r)dr. Integrating from the centre (r = 0) to a radial distance r, we obtain, Z r r02 dr0 = 4πρ0 a2 ( r − a tan−1 (r/a) ) M (r) = 4πρ0 02 2 0 1 + r /a (using a substitution r 0 = a tan θ to solve the integral). When r ¿ a, the expansion tan−1 x = x − x3 /3 + . . . gives M (r) = 4πρ0 a2
¡
r − r + r 3 /3 − O(r 5 )
¢
=
4πρ0 a2 r3 − O(r 5 ) . 3
Therefore, M (r) ∝ r 3 when r ¿ a. When r À a, tan−1 (r/a) ' π/2, so r − a tan−1 (r/a) ' r, which gives, M (r) = 4πρ0 a2 r . Therefore, M (r) ∝ r when r À a. (This is an important result because M (r) ∝ r gives a circular velocity vcirc = constant, as is observed in the rotation curves of spiral galaxies). Actually the asymptotic forms are obvious from ρ ∼ ρ0 (small-r) and ρ ∼ ρ0 /r2 (large-r). For a spherically-symmetric potential, the second Jeans equation gives i ∂Φ n h ∂ 2hvr2 i − hvθ2 i − hvφ2 i = −n ( n hvr2 i ) + ∂r r ∂r where n is the number density of some system of particles or stars, Φ is the gravitational potential, while hvr2 i, hvθ2 i and hvφ2 i are the mean values of the squares of the velocity in the r, θ and φ directions (from Section 2.18 of the course notes). For an isotropic distribution with no net rotation, hvr2 i = hvθ2 i = hvφ2 i = σ 2 , where σ is a constant. So, ¢ d ¡ ∂Φ n hσ 2 i = − n . dr ∂r
To find dΦ/dr, use the fact that the acceleration due to gravity is g = − ∇Φ and that g = GM (r)/r 2 for a spherical distribution of mass. Since the particles are isothermal, σ is independent of r. Therefore, dn GM (r) σ2 = −n dr r2 in terms of M (r) and r. Using ρp ∝ n and integrating, Z Z dρp M (r) 2 σ = −G dr , ρp r2 which gives, G ln ρp = − 2 σ
Z
M (r) dr . r2
This is the density of the test particles as a function of r and M (r). For large r, we have M (r) = 4πρ0 a2 r , which gives, Z dr 4πGρ0 a2 4πGρ0 a2 = − ln r + k1 , ln ρp = − σ2 r σ2 where k1 is a constant. Rearranging, ρp = k 3 r −
4πGρ0 a2 σ2
,
where k3 is a constant. This is a power law of the form ρp = rl where the index is l = −4πGρ0 a2 /σ 2 . So the density of the population of test particles will have a power law dependence on distance at large radii. √ Here Z can be interpreted as the ratio of circular speed to dispersion. The tracer population will have the same large-r density law as the massive population if Z = 2 (i.e. ρ ∝ r −2 and ρp ∝ r−2 at large r if Z = 2). Problem 6: Answer The crossing time of the globular cluster will be Tcross ∼ 20 pc/15 km s−1 ∼ 1.3 × 106 yr, while that of the galaxy will be ∼ 20 kpc/200 km s−1 ∼ 1.0 × 108 yr. So, the globular cluster will be ∼ 104 Tcross old, whereas the galaxy will be ∼ 102 Tcross old. N -body modelling of the dynamics of the two objects will use a series of time steps, with the positions of the particles being computed at each of these steps. However, the globular cluster simulation will need steps ∼ 102 times smaller than for the galaxy to achieve steps representing the same fraction of the crossing time in both systems. Using Trelax /Tcross ' N/12 ln N , the globular cluster will have Trelax ' 700 Tcross ∼ 109 yr, so the globular cluster will be ∼ 10 Trelax old. In contrast, the galaxy will be ¿ Trelax old. The globular cluster simulation will therefore have to consider two-body relaxation, while the galaxy simulation can ingore it. Both these considerations make the globular cluster simulation more difficult.
Appendix C: Example Problems Problem 7: Question A star lying in the Galactic plane is observed to have a visual magnitude of V = 13.60 mag and a colour index (B − V ) = 0.98 mag. Its spectrum shows it to be a dwarf star of spectral type G6 with a solar composition. Stars of this type are known to have an intrinsic colour of (B − V )0 = 0.76 mag and an absolute visual magnitude of MV = +5.20. What is the extinction by the interstellar medium in the V band between us and the star? What is the distance of the star? What is the mean extinction per unit distance in the direction of the star expressed in mag kpc−1 for the V band? Will this extinction per unit distance be the same for other stars in the sky? What would you expect the extinction to be towards the star in the I and K photometric bands (which have central wavelengths of 790 nm and 2.2 µm respectively)? Problem 8: Question Observations of a part of the interstellar medium of the Galaxy show that a region of hot ionised gas (with a temperature 500 000 K, number density of ions 6000 m −3 ), a region of cold neutral gas (temperature 50 K, number density of molecules 2 × 10 7 m−3 ), and a region of warm neutral gas (temperature 10 000 K, number density of atoms 1 × 10 5 m−3 ) are in contact with each other. Which, if any, of these are in pressure equlibrium with the others? Problem 9: Question The mean density in the form of stars in the disc of the Galaxy is observed to vary with the distance z from the Galactic plane as ρs (z) = ρso e−|z|/hs close to the Sun, where ρso is the density of stars in space in the plane, and hs is a scale height (ρso and hs are therefore constants at the distance of the Sun from the Galactic Centre). The density of the interstellar gas ρg is also found to vary exponentially with height, with ρg (z) = ρgo e−|z|/hg , where ρgo and hg are constants. Observations show that hs = 250 pc and hg = 150 pc and ρso = 6 ρgo . What is the ratio of the surface density of stars, Σs , to that of gas, Σgo , at the Sun’s distance from the Galactic Centre? How do you expect the surface density of the dust, Σd , to compare with Σs ?
Appendix C: Example Problems Problem 7: Answer The (B−V ) colour excess of the star is E(B−V ) = (B−V )−(B−V )0 = 0.98−0.76 = 0.22mag. The mean interstellar extinction curve has the relation AV = 3.3E(B−V ) . Therefore, we expect AV = 3.3 × 0.22 = 0.73 mag. The relation between apparent and absolute magnitude for the V band is V − MV = 5 log10 (D/pc) − 5 + AV . Therefore, the distance is D = 10(V −MV +5−AV )/5 pc = 342 pc. The distance to the star is 340 pc. The mean extinction per unit distance is 0.73 mag/342 pc = 2.1 mag (kpc) −1 . [Note the linear relation. This comes from the extinction AV = 1.086τV where τV is the optical depth in the V band. Put τV = ρd κV D, where ρd is the density of dust in space, κV is a coefficient expressing how strongly dust absorbs light in the V band (a constant for the V band unless the type of dust particles varies substantially), and D is the distance to the star. So AV = 1.086 ρd κV D and therefore AV varies linearly with distance D.] This figure depends on the density of dust in space. Therefore it will be large (as in this case) in the direction of the Milky Way, and small away from the plane of the Galaxy. It will be large along sight lines that pass through dense gas (such as cold, neutral gas) and smaller along sight lines through lower density gas (such as hot ionised gas). Therefore the mean extinction per unit distance varies strongly across the sky. [In the AV = 1.086 ρd κV D representation used above, AV /D = 1.086 ρd κV . The parameter κV will vary only slightly, but ρd varies greatly. Therefore, AV /D varies greatly.] From the mean interstellar extinction law graph (on page 57 of the course notes), A I /E(B−V ) = 2.0, and AK /E(B−V ) = 0.4 by extrapolation. So AI = 2.0E(B−V ) = 2.0 × 0.22 = 0.44 mag, and AK = 0.4E(B−V ) = 0.4 × 0.22 = 0.09 mag. So the extinction in the I and K bands will be 0.44 and 0.09 mag respectively.
Problem 8: Answer The ideal gas law relates the pressure P in a gas to the number density n of particles and the absolute temperature T by P = n kB T , where kB is the Boltzmann constant. This gives the pressure in the hot ionised gas as Phot = 6000 m−3 × kB × 500 000 K = 3 × 109 kB K m−3 (in this problem, it is easier to leave the pressure in terms of the constant k B than to calculate the result explicitly). For the cold neutral gas we find that the pressure is P cold = 1 × 109 kB K m−3 . For the warm neutral gas the pressure is Pwarm = 1 × 109 kB K m−3 . We can see that Pcold ' Pwarm 6= Phot . Therefore the cold neutral region and the warm neutral region are in pressure equilibrium. The hot ionised region is not in equilibrium with the other two regions. [Because the hot ionised region has a higher pressure than the others, it will expand by compressing them, although the higher densities in the others mean that their inertias slow the process significantly.] [In practice, magnetic fields and cosmic rays can provide contributions to the pressure in addition to the gas pressure considered here. They will increase the presures, particularly for the hot ionised gas which will have been produced by supernovae (supernovae will produce cosmic rays and the neutron stars they leave behind can add to the strength of the magnetic field). These extra effects are ignored in this case.]
Problem 9: Answer The surface mass density of stars can be obtained by integrating the density over height z: Z ∞ Z ∞ Z ∞ −|z|/hs ρso e−z/hs dz (from the symmetry) ρso e dz = 2 ρs (z) dz = Σs = −∞
−∞
= 2 ρso
Z
∞ 0
0
h i∞ e−z/hs dz = 2 ρso −hs e−z/hs
z=0
= 2 ρso hs
Similarly, for the gas, Σg = 2 ρgo hg . Therefore,
2 ρso hs ρso hs 250 Σs = = = 6 × = 10 . Σg 2 ρgo hg ρgo hg 150 So Σs = 10 Σg at the Sun’s distance from the Galactic Centre. Dust density ρd closely follows that of gas and observations show that ρd /ρg ' 0.1. Therefore we expect Σs ' 100 Σd at the Sun’s distance from the Galactic Centre. [In reality, the density of the interstellar medium is highly variable from place to place. This exponential law is a mean representation of the decline in density from the Galactic plane.]
Appendix C: Example Problems Problem 10: Question One variant on the Simple Model of galactic chemical evolution is the ‘leaky-box’ model. This simulates the effect of shocks from supernovae and winds from young massive stars by allowing gas to leave the box at a rate proportional to the star formation rate. Therefore the change δMtotal in the total mass Mtotal in the box is δMtotal = − c δMstars , where δMstars is the change in the mass in stars, and c is a constant or proportionality. Use this to derive an expression for the mass in gas Mgas (t) at time t in terms of Mtotal (0) and Mstars (t). Now modify the closed-box relation between δMmetals and δMstars by adding an appropriate leaking term. Use these two expressions to derive δZ =
p δMstars . Mtotal (0) − (1 + c) Mstars
This expression shows that the leaky box model won’t solve the G-dwarf problem? Why?
Problem 11: Question For gravitational lensing, for very distant sources (i.e., DS À DL ), we can write the expression for the Einstein angular radius as p θE = k M/DL ,
where k is a constant. Find the value of k in arcsec if M is measured in solar masses and DL in parsecs.
Appendix C: Example Problems Problem 10: Answer We can apply the same analysis to this problem as was used for the Simple Model, but the total mass Mtotal is now a function of time t. Since δMtotal = − c δMstars , the mass of gas at time t is Mgas (t) = Mtotal (0) − (1 + c) Mstars (t) .
Some results from the Simple Model still apply, such as the expression for the change δZ in the metallicity Z of the gas in time δt, and the relation between the changes in the mass in stars and the total mass MSF that has participated in star formation up to time t: δZ =
δMgas δMmetals − Z Mgas Mgas
and
δMstars = α δMSF ,
where α is the fraction of the mass participating in star formation that remains in long-lived stars and stellar remnants. With outflow we have δMmetals
=
− Z δMSF + Z (1 − α) δMSF + p δMstars − c ZδMstars
=
p δMstars − Z δMstars − c ZδMstars .
Substituting in the expression for δZ gives the required result δZ =
p δMstars . Mtotal (0) − (1 + c) Mstars
But this is just the closed-box result with p replaced by p/(1 + c), and Mtotal = Mgas (0) replaced by Mtotal (0)/(1 + c). So the leaky box model just changes the effective value of p and doesn’t change the distribution of stellar metallicities. Problem 11: Answer The angular size corresponding to the Einstein radius is (from the course notes) r 4GM DLS , θE = c2 D L D S where M is the mass of the lensing object, DL is the distance to the lensing object, DS is the distance to the source, DLS is the distance between the lens and source, G is the constant of gravitation and c is the speed of light. For DS À DL , we have DLS ' DS . √ √ r r M M 2 G 2 6.673 × 10−11 ∴ θE = rad = rad kg−1/2 m1/2 8 c DL 2.998 × 10 DL r M −14 = 5.45 × 10 rad kg−1/2 m1/2 DL r µ ¶µ ¶1/2 180 × 60 × 60 M 1.989 × 1030 −1/2 −14 arcsec M¯ pc1/2 = 5.45 × 10 DL π 3.086 × 1016 r M −1/2 = 0.081 arcsec M¯ pc1/2 DL −1/2
So the constant is k = 0.081 arcsec M¯ pc1/2 . This means that a lens at a distance of 1 pc (about the distance of the nearest star) imaging an extragalactic source will produce an Einstein angular radius of about 0.1 arcsec.
Appendix C: Example Problems Problem 12: Question An optical microlensing survey images a star field in the Galactic bulge close to the Galactic centre. Assuming that the dark matter halo is made from compact objects with approximately stellar masses and has a density distribution ρ(r) =
ρ0 b2 , r 2 + b2
where r is the radial distance from the Galactic centre, ρ0 is the central dark matter density and b is a constant, derive an expression for the optical depth of microlensing to the field in terms of ρ0 , b and R0 . Express the result in terms of the distance R0 of the Sun from the Galactic centre. You may assume that the star field is not significantly affected by dust extinction for this calculation. Calculate τ if R0 = 8.0 kpc, b = 2.0 kpc and ρ0 = 2.0 × 10−20 kg m−3 . Problem 13: Question A weakly-interacting massive particle (WIMP) with a mass of 1000 m p , where mp = 1.6726× 10−27 kg is the mass of the proton, lenses the light of a star in the Large Magellanic Cloud, which is situated 50 kpc from the Earth. Calculate the Einstein angular radius of the WIMP if it lies at a distance 20 kpc from the Earth. How does this figure compare with the angular radius of the star if it has the same radius, 6.96 × 108 m, as the Sun? Will the microlensing effect of the WIMP be noticeable? Are dark matter microlensing surveys sensitive to the lensing of stars by WIMPs? What is Einstein angular radius of a brown dwarf with a mass of 0.05M ¯ at the same location as the WIMP? Will the lensing effect of the brown dwarf on the background star be noticeable if there is a suitable alignment?
Problem 14: Question The separation l between the Galaxy and M31 is expected to obey the equation d2 l GM = − 2 , 2 dt l over time t, where M is their combined mass, on the assumption that they move only under their mutual attraction. Show that, in addition to the parametric solutions involving sin and cos discussed in the lectures, the solutions (i) l = (GM τ02 )1/3 (cosh η − 1), t = τ0 (sinh η − η), where η is a parameter, and (ii) l = (9GM/2)1/3 t2/3 are also solutions. What do these two cases represent physically? What observational constraints show that these solutions are inappropriate to the real Galaxy–M31 system?
Appendix C: Example Problems Problem 12: Answer Use the equation τ =
4πG c2 D S
Z
DS
DL DLS ρ(DL ) dDL , 0
where DS is the distance from the observer to the source, DL is the distance from the observer to the lens, and DLS is the distance from the lens to the source. Considering the geometry, DS = R0 , DLS = R0 − DL and DLS = r. Therefore, r = R0 − DL and dDL = −dr. The optical depth is then ¶ µ Z R Z 0 ρ0 b2 R0 r 4πG 4πGρ0 b2 R0 r2 (R − r) r τ = dr = − dr 0 c2 R0 0 r 2 + b2 c2 R0 r 2 + b2 r 2 + b2 0 · ³ ´ ¸ R0 4πGρ0 b2 R0 2 2 −1 r = ln(r + b ) − r + b tan c2 R0 2 b 0 R using the result r2 /(1 + r 2 /b2 ) dr = r − b tan−1 (r/b) + constant, from the solution to problem 3. This gives, µ ³ ¶ R02 ´ 2b 2πGρ0 b2 −1 R0 ln 1 + 2 − 2 + tan τ = c2 b R0 b Substituting for ρ0 = 2.0 × 10−20 kg m−3 , b = 2.0 × 103 pc = 2.0 × 103 × 3.086 × 1016 m, R0 = 8.0 × 103 pc = 8.0 × 103 × 3.086 × 1016 m, gives τ = 5.3 × 10−7 . Problem 13: Answer The expression for the Einstein angular radius is r 4GM DLS θE = rad , c2 D L D S where the mass of the lens M is 1000 mP = 1.673 × 10−24 kg, the distance between the observer and lens is DL = 20 kpc = 6.17 × 1020 m, the distance between the observer and source is DS = 50 kpc = 1.54 × 1021 m, and the distance between the lens and source is DLS = 30 kpc = 9.26 × 1020 m. For the WIMP we have θE = 2.2 × 10−36 rad. The star with a radius 6.96 × 108 m at a distance of 20 kpc = 6.2 × 1020 m subtends an angular radius of 6.9 × 108 /6.2 × 1020 rad = 1.1 × 10−12 rad. So the angular radius of the star is 5 × 1023 times the Einstein angular radius of the WIMP. The lensing effect of the WIMP will take place on a scale that is ∼ 10−23 smaller than the scale of the star image. It will be completely undetectable. [This question assumes that WIMPs do exist. Despite dedicated experiments to search for them, little firm evidence has been found that they exist in any significant numbers.] A M = 0.05M¯ brown dwarf will have θE = 5.4 × 10−10 rad = 1.1 × 10−4 arcsec. The Einstein angular radius of the brown dwarf is 490 times larger than that angular radius of the background star. Lensing effects will be significant if there is a suitable alignment.
Problem 14: Answer The formulae can be shown to be solutions by differentiation and appropriate substitution into the differential equation. To prove (i) is a solution, differenting l = (GM τ02 )1/3 (cosh η − 1) and t = τ0 (sinh η − η), we get dl dt = (GM τ02 )1/3 sinh η and = τ0 (cosh η − 1) dη dη µ ¶−1 dl (GM τ02 )1/3 sinh η dl dt Therefore, = = . Differentiating again, dt dη dη τ0 (cosh η − 1) µ ¶ d dl (GM τ02 )1/3 cosh η τ0 (cosh η − 1) − (GM τ02 )1/3 sinh η τ0 sinh η = dη dt τ02 (cosh η − 1)2 (GM τ02 )1/3 τ0 (cosh η − 1)
= −
Therefore,
using
cosh2 x − sinh2 x ≡ 1
µ ¶ µ ¶−1 (GM τ02 )1/3 dt (GM τ02 )1/3 d dl 1 d2 l = − . = − = dt2 dη dt dη τ0 (cosh η − 1) τ0 (cosh η − 1) τ02 (cosh η − 1)2
Therefore,
d2 l GM + 2 2 dt l
= −
= −
GM (GM τ02 )1/3 + 2 2 2 1/3 τ0 (cosh η − 1) [(GM τ0 ) (cosh η − 1)]2
(GM )1/3 4/3
τ0
(cosh η − 1)2
+
(GM )1/3
4/3
τ0
(cosh η − 1)2
= 0
GM d2 l = − 2 , the original equation of motion. 2 dt l Therefore the parametric equations are solutions to the equation of motion. So,
To prove (ii) is a solution, differenting l = (9GM/2)1/3 t2/3 we get, µ ¶ µ ¶ dl d2 l 2 9GM 1/3 1 2 9GM 1/3 1 and = = − dt 3 2 dt2 9 2 t1/3 t4/3 Therefore,
µ ¶ d2 l GM GM 2 9GM 1/3 1 + + 2 = − 2 4/3 dt l 9 2 t (9GM/2)2/3 t4/3 µ ¶1/3 µ ¶1/3 2 9GM 2 9GM 1 1 = − + = 0 4/3 4/3 9 2 9 2 t t
So,
d2 l GM = − 2 , 2 dt l
the original equation of motion.
Therefore l = (9GM/2)1/3 t2/3 is a solution to the equation of motion. Case (i) represents the situation where the mass of the Galaxy–M31 system is insufficient to reverse their initial movement apart. They continue to move apart at all times, with dl/dt > 0 even in the limit t → ∞. The separation l continues to increase with time at the present day: it predicts that M31 will still be moving away from the Galaxy today, which is inconsistent with radial velocity measurements. Case (ii) represents the situation where the mass of the Galaxy–M31 system is just insufficient to reverse their initial movement apart. The separation l continues with time to the present day, but dl/dt → 0 in the limit t → ∞. It predicts that M31 will still be moving away from the Galaxy today, which is inconsistent with radial velocity measurements.