Workbook 2013 1 submitted by Victor Carrera Cangalaya

MATH1051 CALCULUS AND LINEAR ALGEBRA I Semester 1, 2013 Lecture Workbook

How to use this workbook This book should be taken to lectures, tutorials and lab sessions. In the lectures, you will be expected to fill in the blanks in any incomplete definitions, theorems etc, and any examples which appear. The lecturer will make it clear when they are covering something that you need to complete. The completed workbook will act as a study guide designed to assist you in working through assignments and preparing for the mid-semester and final exams. For this reason, it is very important to attend lectures. The text for the calculus part of the course is â&#x20AC;&#x153;Calculusâ&#x20AC;? by James Stewart, 7th edition. We often refer to the text in this workbook, and many of the definitions, theorems, and examples come from the text. (References are also provided for the 6th edition). It is strongly advised that you purchase a copy. Although there is no set text for the linear algebra part of the course, there are several books in the library which cover the material well (see the course website). c Mathematics, School of Physical Sciences, The University of Queensland, Brisbane QLD 4072, Australia

Edited by Joseph Grotowski, Phil Isaac, Michael Jennings, Birgit Loch, Barbara Maenhaut, Victor Scharaschkin, Mary Waterhouse. February 2013.

CONTENTS

Contents 0 MATLAB

0.1

Introduction to MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

0.2

Plotting Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

0.3

Vectors & Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

0.4

Systems of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

0.5

Eigenvalues & Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

0.6

Plotting points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

0.7

Introduction to M-files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

0.8

Writing Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

0.9

Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

0.10 Useful Commands for MATH1051

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

0.11 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 Numbers

1.1

Number systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2

Real number line and ordering on R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3

Definition: Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.4

Absolute value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 Functions

2.1

Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.2

Definition: Function, domain, range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.3

Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.4

Convention (domain) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.5

Caution (non-functions) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.6

Vertical line test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.7

Exponential functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.8

Trigonometric functions (sin, cos, tan) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.9

Composition of functions

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.10 One-to-one (1-1) functions - Stewart, 6ed. pp. 385-388; 5ed. pp. 413-417 . . . . . . . . . .

2.11 Inverse Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.12 Logarithms - Stewart, 6ed. p. 405; 5ed. p. 434

. . . . . . . . . . . . . . . . . . . . . . . .

2.13 Natural logarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.14 Inverse trigonometric functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CONTENTS

3 Limits

3.1

Definition: Limit - Stewart, 6ed. p. 66; 5ed. p. 71 . . . . . . . . . . . . . . . . . . . . . . .

3.2

One-sided limits

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.3

Theorem: Squeeze principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.4

Limits as x approaches infinity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.5

Some important limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 Sequences

4.1

Formal Definition: Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.2

Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.3

Application: Rabbit population - Stewart, 6ed. p. 722 ex. 71 . . . . . . . . . . . . . . . .

4.4

Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.5

Theorem: Limit laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.6

Theorem: Squeeze . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.7

Useful sequences to remember . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.8

Application: Logistic sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5 Continuity

5.1

Definition: Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.2

Definition: Continuity on intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.3

Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.4

Intermediate Value Theorem (IVT) - Stewart, 6ed. p. 104 . . . . . . . . . . . . . . . . . .

5.5

Application of IVT (bisection method) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6 Derivatives

6.1

Tangents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.2

Definition: Derivative, differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.3

Theorem: Differentiability implies continuity . . . . . . . . . . . . . . . . . . . . . . . . .

6.4

Higher derivatives

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.5

Rates of change - Stewart, 6ed. p. 117 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.6

Rules for differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.7

The chain rule - Stewart, 6ed. p. 156 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.8

Derivative of inverse function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.9

L’Hˆ opital’s rule - Stewart, 6ed. p. 472; 5ed. p. 494 . . . . . . . . . . . . . . . . . . . . . .

6.10 Theorem: Continuous extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.11 The Mean Value Theorem (MVT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.12 Definition: Increasing/Decreasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CONTENTS

6.13 Increasing/Decreasing test - Stewart, 6ed. p. 221; 5ed. p. 240 . . . . . . . . . . . . . . . .

6.14 Definition: Local maxima and minima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 6.15 First derivative test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 6.16 Second derivative test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 6.17 Extreme value theorem - Stewart, 6ed. p. 206 . . . . . . . . . . . . . . . . . . . . . . . . . 106 6.18 Optimization problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 7 Series

117

7.1

Infinite sums (notation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

7.2

Motivation

7.3

Definition: Convergence - Stewart, 6ed. p. 724; 5ed. p. 750 . . . . . . . . . . . . . . . . . 118

7.4

Theorem: p test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

7.5

Theorem: nth term test - Stewart, 6ed. p. 728; 5ed. p. 754 . . . . . . . . . . . . . . . . . 120

7.6

Test for divergence (nth term test) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

7.7

Proof that the Harmonic Series Diverges . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

7.8

Example: Geometric series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

7.9

Application: Bouncing ball - Stewart, 6ed. p. 731 qn 58; 5ed. p. 756 qn 52 . . . . . . . . . 125

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

7.10 Convergence theorem and limit laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 7.11 Comparison tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 7.12 Theorem: Comparison test - Stewart, 6ed. p. 741; 5ed. p. 767 . . . . . . . . . . . . . . . . 129 7.13 Alternating series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 7.14 Theorem: Alternating series test - Stewart, 6ed. p. 746; 5ed. p. 772 . . . . . . . . . . . . . 131 7.15 Definition: Absolute/conditional convergence . . . . . . . . . . . . . . . . . . . . . . . . . 133 7.16 Theorem: Ratio test - Stewart, 6ed. p. 752; 5ed. p. 778 . . . . . . . . . . . . . . . . . . . 135 8 Power series and Taylor series

139

8.1

Definition: Power series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

8.2

Theorem: Defines radius of convergence - Stewart, 6ed. p. 752; 5ed. p. 787 . . . . . . . . 141

8.3

Taylor series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

8.4

Theorem: Formula for Taylor series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

8.5

Binomial series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

9 Integration

153

9.1

Definition: Anti-derivative - Stewart, 6ed. p. 274; 5ed. p. 300 . . . . . . . . . . . . . . . . 153

9.2

Definition: Indefinite integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

9.3

Area under a curve - Stewart, 6ed. pp. 289-296; 5ed. p. 315-322 . . . . . . . . . . . . . . . 154

9.4

Fundamental Theorem of Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

CONTENTS

9.5

Approximate integration (trapezoidal rule)

. . . . . . . . . . . . . . . . . . . . . . . . . . 161

9.6

Improper integrals - Stewart, 6ed. pp. 544-551; 5ed. pp. 566-573 . . . . . . . . . . . . . . 162

9.7

Techniques of integration - Stewart, 6ed. ch. 8; 5ed. ch. 8 . . . . . . . . . . . . . . . . . . 164

9.8

Integrals involving ln function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

9.9

Integration of rational functions by partial fractions . . . . . . . . . . . . . . . . . . . . . 169

9.10 Volumes of revolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 10 Complex Numbers

180

10.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 10.2 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 10.3 Polar form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 10.4 Eulerâ&#x20AC;&#x2122;s formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 11 Vectors

186

11.1 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 11.2 Position Vectors

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

11.3 Definition: Norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 11.4 Vector Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 11.5 Scalar multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 11.6 Unit Vectors

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

11.7 Vectors in 3 dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 11.8 Row and column vectors 11.9 Dot Product

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

12 Matrices and Linear Transformations

198

12.1 2 Ă&#x2014; 2 matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 12.2 Transformations of the plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 12.3 Translations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 12.4 Linear transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 12.5 Visualizing linear transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 12.6 Scalings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 12.7 Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 12.8 Reflections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 12.9 Composing Transformations & Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . 206 12.10Vectors in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 12.11Properties of Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 12.12Definition: Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

CONTENTS

12.13Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 12.14Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 12.15Scalar Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 12.16Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212 12.17Transposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 13 Gaussian elimination

217

13.1 Simultaneous equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 13.2 Gaussian elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 13.3 The general solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 13.4 Gauss-Jordan elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 14 Inverses

232

14.1 The Identity Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 14.2 Definition: Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 14.3 Inverses via Gaussâ&#x20AC;&#x201C;Jordan. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 15 Determinants

243

15.1 Definition: Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 15.2 Properties of Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 15.3 Connection with inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 16 Vector Products in 3-Space

252

16.1 Definition: Cross product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 16.2 Application: area of a triangle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 17 Eigenvalues and eigenvectors

258

17.1 Ranking Webpages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 17.2 Geometry of eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260 17.3 Eigenvalues & Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 17.4 How to find eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 18 Vector Spaces

267

18.1 Linear combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 18.2 Linear independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 18.3 How to test for linear independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 18.4 Invertible matrices and linear independence . . . . . . . . . . . . . . . . . . . . . . . . . . 271 18.5 Vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 18.6 Eigenspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278

CONTENTS

18.7 The span of a set of vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 18.8 Bases

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282

18.9 Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 18.10Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 18.11Further properties of bases

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285

19 Appendix - Practice Problems

288

19.1 Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 19.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 19.3 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 19.4 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 19.5 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 19.6 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 19.7 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 19.8 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 19.9 Power series and Taylor series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 19.10Vectors

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297

19.11Matrix Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 20 Appendix - Mathematical notation

302

21 Appendix - Basics

303

21.1 Powers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303 21.2 Multiplication/Addition of real numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304 21.3 Fractions

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304

21.4 Solving quadratic equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 21.5 Surds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

0.1 Introduction to MATLAB

MATLAB

0.1 0.1.1

Introduction to MATLAB Getting Started

• Log in using your library username and password. Ask your tutor if you have a problem. • Open MATLAB just like any other Windows program. Start→Programs →MATLAB→MATLAB R2007a • Save all files to your student directory on H:/. Files saved onto individual computers or S:/ will be deleted at the end of semester. 0.1.2

The MATLAB Program Window

The MATLAB program window consists of the command window, command history, current directory and a menu bar for opening and editing M-files. The command window is the main window we will use. Simple calculations can be performed in the command window. Executing functions, defining variables, viewing answers and information about errors which may have occurred also all happen in the command window. A history of the commands entered into the command window can be found in the command history. Selecting a command from this window will re-execute the command in the command window. The current directory contains a list of files that MATLAB can utilize. This is also the folder in which MATLAB will save your work. The current directory should be set to your student account on H:/. 0.1.3

Getting Help

There are a number of resources available should you need assistance when using MATLAB. The most easily accessible is MATLAB’s own built-in Help system.

0.1 Introduction to MATLAB

• Clicking the blue question mark button in the toolbar will open the Help window, as will typing doc in the command window. You can search for the topic in question or navigate using the panel on the left. • If you know the name of the function or operation you need help with, typing help <function name> in the command window will display a short description of the function and how to use it. Typing doc <function name> will open the Help window and navigate to the page dedicated to that function or operation. If you have trouble finding what you need with MATLAB’s Help system, there is a great deal of documentation and useful articles available online. The Mathworks website http://www.mathworks.com/products/matlab/ has a lot of helpful descriptions and solutions which go beyond what is available in the MATLAB Help, as well as m-files written and uploaded by users. A Google search often turns up results as well. Of course, your tutors and lecturers will be able to help you out during class or consultation times, and the First Year Learning Centre is open from 2–4pm in 67–443 every day. 0.1.4

Variables & Basic Commands

Variables in MATLAB are names for mathematical objects (eg. numbers, vectors, matrices) which are defined by the user, stored in MATLAB’s memory (for the current session), and can be used in calculations. Variables can be defined (assigned values) by using = . For example, the commands >> x = 3 x = 3 >> y = 1009.781 y = 1009.781 assign the value 3 to the variable x and the value 1009.781 to the variable y. Variable names can be letters or words, and can contain numbers, so long as they begin with a letter (eg. a1 is a valid variable name, but 1a is not). Variable names are case sensitive (eg. MATLAB treats x and X as different variables). Using the command clear followed by the name of a variable will remove that variable from MATLAB’s memory. Using clear all will clear all variables at once. Note that it is not necessary to clear a variable before redefining it. However, it is often a good idea to clear a variable after it is used to avoid mistakenly using an incorrect value in future calculations. Tip 1: Entering a variable name as a command will return the value of that variable. Tip 2: To remove a variable from MATLAB’s memory, use the command clear followed by the variable name. To clear all variables at once, use clear all. Tip 3: For help with any MATLAB command simply enter help followed by the command name (eg. help acos). Tip 4: By default MATLAB will display numbers to 4 decimal places. The command format long changes this to 15 decimal places. To revert back to 4 decimal places, use format short.

0.1 Introduction to MATLAB

MATLAB can be used in much the same way as a normal calculator for evaluating mathematical expressions and functions. Operations can be performed on numbers or variables (as long as the variables have already been defined). Some operations, functions and constants are saved permanently in MATLAB. We demonstrate a number of these here; a more thorough list can be found in the Appendix. As with any computer language, commands in MATLAB must follow certain syntax rules. If you get an error message after trying to execute a command, check that you have not made a mistake when typing the command (eg. brackets, apostrophes, commas). The MATLAB commands for basic mathematical operations are exactly what one would expect. Addition, subtraction, multiplication and division are all demonstrated below. >> x = 10 x = 10 >> y = 5; >> x+y ans = 15 >> x-y ans = 5 >> x*y ans = 50 >> x/y ans = 2 Note that after entering the command x = 10 MATLAB provided feedback, but after the command y = 5; no feedback was produced. Following a command with a semicolon will stop MATLAB from displaying any output, though the command itself will still be executed. Some other common operations (given below) include exponentiation, square root, log and sin. Note that log is the natural logarithm (base e), not base 10. >> x = 10 x = 10 >> y = 5 y = 5 >> x^y ans = 100000 >> sqrt(x) ans = 3.1623 >> log(x) ans = 2.3026 Tip 5: Following any command with a semicolon will stop MATLAB from displaying any output (though the operation will still be performed).

0.2 Plotting Functions

>> sin(x) ans = -0.5440 >> exp(x) ans = 2.2026e+004 The last answer above demonstrates MATLAB’s use of scientific notation: it is the number 2.2026 × 104 . Do not confuse the ‘e’ with Euler’s number e = 2.718 . . .; to obtain this number, use the command exp(1). MATLAB has a number of other constants built-in, such as π and i. For example, the commands >> x = 3*pi x = 9.4248 >> y = log(x) y = 2.2433 >> clear all assign the value 3π to the variable x, set y to be ln(x), and then clear MATLAB’s memory of all variable values. Check the Appendix for a more complete list of commonly used functions and constants. Division of a non-zero number by 0 will return the result Inf or -Inf (representative of +∞ and −∞ respectively). Any other mathematically undefined operation will give the result NaN, which stands for Not-a-Number. For example, >> 0/0 ans = NaN In normal circumstances, Inf and NaN should be considered errors.

0.2 0.2.1

Plotting Functions General ezplot Commands

MATLAB has a number of built-in commands to make plotting functions very easy. MATLAB plots functions in a separate Figure window which will automatically open. To plot f (x) = sin(x) over MATLAB’s default domain −2π < x < 2π, use the command Tip 6: Pressing the up arrow in the command window will scroll up through previously executed commands. Tip 7: The result of the most recently executed command is stored as the variable ans. Tip 8: The command who gives a list of all currently defined variables. Tip 9: Some operations that might normally be considered undefined (eg. log(−1) or arccos(2)) will return complex answers in MATLAB. The reasons for this are not covered in MATH1051 (see courses on complex analysis for details).

0.2 Plotting Functions

>> ezplot(’sin(x)’) In general, to plot an arbitrary function f (x) type >> ezplot(’f(x)’) Note that the ’ ’ symbols cannot be left out here. MATLAB will endeavour to automatically assign a sensible range to the dependent variable if it is not specified. The domain and range can, however, both be controlled by the user. The first command below assigns the domain a ≤ x ≤ b, and the second assigns the same domain as well as range c ≤ f (x) ≤ d. >> ezplot(’f(x)’,[a,b]) >> ezplot(’f(x)’,[a,b,c,d])

0.2.2

Labelling & Editing Figures

After a figure has been generated in MATLAB, it can be labelled and edited through the command window (check MATLAB’s help file for instructions on how to do this) or through the menus in the figure window itself. Axis labels, titles and legends can be added and modified through the Insert menu. To edit the plots themselves, use View → Property Editor.

Tip 10: To plot multiple graphs on the same axes, use the command hold on before plotting the first one, and then hold off when finished. Tip 11: A set of grid lines can be toggled on and off using the command grid, or through the Property Editor. Tip 12: The name given to the independent variable in ezplot does not matter. ezplot(’sin(x)’) and ezplot(’sin(t)’) will produce the same output.

For example,

0.3 Vectors & Matrices

0.3 0.3.1

Vectors & Matrices Basic Vector & Matrix Operations

One of MATLAB’s strengths is its support for vectors and matrices. Performing operations with such objects is very easy in MATLAB, which has a wide range of built-in functions for manipulating them. The following commands demonstrate how to define the matrices and vectors   4 2 4   and A= a= 1 3 4 , b= 0 , 0 1 7 >> a = [1,3,4] a = 1 3 4 >> b = [4;0;7] b = 4 0 7 >> A = [2,4;0,7] A = 2 4 0 7 Note that commas separate entries in a row and semicolons separate rows. The following MATLAB segment details the commands which can be used to take the transpose of matrix a; print the (1, 2) entry of matrix A; the inverse of A; and the product of A with itself. >> a’ ans = 1 3 4 >> A(1,2) ans = 4 >> inv(A) ans = 0.5000 0 >> A*A ans = 4 36 0 49

-0.2857 0.1429

Some functions which are normally applied to single numbers can be applied to all the elements of a matrix at once. For example, >> log(a) ans = 0 1.0986

1.3863

0.4 Systems of Linear Equations

Here, applying the log function to a vector gives the natural logarithm of each element of the vector. Some functions require a full stop be inserted into the expression to operate in this way; for example, you can square each element of a matrix as follows: >> a.^2 ans = 1

In general, functions which have special definitions when applied to matrices (for example, multiplication or exponentiation) will require the full stop, while functions which can only be applied to single numbers (like logarithms) will not. Consult MATLAB’s documentation to determine whether a function can operate on matrices, and whether it requires the full stop operator. See the Appendix for a more thorough list of MATLAB operations which can be performed on matrices and vectors, as well as the commands for some special matrices.

0.4

Systems of Linear Equations

You have seen how to create and manipulate matrices in MATLAB. Systems of linear equations can be written as matrix equations, so we can use MATLAB to easily solve such problems. Consider the system of linear equations 2x + x +

You know from lectures that finding Ax = b, where  2 1 A = 1 1 0 2

y − z = 1 y − 2z = −1 2y − z = 3

the solution to this is equivalent to solving the matrix equation  −1 −2 , −1



 1 b = −1 3

  x and x = y  z

Remember that the commands >> A = [2 1 -1; 1 1 -2; 0 2 -1] A = 2 1 -1 1 1 -2 0 2 -1 >> b = [1; -1; 3] b = 1 -1 3 will enter the values of A and b into MATLAB. Tip 13: When entering a matrix, commas separate entries within a row and semicolons start new rows. Commas can be replaced with spaces (eg. matrix A in the table could be entered as A=[2 4;0 1]). Tip 14: In MATLAB, the column and row indices of vectors and matrices start at 1, not 0. So the value in the upper left corner of a matrix A is A(1, 1), not A(0, 0).

0.5 Eigenvalues & Eigenvectors

0.4.1

Solution by Inversion

One method for solving a matrix equation Ax = b is by noting that x = A−1 Ax = A−1 b, as long as A−1 exists. So to solve the above system, use the command >> x = inv(A)*b x = 0.2000 2.4000 1.8000 after entering the values of A and b. 0.4.2

Solution by Gaussian Elimination

Another method for solving systems of linear equations is Gaussian elimination. The function rref performs Gaussian elimination to find the reduced row echelon form of a matrix. After finding the reduced row echelon form of an augmented matrix (A|b), the solution to Ax = b can just be read off. To augment two matrices in MATLAB (after first making sure that the dimensions match up correctly), simply put them in square brackets separated by a space or comma. For example, to augment A and b above and then solve Ax = b, use the commands >> AugA = [A b] AugA = 2 1 -1 1 1 -2 0 2 -1 >> rref(AugA) ans = 1 0 0 0 1 0 0 0 1

1 -1 3

0.2000 2.4000 1.8000

Another way to solve a system using Gaussian elimination is to use MATLAB’s backslash operator. To solve Ax = b use the command >> x = A \ b x = 0.2000 2.4000 1.8000

0.5

Eigenvalues & Eigenvectors

MATLAB has a built in function called eig to calculate the eigenvalues of a matrix and give corresponding eigenvectors. For example, to calculate the eigenvalues and eigenvectors of the matrix 1 3 A= , 4 2 Tip 15: The command AugA = [A b] concatenates (joins) matrices A and b horizontally, while AugA = [A; b] concatenates A and b vertically. In both cases, the lengths of the sides being joined together must be the same.

0.6 Plotting points

use the following commands. >> A = [1 3; 4 2]; >> [V,D] = eig(A) V = -0.7071 -0.6000 0.7071 -0.8000 D = -2 0 0 5 Here, the values on the diagonal of D are the eigenvalues of A, and the columns of V are the corresponding eigenvectors. So A has an eigenvalue λ = −2 with associated eigenvector x = [−0.7071, 0.7071]T and another eigenvalue λ = 5 with associated eigenvector x = [−0.6, −0.8]T .

0.6

Plotting points

You have seen how to use the ezplot command to plot functions of one variable. There are many situations where you may instead wish to plot a set of discrete points on a cartesian plane. MATLAB’s plot command can be used to do this. To plot a set of points {(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )}, use the command plot(x,y), where x = [x1 , x2 , . . . , xn ] and y = [y1 , y2 , . . . , yn ]. For example, to plot the points (0, 0), (1, 1), (2, 3), (4, 6), use the commands >> x=[0,1,2,3]; >> y=[0,1,3,6]; >> plot(x,y)

Tip 16: Be careful when using the command [V,D] = eig(A). No matter what you call V and D, the one which comes first will be the matrix of eigenvectors, and the other will be the matrix of eigenvalues. Tip 17: You can use hold on to simultaneously plot continuous functions with ezplot and discrete points with plot.

0.7 Introduction to M-files

There is an optional third argument for plot which specifies the colour, type of point and type of line which will be used to plot the points. For example, >> plot(x,y,’ro’) will plot the points as red circles with no line joining them, and >> plot(x,y,’mx-’) will plot the points as magenta crosses joined by a solid line. Use the command help plot for a full explanation. As with ezplot, the properties of a figure (eg. title, axis labels, colour) can be edited using the menus in the figure window.

0.7

Introduction to M-files

M-files are text files which contain sequences of operations which MATLAB will execute when the files are run. In comparison to most programming languages, m-files are very easy to write, and are usually built around only a few core commands. 0.7.1

Creating & Opening M-files

To create a new m-file; either • Click on the New M-File button in the menu bar; or • Select File → New → M-File.

0.7 Introduction to M-files

You should save m-files you wish to keep in your H:/ directory. To open a saved m-file: • Use the drop-down menu at the top of the screen to navigate to the correct folder, and then double-click the file in the Current Directory panel; • Click the Open file button in the menu bar; or • Select File → Open. 0.7.2

Writing & Running M-files

All of the commands which we have thus far covered will work as part of an m-file. For example, create a new m-file, write the commands x = sqrt(5); y = exp(x) and save the file as example.m. In the main MATLAB window, use the drop-down menu to navigate to the folder where the file is saved, and then enter the command >> example y = 9.3565 in the command window. This illustrates the basic idea of using m-files: you write a sequence of commands, save them all together into an m-file, and then execute them all in order by calling the name of the m-file as a command. Of course, the above example is very simple, and would take less time to just evaluate in the command window. M-files are far more useful when they are intended to be used over and over again, because the commands only need to be typed out once. For example, say we have a collection of numbers (which we can store as a vector), and we wish to perform some statistical analysis by calculating the size of the set, the mean, median, mode, range and standard deviation, and we also wish to sort the numbers into ascending order. MATLAB has built-in commands for all these operations. If we only have a single data set then we may as well just type all the commands into the command window; however, if we have many sets then we would be typing the same commands over again for each set. We can save time by putting all the commands into an m-file. Put the following into an m-file and save it as statanalysis.m. % For a vector x of real numbers, outputs the size of the set x, % the mean, median, mode, range and standard deviation of the data, % and sorts the data into numerical order. setsize = length(x) meanvalue = mean(x) medianvalue = median(x) modevalue = mode(x) range = max(x) - min(x) standarddev = std(x) sortedx = sort(x) Now we can perform all the operations at once by entering the vector x and then running statanalysis. For example, >> x = [1,4,10,-2,19,0,6,3,4,7]; >> statanalysis

0.7 Introduction to M-files

setsize = 10 meanvalue = 5.2000 medianvalue = 4 modevalue = 4 range = 21 standarddev = 5.9777 sortedx = -2 0

If we now have another data set we wish to analyze, we only need to redefine x to be that set, and run statanlysis again. 0.7.3

If Statements

While m-files can be used to save time by collecting together operations which would otherwise take time to execute in the command window, there are a number of special commands which make m-files considerably more powerful. The first one we will look at is the if statement. It has the basic form if condition commands end where condition is a true/false statement and commands is a set of operations for MATLAB to perform. These commands will be executed if condition is true; if they are false, MATLAB will do nothing. For example, a valid if statement might be if x>=0 y=sqrt(x) end Here, if x is greater than or equal to 0, then y will be the square root of x. If x is negative then MATLAB does nothing. If we wanted MATLAB to do something when x is negative, we could add a second if statement. However, an else command can be inserted into the if statement to accommodate this: if condition 1st commands else 2nd commands end For example, letâ&#x20AC;&#x2122;s write a program to calculate the absolute value of a number. Create a new m-file and insert the following code. if x < 0 y=-x else y=x end Save the file as absval.m. Set the current directory to the folder containing the file, and enter the commands

0.7 Introduction to M-files

>> x = -5; >> absval y = 5 If your conditions have more than two possibilities (eg. a piecewise function which evaluates differently over different intervals), then you can also make use of elseif statements. if 1st condition 1st commands elseif 2nd condition 2nd commands else 3rd commands end You can use as many elseif statements as you like. As soon as a condition which is satisfied is found, the corresponding command will be executed, and no further conditions will be checked. For example, consider the following m-file (save it as checksign.m) if x < 0 ‘negative’ elseif x > 0 ‘positive’ else ‘zero’ end This will check whether x is negative, positive or zero, and will output a string (the phrase inside the ‘ ’) saying as much. For example, >> x = -10; >> checksign ans = negative The condition in an if statement must be a boolean (true/false) expression. MATLAB has a number of these, the simplest of which are detailed in the table below. == ∽= < <= > >=

equal to not equal to less than less than or equal to greater than greater than or equal to

Tip 18: In MATLAB a word or phrase inside inverted commas (‘ ’) is known as a string. Strings can be used to display messages or label figures and axes, among other things. Variables can take string values, though of course they cannot then be used in mathematical operations.

0.8 Writing Functions

0.7.4

Good Habits

It is often the case that MATLAB code makes perfect sense to the author at the time of writing, but is incomprehensible to anyone else or even the author themselves later on. Here are a few guidelines to ensure that your code can be understood by anyone reading it. • Use intuitive and descriptive names for m-files, functions (see the next section) and variables. • Indent the commands inside specific statements (see the if statements above) and loops (discussed in a later section). • Add thorough comments to your code. Anything written after a % character on a line is a comment and will be ignored by MATLAB. Describe what an m-file does, the meaning of all its variables, and what is achieved by any statements or loops.

0.8

Writing Functions

The example described in the previous section (absval.m) is fine for calculating the absolute value of a number, but every time we want to change numbers we have to redefine x, which may be undesirable. We can avoid this by converting the program into a function, so that we can input any value we like without having to change any variables. Open absval.m and change the first line so that the file reads as follows, and save it. function y = absval(x) if x< 0 y=-x; else y=x; end The first line of a function is very important. It gives the name of the function (here, absval; this must match the name of the m-file itself), the name of any input variables (here, x) and the name of the output variable (here, y). This function can now be executed like any other MATLAB function (eg. sin or log). To calculate the absolute value of −8, use the command >> absval(-8) ans = 8 The value −8 will be assigned to the input variable x within the function, and the value of the output variable y will be returned by the function. Let’s look at another example. Say we want to write a function which takes two n × n matrices A and B as input, and calculates (A + 2B)2 . The m-file would look as follows. function y = matrixsumsquare(A,B) y = (A+2*B)^2; 2 5 1 3 , use the commands and B = To evaluate the function on matrices A = 1 1 4 0

Tip 19: When saving a m-file function, make sure you the m-file has exactly the same name as the name of the function (ie. the name given in the first line of the m-file).

0.9 Loops

>> A = [1 3; 4 0]; >> B = [2 5; 1 1]; >> matrixsumsquare(A,B) ans = 103 91 42 82 Note that as with any variables, A and B must be defined before matrixsumsquare can operate on them. The only variables MATLAB can access while evaluating a function are those given as input to the function, or those which are defined within the function. For example, consider the function function output = example(y) output = x + y; If you now try to evaluate this function by running the commands >> x = 5; >> example(10) You will get an error, because the variable x is neither given as input for the function example, nor defined within example.m. The reverse is also true – variables which are solely defined within functions will not be accessible outside of the function, with the exception of the output variable (which will be output as ans). Note that these caveats only apply to functions; m-files which are not functions can access any variables defined in the command window, and vice versa.

0.9 0.9.1

Loops

While Loops

In a previous section you were introduced to the if statement, where MATLAB executes a command only when a condition holds. A while loop has the same structure as an if statement: while condition commands end The difference is that while condition holds, commands is executed, and then the loop is restarted and condition is checked again. The first time that condition does not hold MATLAB exits the loop and moves on to the next line after end. Consider the following example, which uses a while loop to calculate the factorial of a natural number n, written n!. Recall that n! = n(n − 1)(n − 2) · · · 2 · 1. MATLAB already has a built-in factorial function (called factorial), but we will write our own. Note that we should not call the function factorial, since a function with that name is already defined. We instead use the name fact. function output = fact(x) % temp is a temporary working variable temp=1; while x>1 temp=temp*x; x=x-1; end output=temp;

0.9 Loops

Here, temp acts as a partial product: it starts out at 1, we multiply it by x, and then call that value temp. We then decrease the value of x by 1, and then multiply temp by x again. This is repeated until x takes the value 1, and we no longer need to multiply. The function’s output is then the final value of temp. To use our function, we simply call it with a positive integer as the argument: >> fact(6) ans = 720 Note that as long as condition holds and commands does not alter any of the values used in condition , then the loop will continue running forever and MATLAB will crash. When you write a while loop, ensure that there will definitely be some point at which condition is false. If you do happen to execute code containing an infinite loop, use the command Ctrl+c to stop it. 0.9.2

For Loops

So far you have seen if statements and while loops, which execute commands when certain conditions hold. Sometimes, however, we simply want the commands to be executed a fixed number of times without depending on any conditions. This situation is handled by a for loop, which has the following form: for variable = vector of values commands end For each of the values in the vector, variable is assigned that value and the commands are executed. This is best illustrated with an example. Consider the for loop for j = [1,2,3,4,5] t(j) = j^2; end Note that as this code is executed, we evaluate t(1), t(2), t(3), t(4), t(5), so the output t is a 1 × 5 vector. In the first iteration of the loop, j takes the value 1, and then t(1) (that is, the first entry of the vector t) takes the value 12 . In the next iteration, j takes the value 2, and t(2) takes the value 22 . This continues on until j=5, after which the loop is exited. We can use a for loop to write a factorial function, instead of a while loop as seen in the previous section. Consider the following function: function output = fact2(x) % temp is a temporary working variable temp=1; for j=2:x temp=temp*j; end output=temp;

Tip 20: The command Ctrl+c will stop MATLAB’s current operation. Tip 21: There is an easier way to create the row vector [1, 2, 3, 4, 5], namely the command 1:5. In general, the command x:n:y will create the vector [x, x + n, x + 2n, x + 3n, . . . , y]; that is, the numbers from x to y in increments of n. If the n part is left out (eg. with the command 1:5), then MATLAB assumes n = 1.

0.10 Useful Commands for MATH1051

Here, we defined the vector of values which j takes as 2:x. This is the vector [2, 3, . . . , x]; so the loop will be evaluated for j = 2, j = 3, . . . , j = x. Again, temp is acting as a partial product. In the first iteration of the for loop, j takes the value 2, and temp is redefined as temp*2. In the next iteration, j takes the value 3, and temp is redefined as temp*3. This is repeated until temp has been multiplied by all the integers 2, 3, . . . , x, and it is this final value of temp which becomes the output. Note that if either fact or fact2 had been evaluated with a negative number as input, they would both return the value 1. This is because in both cases no iterations of the loop would be executed: in fact, the statement x > 1 would be false from the beginning, so the while loop is never entered; and in fact2, the vector 2:x would be empty, so the for loop is never entered. Hence the value of temp is never altered from its initial value of 1.

0.10

Useful Commands for MATH1051

In this section we mention a few MATLAB commands which may be useful when completing assignments and/or other tasks in MATH1051. 0.10.1

The trapz Function

The trapz function uses the trapezoid method to evaluate the integral of a function. Recall that the trapezoid method involves dividing up the region under a function into vertical strips, finding the values of the function where the strips meet, and using those values to calculate the area of the region. The trapz function takes as its argument a vector, which it interprets as the set of values of a function at the points where the vertical strips meet. By default, MATLAB assumes the strips are one unit wide. For example, to find the area under the function f (x) = ln(x) between x = 1 and x = 100, use the commands >> x=1:100; >> y=log(x); >> trapz(y) ans = 361.4368 To use trapz when the width of the strips is not 1, simply multiply the final answer by the width of the strips. For example, to compute the same integral as above but with a strip width of 0.1, use the commands >> x=1:0.1:100; >> y=log(x); >> 0.1*trapz(y) ans = 361.5162 0.10.2

fminsear h and fminbnd

The MATLAB functions fminsearch and fminbnd are used to find the minimum value of a function in some region. fminsearch takes as input the function itself and a starting point x0. For example, to find where the minimum value of f (x) = sin(x) near the value x0 = 0 occurs, use the command >> fminsearch(â&#x20AC;&#x2122;sin(x)â&#x20AC;&#x2122;,0) ans = -1.5708

0.11 Appendix

The answer -1.5708 is of course approximately −π/2, which is where a minimum of sin(x) occurs. To find the value of the function itself at the minimum, you can evaluate the function with a separate command (eg. sin(-1.5708)), or use the command >> [xvalue,fvalue]=fminsearch(’sin(x)’,0) xvalue = -1.5708 fvalue = -1.0000 The function fminbnd performs a similar task to fminsearch, except instead of starting at an initial guess x0, we provide an interval [a,b] inside which MATLAB will search for a minimum value. For example, to find where the minimum value of f (x) = x3 + 10x2 in the interval [−4, 4] occurs, use the command >> fminbnd(’x^3+10*x^2’,-4,4) ans = -1.2543e-005 The answer −1.2543 × 10−5 is very close to 0, and it easily be verified analytically that the minimum value of this function on the interval [−4, 4] does occur at 0 (where the function takes the value 0). As with fminsearch, to find the value of the function at this minimum, we can use the command >> [xvalue,fvalue]=fminbnd(’x^3+10*x^2’,-4,4) xvalue = -1.2543e-005 fvalue = 1.5732e-009

0.11 0.11.1

Appendix Common Numerical Operations & Constants

The following table lists the MATLAB commands for a number of commonly used functions which operate on numbers, as well as some constants. Use MATLAB’s help command for more information. Operation x+y x−y xy x/y xn √ x ln(x) log10 (x) x × 10n 0.11.2

MATLAB Code x+y x-y x*y x/y x^n sqrt(x) log(x) log10(x) xen

Operation ex |x| sin(x) cos(x) tan(x) arcsin(x) arccos(x) arctan(x) n!

MATLAB Code exp(x) abs(x) sin(x) cos(x) tan(x) asin(x) acos(x) atan(x) factorial(n)

Constant π √ −1 e ∞ −∞

MATLAB Code pi i exp(1) Inf -Inf

Vector & Matrix Operations

The following table lists the MATLAB commands for a number of commonly used operations which can be performed on vectors and/or matrices.

0.11 Appendix

Operation Entering a row vector: a = 1  3  4 4 Entering a column vector: b = 0 7 2 4 Entering a matrix: A = 0 1 Second component of vector b: Entry (1, 2) of a matrix A: Transpose: c = aT Adding vectors or matrices: b + c Dot product: b · c Cross product: b × c Norm: kak Multiplying matrices: AB Matrix determinant: det(A) Inverse: A−1 Vector dimension: Matrix dimensions: Minimum value in vector b: Maximum value in vector b: 0.11.3

MATLAB Code a=[1,3,4] b=[4;0;7] A=[2,4;0,1] b(2) A(1,2) c=a’ b+c dot(b,c) cross(b,c) norm(a) A*B det(A) inv(A) length(a) size(A) min(b) max(b)

Special Matrices

MATLAB has a number of special matrices built-in, and these can be produced with certain commands, given in the table below. Matrix n × n identity matrix m × n matrix of ones m × n matrix of zeros

MATLAB Code eye(n) ones(m,n) zeros(m,n)

1.3 Definition: Intervals

Numbers

Mathematics uses structures at a fundamental level. These basic structures include numbers, sets (e.g. intervals), shapes, vectors, complex numbers, quaternions and many more strange and exotic things. Numbers are very important in mathematics as building blocks as they lead to equations and inequalities which we can use as a tool to solve many abstract and applied problems.

1.1

Number systems R:

The following are commonly used subsets of R:

N: Z: Q: Irrational √ √ √numbers are real numbers which cannot be represented as a ratio of integers. For example, 2, 3, 5, π = 3.14159 . . ., e = 2.71828 . . . are all irrational. Note: proving rationality or irrationality of a given number can be quite subtle!

1.2

Real number line and ordering on R

The real number system can be visualised by imagining each real number as a point on a line, with the positive direction being to the right, and an arbitrary origin being chosen to represent 0 (see Figure 1). −1.5 −3

−2

e −1

Figure 1: The real number line. The real numbers are ordered, i.e. given any 2 real numbers a and b there holds precisely one of the following: a > b, a < b or a = b.√This means we can use the symbols ‘<’, ‘>’, ‘≤’ and ‘≥’ to write statements such as 1 ≤ 2 and 3 > 3. Geometrically, a < b means that a lies to the left of b on the real number line. Note also that a ≤ b means that either a < b or a = b.

1.3

Definition: Intervals

An interval is a set of real numbers that can be thought of as a segment of the real number line. For a < b, the open interval from a to b is given by (a, b) = {x ∈ R| a < x < b}. This notation is interpreted as

1.3 Definition: Intervals

Note here that the end points are not included. If we wanted to include the endpoints, we would have a closed interval from a to b, denoted

In this notation it is important to distinguish between the round brackets (a, b) for an open interval (excluding the end points) and the square brackets [a, b] for a closed interval (including the end points). We also have half-open intervals (or half-closed depending on how you feel), denoted

1.4 Absolute value

1.4

Absolute value

We define |x| = 1.4.1

x, if x ≥ 0 −x, if x < 0

Examples

|3| =

| − 4| =

|1 − π| = 1.4.2

Properties of absolute value

It is worth emphasising that in general |a + b| = 6 |a| + |b|, but the triangle inequality does hold: |a + b| ≤ |a| + |b|. For example, set a = 1 and b = −2. We have l.h.s = |1 + (−2)| = | − 1| = 1, r.h.s = |1| + | − 2| = 1 + 2 = 3.

1.4.3

Convention for

√

√ √ For a > 0, a always denotes the positive solution of x2 = a. Thus 4 = 2 and so on. This means that √ we can now solve x2 = a. For a > 0, solutions to x2 = a are x = ± a.

1.4 Absolute value

Notes

1.4 Absolute value

Notes

2.2 Definition: Function, domain, range

Functions

2.1

Notation

Let X and Y be subsets of R. The notation f : X → Y means “f maps from X into Y ”.

2.2

Definition: Function, domain, range

A function f : X → Y is a rule which assigns to every element x ∈ X exactly one element f (x) ∈ Y called the value of f at x. X is called the domain of f and f (X) = {f (x)| x ∈ X} is called the range of f . The range of f , f (X), is a subset of Y . In other words, you can take any x value from X, plug it into f , and return the value f (x), which is in Y . The range is the set of all possible values of f (x) as x varies throughout the domain. Note that f (X) is not necessarily equal to all of Y . 2.2.1

Example

The function f : R → R, such that f (x) = x2 is a function with domain R. The range is given by f (R) = {f (x) | x ∈ R} = = 2.2.2

Example

The following is a perfectly good example of a function, g : (−6, 7) → R   −5 , −6 < x < 0 −π , x = 0 g(x) =  x , 0<x<7

Piecewise defined functions are discussed in Stewart, 6ed. p. 17; 5ed. p. 18.

2.3 Graphs

Clearly the domain of g is the open interval (â&#x2C6;&#x2019;6, 7), but what about the range? Looking at the algebraic expression for g given above, the range can be expressed as

2.3

Graphs

We can represent a function by drawing its graph which is the set of all points (x, y) in a plane where y = f (x). 2.3.1

Example

Letâ&#x20AC;&#x2122;s plot the previous two functions: y

Figure 2: Graph of f (x) = x2 from Example 2.2.1

Figure 3: Graph of g(x) from Example 2.2.2.

2.5 Caution (non-functions)

2.3.2

Example

Graph the function f (x) = x2 − 4x + 3 on the axes in Figure 4.

Figure 4: f (x) = x2 − 4x + 3 = (x − 1)(x − 3)

2.4

Convention (domain)

√ √ An expression like “the function y = 1 − x2 ” means the function f with y = f (x) = 1 − x2 . When the domain is not specified it is taken to be the largest set on which the rule is defined. In this example, the domain would be [−1, 1].

2.5

Caution (non-functions)

Sometimes we see equations for curves which are not functions. Consider the equation of a circle of radius 1, centred at the origin: x2 + y 2 = 1. The crucial function property states that for each value x in the domain there must correspond exactly one value y in the range. If there are any more y values, then it is not a function.

2.6 Vertical line test

2.6

Vertical line test

In the graph of a function, the vertical line x =constant must cut the graph in at most one point. The graph for the circle x2 + y 2 = 1 is given in Figure 5. We can clearly see √ that the vertical line intersects the circle at two points. In this case the two y values are given by y = ± 1 − x2 . Therefore x2 + y 2 = 1 does not give rise to a function on any domain intersecting (−1, 1). y

Figure 5: The circle x2 + y 2 = 1 with a vertical line intersecting at two points. This shows that the equation for a circle does not give rise to a function.

2.6.1

Example

√ √ Both y = 1 − x2 (the top half of the circle) and y = − 1 − x2 (the bottom half of the circle) are functions of x. What are the domains and ranges of these two functions?

2.7 Exponential functions

2.6.2

Example

Look at the graphs in Figures 6(a) and 6(b). Do either of these graphs represent functions? y

(a)

(b)

Figure 6: Do either of these two graphs represent functions?

2.7

Exponential functions

An exponential function is one of the form f (x) = ax , where the base a is a positive constant, and x is said to be the exponent or power. One very common exponential function which we shall see often in this course is given by f (x) = ex . It has the graph shown in Figure 7. Notice that it cuts the y-axis (the line x = 0) at y = 1. This graph was made using Matlab with the command ezplot(’exp(x)’,[-1.75,1.75,-4,6]). exp(x) 6 5 4 3 2 1 0 −1 −2 −3 −4

−1.5

−1

−0.5

0 x

0.5

1.5

Figure 7: The function f (x) = ex .

2.7 Exponential functions

2.7.1

Properties of exponents

Exponential functions are very useful for modelling many natural phenomena such as population growth (base a > 1) and radioactive decay (base 0 < a < 1). 2.7.2

Example: Stewart, 5ed. p. 426

The half-life of the isotope strontium-90, 90 Sr, is 25 years. This means that half of any quantity of 90 Sr will disintegrate in 25 years. Say the initial mass of a sample is 24mg, write an expression for the mass remaining after t years.

In general, if the rate of change of a function is proportional to the value of the function, that is df (t) = kf (t), dt then f (t) = f (0)ekt , where f (0) is the initial value of the function and k is the rate of exponential growth or decay.

2.8 Trigonometric functions (sin, cos, tan)

2.8

Trigonometric functions (sin, cos, tan)

In calculus, we measure angles in radians. One radian is defined as the angle subtended at the centre of a circle of radius 1 by a segment of arc length 1.

1 rad 1

Figure 8: Angle in radians. By definition, 2π radians = 360◦ . To convert between the two measures use the following formulae: For x angle in radians, θ angle in degrees, x×

360◦ = θ, 2π

θ×

2π = x. 360◦

The point on the unit circle x2 + y 2 = 1 making an angle θ (in radians) anti-clockwise from the x-axis has coordinates (cos θ, sin θ), so cos2 θ + sin2 θ = 1. This defines the cos and sin functions. This relationship is shown in Figure 9. y

1 (cos θ, sin θ)

sin θ θ

−1

x cos θ

−1

Figure 9: The relationship between the cosine and sine functions.

2.8 Trigonometric functions (sin, cos, tan)

The graphs of the functions cos θ and sin θ are shown below in Figure 10. Note how the two functions have the same behaviour, but are shifted (i.e. phase shifted)) by π2 . This leads to the relationship π π cos θ = sin . − θ = sin θ + 2 2 cos(x), sin(x) cos(x) sin(x)

0.5

−0.5

−1 −6

−4

−2

0 x

Figure 10: The relationship between the cosine and sine functions. In both cases in Figure 10 the range is [−1, 1]. There are some unmarked vertical dotted lines. What values do they represent? (This graph was made using Matlab. To get both curves on the same axes, we used the hold on command.) The tan function is related to the sine and cosine functions by

It is defined for cos θ 6= 0. It is therefore not defined at values of θ which are odd integer multiples of 5π (eg ± π2 , ± 3π 2 , ± 2 etc).

π 2

The graph of tan θ is given in Figure 11 below. Mark the x values in the graph where the function is not defined. tan(x) 6 4 2 0 −2 −4 −6 −6

−4

−2

0 x

Figure 11: Graph of tan θ.

2.8 Trigonometric functions (sin, cos, tan)

Figure 12 shows another common function, cos x . sin x

cot x = cot (x) 6

−2

−4

−6 −6

−4

−2

Figure 12: Graph of cot θ. The periodic nature of all the trigonometric functions we have seen makes them ideal for modelling repetitive phenomena such as tides, vibrating strings, and various types of natural wave-like behaviour. Both the sine and cosine functions have period 2π. Hence sin(x + 2π) = sin x, and cos(x + 2π) = cos x. Furthermore, both sin(x) and cos(x) have an amplitude of one. We can“speed up” or“slow down” these functions by tinkering with their periodicity. For example, x compare the functions sin(x) and sin 2 in Figure 13. Compare also the functions sin(x) and sin(2x) in Figure 14. sinx, sin(x/2) sin(x) sin(x/2)

1 0.8 0.6 0.4 0.2 0 −0.2 −0.4 −0.6 −0.8 −1 −8

−6

−4

−2

0 x

Figure 13: Graph of sin(x) and sin

x 2

2.8 Trigonometric functions (sin, cos, tan)

34 sinx, sin(2 x) sin(x) sin(2x)

1 0.8 0.6 0.4 0.2 0 −0.2 −0.4 −0.6 −0.8 −1 −8

−6

−4

−2

0 x

Figure 14: Graph of sin(x) and sin(2x). In addition, we can stretch or shrink trigonometric functions by multiplying these functions by constants other than one. For example, Figure 15 shows the functions 21 cos(x) and 3 cos(x). What amplitudes do these have? 1/2cos(x), 3 cos(x) 3

1/2 cos(x) 3cos(x)

−1

−2

−3 −8

−6

−4

−2

Figure 15: Graph of

0 x

1 2

cos x and 3 cos(x).

Naturally, we can change both the period and amplitude of trigonometric functions simultaneously. Figure x 16 show the graphs of 4 cos 3 and 2 sin(5x). 4cos(x/3) and 2sin(5x)

5 4cos(x/3) 2sin(5x)

4 3 2 1 0 −1 −2 −3 −4 −5 −20

−15

−10

−5

0 x

Figure 16: Graphs of 4 cos

x 3

and 2 sin(5x).

2.8 Trigonometric functions (sin, cos, tan)

2.8.1

Example

Suppose that the temperature in Brisbane is periodic, peaking at 32 degrees Celsius on January 1, and sinking to 8 degrees Celsius in June; see Figure 17. Use the cosine function to model temperature as a function of time (in days). 35

temperature (T)

100

150

200 time, (t)

250

300

350

Figure 17: Graph of temperature fluctuations in Brisbane.

2.10 One-to-one (1-1) functions - Stewart, 6ed. pp. 385-388; 5ed. pp. 413-417

2.9

Composition of functions

Let f and g be two functions. Then the composition of f and g, denoted f ◦ g, is the function defined by

2.9.1

Example

f (x) = x2 + 1, g(x) =

1 . Their compositions f ◦ g and g ◦ f are given by x

Notice how these two composite functions are not equal. In general, f ◦ g 6= g ◦ f .

2.10

One-to-one (1-1) functions - Stewart, 6ed. pp. 385-388; 5ed. pp. 413-417

A function f : X → Y is said to be one-to-one (usually written as 1-1) if, for all x1 , x2 ∈ X,

On the graph y = f (x), 1-1 means that any horizontal line y =constant cuts through the curve in at most one place; see Figure 18.

2.10 One-to-one (1-1) functions - Stewart, 6ed. pp. 385-388; 5ed. pp. 413-417 y

37 y

(a)

(b)

Figure 18: Are either of these functions 1-1?

2.10.1

Example

Show from the definition that the function f (x) = e2xâ&#x2C6;&#x2019;5 is 1-1 on its domain.

2.11 Inverse Functions

2.11

Inverse Functions

Let f : X → Y be a 1-1 function. Then we can define the inverse function, f −1 : f (X) → X by

By definition,

Note that y = f (x) ⇔ x = f −1 (y), and that (f −1 (y))−1 = f (x). The function f must be 1-1 in order that f −1 satisfy the crucial function property. In order to obtain the graph of f −1 (x), we reflect the graph of f (x) about the line y = x; for example, see Figure 19 below. y

y=f

y=x

−1

(x)

y=f(x) x

Figure 19: The dashed line represents the graph of the inverse function, having beenreflected about the line y = x.

2.11 Inverse Functions

2.11.1

Example

Draw the graph of the inverse function of f (x) = x3 (the cube root function f −1 (x) = x1/3 ) on the same axes given below in Figure 20. x3 2 1.5 1 0.5 0 −0.5 −1 −1.5 −2 −2

−1.5

−1

−0.5

0 x

0.5

1.5

Figure 20: f (x) = x3 . Draw in the graph of f −1 (x) = x1/3 .

2.11.2

Example

f : R → R, f (x) = x2 is not 1-1 and therefore has no inverse. However, the positive half (x ≥ 0) gives a 1-1 function f : [0, ∞) → R, f (x) = x2 with range [0, ∞). The inverse of this function is then √ f −1 : [0, ∞) → [0, ∞), f −1 (x) = x. Similarly the negative half of the function f (x) = x2 is 1-1, with √ inverse f −1 : [0, ∞) → (−∞, 0], f −1 (x) = − x. This technique is often used when the function is not 1-1 over its entire domain: just take a part where it is 1-1 and determine the inverse for that part. For further discussion on how to find the inverse function of a 1-1 function, see Stewart, 6ed. p. 388 box 5; 5ed. p. 416 box 5.

2.13 Natural logarithm

2.12

Logarithms - Stewart, 6ed. p. 405; 5ed. p. 434

Logarithms are the inverse functions of the exponential functions. x

a 10

8 7 6 5

4 3 2 1 0 −1 −4

−3

−2

−1

0 x

Figure 21: Three plots of y = ax with a = 2, 3, 4. From the graph of y = ax (a 6= 1 a positive constant), we see that it is 1-1 and thus has an inverse, denoted loga x. From this definition we have the following facts: loga (ax ) = x

∀x ∈ R,

aloga x = x

∀x > 0.

What is the domain and range of f (x) = loga (x)?

2.13

Natural logarithm

Now we set a = e (Euler’s number = 2.71828. . .). The inverse function of f (x) = ex is loge (x) ≡ ln x.

2.13 Natural logarithm

6 exp(x) 5

3 log(x) 2

−1

−2

−3 −3

−2

−1

Figure 22: Graphs of f (x) = ex and g(x) = ln x.

2.13.1

Properties

Using exponent laws, together with the fact that ln x is the inverse function of ex , we can prove the following.

2.13 Natural logarithm

2.13.2

Example: Bacteria population - Stewart, 6ed. p. 440 qn 49; 5ed. p. 440 qn 49

If a bacteria population starts with 100 bacteria and doubles every 3 hours, then the number of bacteria n after t hours is given by the formula n = f (t) = 100 路 2t/3 . (a) Find the inverse of this function and explain its meaning. (b) When will the population reach 50000?

2.13 Natural logarithm

2.13.3

Example

Suppose that there are 1000 rabbits on January 1. Thirty days later there are 1500 rabbits. Assuming that the population grows exponentially, find an expression for the rabbit population at time t.

2.14 Inverse trigonometric functions

2.14

Inverse trigonometric functions

The function y = sin x is 1-1 if we just define it over the interval [−π/2, π/2]; see Figure 10. The inverse function for this part of sin x is denoted arcsin x. Thus arcsin x is defined on the interval [−1, 1] and takes values in the range [−π/2, π/2]. The graph can easily be obtained by reflecting the graph of sin x about the line y = x over the appropriate interval; see Figure 23. sin x, arcsin x 1.5

0.5

−0.5

sin x −1

arcsin x −1.5 −1.5

−1

−0.5

0.5

1.5

Figure 23: f (x) = sin x defined on [−π/2, π/2] reflected about the line y = x to give f −1 (x) = arcsin x. Similarly y = cos x is 1-1 on the interval [0, π] and its inverse function is denoted arccos x. The function arccos x is defined on [−1, 1] and takes values in the range [0, π]. Figure 24 below shows the graphs of f (x) = cos x before and after reflection about the line y = x. This gives the graph of f −1 (x) = arccos x. cos x, acos x 3 2.5

arccos x 2 1.5 1 0.5 0

cos x

−0.5 −1 −1

−0.5

0.5

1 x

1.5

2.5

Figure 24: f (x) = cos x defined on [0, π] reflected about the line y = x to give f −1 (x) = arccos x (written in MATLAB as acos(x)).

2.14 Inverse trigonometric functions

Also, tan x is 1-1 on the open interval (−π/2, π/2) with inverse function denoted by arctan x. Hence arctan has the domain (−∞, ∞) with values in the range (−π/2, π/2). tan x, atan(x) 10

tan x

8 6 4 2 0

arctan x

−2 −4 −6 −8 −10 −10

−5

Figure 25: f (x) = tan x.

2.14 Inverse trigonometric functions

Notes

2.14 Inverse trigonometric functions

Notes

2.14 Inverse trigonometric functions

Notes

2.14 Inverse trigonometric functions

Notes

3.1 Definition: Limit - Stewart, 6ed. p. 66; 5ed. p. 71

Limits

Limits arise when we want to find the tangent to a curve or the velocity of an object, for example. Once we understand limits, we can proceed to studying continuity and calculus in general. You should be aware that limits are a fundamental notion to calculus, so it is important to understand them clearly.

3.1

Definition: Limit - Stewart, 6ed. p. 66; 5ed. p. 71

Let f (x) be a function and ℓ ∈ R. We say f (x) approaches the limit ℓ (or converges to the limit ℓ) as x approaches a if we can make the value of f (x) arbitrarily close to ℓ (as close to ℓ as we like) by taking x to be sufficiently close to a (on either side of a) but not equal to a. We write lim f (x) = ℓ.

x→a

Roughly speaking, f (x) is close to ℓ for all x values sufficiently close to a, with x 6= a. The limit “predicts” what should happen at x = a by looking at x values close to but not equal to a. 3.1.1

Precise Definition

Let f be a function defined on some open interval that contains the number a, except possibly a itself. Then we write lim f (x) = ℓ x→a

if for every number ε > 0 there is a number δ > 0 such that |f (x) − ℓ| < ε whenever 0 < |x − a| < δ. Detailed discussion on the precise definition of a limit can be found in Stewart, 6ed. p. 88; 5ed. p. 93. We can sometimes determine the limit of a function by looking at its graph. Figure 26 shows two functions with the same limit at 5.

3.1 Definition: Limit - Stewart, 6ed. p. 66; 5ed. p. 71

Function 1

Function 2

Figure 26: Two functions with limit equal to 10 as x → 5. Function 1 is described by

2x −2x + 20

for 0 ≤ x ≤ 5, for 5 < x ≤ 10,

  2x y= 2  −2x + 20

for 0 ≤ x < 5, for x = 5, for 5 < x ≤ 10.

y= while function 2 is described by

Each function has a limit of 10 as x approaches 5. It does not matter that the value of the second function is 2 when x equals 5 since, when dealing with limits, we are only interested in the behaviour of the function as x approaches 5.

3.1 Definition: Limit - Stewart, 6ed. p. 66; 5ed. p. 71

3.1.2

Properties

Suppose that c is a constant and the limits ℓ = lim f (x) and m = lim g(x) exist for some fixed a ∈ R. x→a x→a Then

3.1.3

Example

Find the value of lim (x2 + 1). x→1

3.1 Definition: Limit - Stewart, 6ed. p. 66; 5ed. p. 71

3.1.4

Example: Stewart, 6ed. p. 67 ex. 1; 5ed. p. 71 ex. 1

Determine the value of lim

x→1

x−1 . x2 − 1

3.2 One-sided limits

3.1.5

Definition: Infinite Limits

Let f be a function defined on both sides of a, except possibly at a itself. Then lim f (x) = ∞

x→a

means that the values of f (x) can be made arbitrarily large by taking x sufficiently close to a, but not equal to a. Similarly, lim f (x) = −∞ x→a

means that the values of f (x) can be made arbitrarily large negatively by taking x sufficiently close to a, but not equal to a. In these cases, we say that f (x) diverges to ±∞. Note that the limit properties in section 3.1.2 do not necessarily apply if the limits diverge.

3.2

One-sided limits

Consider the piecewise function f (x) =

1, x ≥ 0 −2, x < 0.

This function has the graph depicted in Figure 27. y

Figure 27: The limit of this function as x → 0 does not exist. Notice that

lim

x→0,x>0

f (x) = 1, but

lim

x→0,x<0

f (x) = −2. Therefore, the limit as x → 0 does not exist. We

can, however, talk about the one-sided limits.

In the above example, we say that the limit as x → 0 from above (or from the right) equals 1 and we write lim f (x) = 1. x→0+

3.2 One-sided limits

Similarly, we say that the limit as x → 0 from below (or from the left) equals −2 and we write lim f (x) = −2.

x→0−

In general, for lim f (x) = ℓ, just consider x with x > a and similarly for lim f (x) = ℓ, consider only x→a+ x→a− x < a. 3.2.1

Example

Consider both lim

x→2+

3.2.2

√ √ √ x − 2 and lim x − 2. Based on your conclusion, does the lim x − 2 exist? x→2−

Theorem

lim f (x) = ℓ if and only if

x→a

3.2.3

Example

Find lim f (x) where f (x) = x→1

x2 , x ≥ 1, 2 − x, x < 1.

x→2

3.3 Theorem: Squeeze principle

3.3

Theorem: Squeeze principle

Suppose lim g(x) = ℓ = lim h(x)

x→a

and, for x close to a (x 6= a)

x→a

h(x) ≤ f (x) ≤ g(x).

Then

See the graph in Figure 28. y

g(x)

f(x) l

h(x) a

Figure 28: The squeeze principle

3.4 Limits as x approaches infinity

3.3.1

Example

Prove that lim x2 sin x→0

3.3.2

1 = 0. x

Example: Stewart, 6ed. p. 69

π Investigate lim sin . x→0 x

3.4

Limits as x approaches infinity

We say that f (x) approaches ℓ as x → ∞ if f (x) is arbitrarily close to ℓ for all large enough values of x. Formally, we write:

That is, f (x) can be made arbitrarily close to ℓ by taking x sufficiently large. Similarly, we write

if f (x) approaches ℓ as x becomes more and more negative.

3.4 Limits as x approaches infinity

Note: Since lim

x→0+

1 1 = ∞ and lim = −∞, − x x→0 x

1 then lim doesn’t exist. Furthermore, it is irrelevant that f (x) is not actually defined at the point x→0 x x = 0. 1/x 10 8 6 4 2 0 −2 −4 −6 −8 −10 −10

−5

0 x

Figure 29: Graph of f (x) = 1/x.

3.4.1

Example

Determine lim sin x, or show that it does not exist. x→∞

3.4.2

Example: ratio of two polynomials, numerator degree = denominator degree 2x2 + 3 . x→∞ 3x2 + x

Find lim

3.4 Limits as x approaches infinity

3.4.3

Example 8x3 + 6x2 . x→∞ 4x3 − x + 12

Find lim

3.5 Some important limits

3.4.4

Example: ratio of two polynomials, numerator degree > denominator degree x2 + 5 . x→∞ x + 1

Find lim

3.4.5

Example: ratio of two polynomials, numerator degree < denominator degree x+1 x→∞ x2 + 1

Find lim

The general method in the four previous examples has been to: 1. divide the numerator and denominator by the highest power of x in the denominator; and then 2. determine the limits of the numerator and denominator separately.

3.5

Some important limits

The following limits are fundamental. Combined with the properties given in 3.1.2 and the Squeeze Principle given in 3.3, these will enable you to compute a range of other limits. Here a ∈ R is arbitrary. (1) (4)

lim 1 = 1

(2)

1 − cos(x) 1 = 2 x→0 x 2

(5)

x→a

lim

lim x = a

x→a

ex − 1 = 1. x→0 x lim

(3)

sin x =1 x→0 x lim

3.5 Some important limits

Notes

3.5 Some important limits

Notes

3.5 Some important limits

Notes

4.2 Representations

Sequences

A sequence {an } is a list of numbers with a definite order: a0 , a 1 , a 2 , a 3 , . . . , an , . . . In fact, sequences can be finite or infinite. Further, a sequence does not have to consist of numbers, but might contain any sort of mathematical structure, or indeed any other objects.

4.1

Formal Definition: Sequence

More formally, a sequence is a function, with domain being N0 = {0, 1, 2, 3, . . . }. We can also take the domain as N = {1, 2, 3, ...} and start the sequence at a1 rather than a0 . 4.1.1

Motivation

Infinite sequences of numbers are useful in many applications. The sequence might represent approximations to the solution of a problem such as the bisection method. Or the sequence might represent a time series, where the numbers represent the population each year. Mathematically, we are interested in how the sequence behaves as the number of terms becomes large: does it converge to a solution, or how fast does the population grow? Sequences can also appear from the partial sums in an infinite series, which comprises our next major topic.

4.2

Representations

There are two main ways to represent the nth term in a sequence. Firstly, there is a direct (or closed form or functional) representation. Secondly, there is a recursive (or indirect) representation.

4.3 Application: Rabbit population - Stewart, 6ed. p. 722 ex. 71

The recursive form can have a simpler structure. It might be that an = f (an−1 , an−2 ) or even an = f (an−1 ). Depending on the problem, it can be natural to express the formula in one or the other way. 4.2.1

Example

List the first terms of the sequence an =

4.2.2

n . n+1

Example

List the first terms of the recursive sequence an+1 =

1 with a1 = 2. 3 − an

Note that for the recursive form, you must express one or more initial conditions or starting values. The recursive form is actually a difference equation, and there are similarities with differential equations that are studied in MATH1052. 4.2.3

Example: Fibonacci sequence

The Fibonacci sequence is defined through

an = f (an−1 , an−2 ) = an−1 + an−2 with a0 = 0, a1 = 1. List the first 9 terms.

4.3

Application: Rabbit population - Stewart, 6ed. p. 722 ex. 71

Suppose that rabbits live forever and that every month each pair produces a new pair which becomes productive at age 2 months. Starting with one newborn pair, how many pairs of rabbits will be there in the nth month?

4.3 Application: Rabbit population - Stewart, 6ed. p. 722 ex. 71

4.4 Limits

4.4

Limits

Let {an }∞ n=0 be a sequence. Then lim an = ℓ

n→∞

ℓ ∈ R,

means :

an approaches ℓ as n gets larger and larger, ie. an is close to ℓ for n sufficiently large. This is similar to lim f (x) = ℓ, except here an is defined only for natural numbers n. x→∞

4.4.1

Convention

∞ If a sequence {an }∞ n=0 has limit ℓ ∈ R, we say that an converges to ℓ and that the sequence {an }n=0 is convergent. Otherwise the sequence is divergent.

4.4.2

Examples

Determine if the sequences {an }, with an as given below, are convergent and if so find their limits. 1. an =

1 n

2. an =

1 an−1 , a0 = −1. 2

4.5 Theorem: Limit laws

3. an = (â&#x2C6;&#x2019;1)n

4. an = r n , for r =

4.5

1 , r = 1, r = 2 2

Theorem: Limit laws

The following limit laws apply provided that the separate limits exist (that is {an } and {bn } are convergent):

4.7 Useful sequences to remember

4.6

Theorem: Squeeze

If an ≤ bn ≤ cn for n ≥ n0 for some n0 ∈ N and lim an = lim cn = ℓ, then n→∞

4.6.1

Example

Use the squeeze theorem on {an }, where an =

4.7

1 sin(n). n

Useful sequences to remember n

(1) For constant c, lim c = n→∞

{cn }∞ n=0

0, 1,

if |c| < 1 if c = 1.

is divergent if c = −1 or |c| > 1. √ (2) For constant c > 0, lim c1/n = lim n c = 1. Sequence

n→∞

(3) Even lim n1/n = lim n→∞

n→∞

√ n

n→∞

n = 1.

1 = 0. n→∞ n!

(4) lim

cn = 0. n→∞ n!

(5) For constant c, lim

1 n = e. n→∞ n a n (7) lim 1 + = ea . n→∞ n (6) lim

n→∞

4.8 Application: Logistic sequence

4.8

Application: Logistic sequence

Sequences which are bounded, but not convergent can be very interesting. The study of chaos appears in these sequences. A famous recursive sequence is the logistic sequence from population dynamics:

For certain values of the coefficient k, the behaviour is regular: convergent, or periodic. But other values of k yield chaotic sequences which cannot be predicted. You will explore the logistic equation in the Matlab Tutorials. Figure 30 has some examples of the logistic sequence when k = 2.9, 3.5 and 3.9.

4.8 Application: Logistic sequence

Figure 30: Logistic sequence, a0 = 21 , k = 2.9, 3.5, 3.9

4.8 Application: Logistic sequence

Notes

4.8 Application: Logistic sequence

Notes

5.1 Definition: Continuity

Continuity

5.1

Definition: Continuity

We say that a function f is continuous at a if

If f is not continuous at a, we say that f is discontinuous at a, or f has a discontinuity at a.

y y = f(x) f(x) approaches f(a)

f(a)

a as x approaches

x a

Figure 31: Graphical representation of continuity at x = a.

A function may not be continuous at x = a for a number of reasons. 5.1.1

Example

Let f (x) =

x, x 6= 0 1, x = 0.

Then f (x) has a discontinuity at x = 0. This is because f (0) = 1 while lim f (x) = 0. Hence Condition xâ&#x2020;&#x2019;0

(iii) of Definition 5.1 doesnâ&#x20AC;&#x2122;t hold. See Figure 32.

5.1 Definition: Continuity

75 y

Figure 32: An example of a discontinuous function with discontinuity at x = 0.

5.1.2

Example

The function f (x) =

1 (see Figure 33) is not continuous at x = 0, since x2

1/(x2) 5

f(x)

0 −2

−1

0 x

Figure 33: The function f (x) = 1/x2 has a discontinuity at x = 0.

5.1.3

Example

f (x) = is not continuous at x = 0 since

x + 1, x ≥ 0 x2 , x < 0

5.2 Definition: Continuity on intervals

5.2

Definition: Continuity on intervals

If f is continuous on the open interval (a, b), then

If f is continuous on the closed interval [a, b], then f is continuous on (a, b) and

You could think of a continuous function being one that on an interval can be drawn without lifting your pen. 5.2.1

Examples

• Any polynomial in x is continuous on R. For example, f (x) = ax2 + bx + c is continuous on R. • ex , |x|, sin x, cos x and arctan x are continuous on R. • f (x) = ln x is continuous on (0, ∞).

5.3 Properties

5.3

Properties

If f (x) and g(x) are continuous at x = a and c is a constant, then

5.3.1

Theorem: Limit of a composite function - Stewart, 6ed. p. 103 Thm 8; 5ed. p. 107 Thm 8

If f is continuous at b and lim g(x) = b, then lim f (g(x)) = f (b). In other words, x→a

x→a

Similarly if lim g(x) = b and f is continuous at x = b, then x→∞

lim f (g(x)) = f (b).

x→∞

5.3.2

Corollary: Stewart, 6ed. p. 103 Thm 9; 5ed. p. 108 Thm 9

If g is continuous at a and f is continuous at g(a) then f ◦ g is continuous at a.

5.5 Application of IVT (bisection method)

5.4

Intermediate Value Theorem (IVT) - Stewart, 6ed. p. 104

Suppose that f is continuous on the closed interval [a, b] and let N be any number between f (a) and f (b), where f (a) 6= f (b). Then there exists a number c ∈ (a, b) such that f (c) = N . y

f(b)

N f(a) x a

Figure 34: The IVT states that a continuous function takes on every intermediate value between the function values f (a) and f (b). That is, we can choose any N between f (a) and f (b) which corresponds to a value c between a and b. Note that this is not necessarily true if f is discontinuous.

5.4.1

Example

Suppose that a function f is continuous everywhere and that f (−2) = 3, f (−1) = −1, f (0) = −4, f (1) = 1, and f (2) = 5. Does the Intermediate-Value Theorem guarantee that f has a root on the following intervals? a) [−2, −1]

5.5

b) [−1, 0]

c) [−1, 1]

d) [0, 2]

e) [1, 3]

Application of IVT (bisection method)

The bisection method is a procedure for approximating the zeros of a continuous function. It first cuts the interval [a, b] in half (say, at a point c), and then decides in which of the smaller intervals ([a, c] or [c, b]) the zero lies. This process is repeated until the interval is small enough to give a significant approximation to the zero itself. We can present this bisection method as an algorithm: (1) Given [a, b] such that f (a)f (b) < 0, let c = (2) If f (c) = 0 then quit; c is a zero of f .

a+b . 2

5.5 Application of IVT (bisection method)

(3) If f (c) 6= 0 then: (a) If f (a)f (c) < 0, a zero lies in the interval [a, c). So replace b by c. (b) If f (a)f (c) > 0, replace a by c. (4) If the interval [a, b] is small enough to give a precise enough approximation then quit. Otherwise, go to step (1). y

f(b)

a b

f(a)

Figure 35: The bisection method is an application of the IVT. Note that: f (a)f (c) < 0 f (a)f (c) > 0

⇐⇒

f (a) < 0 and f (c) > 0

f (a) > 0 and f (c) < 0

⇐⇒

f (a) > 0 and f (c) > 0

f (a) < 0 and f (c) < 0

We will investigate the bisection method in more detail in the lab sessions with MATLAB.

5.5 Application of IVT (bisection method)

Notes

5.5 Application of IVT (bisection method)

Notes

6.1 Tangents

Derivatives

Finding the instantaneous velocity of a moving object and other problems involving rates of change are situations where derivatives can be used as a powerful tool. All rates of change can be interpreted as slopes of appropriate tangents. Therefore we shall consider the tangent problem and how it leads to a precise definition of the derivative.

6.1

Tangents y y=f(x)

B=(b, f(b))

f(b) − f(a)

A=(a, f(a))

h x a

b=a+h

Figure 36: We want to determine the tangent line at the point A. Consider the graph in Figure 36. We want to determine the tangent line at the point A (x = a) on the graph of y = f (x), where f (x) is ‘nice enough’. We approximate the tangent at A by the chord AB where B is a point on the curve (close to A) with x-value a + h, where h is small. By looking at the graph, we can see that the slope m of the chord AB is given by

As the point B gets closer to A, m will become closer to the slope at the point A. To obtain this value, we take the limit as h → 0:

6.2 Definition: Derivative, differentiability

6.2

Definition: Derivative, differentiability

The derivative of f at x is defined by

We say that f is differentiable at some point x if this limit exists. Further, we say that f is differentiable on an open interval if it is differentiable at every point in the interval. Note that f ′ (a) is the slope of the tangent line to the graph of y = f (x) at x = a. We have thus defined a new function f ′ , called the derivative of f . Sometimes we use the Leibniz notation dy df or in place of f ′ (x). dx dx Note that if f is differentiable at a, there holds: f ′ (a) = lim

x→a

6.2.1

f (x) − f (a) . x−a

Example

Using the definition of the derivative (i.e., calculating ‘by first principles’), find the derivative of f (x) = x2 + x.

6.2 Definition: Derivative, differentiability

6.2.2

Example

Find the derivative from first principles of f (x) = ex .

6.2.3

Some useful derivatives

Other useful derivatives are

6.4 Higher derivatives

6.3

Theorem: Differentiability implies continuity

If f is differentiable at a, then

The converse, however, is false. That is, if f is continuous at a, then it is not necessarily differentiable. (See the following example.) 6.3.1

6.4

Example

Higher derivatives

Differentiating a function y = f (x) n times, if possible, gives the nth derivative of f , usually denoted dn y dn f or . f (n) (x), dxn dxn Therefore acceleration can be represented as

Sometimes when we deal with time derivatives we write v(t) = s(t) Ë&#x2122; or a(t) = sÂ¨(t) and so on.

6.5 Rates of change - Stewart, 6ed. p. 117

6.5

Rates of change - Stewart, 6ed. p. 117

Recall that the slope of the tangent line to y = f (x) at x = a measures the rate of change of y with respect to x at the point x = a. 6.5.1

Example: Velocity

Velocity v is the rate at which displacement s changes with respect to time t, so

Similarly acceleration a is the rate at which velocity v changes with respect to time t, so

6.5.2

Example (blood flow in an artery) - Stewart, 6ed. p. 176 ex. 7

Blood flowing through a blood vessel can be described by the law of laminar flow. We model the artery or vein as a cylinder. The closer the blood is to the artery wall, the slower it will flow due to friction. Conversely, the velocity of the blood is greatest at the centre of the blood vessel (cylinder). In mathematical terms, this law can be stated v=

P (R2 − r 2 ), 4ηl

where η is the viscosity of the blood, P is the pressure difference between the ends of the artery, l is the length of the blood vessel, R is the radius of the vessel and r is the distance from the centre of the vessel (the only variable in this case); see Figure 37. We now seek to find a general expression for the velocity gradient of the blood, that is, an expression for the rate of change of velocity with respect to r:

Figure 37: The closer the blood is to the vein wall, the smaller its velocity.

6.6 Rules for differentiation

This tells us at what rate the velocity of the blood decreases the further it is from the centre of the blood vessel. These types of problems are studied in later courses on fluid mechanics and other related topics.

6.6

Rules for differentiation

Suppose that f (x) and g(x) are differentiable functions, and let c be a constant. Then we have the following rules:

In the Leibniz notation, if f (x) and g(x) are both differentiable functions, and c is a constant, then the rules for differentiation can be expressed as:

6.6 Rules for differentiation

6.7 The chain rule - Stewart, 6ed. p. 156

6.6.1

Example

Use the rules above to find the derivative of f (x) = tan x =

6.7

sin x . cos x

The chain rule - Stewart, 6ed. p. 156

If g and h are both differentiable and f = g ◦ h is the composite function defined by f (x) = g(h(x)), then f is differentiable and f ′ is given by the product

In the Leibniz notation, if y = g(u) and u = h(x) are both differentiable functions, then

6.7.1

Example

Differentiate y =

√

sin x.

6.7 The chain rule - Stewart, 6ed. p. 156

6.7.2

Example: Rocket tracking application

A rocket is launched vertically and tracked by an observation station 5km from the launch pad. The angle of elevation θ is observed to be increasing at 3◦ /s when θ = 60◦ . What is the velocity of the rocket at that instant?

6.8 Derivative of inverse function

6.8

Derivative of inverse function

Suppose y = f −1 (x), where f −1 is the inverse of f . To obtain

dy we use dx

x = f (f −1 (x)) = f (y). Differentiating both sides with respect to x using the chain rule gives

6.8 Derivative of inverse function

6.8.1

Example

Find the derivative of y = ln x.

6.8 Derivative of inverse function

6.8.2

Example

Find the derivative of y = arcsin x

6.8.3

Other useful derivatives

6.9 L’Hˆ opital’s rule - Stewart, 6ed. p. 472; 5ed. p. 494

6.9

L’Hˆ opital’s rule - Stewart, 6ed. p. 472; 5ed. p. 494

Suppose that f and g are differentiable and g ′ (x) 6= 0 near a (except possibly at a). Suppose that lim f (x) = 0 and lim g(x) = 0

x→a

or lim f (x) = ±∞ and lim g(x) = ±∞.

x→a

Then

if the limit on the right exists. 6.9.1

Example

Find lim

x→1

ln x . x−1

x→a

6.10 Theorem: Continuous extension

6.9.2

Example

Find lim x ln x. x→0+

6.10

Theorem: Continuous extension

Sometimes L’Hˆ opital’s rule can be used to evaluate limits of sequences. Let f be a function on the real numbers such that lim f (x) exists. Let f (n) = an for natural numbers n. Then: x→∞

6.10.1

Example

Evaluate lim

n→∞

log n n .

6.11 The Mean Value Theorem (MVT)

6.11

The Mean Value Theorem (MVT)

Let f be continuous on [a, b] and differentiable on (a, b). Then f (b) − f (a) = f ′ (c) b−a for some c, where a < c < b. Note f ′ (c) is the slope of y = f (x) at x = c and to B = (b, f (b)). (Stewart, 6ed. p. 216)

f (b)−f (a) b−a

is the slope of the chord joining A = (a, f (a))

6.11 The Mean Value Theorem (MVT)

6.11.1

Example

Without actually finding them, show that the equation x4 + 4x + 1 = 0 has exactly two real solutions for x.

6.13 Increasing/Decreasing test - Stewart, 6ed. p. 221; 5ed. p. 240

6.12

Definition: Increasing/Decreasing

A function f is called strictly increasing on an interval I if

while f is called strictly decreasing on I if

This leads to the following test.

6.13

Increasing/Decreasing test - Stewart, 6ed. p. 221; 5ed. p. 240

Suppose that f is continuous on [a, b] and differentiable on (a, b).

6.14 Definition: Local maxima and minima

6.13.1

Example

Test to see on what intervals the function f (x) = x3 + x is increasing or decreasing.

6.14

Definition: Local maxima and minima

A function f has a local maximum at a if

Similarly, f has a local minimum at b if

6.14.1

Definition (critical point)

A function f is said to have a critical point at x = a, a â&#x2C6;&#x2C6; dom(f ) if

100

6.14 Definition: Local maxima and minima

6.14.2

101

Example

Figure 38: What do the points x = p, q, r, s, t, u represent? Consider the function in Figure 38. The domain of this function is the interval [p, u]. Which points represent the critical points? Which points represent the global maximum and minimum? Which points represent local maxima and minima?

6.14.3

Definition: Global maximum and minimum

Let f be a function defined on the interval [a, b], and c â&#x2C6;&#x2C6; [a, b]. Then f has a global maximum at c if

Similarly f has a global minimum at c if

6.15 First derivative test

6.14.4

Theorem

If f has a local maximum/minimum at x = c and f ′ (c) exists then f ′ (c) = 0. Proof (for max):

6.15

First derivative test

Let f be continuous on [a, b] and differentiable on (a, b), and let c ∈ (a, b). (a) If f ′ (x) > 0 for a < x < c and f ′ (x) < 0 for c < x < b

(b) If f ′ (x) < 0 for a < x < c and f ′ (x) > 0 for c < x < b

Proof:

102

6.16 Second derivative test

6.16

103

Second derivative test

Suppose f ′′ exists and is continuous near c. (a) If f ′ (c) = 0 and f ′′ (c) > 0, then

(b) If f ′ (c) = 0 and f ′′ (c) < 0, then

(c) If f ′ (c) = 0 and f ′′ (c) = 0, then the test fails and we get no useful information. In this case the first derivative test should be used. 6.16.1

Example

Find local minima and maxima of f (x) = 2x3 + 3x2 − 12x + 4 on R.

6.16 Second derivative test

6.16.2

104

Example

Find local extrema of f (x) = x4 .

6.16.3

Example: Beehive cell - Stewart, 6ed. p. 264 qn 43

In a beehive, each cell is a regular hexagonal prism, open at one end with a trihedral angle at the other end (see Figure 39). It is believed that bees form their cells in such a way as to minimise the surface area for a given volume, thus using the least amount of wax in cell construction. Examination of these cells has shown that the measure of the apex angle θ is amazingly consistent. Based on the geometry of the cell, it can be shown that the surface area S is given by S(θ) = 6sh +

3s2 √ ( 3 − cos θ) 2 sin θ

where s, the length of the sides of the hexagon, and h, the height, are constants.

6.16 Second derivative test

105

Figure 39: The beehive cell. The open front is pictured on the bottom and the closed end with the trihedral angle on top.

(a) Calculate

dS dÎ¸

(b) What angle should the bees prefer? Actual measurements of the angle Î¸ in beehives have been made, and the measures of these angles seldom differ from the calculated value by more than 2â&#x2014;Ś .

6.17 Extreme value theorem - Stewart, 6ed. p. 206

6.17

Extreme value theorem - Stewart, 6ed. p. 206

If f is continuous on a closed interval [a, b], then

106

6.17 Extreme value theorem - Stewart, 6ed. p. 206

107

6.17 Extreme value theorem - Stewart, 6ed. p. 206

6.17.1

Example: Stewart, 6ed. p. 209 ex. 8; 5ed. p. 228 ex. 8

Find the global maximum and minimum of the function f (x) = x3 â&#x2C6;&#x2019; 3x2 + 1 on the interval [â&#x2C6;&#x2019; 12 , 4].

108

6.18 Optimization problems

6.18 6.18.1

109

Optimization problems Example: Optimal rowing

A rowing team launches a boat from point A on a bank of a straight river 3km wide, and wants to reach point B, 8km downstream on the opposite bank as quickly as possible; see Figure 40. They could row their boat directly across the river to a point C and then run to B, or they could row directly to B, or they could row to some point D between C and B and then run to B. If they can row 6km/h and run 8km/h, where should they land to reach B as soon as possible? (Assume that the speed of the water is negligible compared to the speed at which they row.) 8km C

3km

Figure 40: How should the team travel from A to B in the shortest possible time?

6.18 Optimization problems

110

6.18 Optimization problems

6.18.2

111

Example: Optimal farming

Some farmers have a certain amount of fencing to build a rectangular enclosure which is to be subdivided into 3 parts by 2 fences parallel to one of the sides. How should they do it so the area enclosed is maximised?

6.18 Optimization problems

6.18.3

112

Example: Newton’s method

Newton’s method is a numerical procedure to approximate the roots of functions. Most programs which find roots of functions are based on this method. For example, consider the equation 48x(1 + x)60 − (1 + x)60 + 1 = 0. How would you find the roots of this equation? It is impossible to solve analytically (ie, precisely), but Newton’s method gives us a way of approximating the roots to some degree of accuracy. The idea behind the geometry of this method is given in Figure 41. We must have a starting guess (the value x1 ), then we find the tangent line to the curve at that point, and find where the tangent line cuts the x-axis. This gives us the next point. We repeat the process until we have a close enough estimate.

Figure 41: Starting at x1 , this gives you an idea of how Newton’s method iterates to estimate the root p. Mathematically, the slope of the tangent line to the curve at the point y = f (x1 ) is given by f ′ (x1 ) =

f (x1 ) − 0 x1 − x2

⇒

Repeating this procedure leads to the iterative formula

x2 = x1 −

f (x1 ) . f ′ (x1 )

6.18 Optimization problems

113

If the numbers xn become closer and closer to p as n becomes large, we say the sequence converges to p. This method is very sensitive to the type of curve and the choice of starting point.

6.18 Optimization problems

Notes

114

6.18 Optimization problems

Notes

115

6.18 Optimization problems

Notes

116

7.2 Motivation

117

Series

7.1

Infinite sums (notation)

Recall that if we have an infinite sum we write a0 + a1 + a2 + . . . + an + . . . =

∞ X

an .

n=0

Note that the lower bound (n = 0) of the sum may vary.

7.2

Motivation

Series come from many fields. 1. Approximation to problem solutions: • a0 zeroth order approximation

• a0 + a1 (a1 small) first order approximation

• a0 + a1 + a2 (a2 very small) second order approximation

• a0 + a1 + a2 + . . . + an nth order

• a0 + a1 + a2 + . . . + an + . . . exact solution, provided the series converges 2. Current state of a process over infinite time horizon 3. Approximating functions via Taylor/Fourier Series, e.g., 1 1 ex = 1 + x + x2 + x3 + . . . , 2 3! f (t) = c0 + c1 sin t + c2 sin(2t) + c3 sin(3t) + . . . 4. Measure the volume in a fractal process, like the amount of material in a botanic structure (Figure 42).

Figure 42: This is a graphical representation of a botanic fractal. 5. Riemann sums

7.3 Definition: Convergence - Stewart, 6ed. p. 724; 5ed. p. 750

7.3

118

Definition: Convergence - Stewart, 6ed. p. 724; 5ed. p. 750

Given a series

∞ X

an = a0 + a1 + a2 + . . ., let sn denote its nth partial sum:

n=0

If the sequence {sn } is convergent (i.e.

lim sn = s with s ∈ R), then the series

n→∞

convergent and we write lim sn =

n→∞

∞ X

an is said to be

n=0

an = s.

n=0

The number s is called the sum of the series. Otherwise the series is said to be divergent. 7.3.1

Example

Does the series

∞ X

(−1)n converge or diverge?

n=0

7.4 Theorem: p test

7.3.2

Example

Show that the series

7.4

119

∞ X

1 is convergent. (n + 1)(n + 2) n=0

Theorem: p test

For p ∈ R, the p-series

∞ X 1 is np

n=1

Note the above sum is from n = 1. This is just a matter of taste since ∞ ∞ X X 1 1 = p n (n + 1)p

n=1

n=0

7.6 Test for divergence (nth term test)

7.5

Theorem: nth term test - Stewart, 6ed. p. 728; 5ed. p. 754

7.6

Test for divergence (nth term test)

If lim an 6= 0 then the series is divergent. n→∞

7.6.1

Examples

Verify that these series diverge by the nth term test. (i)

∞ X

(−1)n

n=0

(ii)

∞ X

n=1

n2 5n2 + 4

120

7.6 Test for divergence (nth term test)

121

7.8 Example: Geometric series

7.6.2

122

Caution

If lim an = 0 then we cannot say whether it is convergent or divergent. n→∞

For example,

1 =0 n→∞ n lim

but the harmonic series

∞ X 1 1 1 1 = 1 + + + + ... n 2 3 4

n=1

diverges (p-series with p = 1).

7.7

Proof that the Harmonic Series Diverges

Write the partial sums s1 = 1 1 2 1 1 = 1+ + 2 3 1 1 1 = 1+ + + 2 3 4 1 1 1 ≥ 1+ + + 2 |4 {z 4} 1 1 = 1+ + 2 2

s2 = 1 + s3 s4

1 1 1 1 1 1 1 + + + = 1+ + + + 2 3 4 5 6 7 8 1 1 1 1 1 1 1 ≥ 1+ + + + + + + 2 4 4 8 8 8 8 1 1 1 + . = 1+ + 2 2 2

In general s2n ≥ 1 + n/2, so the partial sums approach infinity as n → ∞.

7.8

Example: Geometric series

The series

∞ X

ar n

n=0

is convergent if |r| < 1 with

∞ X

n=0

ar n =

a and divergent if |r| ≥ 1. 1−r

7.8 Example: Geometric series

7.8.1

Proof

123

7.8 Example: Geometric series

124

7.9 Application: Bouncing ball - Stewart, 6ed. p. 731 qn 58; 5ed. p. 756 qn 52

7.8.2

125

Intuition

If all the terms in the series stay â&#x20AC;&#x153;largeâ&#x20AC;?, then the series is divergent. If the individual terms are becoming smaller then the series may converge, but not necessarily.

7.9

Application: Bouncing ball - Stewart, 6ed. p. 731 qn 58; 5ed. p. 756 qn 52

A certain ball has the property that each time it falls from a height h onto a hard, level surface, it rebounds to a height rh, where 0 < r < 1. Suppose that the ball is dropped from an initial height of H metres. (a) Find the total distance that the ball travels.

7.9 Application: Bouncing ball - Stewart, 6ed. p. 731 qn 58; 5ed. p. 756 qn 52

126

(b) Calculate the total time that the ball travels. Use the fact that the ball falls gt2 /2 metres in t seconds (from classical physics).

7.9 Application: Bouncing ball - Stewart, 6ed. p. 731 qn 58; 5ed. p. 756 qn 52

7.9.1

127

Example: Repeated drug dosage

A person with an ear infection is prescribed a treatment of ampicillin (a common antibiotic) to be taken four times a day in 250mg doses. It is known that at the end of a six hour period, about 4% of the drug is still in the body (since the drug is being excreted by the body between doses). What quantity of drug is in the body after the tenth tablet? The fortieth? Predict what quantity of drug will be in the body after long term treatment.

7.10 Convergence theorem and limit laws

7.10 If

∞ X

128

Convergence theorem and limit laws an = ℓ and

∞ X

bn = m are convergent series, then so are the series

n=0

n=0 ∞ X

(an ± bn ) and

n=0

7.10.1

Example

Evaluate

∞ X

n=0

xn (2x + 1), for |x| < 1.

∞ X

n=0

can (where c is a constant),

7.12 Theorem: Comparison test - Stewart, 6ed. p. 741; 5ed. p. 767

7.11

129

Comparison tests

If we are trying to determine whether or not

∞ X

an converges, where an contains complicated terms,

n=0

then it may be possible to bound the series by using simpler terms. If |an | is always smaller than |bn | ∞ ∞ X X an should also. On the other hand, if an is always large - in fact larger |bn | converges, then and n=0

∞ X

than |bn |, and

7.12

n=0

bn diverges, then

an should diverge too.

n=0

Theorem: Comparison test - Stewart, 6ed. p. 741; 5ed. p. 767

Suppose that

∞ X

an and

∞ X

bn are series with all non-negative terms.

n=0

7.12.1

∞ X

Example

Test the series

∞ X ln n

n=1

for convergence.

7.13 Alternating series

7.12.2

Example

Test the series

7.13

130

∞ X

5 for convergence. 2 + 3n n=0

Alternating series

The tests for convergence have so far required terms to be positive. However, if terms alternate in sign +, −, +, −, . . . (or −, +, −, +, . . .) there are some special results. We call such a series an alternating series. Here is quite an interesting result: We have seen that the harmonic series 1+

1 1 1 + + + ... 2 3 4

1−

1 1 1 + − + ... 2 3 4

is divergent. However, the series

is convergent. Why?

7.14 Theorem: Alternating series test - Stewart, 6ed. p. 746; 5ed. p. 772

7.14

Theorem: Alternating series test - Stewart, 6ed. p. 746; 5ed. p. 772

If the alternating series ∞ X

(−1)n bn = b0 − b1 + b2 − b3 + . . . (all bn > 0)

n=0

satisfies

then the series is convergent. 7.14.1

Example ∞ X

(−1)n √

n=0

1 n+1

131

7.14 Theorem: Alternating series test - Stewart, 6ed. p. 746; 5ed. p. 772

7.14.2

Example ∞ X

(−1)n

n=1

7.14.3

3n 4n − 1

Example

1 1 1 1 1 1 1 − − + + − − + ... 2 3 4 5 6 7 8

132

7.15 Definition: Absolute/conditional convergence

7.15

7.15.1

133

Definition: Absolute/conditional convergence

Examples

Are the following series conditionally or absolutely convergent?

(i)

â&#x2C6;&#x17E; X

(â&#x2C6;&#x2019;1)n

n=1

1 n

7.15 Definition: Absolute/conditional convergence

134

(ii)

∞ X

(−1)n

n=1

(iii)

1 n4

∞ X cos n n=1

7.16 Theorem: Ratio test - Stewart, 6ed. p. 752; 5ed. p. 778

7.15.2

135

Theorem: Absolute convergence

If a series is absolutely convergent, then it is convergent. 7.15.3

7.16

Proof

Theorem: Ratio test - Stewart, 6ed. p. 752; 5ed. p. 778

A powerful test for convergence of series is the ratio test. We will use it extensively in the following sections.

∞ X

an+1

= L < 1, then the series

an is absolutely convergent (and therefore convergent). 1. If lim

n→∞ an

n=1

∞ X

an+1

2. If lim

an is divergent. = L > 1 or lim

= ∞ , then the series n→∞ n→∞ an

n=1

an+1

= 1, then the ratio test is inconclusive. 3. If lim

n→∞ an

an+1

is not defined, then the ratio test is inconclusive. 4. If lim

n→∞ an

Note that the ratio test only tests for absolute convergence. The ratio test does not give any estimates of the sum.

7.16 Theorem: Ratio test - Stewart, 6ed. p. 752; 5ed. p. 778

7.16.1

136

Examples

Use the ratio test to determine, if possible, whether the following series are absolutely convergent. ∞ X ar n (i) n=0

(ii)

∞ X

(−1)n

n=1

n3 3n

7.16 Theorem: Ratio test - Stewart, 6ed. p. 752; 5ed. p. 778

(iii)

(iv)

137

∞ X 1 n x n! n=1

∞ X

(2x + 5)n

n=0

7.16 Theorem: Ratio test - Stewart, 6ed. p. 752; 5ed. p. 778

Notes

138

8.1 Definition: Power series

139

Power series and Taylor series

8.1

Definition: Power series

A power series is a series of the form

Such series have many applications, for example, solutions of higher order O.D.E’s (ordinary differential equations) with non-constant coefficients, which in turn are very important to many areas of science and engineering. 8.1.1

Example

Use the ratio test to determine for which values of x the following series converges: ∞ X (x − 3)n

n=1

8.1 Definition: Power series

8.1.2

140

Example: Bessel function

Use the ratio test to determine for which values of x the Bessel function of order zero (denoted J0 (x)) converges, where ∞ X (−1)n x2n . J0 (x) = 22n (n!)2 n=0

Bessel functions first arose when Bessel solved Kepler’s equation for describing planetary motion. Since then, they have been used in applications such as temperature distribution on a circular plate and the shape of a vibrating membrane (such as the human ear). The graph of J0 (t) is given in Figure 43. besselj(0,t) 1.5

0.5

−0.5

−1 0

15 t

Figure 43: The Bessel function J0 (t) is represented as besselj(0,t) in MATLAB.

8.2 Theorem: Defines radius of convergence - Stewart, 6ed. p. 752; 5ed. p. 787

8.2

141

Theorem: Defines radius of convergence - Stewart, 6ed. p. 752; 5ed. p. 787

For a given power series

∞ X

n=0

cn (x − a)n there are only three possibilities:

In case (1), we say that the radius of convergence is 0, while in case (2) we say that the radius of convergence is ∞. The interval of convergence of a power series is the interval that consists of all values of x for which the series converges. In case (1) the interval is a single point, a. In case (2) the interval is (−∞, ∞). In case (3) the interval is either (a − r, a + r), (a − r, a + r], [a − r, a + r) or [a − r, a + r].

8.3 Taylor series

8.2.1

142

Example

What is the radius of convergence for the series

8.3

∞ X (−3)n xn √ ? When does the series converge/diverge? n+1 n=0

Taylor series

Given a function f (x) it is sometimes possible to expand f (x) as a power series f (x) =

∞ X

n=0

cn (x − a)n = c0 + c1 (x − a) + c2 (x − a)2 + c3 (x − a)3 + . . .

How do we find the coefficients cn in general? Let us assume that f and all its derivatives are defined at x = a. First note that f (a) = c0 . Now take the derivative of both sides. Assuming that we are justified in differentiating term by term on the right-hand side, this gives

8.4 Theorem: Formula for Taylor series

143

so that f ′ (a) = c1 . Taking the derivative again and again gives f ′′ (a) = 2c2

⇒

f ′′′ (a) = 6c3

⇒

c2 = f ′′ (a)/2 c3 = f ′′′ (a)/3!

and, in general cn = f (n) (a)/n!, with f (0) (x) ≡ f (x). This leads to

The right-hand side is called the Taylor series of f (x) about x = a. So this result says that if f (x) has a power series representation (expansion) at x = a, then it must be a Taylor series. 8.3.1

Example: MacLaurin series

When a = 0 in the Taylor series we call the series a MacLaurin series. 8.3.2

Note

The function f and all its derivatives must be defined at x = a for the series to exist. In this case we say that f is infinitely differentiable at x = a.

8.4

Theorem: Formula for Taylor series

Suppose f (x) is infinitely differentiable at x = a. Then for f “sufficiently well behaved”, f (x) =

∞ X f (n) (a)

n=0

(x − a)n , for |x − a| < r,

where r is the radius of convergence. (Note that if we define a sequence of numbers bn = max |f (n) (x)|, n ∈ N, x∈[−r,r]

then the “sufficiently well behaved” means bn r n = 0. n→∞ n! lim

Unless otherwise noted, all functions we consider in this section are “sufficiently well behaved”.)

8.4 Theorem: Formula for Taylor series

144

For |x − a| < r, we have approximately f (x) ≃

k X f (n) (a)

n=0

(x − a)n

which is a useful polynomial approximation. The smaller |x − a|, the better the approximation. This approximation can be made accurate to arbitrary order by taking k sufficiently large provided |x − a| < r. This begins to address the question of whether or not f (x) has a power series representation. The details are beyond the scope of this course, but for those interested there is a good discussion about this issue and Taylor’s inequality in Stewart, 6ed. pp. 772-774; 5ed. pp. 798-800. 8.4.1

Example

Find the Taylor series of f (x) = ln x about x = 1. Determine its radius of convergence.

8.4 Theorem: Formula for Taylor series

145

8.4 Theorem: Formula for Taylor series

8.4.2

Example

Find the Taylor series of f (x) = ex about x = a. Determine its radius of convergence.

146

8.4 Theorem: Formula for Taylor series

8.4.3

Example

Find the MacLaurin series for f (x) = sin x. Determine its radius of convergence.

147

8.5 Binomial series

8.5

148

Binomial series

For a non-negative integer m, the binomial series is m

(1 + x)

m X m

n=0

xn ,

m where are the binomial coefficients defined by n m m m(m − 1) . . . (m − n + 1) = 1, = n! 0 n

for

n = 1, 2, ....

Note that in this expression for the binomial coefficient, m can be any real number. In the case that m is a positive integer and 0 ≤ n ≤ m, such as in the above expansion, it can be written in the more familiar form m m! . = (m − n)!n! n Also note that when both m and n are positive integers, with n ≥ m + 1 the binomial coefficient is zero. The above expansion can then be written m

(1 + x)

∞ X m

n=0

xn .

8.5 Binomial series

149

Generalize the binomial theorem to arbitrary real powers by finding the MacLaurin series expansion of the function f (x) = (1 + x)α where α ∈ R. Find also the radius of convergence of the series.

8.5 Binomial series

8.5.1

150

Relationship to geometric series

A familiar series is the geometric series 1 = 1 + x + x2 + · · · , 1−x

|x| < 1.

for

This result follows from the binomial theorem. Taking α = −1 and replacing x with −x we have f (x) = (1 − x)−1 =

1 . 1−x

The binomial coefficients in this case are

−1 (−1)(−2) . . . (−n) = (−1)n , = n! n

for

The binomial theorem then gives, for |x| < 1, ∞

∞

n=1

n=0

X X 1 xn , (−1)n (−x)n = =1+ 1−x which is the geometric series.

n ≥ 1.

8.5 Binomial series

Notes

151

8.5 Binomial series

Notes

152

9.1 Definition: Anti-derivative - Stewart, 6ed. p. 274; 5ed. p. 300

153

Integration

9.1

Definition: Anti-derivative - Stewart, 6ed. p. 274; 5ed. p. 300

A function F is called an anti-derivative of f or, a primitive of f on an interval I if

See the table below for some examples of anti-derivatives. We use the notation F ′ = f , G′ = g and c is a constant. Function Antiderivative Function Antiderivative cf (x)

cF (x)

cos x

sin x

f (x) + g(x)

F (x) + G(x)

sec2 x

tan x

xα , (α 6= −1)

xα+1 α+1

1 x

ln x

sin x

− cos x

The most general anti-derivative can be obtained from those given in the table by adding a constant. Nod d F (x) = f (x) then (F (x) + C) = f (x) is also true where C is any constant (independent tice that if dx dx of x). 9.1.1

Example

Find the most general anti-derivative of the function f (x) = x2 + 3x.

9.3 Area under a curve - Stewart, 6ed. pp. 289-296; 5ed. p. 315-322

9.1.2

Example

If f ′′ (x) = x −

9.2

154

√ x, find f (x).

Definition: Indefinite integral

The indefinite integral

f (x)dx, of a suitable function f (x), is defined by

whereRF (x) is any anti-derivative of f (x) and c is an arbitrary constant called the constant of integration. Thus f (x)dx gives all the anti-derivatives of f (x).

9.3

Area under a curve - Stewart, 6ed. pp. 289-296; 5ed. p. 315-322

Consider the problem of finding the area under a curve given by the graph of a continuous, non-negative function y = f (x) between two points x = a and x = b; see Figure 44.

y=f(x)

Figure 44: We need to define the area of the region S = {(x, y)|a ≤ x ≤ b, 0 ≤ y ≤ f (x)}.

9.3 Area under a curve - Stewart, 6ed. pp. 289-296; 5ed. p. 315-322

155

Although we might have an intuitive notion of what we mean by area, how do we define the area under such a curve? We start by subdividing S into n strips S1 , S2 , . . . , Sn of equal width. The width of the interval [a, b] is b − a so the width of each strip is

These strips divide the interval [a, b] into n subintervals [x0 , x1 ], [x1 , x2 ], . . . , [xn−1 , xn ] where x0 = a and xn = b. If ci is any point in the ith subinterval [xi−1 , xi ], then we can approximate the ith strip Si by a rectangle of width ∆x and height f (ci ), which is the value of f at the point ci (see Figure 45).

y=f(x)

a c1

∆x

Figure 45: A method of approximating the area of the region S. The area of such a rectangle is f (ci )∆x. Therefore we can approximate the area of S by taking the sum of the area of these rectangles. We call this sum Rn :

Now what happens if we increase n? Consider the graph in Figure 46. It seems that as ∆x becomes smaller (when n becomes larger), the approximation gets better. Therefore we define the area A of the region S as follows.

9.3 Area under a curve - Stewart, 6ed. pp. 289-296; 5ed. p. 315-322

156

y=f(x)

Figure 46: The smaller the width of the rectangle, the better the approximation, in general. Notice that in this diagram we chose ci = a + i∆x.

9.3.1

Definition: Area of a region, Riemann sum

The area A of the region S that lies under the graph of the continuous, non-negative function f is

This limit always exists and is independent of the choice of ci . This can be proved using the fact that f is continuous on [a, b]. This sum Rn is called a Riemann sum. Note that lim [f (c1 )∆x + · · · + f (cn )∆x] is called the Riemann integral of f on [a, b] and is denoted n→∞

f (x)dx.

IMPORTANT: this definition can be carried over to functions which are not necessarily non-negative. The intuitive interpretation of A as an area is then no longer valid. The quantity A should then be viewed as a “signed area”, i.e. A = area above x-axis, below graph − area below x-axis, above graph. Stewart, 6ed. p. 301; 5ed. p. 328

9.4 Fundamental Theorem of Calculus

9.4 9.4.1

157

Fundamental Theorem of Calculus Theorem: Fundamental Theorem of Calculus

(a) If f is continuous on [a, b], a ≤ x ≤ b, then A(x) =

f (t)dt

gives an antiderivative of f (x) such that A(a) = 0. (b) If F (x) is any antiderivative of f (x), then F (b) − F (a) =

9.4.2

Proof (outline)

Zb a

f (t)dt.

9.4 Fundamental Theorem of Calculus

9.4.3

158

Notation

We will write F (x)|ba = [F (x)]ba = F (b) − F (a). 9.4.4

Properties of the definite integral

For some c ∈ R, f (x) continuous on [a, b] there holds: •

•

• 9.4.5

f (x)dx = −

f (x)dx a

cdx = c(b − a)

a b

[f (x) ± g(x)]dx =

a b

cf (x)dx = c a c

f (x)dx + a a

f (x)dx ±

f (x)dx a

f (x)dx = c

f (x)dx a

f (x)dx = 0 a

Examples Zx 1

(1) Show that ln x =

1 dt, for x > 0. t

g(x)dx

9.4 Fundamental Theorem of Calculus

(2) Evaluate

(3) Evaluate

1 dt. t

π/2

sin xdx.

(4) If F (u) =

Ru 2

(5) If G(x) =

√ dt , 1+t2

sin Rx 2

find F ′ (u).

√ dt , 1+t2

find G′ (x).

159

9.4 Fundamental Theorem of Calculus

9.4.6

160

Example

Find the area between the x-axis and the curve y = sin x on [0, π].

9.4.7

Example

Find the area bounded by the curves y = x2 and y = x3 on the interval [0, 1].

9.4.8

Corollary

If f is non-negative and continuous on [a, b] then Z

f (x)dx ≥ 0.

If f is non-positive and continuous on [a, b] then Z

f (x)dx = −(area above graph below interval [a, b]) ≤ 0.

9.5 Approximate integration (trapezoidal rule)

9.4.9

161

Example

Find the area enclosed by the curves y = x and y = x3 .

9.5

Approximate integration (trapezoidal rule)

We can approximate the area under a curve by taking the sum of a finite number of rectangles under the graph as outlined in the previous section. If we choose the ci to be the left endpoints of each rectangle, we obtain the approximation n X f (xi−1 )∆x. ℓn = i=1

If we choose the ci to be the right endpoints of each rectangle we obtain the approximation rn =

n X i=1

f (xi )∆x.

9.6 Improper integrals - Stewart, 6ed. pp. 544-551; 5ed. pp. 566-573

162

The trapezoidal rule is the approximation of the area under a curve which takes the average value of â&#x201E;&#x201C;n and rn . The formula for this approximation is therefore given by

The trapezoidal rule will appear in a Matlab module.

9.6

Improper integrals - Stewart, 6ed. pp. 544-551; 5ed. pp. 566-573

9.6 Improper integrals - Stewart, 6ed. pp. 544-551; 5ed. pp. 566-573

The integrals

9.6.1 Find

∞ a

f (x)dx and

163

f (x)dx are called convergent if the limit exists and divergent otherwise.

−∞

Example Z

∞

e−x dx.

9.6.2

Note

It is still possible to define the integral for some classes of functions which are not necessarily continuous: Stewart, 6ed. p. 548; 5ed. p. 570. Note also that there are other types of improper integral: Stewart, 6ed. p. 547; 5ed. p. 569.

9.7 Techniques of integration - Stewart, 6ed. ch. 8; 5ed. ch. 8

9.7

Techniques of integration - Stewart, 6ed. ch. 8; 5ed. ch. 8

9.7.1

Substitution

If u = g(x) is a differentiable function whose range is an interval I and f is continuous on I, then

If gâ&#x20AC;˛ is continuous on [a, b] and f is continuous on the range of u = g(x), then

9.7.2 Find

Example: Stewart, 6ed. p. 334 ex. 1; 5ed. p. 361 ex. 1 Z

x3 cos(x4 + 2)dx.

164

9.7 Techniques of integration - Stewart, 6ed. ch. 8; 5ed. ch. 8

9.7.3

Example: Stewart, 6ed. p. 337 ex. 7; 5ed. p. 364 ex. 7

Evaluate

9.7.4

dx . (5x − 3)2

Example: Trig substitution

Evaluate

dx √ . 2 a − x2

165

9.7 Techniques of integration - Stewart, 6ed. ch. 8; 5ed. ch. 8

9.7.5

Example

Evaluate

9.7.6

cos x sin x dx.

Example

Evaluate

(3x + 1)(3x2 + 2x)3 dx.

166

9.7 Techniques of integration - Stewart, 6ed. ch. 8; 5ed. ch. 8

9.7.7

167

Integration by parts

Given two differentiable functions u(x) and v(x) we have a product rule for differentiation:

Therefore uv is an antiderivative of uv ′ + u′ v. Hence

The aim is to simplify the function that is integrated, i.e. Also note that the antiderivative of v ′ needs to be found! 9.7.8

Example

Evaluate

x3 ln xdx.

u′ vdx should be easier to find than

uv ′ dx.

9.7 Techniques of integration - Stewart, 6ed. ch. 8; 5ed. ch. 8

9.7.9

Example

Evaluate

xex dx.

9.7.10

General rule R • For (polynomial) · eax+b dx, choose u to be the polynomial and v ′ = eax+b . R • For (polynomial) · ln (ax + b)dx, choose u = ln (ax + b) and v ′ to be the polynomial. • Note that sometimes integration by parts needs to be applied more than once.

9.7.11

Example

Evaluate

x sin xdx.

168

9.9 Integration of rational functions by partial fractions

169

9.8

Integrals involving ln function

9.9

Integration of rational functions by partial fractions

A ratio of two polynomials, f (x) =

P (x) Q(x)

is called a rational function. Integration of a rational function is made possible by expressing it as the sum of simpler fractions, called partial fractions. (In the three examples which follow, we will only consider rational functions for which the degree of P (x) is less than the degree of Q(x).) The general process for obtaining the sum of partial fractions is as follows: â&#x20AC;˘ Factor the denominator Q(x) as much as possible; A Ax + B or , (ax + b)i (ax2 + bx + c)j where the denominators of these fractions come from the factorisation of Q(x).

â&#x20AC;˘ Express f (x) as a sum of fractions each of which take either the form

9.9 Integration of rational functions by partial fractions

Consider the following examples. 9.9.1

Example

Evaluate

x+2 dx. x2 + x

170

9.9 Integration of rational functions by partial fractions

9.9.2

Example

Evaluate

dx , a 6= 0. â&#x2C6;&#x2019; a2

171

9.9 Integration of rational functions by partial fractions

9.9.3

Example

Evaluate

2x2 â&#x2C6;&#x2019; x + 4 dx. x3 + 4x

172

9.9 Integration of rational functions by partial fractions

173

9.9 Integration of rational functions by partial fractions

9.9.4

174

Rationalising substitutions

Some nonrational functions can be made rational p through the use of an appropriate substitution. When p the integrand contains an expression of the form n g(x), try using the substitution u = n g(x). 9.9.5

Example

Z â&#x2C6;&#x161; x+4 dx. Evaluate x

9.9 Integration of rational functions by partial fractions

175

9.10 Volumes of revolution

9.10

176

Volumes of revolution

Suppose f (x) is positive and continuous on [a, b]. By rotating the graph of f (x) above [a, b] about the x-axis, we obtain a cylindrical solid. What is the volume of such a solid? This is what we call a volume of revolution.

y=f(x)

(a)

(b)

Figure 47: Rotating the curve in (a) about the x-axis gives the solid in (b).

9.10.1

Formula for volume

Suppose f (x) ≥ 0 and continuous on [a, b]. The volume of revolution of the solid obtained by rotating the graph y = f (x) above [a, b] about the x-axis is

9.10.2

Outline of proof

For x ∈ [a, b], let V (x) be the volume obtained by rotating the graph of f above the interval [a, x] about the x-axis. So V (a) = 0, V (b) = V . Then for h ≥ 0 small V (x + h) − V (x) =

volume obtained by rotating graph of f above [x, x + h] about x-axis

≃ volume of cylinder with radius f (x) and height h = π[f (x)]2 h (becomes exact as h → 0) ⇒

V (x + h) − V (x) ≃ π[f (x)]2 h

9.10 Volumes of revolution

177

Taking the limit h → 0 we obtain

V ′ (x) = π[f (x)]2

so the Fundamental Theorem yields Z 9.10.3

b a

π[f (x)]2 dx = [V (x)]ba = V (b) − V (a) = V (b) = V.

Example

Let a > 0. Find the volume of the solid obtained by rotating y = f (x) =

9.10.4

√

x about the x-axis over [0, a].

Example

Determine the volume of the solid obtained by rotating y = x about the x-axis over [0, 1].

9.10 Volumes of revolution

9.10.5

178

Example

Determine the volume of the solid obtained by rotating y = x2 about the x-axis over [0, 1].

9.10.6

Example

The region enclosed by the curves y = x and y = x2 is rotated about the x-axis. Find the volume of the resultant solid.

9.10 Volumes of revolution

Notes

179

10.2 Complex Numbers

10 10.1

180

Complex Numbers Notation

Let R denote the set of real numbers. The symbol ∈ means “is an element of”. Thus we write x∈R

√ to mean x is an element of R. That is, x is a real number. Eg 2 ∈ R, π ∈ R, − 3 ∈ R etc.

10.2

Complex Numbers

Complex numbers were introduced in the 16th century to obtain roots of polynomial equations. A complex number is of the form z = x + iy where x, y ∈ R and i is (formally) a symbol satisfying i2 = −1. The quantity x is called the real part of z and y is called the imaginary part of z. The set of all complex numbers is denoted C. Eg 3 − 2i ∈ C. 10.2.1

Example

The real part of 3 − 2i is 3 and the imaginary part is −2 (not −2i). Complex numbers can be added and multiplied by replacing i2 everywhere with −1. For example (2i)2 = 4i2 = −4. 10.2.2

Example

Simplify (3 − 2i)(1 + i).

(3 − 2i)(1 + i) = 3 + 3i − 2i − 2i2

= 3+i+2 =5+i

(3 − 2i)(1 + i)

10.2.3

Example

Suppose a, b ∈ R. Simplify (a + bi)(a − bi).

= =

3 + 3i − 2i − 2i

3+i+2= 5+i

10.2 Complex Numbers

181

(a + bi)(a − bi) = a2 − abi + abi − b2 i2 = a 2 + b2

(a + bi)(a − bi)

2 2

a − abi + abi − b i

a +b

If z = a + bi is a complex number, the number a − bi is called the complex conjugate of z, denoted z. Eg the complex conjugate of 3 + 2i is 3 + 2i = 3 − 2i. The previous example shows that z z is always a real number. 10.2.4

Example

Simplify

3 − 2i . 1−i

We multiply top and bottom by the complex conjugate of the denominator. This does not change the value of the fraction, but the new denominator is a real number. 3 − 2i 1−i

3 − 2i (1 + i) × 1−i (1 + i) (3 − 2i)(1 + i) 12 + 12 1 (3 − 2i)(1 + i) 2 1 (5 + i) 2 5 1 + i 2 2

= = = = =

We multiply top and bottom by the complex conjugate of the denominator. This does not change the value of the fraction, but the new denominator is a real number.

3 − 2i 1−i

3 − 2i

(3 − 2i)(1 + i)

= = =

1−i

(1 + i) (1 + i)

12 + 12

1 2 1 2 5 2

(3 − 2i)(1 + i) (5 + i) +

1 2

It is a fact that if we consider complex roots of polynomials (and count them with their correct multiplicity), then a polynomial of degree n always has n roots. For example, every quadratic has two roots.

10.3 Polar form

10.2.5

182

Example

Find the roots of x2 + 2x + 2 = 0.

Use the quadratic formula: √ 1 1 (−2 ± 4 − 8) = (−2 ± 2i) 2 2 = −1 ± i.

x =

Alternatively, complete the square. Take half the coefficient of the linear term 2x, namely 1. So consider (x + 1)2 : x2 + 2x + 2 = (x + 1)2 + 1 = 0 ⇐⇒ (x + 1)2 = −1 = i2 ⇐⇒ x + 1 = ±i

∴ x = −1 ± i are the roots.

Use the quadratic formula:

= =

(−2 ± 2 −1 ± i.

√

4 − 8) =

1 2

(−2 ± 2i)

Alternatively, complete the square. Take half the coefficient of the linear term 2x, namely 1. So consider (x + 1)2 :

x + 2x + 2

⇐⇒ x + 1

⇐⇒ (x + 1)

∴x

10.3

(x + 1) + 1 = 0 2

−1 = i ±i

−1 ± i are the roots.

Polar form

Real numbers are often represented on the real line. A complex number z = x + iy may be represented by a point in the complex plane, where the horizontal axis is the real axis and the vertical axis is the imaginary axis. We can also specify z by giving the length r and the angle θ in figure 48. The quantity r is called the modulus of z, denoted |z|. It measures the distance of z from the origin. The angle θ is called the argument of z. We have:

10.3 Polar form

183 Imaginary

z=x+iy r θ

Real

Figure 48: A complex number can be represented in rectangular or polar form.

x =

r cos θ

y =

r sin θ

⇒ z = x + iy = Also r = |z| = tan θ =

Example

Write z = 1 + i in polar form.

x2 + y 2

y x

if x 6= 0.

r cos θ

r sin θ

⇒ z = x + iy

r(cos θ + i sin θ).

r = |z|

tan θ

Also

10.3.1

r(cos θ + i sin θ).

q y

x2 + y 2 if x 6= 0.

10.4 Euler’s formula

184

First find the modulus: |z| =

√

1+1=

√

y We have for the argument tan θ = , so we can take x y x = arctan 1 π = 4 √ π π . z = 2 cos + i sin 4 4 θ = arctan

∴ First find the modulus:

|z| = We have for the argument tan θ =

y x

, so we can take

= = =

∴

10.4

√ √ 1 + 1 = 2.

arctan

x arctan 1 π 4 √

π π cos . + i sin 4 4

Euler’s formula

Euler’s formula states for any real number θ: cos θ + i sin θ = eiθ . (To make sense of this, one has to define the exponential function for complex arguments. This may be done using a series.) Thus every complex number z = x + iy can be represented in polar form z = reiθ .

10.4 Eulerâ&#x20AC;&#x2122;s formula

Notes

185

11.1 Review

186

Vectors • A vector quantity has both a magnitude and a direction. Force and velocity are examples of vector quantities. • A scalar quantity has only a magnitude (it has no direction). Time, area and temperature are examples of scalar quantities.

11.1

Review

A vector is represented geometrically in the (x, y) plane (or in (x, y, z) space) by a directed line segment (arrow). The direction of the arrow is the direction of the vector, and the length of the arrow is proportional to the magnitude of the vector. Only the length and direction of the arrow are significant: it can be placed anywhere convenient in the (x, y) plane (or (x, y, z) space). y

Figure 49: The vector of unit length at 45◦ to the x-axis has many representations.

(v 1 ,v 2) y=v 2

Q=(xQ ,yQ ) v=<v 1 ,v 2>

PQ yQ −y P = v 2 xQ −x P = v 1 x=v 1

(a)

P=(x P ,y P )

(b)

Figure 50: Geometric representation of a vector. −→

If P , Q are points, P Q denotes the vector from P to Q. −→

A vector v =P Q in the (x, y) plane may be represented by a pair of numbers v1 xQ − xP v= = v2 yQ − yP

11.4 Vector Addition

187 −→

which is the same for all representations P Q of v. We call v1 , v2 the components of the vector v. 0 We call the vector the zero vector. It is denoted by 0. 0 Vectors will be indicated by bold lowercase letters v, w etc. Writing by hand you may use v or ~v or ve.

11.2

Position Vectors

−→

Let P = (xp , yp ) be a point in the (x, y) plane. The vector OP , where O is the origin, is called the position vector of P . Obviously −→ xp OP = . yp

11.3

Definition: Norm −→

norm (or length or magnitude ) of v, written kvk, is the distance between P and For vector v =P Q, the v1 Q. Thus for v = we have v2

kvk =

11.4

v12 + v22

2 + v2 v1 2

Vector Addition

We add vectors by the triangle rule. −→

−→

Consider the triangle P QR with v =P Q, w =QR. Then v + w =P R; see Figure 51. In terms of components, if v1 w1 , w= v= v2 w2 then v+w =

v1 + w1 v2 + w2

It follows from the component description that vector addition satisfies the following properties: v+w =w+v u + (v + w) = (u + v) + w v+0 =0+v =v

(commutative law) (associative law)

11.6 Unit Vectors

188

−→

Figure 51: P Q + QR=P R

αv αv2 v

v2 v1

αv1

Figure 52: We can multiply the vector v by a number α (scalar).

11.5

Scalar multiplication

If α is a real number (called a scalar ), we define αv to be the vector of norm kαvk = |α| · kvk in the same direction as v if α > 0, and opposite direction if α < 0. v1 αv1 Using similar triangles it follows that if v = then αv = . v2 αv2 v1 by zero we obtain the zero vector: If we multiply any vector v = v2 0 0·v = =0 0

11.6

Unit Vectors

A unit vector is a vector of norm 1. If v 6= 0 is a vector, then

11.7 Vectors in 3 dimensions

189

1 v kvk 1 kvk

is a unit vector in the direction of v. In particular i=

1 0

0 1

determine unit vectors along the x and y axes respectively. y

Figure 53: Unit vectors i and j in the x and y directions respectively. For any vector v =

v1 v2

we have v=

v1 0

0 v2

= v1 i + v2 j,

Hence we can decompose v into a vector v1 i along the x-axis and v2 j along the y-axis. The numbers v1 and v2 are called the components of v with respect to i and j.

11.7

Vectors in 3 dimensions −→

Similarly in (x, y, z)-space a vector v =P Q is represented in component form by     v1 xQ − xP v =  v2  =  y Q − y P  v3 zQ − zP −→

which is the same for all representations P Q of v.   v1 −→ For v =OP =  v2  the norm of the vector v is v3

11.7 Vectors in 3 dimensions

190

z P

v3 y v1

kvk =

ON 2 + N P 2 =

kvk =

ON 2 + NP 2 =

v12 + v22 + v32 .

2 + v2 + v2 . v1 2 3

As before, we add vectors component by component. We also    v1 v =  v2  , w= v3 are vectors then



 v1 + w1 v + w =  v2 + w2  v3 + w3   v1 + w1 v + w =  v2 + w2  v3 + w3

define multiplication by a scalar α. So if  w1 w2  w3



 αv1 and αv =  αv2  , for α ∈ R. αv3 and

  αv1 αv =  αv2  , for α ∈ R. αv3

The unit vectors along the x, y, z axes are respectively     1 0    0 1 , , j= i= 0 0 

 v1 Any vector v =  v2  may be expressed as v3

v = v1 i + v2 j + v3 k.

 0 k =  0 . 1 

11.8 Row and column vectors

191

We call v1 , v2 , v3 the components of v in the i, j, k directions respectively. We usually denote 2 and 3 dimensional space by R2 and R3 , respectively. Thus ďŁąďŁŤ ďŁś ďŁź ďŁ˛ x ďŁ˝ x 2 3 ďŁ ďŁ¸ R = | x, y â&#x2C6;&#x2C6; R and R = y | x, y, z â&#x2C6;&#x2C6; R . y ďŁł ďŁž z

11.8

Row and column vectors

We may write vectors using columns eg. vectors in this course.

a b

, or as row vectors

a b . We usually use column

11.8 Row and column vectors

11.8.1

192

Example

An albatross is flying NE at 20 km/h into a 10 km/h wind in direction E60â&#x2014;Ś S. Find the speed and direction of the bird relative to the ground.

20 v 45

Î¸

Figure 54: v gives the velocity vector of the albatross.

11.8 Row and column vectors

193

We have: z = 20 cos 45◦ i + 20 sin 45◦ j √ √ = 10 2i + 10 2j; w = 10 cos 60◦ i − 10 sin 60◦ j √ = 5i − 5 3j.

√ √ √ So v = z + w = (10 2 + 5)i + (10 2 − 5 3)j Therefore the magnitude of v is q √ √ √ (10 2 + 5)2 + (10 2 − 5 3)2 kvk = =

√ √ 500 + 100 2(1 − 3)

≈ 19.91. To calculate the angle θ we use tan θ =

√ √ 10 2 − 5 3 √ 10 2 + 5

⇒ θ ≈ 16◦ So v = 19.9kmh−1 at E16◦ N.

We have:

◦

20 cos 45 i + 20 sin 45 j

√ √ 10 2i + 10 2j;

10 cos 60 i − 10 sin 60 j

√ 5i − 5 3j.

◦

√ √ √ So v = z + w = (10 2 + 5)i + (10 2 − 5 3)j Therefore the magnitude of v is

kvk

q √ √ √ (10 2 + 5)2 + (10 2 − 5 3)2

q √ √ 500 + 100 2(1 − 3)

≈

19.91.

To calculate the angle θ we use

So v = 19.9kmh−1 at E16◦ N.

tan θ

√ √ 10 2 − 5 3 √ 10 2 + 5

⇒ θ

≈

◦

11.9 Dot Product

11.9

194

Dot Product −→

−→

For non zero vectors v =OP , w =OQ the angle between v and w is the angle θ with 0 ≤ θ ≤ π radians −→

−→

between OP and OQ at the origin O; see Figure 55. y

v w

Figure 55: θ is the angle between v and w. The dot (or scalar or inner ) product of vectors v and w, denoted by v · w, is the number given by v·w = where θ is the angle between v and w.

  0, 

if v or w = 0

kvk · kwk cos θ , otherwise

If v, w 6= 0 and v · w = 0 then v and w are said to be orthogonal or perpendicular. If v = v1 i + v2 j + v3 k and w = w1 i + w2 j + w3 k are two vectors, then v · w is given by: v · w = v1 w1 + v2 w2 + v3 w3 , In particular, for v ∈ R3 ,

11.9.1

kvk2 = v · v = v12 + v22 + v32 .

Example

Find the angle θ between the vectors: 

 1 v =  −5  , 4

 3 w= 3  3 

11.9 Dot Product

195

cos θ =

v·w . kvk · kwk

v · w = 1 · 3 − 5 · 3 + 4 · 3 = 0. π So cos θ = 0, so the vectors are perpendicular, i.e. θ = . 2

cos θ =

v·w

kvk · kwk

v · w = 1 · 3 − 5 · 3 + 4 · 3 = 0. So cos θ = 0, so the vectors are perpendicular, i.e. θ =

11.9.2

π 2

Example

If P = (2, 4, −1), Q = (1, 1, 1), R = (−2, 2, 3), find the angle θ = P QR.

11.9 Dot Product

196

Find vectors joining Q to P , and Q to R:

−→

     1 1 2 =  4  −  1  =  3 , −2 1 −1 

     −3 1 −2 QR =  2  −  1  =  1  . 2 1 3 

−→

∴

−→ −→ √ √ QP = QR = 1 + 9 + 4 = 14. 

   1 −3 QP · QR =  3  ·  1  = −3 + 3 − 4 = −4 −2 2 −→

−→

∴

−→

−4 QP · QR ◦ cos θ = −→ −→ = 14 ⇒ θ ≃ 107 . QP · QR QP

−→



     2 1 1 4 − 1 = 3 , −1 1 −2

−→



     −3 1 −2 1 . 2 − 1 = 1 2 3

∴

−→ −→ √ √ QP = QR = 1 + 9 + 4 = 14.

−→

∴

−→

−4 QP · QR ◦ ⇒ θ ≃ 107 . cos θ = −→ −→ = 14 QP · QR

Find vectors joining Q to P , and Q to R:

−→





QP · QR = 

   1 −3 3 · 1  = −3 + 3 − 4 = −4 −2 2

11.9 Dot Product

Notes

197

12.2 Transformations of the plane

12 12.1

198

Matrices and Linear Transformations 2 × 2 matrices

Recall that a 2 × 2 array of numbers A=

a b c d

x is called a 2 × 2 matrix (plural matrices). Suppose v = is a 2 element column vector. Then y the product Av is defined to be the 2 element column vector whose entries are given by taking the dot product of each row of A with the vector v: a b x ax + by = . Av = c d y cx + dy 12.1.1

Example

Let A =

1 2 3 4

, and let v =

5 6

. Calculate Av.

Av =

1 2 3 4

5 6

1·5+2·6 = 3·5+4·6 17 = . 39 Av

12.2

1 3

1·5+2·6 3·5+4·6

17 39

2 4

5 6

Transformations of the plane

A transformation of the (x, y) plane is an operation (function) that maps each point of the P = (x, y) plane to some other point P ′ = (x′ , y ′ ). We call P the original point and P ′ the image point. For particular transformations, we are given, or can determine, a formula for x′ and y ′ in terms of x and y. This has application in computer graphics. We often want to transform a set of points forming a figure on the screen. Eg rescaling the figure, or rotating it about a central point etc. We shall see that many important transformations can be represented by 2 × 2 matrices.

12.3 Translations

12.3

199

Translations

A translation is a transformation that shifts each point of the plane a certain distance in the same direction. The translation in which each point is shifted a units in the x-direction and b units in the y-direction can be described by the equations x′ = x + a and y ′ = y + b. This can be described in terms of vectors. If p is the position vector of the original point P = (x, y), p′ a ′ ′ ′ , called the translation vector, then is the position vector of the image point P = (x , y ), and t = b ′

p = p + t, that is,

x′ y′

x y

a b

A translation shifts each point the same distance in the same direction, so the size and shape of figures is preserved. 12.3.1

Example

Draw the triangle with vertices A = (1, 3), B = (−2, 1) and C = (0, −2). Then draw its image under the 4 . translation having translation vector t = −1

12.4 Linear transformations

200

12.4

Linear transformations

A linear transformation of the plane is a transformation that maps the point (x, y) to the point (x′ , y ′ ) where x′ = ax + by and y ′ = cx + dy, for some real numbers a, b, c and d. Such a transformation can be represented by the 2 × 2 matrix A =

a b c d

as follows:

If p is the position vector of the original point P = (x, y) and p′ is the position vector of the image point P ′ = (x′ , y ′ ), then the linear transformation represented by the matrix A can be described by the equation ′ x a b x ′ = p = Ap, that is, . ′ y c d y It can be shown that the following properties hold. • A linear transformation preserves straight lines. When a linear transformation is applied to a figure, points that are collinear in the original figure map to collinear points in the new figure. • A linear transformation preserves parallelism. When a linear transformation is applied to a figure, lines that are parallel in the original figure map to lines that are parallel in the new figure. • A linear transformation maps the origin to itself, since A0 = 0. Translations are not linear transformations because they do not map 0 to 0. (Exception: translation by 0, the identity transformation.) Examples of linear transformations include rotations, reflections and scalings, as we shall see.

12.5 Visualizing linear transformations

12.4.1

201

Example

Are the following transformations linear? If so, write down the matrix representing the transformation, if not, explain why not. (a) x′ = x + 2y

y′ = y − x

(b) x′ = x + 2

(a) This is linear, represented by the matrix

1 2 −1 1

(b) This is not linear. The origin is not mapped to the origin: A

(a) This is linear, represented by the matrix

1 −1

2 1

0 0

2 −1

0 0

(b) This is not linear. The origin is not mapped to the origin: A

12.5

y′ = y − 1

0 0

2 −1

0 0

Visualizing linear transformations

• The linear transformation represented by the matrix A maps (1, 0) and (0, 1) to points whose position vectors are the first and second columns of A, respectively, since a b 1 a = and Ai = c d 0 c Aj =

a b c d

0 1

b d

Ai is the 1st column of A, and Aj is the 2nd column of A. This will be used again many times in the course. • Let a and b be the position vectors given by the first and second columns of A, respectively. a b and b = . Thus a = c d • The linear transformation represented by A will map the point (x, y) to a b x ax + by a b = =x +y . c d y cx + dy c d • So (x, y) is mapped to the point with position vector xa + yb = xAi + yAj. The linear transformation is determined once we know the image of i and j. These facts, along with the knowledge that linear transformations preserve linearity and parallelism, allow us to visualize the effect of a linear transformation.

12.6 Scalings

12.5.1

202

Example

Consider the matrix A =

2 1 1 2

This acts as a linear transformation, mapping i 7→

A maps i 7→

12.6

2 1

and j 7→

1 2

2 1

and j 7→

1 2

Scalings

• A scaling of the plane is a stretching (or shrinking) of the plane in two directions, horizontally and vertically. • Scaling is a linear transformation. • The amount of stretching/shrinking of a figure under a scaling is determined by two parameters, the scaling factor in the x-direction and the scaling factor in the y-direction. • If one (or both) of the scaling factors is negative, then the image will be reflected in one (or both) of the axes as well as being scaled. 12.6.1

Example

(a) Determine the matrix representing the linear transformation of a scaling that makes a figure twice as wide and three times as high.

We need Ai and Aj: 2 0 ; Aj = Ai = 0 3 So A is represented by the matrix Ai =

2 0

;

Aj =

0 3

2 0 0 3

We need Ai and Aj:

So A is represented by the matrix

2 0

0 3

(b) Use the matrix from part a) to determine the image of the point (4, 5) under this scaling.

12.7 Rotations

203

x′ y′

x a b y c d 2 0 4 8 = = . 0 3 5 15

= A

x′ y′

x y

2 0

0 3

a c 4 5

b d

8 15

x y

We now determine the matrix representing a scaling with factor g in the x-direction and factor h in the y-direction. First we must find the images of the position vectors i and j.

hj j

gi i

The image of i is the position vector

g 0

The image of j is

0 h

Thus the matrix representing a scaling with factors g and h is g 0 Dg,h = . 0 h

12.7

Rotations

• A rotation of the plane about the origin is a linear transformation. A rotation about any other point is not, because the origin will not be mapped to the origin. • A rotation of the plane through an angle θ will mean an anti-clockwise rotation of the plane about the origin through an angle θ.

12.7 Rotations

12.7.1

204

Example

(a) Determine the matrix that represents the linear transformation of anti-clockwise rotation through an angle of π/2 radians about the origin.

Ai = j =

0 1

Aj = −i =

−1 0

Hence the transform is represented by the matrix Hence the transform is represented by the matrix

−1 0

0 1

0 −1 1 0

Ai = j =

0 1

Aj = −i =

−1 0

We now determine the general matrix that represents an anti-clockwise rotation through an angle θ about the origin. We must find the images of the position vectors i and j. Recall the definition of sin θ and cos θ in terms of the unit circle.

y (0,1)

π π ( cos (__ + θ ) , sin (__ + θ ) ) 2

(cos θ , sin θ)

= ( −sin θ , cos θ )

i’ j’

θ i

(1,0)

Figure 56: The image of i and j under rotation by angle θ. The image of i is the position vector

cos θ sin θ

− sin θ . The image of j is . See Figure 56. cos θ

Thus the matrix representing an anti-clockwise rotation through an angle θ about the origin is cos θ − sin θ Rθ = . sin θ cos θ 12.7.2

Example

Determine the matrix representing an anti-clockwise rotation through 2π/3 radians about the origin.

12.8 Reflections

205

The angle θ = 2π/3, so cos θ =−1/2 and√sin θ= 3 1 − 2 −   2 The rotation matrix is R 2π =  √ . 3 3 1 −2 2 

 The rotation matrix is R 2π = 

− 23

√ 3 2

1 −2

12.8

√

−1 2

√

3/2.

The angle θ = 2π/3, so cos θ = −1/2 and sin θ =

√

3/2.



 .

Reflections

• A reflection of the plane in a line through the origin is a linear transformation. • A reflection in a line that does not pass through the origin does not map the origin to itself, so is not a linear transformation. We will only consider reflections in lines through the origin. • When the plane is reflected in a line through the origin, points on the line of reflection remain fixed. A point P not on the line of reflection is mapped to a point P ′ on the opposite side of the line, such that the points P and P ′ are equidistant from the line of reflection and the line through P and P ′ is perpendicular to the line of reflection. 12.8.1

Example

Determine the matrix representing the linear transformation of reflection in the x-axis.

Ai =

1 0

Hence A =

1 0

Aj =

;

1 0 0 −1

0 −1

Ai =

1 0

;

Aj =

0 −1

We now determine the matrix representing a reflection in the line (passing through the origin) which makes an angle θ with the positive x-axis. First we must find the images of the position vectors i and j.

y ( cos 2 θ , sin 2θ )

__ π 2

(0,1)

i’

θ θ

θ i

(1,0)

__ π 2

x j’

π (cos ( __ 2

2θ x

π 2θ ) , −sin ( __

= (sin 2θ , −cos 2 θ )

2θ ) )

12.9 Composing Transformations & Matrix Multiplication

The image of i is the position vector

cos 2θ sin 2θ

206

The image of j is

sin 2θ − cos 2θ

Thus the matrix representing a reflection in the line through the origin, that makes an angle θ with the positive x-axis is cos 2θ sin 2θ Qθ = . sin 2θ − cos 2θ 12.8.2

Example

(a) Determine √ the matrix representing a reflection in the line that passes through the origin and through the point ( 3, 1).

(b) Use the matrix from part a) to calculate the image of the point (−1, 1) under this reflection.

√ √ The line makes an angle of θ = π/6 with the x-axis (since tan(π/6) = 1/ 3). Since sin(2θ) = 3/2 and cos(2θ) = 1/2, the reflection matrix is  √  Hence the point (−1, 1) goes to  √   

3 2

1 2

√

3 2

− 21

 

 Qθ = 

−1 1

 Qθ = 

 

12.9

1 2 √ 3 2

√

3 2

−1 2

  

3 2

− 21

√

=  



3 2





The line makes an angle of θ = π/6 with the x-axis (since tan(π/6) = 1/

Hence the point (−1, 1) goes to

1 2

−1 1



√

 .

√ ( 3−1) 2 √ − ( 3+1) 2



  ≈

3). Since sin(2θ) = √

3 2

1 2 √ 3 2



= 

−1 2



√

0.37 −1.37



.

3/2 and cos(2θ) = 1/2, the reflection matrix is



 .

√ ( 3−1) 2 √ ( 3+1) − 2



  ≈

0.37 −1.37



.

Composing Transformations & Matrix Multiplication

Suppose we perform two linear transformations, one after the other. Suppose the first transformation we s t a b perform is given by the matrix B = , and the second is given by the matrix A = . A u v c d x vector v = is first sent to y Bv =

s t u v

x y

sx + ty ux + vy

12.9 Composing Transformations & Matrix Multiplication

207

The resulting vector Bv is then acted on by A, giving sx + ty a b sx + ty A = ux + vy c d ux + vy a(sx + ty) + b(ux + vy) = c(sx + ty) + d(ux + vy) (as + bu)x + (at + bv)y as + bu = = (cs + du)x + (ct + dv)y cs + du

at + bv ct + dv

x y

We define the product AB to be the 2 × 2 matrix as + bu at + bv s t a b = . AB = u v cs + du ct + dv c d The effect of performing the transformation represented by B and then the transformation represented by A is to perform the transformation represented by AB. In symbols A(Bv) = (AB)v. Note that the matrix AB represents the transformation of first applying B, and then applying A. Compare composition of functions F ◦ G. 12.9.1

Non-commutativity of matrix multiplication

In general AB 6= BA. That is, matrix multiplication is not commutative. 12.9.2

Example

Let A =

0 1 2 −1

AB =

1 2 3 4

0 1 2 −1

1 2 3 4

,B=

but BA =

AB =

0 2

1 −1

. Find AB.

1 2 3 4 1 3

but BA =

2 4

1 3

0·1+1·3 2·1−1·3

0 1 2 −1

2 4

0·1+1·3 2·1−1·3 0 2

1 −1

0·2+1·4 2·2−1·4 =

4 −1 8 −1

0·2+1·4 2·2−1·4 =

4 8

−1 −1

3 −1

4 0

3 4 −1 0

Note that AB 6= BA. Linear transformations in three dimensions can be accomplished using larger matrices.

12.10 Vectors in Rn

12.10

208

Vectors in Rn

We saw that 2 × 2 matrices can be used to represent linear transformations in the plane. In a similar way, 3 × 3 matrices can be used to transform images in 3 dimensional space, by rotating, reflecting etc. We are familiar with vectors in two and three dimensional space, R2 and R3 . These generalize to n dimensional space, denoted Rn . A vector v in Rn is specified by n components:   v1  v2    vi ∈ R. v= .  .  .  vn

We defineaddition  of vectors and multiplication by a scalar component-wise. Thus if α ∈ R is a scalar w1  w2    and w =  .  is another vector then  ..  wn



  v+w = 

v1 + w1 v2 + w2 .. .

vn + wn



   = w + v, 



  αv =  

αv1 αv2 .. .

αvn

    

We define the dot (or scalar ) product between v and w as

v · w = v1 w1 + v2 w2 + · · · · · · + vn wn n X vk wk . = k=1

v·w

= =

v1 w1 + v2 w2 + · · · · · · + vn wn n X

vk wk .

k=1

The length or norm of the vector v is

√ v·v q = v12 + v22 + · · · + vn2

kvk =

kvk

= =

√

v·v q 2 + v2 + · · · + v2 v1 n 2

α ∈ R,

12.11 Properties of Dot Product

209

Every vector v in Rn is expressible in terms of n special vectors e1 , . . . , en called coordinate vectors, where   0  ..   .     0     ei =   1  ← ith position.  0     ..   .  0

Proof:

An arbitrary vector v ∈ Rn can be written as    v1 v1  0  v2      v =  .  = 0  ..  ..   . vn 0





       +     

0 v2 0 .. .

= v1 e1 + v2 e2 + · · · · · · + vn en .





0 0 .. .

       + ······ +      0 vn 0

      

An arbitrary vector v ∈ Rn can be written as  v

12.10.1

    

v1 v2 . . . vn





      =     

v1 0 0 . . . 0

 0   v2     0  +    .   .   . 0 





       + ······ +       

0 0 . . . 0 vn

       

v1 e1 + v2 e2 + · · · · · · + vn en .

Notation

(1) E.g. in R3 we have i = e1 , (2) We have also the zero vector

0 + v = v + 0 = v,

0 · v = 0,



  0= 

j = e2 , 0 0 .. . 0

    

k = e3 .

satisfying

for every vector v.

(3) If v 6= 0, w 6= 0, we say that vectors v, w are perpendicular (or orthogonal) if v · w = 0.

12.11

Properties of Dot Product

If u, v and w ∈ Rn and α ∈ R then

12.13 Equality

210

(1) u · v = v · u. (2) v · (w + u) = v · w + v · u. (3) (αv) · w = α(v · w) = v · (αw). (4) v · v ≥ 0 and v · v = 0 if and only if v = 0.

12.12

Definition: Matrix

A rectangular array of numbers 

  A= 

······ ······

a1n a2n .. .

am1 am2 · · · · · ·

amn

a11 a21 .. .

a12 a22

    

is called an m × n matrix. It is made up of m rows and n columns. Entries of the jth column may be assembled into a vector   a1j  a2j     ..   .  amj

called the jth column vector of A. Similarly the ith row vector of A is (ai1 ai2 · · · · · · ain ).

Note that an m × 1 matrix is a column vector and a 1 × n matrix is a row vector. The number aij in the ith row and j column of A is called the (i, j)th entry of A. For brevity we write A = (aij ). 12.12.1

Example

The 2 × 3 matrix A with entries aij = i − j is

12.12.2

0 −1 −2 1 0 −1 0 1

−1 0

−2 −1

Example

The m × n matrix A whose entries are all zero is called the zero matrix, denoted 0; eg. the zero 3 × 2 matrix is   0 0 0 =  0 0 . 0 0

12.13

Equality

Matrices A, B are said to be equal, written A = B, if they are the same size and aij = bij , for all i, j.

12.15 Scalar Multiplication

12.14

211

Addition

The sum of two m × n matrices A and B is defined to be the m × n matrix A + B with entries (A + B)ij = aij + bij We add matrices element wise, as for vectors. Addition is only defined between matrices of the same size. 12.14.1

Properties

For matrices A, B, C of the same size we have A+B = B+A (A + B) + C = A + (B + C).

12.15

Scalar Multiplication

Let A be an m × n matrix and α ∈ R. We define αA to be the m × n matrix with entries (αA)ij = α · aij

for all i, j.

Write −A for (−1) · A. Define subtraction between matrices of the same size by A − B = A + (−B). 12.15.1

Example

Find A + B and A − B if A=

A+B =

1 5 3 3 2 2

A+B =

12.15.2 If A =

−4 6 3 0 1 2

1 3

5 2

3 2

Example 6 0 2 −1

then A + A =

12 0 4 −2

= 2A.

5 −1 0 3 1 0

A−B = A−B =

−9 −3

−9 7 3 −3 0 2 7 0

3 2

12.16 Matrix Multiplication

12.16

212

Matrix Multiplication

We generalize the multiplication of 2 × 2 matrices. Let A = (aij ) be an m × n and B = (bjk ) an n × r matrix. Then AB is the m × r matrix with ik entry (ab)ik = ai1 b1k + ai2 b2k + · · · · · · + ain bnk : 

a11  ..  .   ai1   ..  . am1

a12 .. .

 a1n  ..  .   ain    ..  . amn

······

ai2 · · · · · · .. . am2 · · · · · · A (m × n)

b11 · · · · b1k · · · · b21 · · · · b2k · · · · .. . .. . bn1 · · · · bnk · · · · B (n × r)

b1r b2r .. . .. . bnr



    = AB   (m × r) 

where (ab)ik = dot product between ith row of A and kth column of B. AB is only defined if the number of columns of A = the number of rows of B. 12.16.1

Example 

 0 −1 • Let A = (−1 0 1) and B =  1 2  1×3 −2 5 3×2 • Then AB is defined, but B A is not defined. 3×2 1×3

Calculate AB.



 0 −1 AB = (−1 0 1)  1 2  −2 5 = (−1 · 0 + 0 · 1 + 1 · −2

− 1 · 1 + 0 · 2 + 1 · 5)

= (−2 4)

0 1 −2

 −1 2  5

(−1 0 1) 

(−1 · 0 + 0 · 1 + 1 · −2

12.16.2



(−2

− 1 · 1 + 0 · 2 + 1 · 5)

Example  2 A = (1 3 9) , B =  1  . 7 

12.17 Transposition

213

Calculate AB and BA.



 2 (1 3 9)  1  = 2 + 3 + 63 = 68 1×3 7 3×1



2  1  7

1 3 9 1×3

3×1

  2 (1 3 9)  1  1×3 7



 2 6 18 =  1 3 9 . 7 21 63

2 + 3 + 63 = 68



3×1



 2  1  7

1×3

3×1

12.16.3

2  1 7

6 3 21

 18 9 . 63

Properties

For matrices of appropriate size (1) (AB)C = A(BC),

Associativity

(2) (A + B)C = AC + BC,

A(B + C) = AB + AC

Distributive Laws

Another unusual property of matrices is that AB = 0 does not imply A = 0 or B = 0. It is possible for the product of two non-zero matrices to be zero: 1 1 −1 1 0 0 = 2 2 1 −1 0 0 Also, if AC = BC, or CA = CB then it is not true in general that A = B. (If AC = BC, then AC − BC = 0 and (A − B)C = 0, but this does not imply A − B = 0 or C = 0.)

12.17

Transposition

The transpose of an m × n matrix A = (aij ) is the n × m matrix AT with entries aji = aTij ie.

    

······ ······

a1n a2n .. .

am1 am2 · · · · · ·

amn

a11 a21 .. .

a12 a22

T

for all i, j 

     =  

······ ······

am1 am2 .. .

a1n a2n · · · · · ·

amn

a11 a12 .. .

a21 a22

So the row vectors of A become column vectors of AT and vice versa.

    

12.17 Transposition

12.17.1 If A =

214

Examples of transpose 5 −8 1 4 0 0

,B=

1 −1 2 0

, C = (7 5 − 2) then find AT , B T , C T .



 5 4 5 −8 1 AT = =  −8 0  4 0 0 1 0 T 1 −1 1 2 T B = = 2 0 −1 0   7 C T = (7 5 − 2)T =  5  . −2

5 4

12.17.2

−8 0 1 2

1 0

−1 0

= (7 5 − 2)

 5 =  −8 1

 4 0  0

1 −1



 7 5 . −2

=

2 0

Properties

For matrices of appropriate size (1) (αA)T = α · AT

α∈R

(2) (A + B)T = AT + B T (3) (AT )T = A (4) (AB)T = B T AT 12.17.3

(not AT B T !).

Dot product expressed as matrix multiplication

A column vector



  v= 

v1 v2 .. . vn



   ∈ Rn 

may be interpreted as an n × 1 matrix. Then the dot product (for two column vectors v and w) may be expressed using matrix multiplication:   w1  w2    v · w = v1 w1 + v2 w2 + · · · + vn wn = (v1 v2 · · · vn ) .  ..   1×n wn n×1 = v T w.

12.17 Transposition

215

Also v 路 v = v12 + v22 + 路路路 + vn2 = kvk2 = v T v.

12.17 Transposition

Notes

216

13.2 Gaussian elimination

13 13.1

217

Gaussian elimination Simultaneous equations

One of the main applications of matrices is solving simultaneous equations of the form a11 x1 + a12 x2 + · · · + a1n xn a21 x1 + a22 x2 + · · · + a2n xn .. .

= b1 = b2 .. .

am1 x1 + am2 x2 + · · · + amn xn = bm . Here x1 , x2 , · · · · · · , xn are the variables (unknowns) and the coefficients aij and bi are given real numbers. This set of equations can be written using matrix multiplication as Ax = b 

  where x =  

x1 x2 .. .



(13.1) 

     is a column vector of unknowns, b =   

xn vector) and A = (aij ) is the m × n matrix of coefficients.

b1 b2 .. . bm



   is a given m × 1 matrix (column 

An efficient way of solving a system is by Gaussian elimination.

13.2

Gaussian elimination

We illustrate the process of Gaussian elimination. Given a set of equations, the idea is to perform operations, called elementary row operations, in order to simplify the equations to the point where the solution may be obtained easily. Solve the following system of equations: 2x1 + 5x2 + 7x3 = 25 x1 + 2x2 + 2x3 = 9 −2x1 + 11x3 = 16

13.2 Gaussian elimination

218

Swap the order of the first two equations: x1 + 2x2 + 2x3 = 9 2x1 + 5x2 + 7x3 = 25 −2x1 + 11x3 = 16 Subtract 2 times first equation from second: x1 + 2x2 + 2x3 = 9 x2 + 3x3 = 7 −2x1 + 11x3 = 16 Add 2 times first equation to third: x1 + 2x2 + 2x3 = 9 x2 + 3x3 = 7 4x2 + 15x3 = 34 Subtract 4 times second equation from third: x1 + 2x2 + 2x3 = 9 x2 + 3x3 = 7 3x3 = 6 Multiply third equation by 1/3: x1 + 2x2 + 2x3 = 9 x2 + 3x3 = 7 x3 = 2 Now we can solve by back substitution:

Third equation gives x3 = 2.

Substitute x3 into second: x2 + 6 = 7 so x2 = 1. Substitute x2 , x3 into first: x1 + 2 + 4 = 9 so x1 = 3.

Swap the order of the first two equations:

x1 2x1 −2x1

+ +

2x2 5x2

+ + +

2x3 7x3 11x3

= = =

9 25 16

2x2 x2

+ + +

2x3 3x3 11x3

= = =

9 7 16

Subtract 2 times first equation from second: −2x1 Add 2 times first equation to third: x1

2x2 x2 4x2

+ + +

2x3 3x3 15x3

= = =

9 7 34

Subtract 4 times second equation from third: x1

2x2 x2

+ +

2x3 3x3 3x3

= = =

9 7 6

2x2 x2

+ +

2x3 3x3 x3

= = =

9 7 2

Multiply third equation by 1/3:

Now we can solve by back substitution:

Third equation gives x3 = 2. Substitute x3 into second: x2 + 6 = 7 so x2 = 1.

Substitute x2 , x3 into first: x1 + 2 + 4 = 9 so x1 = 3.

13.2 Gaussian elimination

219

This process is called Gaussian elimination. We call this simplified set of equations x1 + 2x2 + 2x3 = 9 x2 + 3x3 = 7 x3 = 2 the reduced form of the equations. Once in reduced form the set of equations is easily solved by back substitution. In this example we repeatedly performed the following operations: (1) Interchange the order of two equations: swap the ith and j equations. (2) Multiply an equation through by a non-zero number α. (3) Add α times the jth equation to the ith equation. Performing these operations does not change the solution of the equations. But eventually we obtained such a simple system that we could easily find the solution. 13.2.1

Gaussian elimination using matrices

The above procedure may be described most conveniently in terms of matrices. The information present in a set of simultaneous equations A x = b m×n n×1

m×1

may be summarized by constructing a new matrix (A | b) A with one extra column, b. This is called the augmented matrix. Instead of manipulating equations we manipulate the corresponding augmented matrix. This way we keep writing to a minimum. The augmented matrix for the set of equations in the previous example 13.2 may be written



 25 9  16

2 5 7  1 2 2 −2 0 11  

2 1 −2

5 2 0

7 2 11

 25 9  16

Always keep in mind that each row of the augmented matrix represents an equation. The last column is the RHS. The other columns are the coefficients of the variables. Eg the last row above corresponds to −2x1 + 0x2 + 11x3 = 16. Another advantage of the augmented matrix notation is that if we want to solve two sets of equations with the same coefficient matrix but different RHS: Ax = b,

and Ay = c,

we can do both at once by forming the augmented matrix (A | b | c). See later.

13.2 Gaussian elimination

13.2.2

220

Elementary row operations (EROs)

We rewrite the three allowable operations on equations discussed above, in terms of their effect on the augmented matrix. Call the row vectors of the augmented matrix r 1 , r 2 , . . . etc. Since each row corresponds to an equation, the operations are: (1) Interchange rows i and j, written r i ←→ r j . (2) Replace r i with αr i , α ∈ R, α 6= 0, written r i

αr i .

(3) Replace r i with r i + αr j , α ∈ R, where r j is another row of A, written r i

r i + αr j .

These three operations are known as elementary row operations (EROs). In general we say that m × n matrices A, B are row equivalent, written A ∼ B, if and only if they are related by a series of EROs. To solve a system of equations: • Set up the augmented matrix (A | b) representing the system. • Apply EROs to (A | b) to reduce it to a simpler matrix. • Eventually the system is so simple that the solution may be read off. This is one of the most efficient procedures for solving linear equations. 13.2.3

Example

Redo the previous example 13.2, using row reductions. You should check that we are doing exactly the same calculations as before, just with less writing.

13.2 Gaussian elimination

221

The augmented matrix is:

2 5 7  1 2 2 −2 0 11

Swap the first two rows: r 1 ↔ r 2 .



Subtract 2 times row 1 from row 2: r 2

Add 2 times row 1 to row 3: r 3

1 3 r3 .

1 2 2  ∼ 2 5 7 −2 0 11

r 2 − 2r 1 .  1 2 2 ∼ 0 1 3 −2 0 11

r 3 + 2r 1 . 

Subtract 4 times row 2 from row 3: r 3

Multiply row 3 by 1/3: r3

 25 9  16



1 2 2 ∼ 0 1 3 0 4 15

r 3 − 4r 2 .  1 2 2  0 1 3 ∼ 0 0 3 

1 2 2 ∼ 0 1 3 0 0 1

 9 25  16  9 7  16

 9 7  34  9 7  6  9 7  2

This is called a Gauss reduced form, or row-echelon form. The solution can now be read off directly by back substitution as before. r 3 gives x3 = 2.

Substitute x3 into r 2 : x2 + 6 = 7 so x2 = 1. Substitute x2 , x3 into r 1 : x1 + 2 + 4 = 9 so x1 = 3.   Swap the first two rows: r1 ↔ r 2 .

r2 − 2r 1 .

r 3 + 2r 1 .

Subtract 4 times row 2 from row 3: r3

Multiply row 3 by 1/3: r3

1r . 3 3

r3 − 4r 2 .

7 2 11

 25 9  16

1 2 −2

2 5 0

2 7 11

 9 25  16



1 0 −2

2 1 0

2 3 11

 9 7  16

∼ Add 2 times row 1 to row 3: r 3

5 2 0



∼ Subtract 2 times row 1 from row 2: r2

2 1 −2

The augmented matrix is:

 1 ∼ 0 0

2 1 4

2 3 15

 9 7  34

 1 ∼ 0 0

2 1 0

2 3 3

 9 7  6

 1 ∼ 0 0

2 1 0

2 3 1

 9 7  2

This is called a Gauss reduced form, or row-echelon form. The solution can now be read off directly by back substitution as before. r 3 gives x3 = 2.

13.2 Gaussian elimination

13.2.4

222

The Gaussian algorithm

To solve x = b ,

m×n n×1

m×1

first set up the augmented matrix (A | b) . Then - Find a row with entry α 6= 0 in column 1. (If none go to column 2) - Multiply row by

1 α

to make leading entry 1

- Swap rows to bring the 1 to the top     

1 ∗ ∗ .. .

 • • • • • •  • • • 

- Subtract appropriate multiples of row 1 to clear out the first column   1 • • •  0 • • •    0 • • •   .. . First column is now done.

Repeat on second column: Find a row (not the first) with entry α 6= 0 in column 2. (If none exists, go on to third column). Make the leading entry 1, and clear out all the entries below this 1:   1 • • •  0 1 • •    0  0 • •   .. .

Then go on to the third column etc. 13.2.5

Gauss reduced form or row echelon form

When solving a system Ax = b, we can stop doing EROs on (A | b) once A is “simple enough” to find the solution by back substitution. This means we reduce (A | b) until: • All rows consisting entirely of 0’s are at the bottom. • The first non-zero entry (leading entry or pivot) of each row is 1, and appears in a column to the right of the leading entry of the row above it. • All entries in a column below a leading entry are zeros.

13.2 Gaussian elimination

223

The reduced matrix is called the Gauss reduced form (or row echelon form) of A. For example: 

1 0  0 0

• 1 0 0

• • 0 0

• • 1 0

 •

• •

•   •

•  0 •

The leading entries 1 (pivots) must be 1 and must have zeros below them. Not every column must have a pivot. Above, the third column has no pivot. The leading entries must form a “stair case” from top left to bottom right. (Note that a matrix A may have more than one Gauss reduced form. Eg the entries marked • above are not unique. It doesn’t matter which Gauss reduced form is used.) Tips: • Try to do row reductions that lead to the easiest algebra. • Try to avoid fractions. • Use row swaps to take advantage of existing 1’s in the matrix. 13.2.6

Example

Which of the following matrices are in Gauss reduced form (row echelon form)? 

 1 2 3 4 A =  0 0 1 4 , 0 0 0 1



 0 1 0 2 B =  0 0 1 −2  , 0 0 0 0



 0 1 0 2 C= 1 0 0 0  0 0 1 0

A, B are row-reduced. C is not because the leading 1 in the second column is not to the right of the leading 1 in the 1st column. A, B are row-reduced. C is not because the leading 1 in the second column is not to the right of the leading 1 in the 1st column.

13.2.7

Inconsistent Equations

Not every system of linear equations has a solution. Consider the following example. Solve

x1 + 2x2 − x3 = 4 2x1 + 7x2 + x3 = 14 3x1 + 8x2 − x3 = 17

13.2 Gaussian elimination

224

The augmented matrix is  1 2 −1  2 7 1 3 8 −1

   4 4 1 2 −1 3 − 6  (r 2 r 2 − 2r 1 ) 14  ∼  0 3 17 (r 3 0 2 2 −5 r 3 − 3r 1 )   1 2 −1 4 1  0 1 1 2  (r 2 ∼ 3 r2) 0 2 2 5   1 2 −1 4 ∼  0 1 1 2  (r 3 r 3 − 2r 2 ) 0 0 0 1

This is in row echelon form. The bottom line corresponds to:

0 · x1 + 0 · x2 + 0 · x3 = 1, which is impossible. Therefore there are no solutions to the system. 

1  2 3

2 7 8

−1 1 −1



4 14  17

∼



1  0 0

2 3 2

−1 3 2

4 −6  −5



1  0 0

2 1 2

−1 1 2

 4 2  5



2 1 0

−1 1 0

 4 2  1

1  0 0



(r2 (r3

The augmented matrix is

r 2 − 2r 1 ) r 3 − 3r 1 )

(r 2

1r ) 3 2

(r 3

r 3 − 2r 2 )

This is in row echelon form. The bottom line corresponds to: 0 · x1 + 0 · x2 + 0 · x3 = 1, which is impossible. Therefore there are no solutions to the system.

13.2.8

How to recognize an inconsistent system

The Gauss reduced form of the augmented matrix will have one or more rows of the form (0 0 . . . 0 | a) where a 6= 0. This leads to an impossible equation 0x1 + 0x2 + · · · + 0xn = a 6= 0. Geometrically, a linear equation in R2 represents a line. An equation in R3 represents a plane etc. Solving two equations in R2 thus corresponds to finding the intersection of two lines in R2 . Usually two lines in R2 have a unique point of intersection. But not always: parallel lines do not. In fact parallel lines do not intersect at all, so there are no solutions—unless the lines are right on top of each other, in which case they intersect in infinitely many points. Solving three equations in R3 corresponds to finding the intersection of three planes. Usually three planes have a unique point of intersection, but not always. Similarly for equations in more variables. This underlies the next two examples.

13.2 Gaussian elimination

225

Three planes meeting in a point

Three planes meeting in a line

13.2.9

More than one solution

A system of equations may have more than one solution. Solve the system of equations x1 + 3x2 − 2x3 = 5 −x1 + x2 − 2x3 = −9 2x1 + 4x2 − 2x3 = 12

13.2 Gaussian elimination

226

The augmented matrix is: 

1 3 −2  −1 1 −2 2 4 −2  1 1 r2)  0 4 0

∼ (r2

(Gauss reduced form). The last row is equivalent to

  1 3 5 −9  ∼  0 4 12 0 −2

 −2 5 −4 −4  2 2

  1 3 3 −2 5 1 −1 −1  ∼  0 1 −2 2 2 0 0

0 · x1 + 0 · x2 + 0 · x3 = 0,

(r 2 (r 3

r2 + r1) r 3 − 2r 1 )

 −2 5 −1 −1  (r 3 0 0

r3 +2r2 )

which gives no information.

Therefore we effectively have just 2 equations in 3 unknowns x1 + 3x2 − 2x3

= 5

x2 − x3

= −1

No leading 1 in the third column (non-pivotal). So set x3 = α arbitrary, and back substitute into r 2 and r 1 to give x2 x1

= −1 + x3 = −1 + α ,

= 5 − 3x2 + 2x3 = 8 − 3α + 2α = 8 − α

The augmented matrix is:



1 3 −2  −1 1 −2 2 4 −2 ∼ (r2

 1 1 r2)  0 4 0

(Gauss reduced form). The last row is equivalent to

  5 1 3 −9  ∼  0 4 12 0 −2

 −2 5 −4 −4  2 2

  3 −2 5 1 3 1 −1 −1  ∼  0 1 −2 2 2 0 0

0 · x1 + 0 · x2 + 0 · x3 = 0,

(r 2 (r 3

r2 + r1) r 3 − 2r 1 )

 −2 5 −1 −1  (r 3 0 0

r3 +2r2 )

which gives no information.

Therefore we effectively have just 2 equations in 3 unknowns x1 + 3x2 − 2x3 x2 − x3

= 5 = −1

No leading 1 in the third column (non-pivotal). So set x3 = α arbitrary, and back substitute into r 2 and r 1 to give x2 x1

= −1 + x3 = −1 + α , = 5 − 3x2 + 2x3 = 8 − 3α + 2α = 8 − α

13.3 The general solution

227

This means, for any choice of α ∈ R, a solution is:         x1 8−α 8 −1 x =  x2  =  −1 + α  =  −1  + α  1  . x3 α 0 1



 −8 The last equation is the equation of a line in R3 : x is given by a position vector p =  1  plus α 0   −1 times a vector v =  1  indicating direction. 1 Thus geometrically the three planes intersect along a common line rather than at a single point. Every solution x corresponds to a point on this line obtained by a particular choice of α. This means, for any choice of α ∈ R, a solution is:

        x1 8−α 8 −1 1 . x =  x2  =  −1 + α  =  −1  + α  1 x3 α 0 

   −8 −1 1  plus α times a vector v =  1  indicating direction. 0 1 Thus geometrically the three planes intersect along a common line rather than at a single point. Every solution x corresponds to a point on this line The last equation is the equation of a line in R3 : x is given by a position vector p =  obtained by a particular choice of α.

13.2.10

How to recognize non-unique solutions

The system is consistent, and the Gauss reduced matrix will have one or more columns without a pivot (leading entry). For example, if the second column does not have a leading 1 then there will never be an equation we can use to isolate x2 , so x2 will be a free variable: we will be able to assign it any value (α say), and still solve the system. In a big system we may have several free variables. Any system with more than one solution will have a free variable. Since this variable can take any value, the system will then have infinitely many solutions.

13.3 13.3.1

The general solution General form of solutions

In general a system of equations A

x = b

m×n n×1

m×1

will have one of the following • A unique solution • An infinite number of solutions • No solution. Which behaviour occurs can be seen from the Gauss reduced form of the matrix. We shall show that if there are an infinite number of solutions, they will all be of the form p + y where p is a particular solution and y is the part arising from the free variables. We shall show that y is a solution of Ax = 0. It is useful to introduce some notation.

13.3 The general solution

13.3.2

228

The Nullspace of a matrix

Let A be a matrix. The set of vectors x with Ax = 0 is called the nullspace of A, denoted NS(A). The nullspace always contains the zero vector, since A0 = 0. We shall show later that the nullspace either contains only the zero vector, or else contains infinitely many vectors. The linear system Ax = 0 is called a homogeneous system. (Homogeneous means “all the same”.) Thus the nullspace of A is the set of solutions of the homogeneous system. A linear system Ax = b with b 6= 0 is called inhomogeneous. 13.3.3

Example

   5 1 3 −2 Let A = −1 1 −2 and let b =  −9 . 12 2 4 −2 

(a)

Calculate NS(A).

Solve the system Ax = b.

(b)

For the nullspace we have  1  −1 2 ∼ (r 2

to solve Ax = 0.    3 −2 0 1 3 −2 0 1 −2 0  ∼  0 4 −4 0  r2 + r1) (r 2 4 −2 0 0 −2 2 0 (r 3 r 3 − 2r 1 )     1 3 −2 0 1 3 −2 0 r2  ) 0 1 −1 0  ∼  0 1 −1 0  (r 3 r 3 +2r 2 ) 4 0 −2 2 0 0 0 0 0

(Gauss reduced form). The bottom row gives no information. So x3 can be anything. Say x3 = α. The middle row says x2 − x3 = 0 so x2 = α. The top row gives x1 + 3x2 − 2x3 = 0, so x1 + 3α − 2α = 0 so x1 = −α.     −1   NS(A) = α  1  | α ∈ R .   1 For the nullspace we have to solve Ax = 0. 

1  −1 2

∼ (r2

3 1 4

 1 ) 0 4 0

−2 −2 −2

3 1 −2

  0 1 0  ∼  0 0 0 −2 −1 2

 0 0  0

3 4 −2

∼



 0 0  0

−2 −4 2

1  0 0

3 1 0

−2 −1 0

(r2 (r3

r2 + r 1 ) r3 − 2r 1 )

 0 0  0

(r3

r3 +2r2 )

(Gauss reduced form). The bottom row gives no information. So x3 can be anything. Say x3 = α. The middle row says x2 −x3 = 0 so x2 = α. The top row gives x1 +3x2 −2x3 = 0, so x1 + 3α − 2α = 0 so x1 = −α.     −1   NS(A) = α  1 |α∈R .   1

13.4 Gauss-Jordan elimination

229

Inhomogeneous case Ax = b. We did this in the previous example. We found         x1 8−α 8 −1 x =  x2  =  −1 + α  =  −1  + α  1  = p + y, x3 α 0 1   8  where p = −1  is a particular solution of Ax = b and y is any element in the nullspace NS(A). 0

Inhomogeneous case Ax = b. We did this in the previous example. We found         x1 8−α 8 −1 x =  x2  =  −1 + α  =  −1  + α  1  = p + y, x3 α 0 1   8 where p =  −1  is a particular solution of Ax = b and y is any element in the nullspace NS(A). 0

13.3.4

Relation between homogeneous & inhomogeneous systems

The solution of the homogeneous Ax = 0 and inhomogeneous Ax = b systems are related. Let A be a matrix, and consider the system (∗)

Ax = b.

Let p be a fixed solution of (∗). Then the complete set of solutions of (∗) is {x = p + y | y ∈ NS(A)}. Proof: If p is a solution and y ∈ NS(A) then A(p + y) = Ap + Ay = b + 0 = b. So every x of the form p + y is a solution. Conversely, if x is a solution of (∗) then Ax = b = Ap. Thus A(x − p) = Ax − Ap = b − b = 0. So let y = x − p ∈ NS(A). Then x = p + y has the stated form. 13.3.5

Summary

There are 3 possible behaviours for a system: 1. The system Ax = b may have no solutions. (Inconsistent case.) 2. If a solution to Ax = b exists and NS(A) = {0} then Ax = b has a unique solution. 3. If Ax = b has a solution x = p and NS(A) 6= {0} then there are infinitely many solutions: all x of the form x = p + y with y ∈ NS(A). Note that a homogeneous system is always consistent (case (1) cannot apply) because the 0 vector is always a solution.

13.4

Gauss-Jordan elimination

Gauss-Jordan (or simply Jordan) reduction involves performing more EROs on the Gauss reduced form of a matrix until it has the additional property that each leading 1 is the only non-zero entry in its column. Solve

2x1 − 3x2 − x3 = 1 x1 − 2x2 − 3x3 = 2 −2x1 + 2x2 − 5x3 = 3

13.4 Gauss-Jordan elimination

13.4.1

230

Example of Gauss-Jordan Reduction

This is example 13.2.3 again. The augmented matrix is:    1 −2 −3 2 −3 −1 1  1 −2 −3 2  ∼  0 1 5 −2 2 −5 3 0 0 1 This is the Gauss reduced form. But  1  ∼ 0 0 r1

we can continue doing row   0 7 −4 1 0   1 5 −3 0 1 ∼ 0 1 −1 0 0 r1 + 2r 2

r1 r2

 2 −3  −1

operations.  3 0 2  0 1 −1

r 1 − 7r 3 r 2 − 5r 3

This is the Jordan reduced form. We can read off the solution directly: x1 = 3, x2 = 2, x3 = −1. 13.4.2

The difference between Gauss and Gauss-Jordan reduction

Gauss reduction leads to a matrix with 0’s below the leading 1’s, such as 

1  0   0 0

∗ 1 0 0

∗ ∗ 0 0

 ∗ ∗  1 0

while Gauss-Jordan reduction results in a matrix with 0’s above and below the leading 1’s such as 

1  0   0 0

0 1 0 0

0 0 0 0

 0 0 . 1 0

Jordan form has the advantage that the solution can be read off directly without any back substitution. However Gaussian reduction is usually preferred for solving systems. Gauss-Jordan reduction is better if we want to solve many systems Ax = b, Ax = c etc with the same coefficient matrix A, since it eliminates the need to do multiple back substitution steps. This is useful in the next chapter.

13.4 Gauss-Jordan elimination

Notes

231

14.1 The Identity Matrix

232

Inverses

For the rest of the course, the matrices will be square.

14.1

The Identity Matrix

An n × n matrix (m = n) is called a square matrix of order n. The diagonal containing the entries a11 , a22 , · · · , ann is called the main diagonal (or principal diagonal) of A. If the entries above this diagonal are all zero then A is called lower triangular. If all the entries below the diagonal are zero, A is called upper triangular     a11 · · · · · · a1n a11 0 0 ··· 0    .. ..   0 a22  . a22 0 · · · .  0          . . . . . . . .   0  . . . 0 .        ..  .. .. ..  .. .. ..  .  . . . . . .  0  0 0 ··· 0 ann an1 · · · · · · ann upper triangular lower triangular If elements above and below the principal diagonal are zero, so aij = 0 ,

i 6= j

then A is called a diagonal matrix. Note that if A is diagonal then A = AT . 14.1.1

Example



 1 0 0 A =  0 −3 0  is a diagonal 3 × 3 matrix. 0 0 2 14.1.2

Definition: Identity matrix

The n × n identity matrix I = In , is the diagonal matrix whose entries are all 1:   1 0 0 ··· 0  0 1 0 ··· 0      I =  0 0 1 ··· 0 .  .. . . . . ..   .  0 0 ··· ··· 1 14.1.3

Comments on the identity matrix

Recall from linear transformations of R2 : Ai is the first column of A and Aj is the second column. For example, for the 2 × 2 case we have a11 a12 1 0 a11 a12 AI = = = A = IA. a21 a22 0 1 a21 a22

14.2 Definition: Inverse

233

In general the jth column vector of I is the coordinate vector   0  ..   .     0     ej =   1  ← j.  0     ..   .  0

Thus



  =  

Aej



  =  

a11 a21 .. .

··· ···

an1 · · · a1j a2j .. . anj

    

··· ···

a1n a2n .. .

anj · · ·

ann

a1j a2j .. .



 0 .    ..    0      ←j  1     0    ..   .  0

jth column = vector of A.

So AI = A for all A. Replacing A by AT , AT I = AT also. Transpose both sides: A = (AT )T = (AT I)T = I T (AT )T = IA, so IA = A for all A. We have proved that for all square matrices A IA = AI = A.

14.2

Definition: Inverse

Let I denote the identity matrix. A square matrix A is invertible (or non-singular ) if there exists a matrix B such that AB = BA = I. Then B is called the inverse of A and is denoted A−1 . A matrix that is not invertible is also said to be singular. 14.2.1

Inverse for the 2 × 2 case

Let A be a 2 × 2 matrix A = AB =

a b c d

. Set B = d −b −c a

d −b −c a

. Then

ad − bc 0 0 ad − bc

= BA.

14.2 Definition: Inverse

234

Let ∆ = ad − bc. If ∆ 6= 0, then

1 1 B= ∆ ∆

A−1 =

d −b −c a

We call ∆ the determinant of A. We will consider determinants in more detail later (chapter 15). The problem of obtaining inverses of larger square matrices will also be considered later (section 14.2.5). 14.2.2

Example

Find the inverse matrix of A =

∆ = 1 × 5 − 2 × 3 = −1.

1 2 3 5

−1

⇒A

. Check your answer.

1 = −1

5 −2 −3 1

−5 2 3 −1

Check whether AA−1 = A−1 A = I: Multiplying the two matrices A and A−1 gives 1 2 −5 2 1 0 = , 3 5 3 −1 0 1 −5 2 1 2 1 0 = 3 −1 3 5 0 1 So A is invertible with inverse

∆ = 1 × 5 − 2 × 3 = −1.

−1

⇒A

1 3

2 5

−5 3

1 −1

−5 2 3 −1 −2 1

5 −3

−5 3

2 −1

Check whether AA−1 = A−1 A = I: Multiplying the two matrices A and A−1 gives 2 −1

So A is invertible with inverse

14.2.3

Example

Show that A =

1 2 3 6

−5 3

is not invertible.

2 −1

2 5

1 3

−5 3

2 −1

1 0

0 1

1 0

0 1

14.2 Definition: Inverse

235

Suppose A has inverse

Then

a b . c d

1 2 a b 1 0 = ⇒ 3 6 c d 0 1

 a + 2c = 1    b + 2d = 0 3a + 6c = 0   3b + 6d = 1

(1) (2) . (3) (4)

Hence 3 · (1) − (3) ⇒ 0 = 3. This is impossible, so A can’t have an inverse. Note that ∆ = 6 − 2 · 3 = 0 in this case. Later we shall see that a square matrix is invertible exactly when its determinant is not 0. pose A has inverse 1 Then 3

a 2 c 6

a c

b d

1 = 0

0 1

 a + 2c = 1    b + 2d = 0 3a + 6c = 0   3b + 6d = 1

⇒

(1) (2) . (3) (4)

Hence 3 · (1) − (3) ⇒ 0 = 3. This is impossible, so A can’t have an inverse.

Note that ∆ = 6 − 2 · 3 = 0 in this case.

Later we shall see that a square matrix is invertible exactly when its determinant is not 0.

14.2.4

Example: Inverse of a 3 × 3 Matrix



   2 −3 −1 16 −17 7 Let A =  1 −2 −3  and B =  11 −12 5 . Show that B = A−1 . −2 2 −5 −2 2 −1 Just multiply the two matrices together.      2 −3 −1 16 −17 7 1 0 0 AB =  1 −2 −3   11 −12 5  =  0 1 0 . 0 0 1 −2 2 −5 −2 2 −1

And, similarly

    16 −17 7 2 −3 −1 1 0 0 BA =  11 −12 5   1 −2 −3  =  0 1 0  . 0 0 1 −2 2 −1 −2 2 −5 

Just multiply the two matrices together.



2 1 −2

−3 −2 2



16 11 −2

−17 −12 2

AB = 

 −1 16 −3   11 −5 −2

−17 −12 2

  7 1 5 = 0 −1 0

0 1 0

 0 0 . 1

−3 −2 2

  −1 1 −3  =  0 0 −5

0 1 0

 0 0 . 1

And, similarly BA = 

14.2.5

 7 2 5  1 −1 −2

Inverses using Gauss–Jordan reduction

We can now give a systematic way of finding inverses of matrices, using Gauss–Jordan reduction.

Sup-

14.2 Definition: Inverse

236

Let A be an invertible n × n matrix. We want to find the inverse X of A. We try to solve AX = I. 14.2.6

Example 

   2 −3 1 x1 y1 z1 Above A =  1 −2 −3 . We want a matrix X =  x2 y2 z2  with AX = I. −2 2 5 x3 y3 z3

           x1 y1 z1 1 0 0 Let x =  x2 , y =  y2 , z =  z2 . We need Ax =  0 , Ay =  1 , Az =  0 . 0 0 1 x3 y3 z3 

So Ax = i, Ay = j, Az = k. We can solve these 3 systems by augmenting A with all 3 RHS vectors: (A | i j k). That is, we form (A | I). We use Gauss-Jordan reduction to solve the 3 systems at once and find x, y, z. 

 2 −3 −1 Let A =  1 −2 −3 . Calculate A−1 . (This is example 14.2.4). −2 2 −5

14.2 Definition: Inverse

237

Perform Gauss-Jordan reduction on the augmented matrix (A | I):     1 −2 −3 0 1 0 2 −3 −1 1 0 0  1 −2 −3 0 1 0  ∼  2 −3 −1 1 0 0  −2 2 −5 0 0 1 −2 2 −5 0 0 1 r 1 ←→ r 2  1 0 1 −2 −3 0  0 1 5 1 −2 0  2 1 0 −2 −11 0 r 2 − 2r 1 r2 r3 r 3 + 2r 1 

 2 −3 0 1 0 7  0 1 5 1 −2 0  0 0 1 −2 2 −1 r3 −r3 

So A is invertible, and A−1 Compare this to 14.2.4.



 16 −17 7 =  11 −12 5 . −2 2 −1



 1 0 7 2 −3 0 5 1 −2 0  ∼  0 1 0 0 −1 2 −2 1 r1 r 1 + 2r 2 r3 r 3 + 2r 2 

 16 −17 7 1 0 0 11 −12 5  ∼  0 1 0 0 0 1 −2 2 −1 r1 r 1 − 7r 3 r 2 − 5r 3 r2

Perform Gauss-Jordan reduction on the augmented matrix (A | I):   

1  0 0



1  0 0



So A is invertible, and A−1 = 

Compare this to 14.2.4.

16 11 −2

−3 −2 2

2 1 −2

−17 −12 2

−2 1 −2

0 1 0

−1 −3 −5

1 0 0

0 1 0

0 1 −3 1 −2 5 −11 0 2 r2 − 2r 1 r2 r3 r 3 + 2r 1 7 5 1 r3

2 −3 1 −2 −2 2 −r3

 0 0  1  0 0  1

 0 0  −1



∼ 

0 −2 −3 −3 −1 1 0 2 −5 r 1 ←→ r 2

1 2 −2

 1 ∼  0 0

1 0 0

2 −3 0 7 1 −2 1 5 0 −1 2 −2 r1 r1 + 2r 2 r3 r3 + 2r 2 0 1 0

0 0 1 r1 r2

16 −17 11 −12 −2 2 r1 − 7r 3 r2 − 5r 3

 0 0  1  0 0  1

 7 5  −1

 7 5 . −1

In general, let xj be the jth column of the matrix X, and recall that ej is the jth column of I. Breaking the previous matrix equation AX = I into one equation for each column gives rise to n equations Ax1 = e1 , Ax2 = e2 , . . . , Axn = en .

(14.1)

We can regard each of these as a system to be solved. As we noted when introducing the augmented matrix, we can solve more than one system of equations at once by forming a larger augmented matrix. Eg Ax = b, Ax = c 99K (A | b c) Thus we may solve (14.1) by applying Gauss–Jordan elimination to the augmented matrix (A | e1 e2 , . . . , en ) = (A | I). Reduce until we reach (I | X).

The solution X can now be read off, so X = A−1 is the inverse of A. If this reduction cannot be completed, A has no inverse.

14.3 Inverses via Gauss–Jordan. Summary

14.3

238

Inverses via Gauss–Jordan. Summary

• If A is n × n, to find A−1 : • Set up augmented n × 2n matrix (A | I). • If this can be reduced to (I | X) then X = A−1 . • If it cannot be reduced, then A is not invertible. 14.3.1

Invertibility and the nullspace

Above to find A−1 we found X with AX = I. However for X to be the inverse of A, the definition of the inverse matrix requires XA = I also. We shall show the surprising fact that if AX = I then XA = I automatically, for square matrices. We need a little theory. Let A be an n × n matrix. We shall find several properties equivalent to A being invertible. Consider the following statements: 1. A is invertible. 2. NS(A) = {0}. 3. For every n × 1 vector b the system Ax = b is solvable, with a unique solution x. 4. There exists an n × n matrix B with AB = I. We first show that (1) =⇒ (2) =⇒ (3) =⇒ (4): Proof: (1) =⇒ (2) Suppose x ∈ NS(A). Thus Ax = 0. Multiply both sides by A−1 to see that x = 0, so NS(A) = {0}. (2) =⇒ (3) If NS(A) = {0} then when we solve variables. Row reducing must give  1  0   (A | 0) ∼  0  ..  . 0 with a leading entry in every column.

Ax = 0 the only solution is x = 0, so there are no free ∗ ∗ ··· 1 ∗ ··· 0 1 ··· .. .. .. . . . 0 0 ···

 ∗ 0 ∗ 0   ∗ 0  , .. ..  . .  1 0

Now consider solving Ax = b. The row reductions on A are changes. So we get  1 ∗ ∗ ··· ∗  0 1 ∗ ··· ∗   (A | b) ∼  0 0 1 · · · ∗  .. .. .. .. ..  . . . . . 0 0 0 ··· 1

the same. Only the augmented column • • • .. . •

This will have also have a unique solution x, by back substitution.



   .  

(3) =⇒ (4) We can solve Ab1 = e1 , Ab2 = e2 , . . . where e1 , e2 , . . . are the standard coordinate vectors. Let B be the n × n matrix with columns b1 , b2 , . . . The first column of AB will be Ab1 = e1 . The second column will be Ab2 = e2 etc. Thus AB = I.

14.3 Inverses via Gauss–Jordan. Summary

239

It is not surprising that (1) =⇒ (4): (4) is the statement AB = I, whereas (1) is the statement AB = I and BA = I. Amazingly, once AB = I, we shall show that BA is forced to be I (for square matrices); see below. This means (4) =⇒ (1) also. FACT: If C and D are n × n matrices and CD = I then DC = I also.

Consider the nullspace of D. Let x be any element in NS(D), so Dx = 0. Multiply both sides by C on the left to get (CD)x = C(Dx) = C0 = 0. But CD = I (given), so Ix = 0 so x = 0. That is NS(D) = {0}.

Thus property (2) holds for D. We have seen that (2) =⇒ (4), so (4) holds for D. That is, there exists an n × n matrix B with DB = I. So C = CI = C(DB) = (CD)B = IB = B, that is, B = C. So DC = DB = I. Consider the nullspace of D. Let x

be any element in NS(D), so Dx = 0. Multiply both sides by C on the left to get (CD)x = C(Dx) = C0 = 0. But CD = I (given), so Ix = 0 so x = 0. That is NS(D) = {0}.

Thus property (2) holds for D. We have seen that (2) =⇒ (4), so (4) holds for D. That is, there exists an n × n matrix B with DB = I. So C = CI = C(DB) = (CD)B = IB = B, that is, B = C. So DC = DB = I.

We have proved: 14.3.2

Properties equivalent to invertibility

Let A be an n × n matrix. The following statements are all equivalent: 1. A is invertible.

14.3 Inverses via Gauss–Jordan. Summary

240

2. NS(A) = {0}. 3. For every n × 1 vector b the system Ax = b is solvable, with a unique solution x. 4. There exists an n × n matrix B with AB = I. If any one of these statements holds, the other three also hold. If any one fails the other three also fail. 14.3.3

Special case: n equations in n unknowns

Consider the system Ax = b. In the special case that A is n × n and invertible, we may solve this system by multiplication on the left by A−1 to give −1 −1 −1 |A {z A} x = A b ⇒ unique solution x = A b. I

This is another demonstration that (1) =⇒ (3) above: If A is invertible, the system Ax = b has a unique solution for each b.

This is often inaccurately remembered as “n equations in n unknowns have a unique solution”. This statement is only true if A is invertible. The method of multiplying by A−1 only works if A is square and A is invertible. It is also more work to find A−1 than to solve the system using Gaussian elimination. In general, use Gaussian elimination to solve systems. 14.3.4

Properties of inverses

(1) A has at most one inverse (2) If A, B are invertible so is AB, and (AB)−1 = B −1 A−1

(not A−1 B −1 !)

(3) A is invertible if and only if AT is invertible (4) If A is invertible then (AT )−1 = (A−1 )T . 14.3.5

Proof

(1) Suppose B, C are both inverses of A, so AB = BA = I

AC = CA = I.

Then B = BI = B(AC) = (BA)C = IC = C. Therefore the inverse, if it exists, is unique. (2) Suppose A, B have inverses A−1 , B −1 respectively. We have to show that AB has inverse B −1 A−1 . So we just multiply AB by B −1 A−1 , and check that we get the identity matrix I: (AB)B −1 A−1 = A(BB −1 )A−1 = AIA−1 = AA−1 = I

14.3 Inverses via Gauss–Jordan. Summary

241

and similarly (B −1 A−1 )(AB) = I. Therefore AB is invertible with inverse (AB)−1 = B −1 A−1 . (3) and (4) are an exercise:

We prove (4). Assume that A is invertible. We have to show that the inverse of AT is (A−1 )T . To show that one matrix is the inverse of the other, we just multiply them together and check we get the identity. We will need the fact that (AB)T = B T AT . Setting B = A−1 in this, we have (A−1 )T AT = (AA−1 )T = I T = I. Similarly AT (A−1 )T = (A−1 A)T = I T = I. Therefore if A is invertible then AT is invertible with inverse (AT )−1 = (A−1 )T This also proves half of (3), namely A invertible implies A−1 invertible. To prove the converse, just replace A with AT in the above calculation. This concludes the proof of both (3) and (4). We prove (4). Assume that A is invertible. We have to show that the inverse of AT is (A−1 )T . To show that one matrix is the inverse of the other, we just multiply them together and check we get the identity. We will need the fact that (AB)T = B T AT . Setting B = A−1 in this, we have −1 T

) A

−1 T

= I.

= (AA

)

Similarly T

−1 T

A (A Therefore if A is invertible then AT is invertible with inverse

)

−1

= (A

T −1

(A )

−1 T

= (A

)

This also proves half of (3), namely A invertible implies A−1 invertible. To prove the converse, just replace A with AT in the above calculation. This concludes the proof of both (3) and (4).

14.3 Inverses via Gaussâ&#x20AC;&#x201C;Jordan. Summary

Notes

242

15.1 Definition: Determinant

243

Determinants

15.1

Definition: Determinant

Associated to each square matrix A is a number called the determinant of A and denoted |A| or det(A). It is defined as follows. • If A is a 1 × 1 matrix, say A = (a) then |A| is defined to be a. • If A = (aij ) is 2 × 2 then we define

a a |A| =

11 12 a21 a22

We already encountered this in 14.2.1.

• In general if A = (aij ) is n × n, first set

a11 a12

a21 a22

.. ..

. .

Cij = (−1)i+j

ai−1 1 ai−1

ai+1 1 ai+1

. ..

.. .

an1 an2

= a11 a22 − a12 a21 .

.... a1 .... a2 .. . 2 2

j−1 j−1

.... ai−1 .... ai+1 .. . .... an

j−1 j−1

j−1

a1 a2 .. .

j+1 j+1

ai−1 ai+1 .. . an

j+1 j+1

j+1

.... a1n .... a2n .. . .... ai−1 .... ai+1 .. . .... ann

n n

called the cofactor of aij . The (n − 1) × (n − 1) determinant is obtained by omitting the ith row and jth column from A (indicated by the horizontal and vertical lines in the matrix). We then define |A| = a11 C11 + a12 C12 + · · · · · · + a1n C1n which gives a recursive definition of the determinant.

Observe that

(−1)i+j



  gives the pattern   

 + − + − · · · · − + − + · · · ·   + − + − · · · ·    · ··

Thus for a 3 × 3 matrix A we have

a11 a12 a13

|A| =

a21 a22 a23

a31 a32 a33

where

C11

a a =

22 23 a32 a33

C12

= a11 C11 + a12 C12 + a13 C13

a a = −

21 23 a31 a33

C13

a a =

21 22 a31 a32

15.2 Properties of Determinants

15.1.1

244

Example

Find the cofactors C13 and C23 of the matrix



 3 5 7 1 0 2 . 0 3 0

1+3

C13 = (−1)

2+3

C23 = (−1)

15.1.2

M13

M23

1 0

= 3,

=1·

0 3

3 5

= −9. = −1 ·

0 3

C13 = (−1)

1+3

1 M13 = 1 ·

C23 = (−1)

2+3

3 M23 = −1 ·

= 3, 3

= −9. 3

Example

3 5 7

Calculate the determinant

1 0 2

0 3 0

3 5 7

1 0 2

0 3 0

= (1) · (3) · 0 2

3 0

+ (−1) · (5) · 1 2

0 0

= 3(0 − 6) − 5(0 − 0) + 7(3 − 0)

+ (1) · (7) · 1 0

0 3

= 3.

15.2

5 0 3

7 2 0

0 (1) · (3) ·

1 2

+ (−1) · (5) ·

0 0

3(0 − 6) − 5(0 − 0) + 7(3 − 0)

1 2

+ (1) · (7) ·

Properties of Determinants

Property (1): |A| = AT

Consider for example the 2 × 2 case. Then a11 a12 A= a21 a22

⇒

A =

a11 a21 a12 a22

15.2 Properties of Determinants

245

Thus |A| = a11 a22 − a12 a21 = AT . It can be shown that this holds for any square matrix, not just in the 2 × 2 case.

This means that any results about the rows in a general determinant is also true about the columns (since the rows of AT are the columns of A). In particular, any statement about the effect of row operations on determinants is also true for column operations.

Warning: We used row operations to simplify systems of equations, because they do not change the solution. Column operations may change the solution of linear systems, so we should not use column operations on such systems. Property (2): The determinant may by taking cofactors along any row (not just the first) or down any column. Eg. for a 3 × 3 matrix A |A| = a11 C11 + a12 C12 + a13 C13 = a21 C21 + a22 C22 + a23 C23 = a13 C13 + a23 C23 + a33 C33

(definition, expansion along 1st row) (expansion along 2nd row) (expansion down 3rd column) etc.

This is useful if one row or column contains a larger number of zeros. 15.2.1

Examples

3 5 7

(i) Calculate

1 0 2

by (a) expanding down the third column, (b) expanding along the third row.

0 3 0

Which is more efficient?

3 5 7

1 0 2

0 3 0

= 7 1 0 − 2 3 5 = 21 − 18 = 3

0 3

3 7

= −3(6 − 7) = 3 = −3

1 2

(ii) Find

5 0 3

7 2 0

3 0

= 7

− 2

= 21 − 18 = 3 0 3

0 3

3 7

= −3

= −3(6 − 7) = 3 1 2

3 6 9

0 0 0

2 4 5

exp. down 3rd. col. exp. along r3

exp. down 3rd. col.

exp. along r3

15.2 Properties of Determinants

246

3 6 9

0 0 0

2 4 5

= 0 · C21 + 0 · C22 + 0 · C23

= 0

6 0 4

9 0 5

(expanding along r 2 )

0 · C21 + 0 · C22 + 0 · C23

(expanding along r 2 )

In general the determinant of any matrix with a row or column of zeros is equal to zero. 15.2.2

Effect of Elementary Row Operations on Determinants

The calculation of large determinants from the definition is very tedious. It is much easier to calculate the determinant after row reducing a matrix, because the reduced matrix will contain many zeros. However, performing elementary row operations changes the value of a determinant. Luckily the changes are simple: Property (3): A common factor in all entries of a row (or column) can be taken out as a multiplier of the determinant. 15.2.3

Example

3 6 9

0 1 2

3 4 5

= 3C11 + 6C12 + 9C13

(expanding along r 1 )

= 3[C11 + 2C12 + 3C13 ]

1 2 3

= 3 ·

0 1 2

3 4 5

(check!)

Property (4): If 2 different rows (or columns) of A are interchanged (r i ←→ rj or ci ←→ cj ), |A| changes sign. 15.2.4

Example

3 5 7

We have seen

1 0 2

0 3 0

= 3. Interchanging columns 2 and 3 gives

3 7 5

1 2 0

0 0 3

= 3 3 7 (expanding along r 3 )

1 2

= −3.

Let A be a matrix, and let B be the matrix obtained by interchanging two rows or columns. Then det(A) = − det(B). If A has two equal rows (or columns) and these are interchanged then A = B so det(A) = − det(B) = − det(A) so det(A) = 0. Hence:

15.2 Properties of Determinants

247

If A has two equal rows or columns then det(A) = 0. Property (5): If a multiple of one row (or column) is added to another (r i the determinant is unchanged.

r i +αr j or ci

ci +αcj ),

The above 5 properties give a shortcut for evaluating determinants. We can use a sequence of elementary row operations (and/or column operations) to produce a matrix with a row (or column) with at most one non-zero entry. We then expand along this row (or column) to obtain a smaller determinant, and repeat. Important: row (and column) operations can change the value of the determinant, so we must always keep track of which operations we perform. (Unlike doing Gaussian elimination.) 15.2.5

Example

Calculate the determinant

2 1 3 0

5 2 2 2

6 1

3 −2

. 4 4

4 6

5 2 2 2

6 1

0 1 0 5

(r 1 3 −2

1 2 3 −2

= 4 4

0 −4 −5 10

(r 3 4 6 0 2 4 6

2 1 3 0

5 2 2 2

6 3 4 4

1 −2 4 6

= 1

1 2 −4 2

0 5

= (−1) · −4 −5 10

2 4 6

1 0 0

= (−1) · −4 −5 30

2 4 −4

−5 30 = −

4 −4

5 −2 10 6

r 3 − 3r 2 )

(r1

r1 − 2r 2 )

(r3

r3 − 3r 2 )

(expanding down column 1)

(c3

c3 − 5c1 )

(expanding along r 1 )

−1 6 = (−5) · 4 ·

1 −1

= (−1) ·

−4

= (−1) ·

−4

−5 = −

0 3 −5 4

r 1 − 2r 2 )

0 −5 4 0 −5 4

−4

−1 = (−5) · 4 ·

= −20(1 − 6) = 100.

5 10 6

0 30 −4

(expanding down column 1)

(c3

c3 − 5c1 )

(expanding along r 1 )

= −20(1 − 6) = 100. −1

15.2 Properties of Determinants

248

Using the definition with cofactors, to calculate an n × n determinant, we have to find n cofactors, each a (n − 1) × (n − 1) determinant. For each of these we need to find (n − 1) cofactors of size (n − 2) × (n − 2) etc. The total number of calculations is about n · (n − 1) · (n − 2) · · · 3 · 2 = n! For a 25 × 25, this is about 1025 operations. This would take thousands of years, even on a computer. Doing row reductions, only a few thousand calculations are required, which could be done in a fraction of a second on a computer. 15.2.6

Example

1 1 1

Evaluate

a b c

. This matrix is called Vandermonde’s matrix, and it occurs in applications such

a2 b2 c2

as signal processing, error-correcting codes, and polynomial interpolation.

1 1 1 1

1 1

a b c = 0 b−a r 2 − ar1 ) c − a

(r 2

a2 b2 c2 0 b2 − a2 c2 − a2 (r r 3 − a2 r 1 ) 3

b−a c − a

= 2 (expand down first column) b − a2 c2 − a2

Now we extract a common factor from each of columns 1 and 2 giving

1 1

= (b − a)(c − a)

b+a c+a

= (b − a)(c − a)[c + a − (b + a)] = (b − a)(c − a)(c − b).

1 b b2

= 0

0 c2

1 c

b−a =

2 b − a2

1 b−a b2 − a2

c − a

c2 − a 2

1 c−a c2 − a 2

(r 2

(r 3

(expand down first column)

Now we extract a common factor from each of columns 1 and 2 giving

15.2.7

(b − a)(c − a)



a11

···

  a21 a22 0  . .. Suppose A =   .. .   an1 · · · · · ·

···

0 .. . .. . 0 ann

1 b+a

c+a

(b − a)(c − a)[c + a − (b + a)]

(b − a)(c − a)(c − b).

Determinants of Triangular Matrices        

r 2 − ar1 ) 2

r 3 − a r1 )

15.3 Connection with inverses

249

is lower triangular. Then by repeated expansion along r 1 we obtain

a22

= . . . = a11 a22 · · · · · ann , |A| = a11 .

an2 · · · · · · ann

which is the product of the diagonal entries. The same result holds for upper triangular and diagonal matrices. In particular |I| = 1.

15.3

Connection with inverses

An important property of determinants is the following. 15.3.1

Fact (product of determinants)

Let A, B be n × n matrices. Then

15.3.2

|AB| = |A| · |B| .

Theorem: Invertible matrices

A is invertible ⇐⇒ |A| = 6 0. 15.3.3

Proof

=⇒ If A is invertible, then ⇒ ⇐= Follows from 15.3.4 (see below).

I = AA−1

1 = |I| = AA−1 = |A| · A−1 , So |A| = 6 0. =⇒ If A is invertible, then

⇒

I = AA−1

1 = |I| = AA−1 = |A| · A−1 , So |A| = 6 0.

⇐= Follows from 15.3.4 (see below).

Remark: It follows immediately from the proof that if A is invertible, then |A−1 | =

1 , |A|

i.e., the inverse of the determinant is the determinant of the inverse.

15.3 Connection with inverses

15.3.4

250

A Theoretical Formula

There is an explicit formula for the inverse of A. If Cij is the cofactor of the i, j entry of A, let C be the matrix of cofactors C = (Cij ). Then 1 T C . A−1 = |A| Here C T is called the adjoint matrix of A. This formula is useful for some theoretical problems, but is useless as a tool for calculations because it is so time consuming to calculate all the cofactors. Never use this formula for calculating inverses! 15.3.5

Example

In the 2 × 2 case we have and co-factors

C11 = a22 ∴

a a |A| =

11 12 a21 a22 ,

C12 = −a21

|A| = 6 0 ⇒ A−1 =

1 |A|

in agreement with a previous result, 14.2.1.

= a11 a22 − a12 a21

C21 = −a12

C11 C21 C12 C22

1 |A|

C22 = a11

a22 −a12 −a21 a11

15.3 Connection with inverses

Notes

251

16.1 Definition: Cross product

252

Vector Products in 3-Space

16.1

Definition: Cross product

For vectors v, w in R3 , the cross (or vector) product of v and w is a vector v × w such that (i) kv × wk = kvk · kwk sin θ, where θ is the angle between v and w (ii) v × w is perpendicular to v and w, in right hand direction from v× w (right hand rule).

v xw

Figure 57: The vector v × w is perpendicular to both v and w in R3 16.1.1

Application: angular momentum

Vector products also arise naturally in physics and engineering eg. a particle of unit charge moving with velocity v in a magnetic field B experiences a force F = v × B. The angular momentum of a particle of mass m moving with velocity v is given by L = m(r × v) where r is the position vector of the particle. This is important for central force problems. 16.1.2 (1)

Notes on the cross product

For v, w 6= 0

v × w = 0 ←→ sin θ = 0 ←→ θ = 0 or 180◦ ∴ (2) (3)

v × w = 0 ←→ v and w are parallel (or anti-parallel).

v × w = −w × v since they have the same magnitude but opposite direction v × v = 0, since sin 0 = 0.

(4) If v × w 6= 0, then v × w is perpendicular to the plane determined by v and w. This is useful for obtaining the equation of a plane in R3 (see Math 1052). 16.1.3

Example i×i = j×j = k×k = 0 i×j = k = −j × i j × k = i = −k × j k×i = j = −i × k.

16.1 Definition: Cross product

253

Figure 58: This diagram is an aid in remembering the cross product between unit vectors. 16.1.4

Properties of cross product

It can be shown that

v × (w + u) = v × w + v × u (v + w) × u = v × u + w × u v × (αw) = (αv) × w = α(v × w)

α ∈ R.

We have seen that v × w = −w × v so the commutative law fails. Also we do not have the associative law: ie. in general v × (w × u) 6= (v × w) × u For example, (i × i) × k = 0 × k = 0, but i × (i × k) = i × (−j) = −k. Using the above results we can express the cross product of two vectors in terms of their components. 16.1.5

Theorem: Calculating the cross product 

 v1 Suppose v =  v2  , v3

ie.



 w1 w =  w2 . Then w3

v × w = (v2 w3 − v3 w2 )i + (v3 w1 − v1 w3 )j + (v1 w2 − v2 w1 )k

j k

v1 v3

v v 2 3

j + v1 v2

i−

v × w =

v1 v2 v3

w1 w2

w1 w3 w2 w3

w1 w2 w3

can be evaluated by expanding above determinant along the first row.

16.1 Definition: Cross product

16.1.6

254

Proof

This follows directly by expanding: v × w = (v1 i + v2 j + v3 k) × (w1 i + w2 j + w3 k) = v1 i × (w1 i + w2 j + w3 k)

+ v2 j × (w1 i + w2 j + w3 k)

+ v3 k × (w1 i + w2 j + w3 k) = v1 w2 (i × j) − v1 w3 (k × i)

− v2 w1 (i × j) + v2 w3 (j × k)

+ v3 w1 (k × i) − v3 w2 (j × k) = (v2 w3 − v3 w2 )i − (v1 w3 − v3 w1 )j + (v1 w2 − v2 w1 )k.

This follows directly by expanding: v × w = (v1 i + v2 j + v3 k) × (w1 i + w2 j + w3 k) = v1 i × (w1 i + w2 j + w3 k) + v2 j × (w1 i + w2 j + w3 k) + v3 k × (w1 i + w2 j + w3 k) = v1 w2 (i × j) − v1 w3 (k × i) − v2 w1 (i × j) + v2 w3 (j × k) + v3 w1 (k × i) − v3 w2 (j × k) = (v2 w3 − v3 w2 )i − (v1 w3 − v3 w1 )j + (v1 w2 − v2 w1 )k.

16.1.7

Example

Let v = (1, 2, 3), w = (4, 5, 6). Find a (non-zero) vector orthogonal to both v and w.

16.2 Application: area of a triangle

The vector v × w will do:

255

i j k

v × w =

1 2 3

4 5 6

2 3

1 3

1 2

−j

= i

5 6

4 6

4 5

= i(12 − 15) − j(6 − 12) + k(5 − 8) = −3i + 6j − 3k.

This is orthogonal to both v and w.

The vector v × w will do:

v×w

= = =

i j k

1 2 3

4 5 6

−j 1 i

4 5 6

1 3

+ k

i(12 − 15) − j(6 − 12) + k(5 − 8)

−3i + 6j − 3k.

This is orthogonal to both v and w.

16.2

Application: area of a triangle

Find area of the ∆ABC with vectors v, w along edges as shown in Figure 59.

θ A

Figure 59: Find the area of the triangle ABC

Area = =

Area

1 base · perpendicular height 2 1 1 kvk · kwk sin θ = kv × wk . 2 2 = =

1 2 1 2

base · perpendicular height kvk · kwk sin θ =

1 2

kv × wk .

16.2 Application: area of a triangle

16.2.1

256

Example

Find the area of a triangle with vertices A = (0, 1, 3), B = (1, 2, 1), C = (4, 1, 0) and obtain a vector perpendicular to the plane of ∆ABC.



 1 −→ w = AB =  1  , −2



 4 −→ v = AC =  0  −3

Hence a vector perpendicular to the plane of ∆ABC is

i j k

v × w =

4 0 −3

= 3i + 5j + 4k.

1 1 −2

Area of the triangle

1 3i + 5j + 4k 2 1√ 9 + 25 + 16 2 1√ 50 2√ 5 2 . 2

1 kv × wk = 2 = = = −→



w = AB = 

 1 1 , −2

−→



v = AC = 

 4 0  −3

Hence a vector perpendicular to the plane of ∆ABC is

v × w =

j 0 1

k −3 −2

= 3i + 5j + 4k.

Area of the triangle 1 2

kv × wk

= = = =

1 3i + 5j + 4k 2 1√ 9 + 25 + 16 2 1√ 50 2 √ 5 2 . 2

16.2 Application: area of a triangle

Notes

257

17.1 Ranking Webpages

17 17.1

258

Eigenvalues and eigenvectors Ranking Webpages

Web search engines assign a number to each webpage, called the pagerank. The higher the pagerank, the higher that webpage is listed in websearches. If a webpage links to 3 other pages, it passes one third of its rank to each of those pages. The pagerank of a webpage is calculating by summing all the contributions from in-linking pages. 17.1.1

Example

Four webpages link to each as follows: 1 o

/ 2 O ^â?&#x192;â?&#x192; @ O â?&#x192;â?&#x192; â?&#x192;â?&#x192; â?&#x192;â?&#x192; / 4 3

Let the ranks of the pages be r1 , r2 , r3 , r4 . The only page linking in to 4 is page 3. Page 3 links to 3 pages (1, 2, 4), so it passes 1/3 of its rank on to each. So 1 r4 = r3 . 3 The rank of page 3 is given by:

The only page linking in to 3 is page 1. Page 1 links to 2 pages (2, 3), so it passes 1/2 of its rank on to each. So 1 r3 = r1 . 2 The only page linking in to 3 is page 1. Page 1 links to 2 pages (2, 3), so it passes 1/2 of its rank on to each. So

r3 =

1 2

r1 .

The pages linking in to page 2 are 1, 3 and 4. So

1 1 r2 = r1 + r3 + 2 3 r1 gets weight 1/2 because page 1 links to 2 pages. r3 gets r2 =

1 2

r1 +

1 3

r3 +

1 2

1 r4 . 2 weight 1/3 because page 3 links to 3 pages.

r4 .

r1 gets weight 1/2 because page 1 links to 2 pages. r3 gets weight 1/3 because page 3 links to 3 pages.

Altogether we get the following system of equations:

17.1 Ranking Webpages

259

r1 = r2 =

1 2 r1

r3 =

1 2 r1

r2 +

1 3 r3

1 2 r4

1 3 r3

1 2 r4

1 3 r3

r4 = r1

1r 2 1

1r 3 3

1r 2 4

1r 3 3

1r 2 4

1r 3 3

This can be written as Ar = r where



    A=     

    A=    

0 1

1 3

1 2

1 3

1 2

0 0

      0    1 2

1 3

0 0



1 2

1 3

1 2

1 3

1 2

1 3

1 2 1 2

         

Solving this system (exercise) gives r1 = 12, r2 = 9, r3 = 6, r4 = 2 is a solution, so page 1 is ranked highest, and page 4 lowest.

In general, the pagerank of v is calculated by adding the weight of all the pages w1 , . . . , wn that link in to v: X r(wj ) . r(v) = nj all pages w1 , . . . , wn that link to v

where page wj links to nj pages. Problem: to find r(v) we need to know r(w1 ), . . . , r(wn ). How do we ever get started? Idea: given a set of N webpages w1 , . . . , wN , create the N × N matrix A = (aij ), where ( 1 , if page wj links to wi aij = nj 0, otherwise Let



  r= 

r1 r2 .. . rN



  . 

17.2 Geometry of eigenvectors

260

Then the pagerank equation is Ar = r.

(17.1)

This is an N × N system, but not one that we have seen before because the RHS is not constant. The unknown vector r appears on both sides of the equation. Equation (17.1) is called an eigenvector equation. The vector r is called an eigenvector of the matrix A. Instead of the equation Ax = x, sometimes we often encounter equations like Ax = 2x. This is called an eigenvector problem with eigenvalue 2. Equation (17.1) is a special case with eigenvalue 1.

17.2

Geometry of eigenvectors

Consider the matrix A = 1 . j 7→ 2

A maps i 7→

2 1

and j 7→

2 1 1 2

1 2

. This acts as a linear transformation, mapping i 7→

Draw a vector that is unmoved under the action of A.

2 1

and

17.2 Geometry of eigenvectors

The vector

Check: A

−1 1

261

is unchanged.

2 1 1 2

For most vectors Av 6= v, but

The vector

Check: A

−1 1

is special for the matrix A. It is called an eigenvector of A.

is unchanged.

2 1

1 2

For most vectors Av 6= v, but

−1 1

is special for the matrix A. It is called an eigenvector of A.

Instead of Ax = x we often encounter equations like Ax = 2x. Then 2 is called an eigenvalue, and x a corresponding eigenvector. Above we looked at the special case of eigenvalue 1.

17.3 Eigenvalues & Vectors

17.2.1

262

Other applications

Eigenvalues and eigenvectors have many other applications e.g. in Quantum Mechanics the energy of a system is represented by a matrix whose eigenvalues give the allowed energies. This is why discrete energy levels are observed. Another situation where they are used is in modelling the population dynamics of a predator-prey system in biology. The eigenvalues and eigenvectors determine the stability and dynamics of a population in such models. There are many other applications in Mathematics, Physics, Chemistry and Biology.

17.3

Eigenvalues & Vectors

Let A be an n × n matrix and consider the equation Ax = λx

(17.2)

with λ a number (possibly complex). We are given the matrix A, and need to find λ and x. Obviously x = 0 is always a solution to (17.2) (for any λ). A value λ for which (17.2) has solution x 6= 0 is called an eigenvalue of A. The corresponding solution vector x 6= 0 is called an eigenvector of A corresponding to the eigenvalue λ. Note that λ is allowed to be 0, but x is not allowed to be 0. 17.3.1

Example

1 3 4 2 eigenvector x.

Let A =

, λ = −2 and x =

−1 1

. Show that λ is an eigenvalue for A, with corresponding

We have to check that Ax = λx. −1 2 −1 1 3 = λx. = −2 Ax = = 1 −2 1 4 2 3 . We have to check that Ax = λx. Check that another eigenvalue is λ = 5 with corresponding eigenvector 4 Ax =

1 4

3 2

−1 1

Check that another eigenvalue is λ = 5 with corresponding eigenvector

17.3.2

3 4

2 −2

= −2

−1 1

= λx.

Note

Usually Ax is related to x in a much more complicated way, 1 3 1 7 1 e.g. = 6= λ . 4 2 2 8 2

17.4 How to find eigenvalues

17.4

263

How to find eigenvalues

We have

Ax Ax Ax − λIx (A − λI)x

⇐⇒ ⇐⇒ ⇐⇒

= λx = λ(Ix) = 0 = 0.

We seek a non-zero solution x of this equation.

Recall that if B is an n × n matrix and det B 6= 0 then Bx = 0 has a unique solution, which must be x = 0 (§ 14.3.2 and 15.3.2). Since we want a non-zero solution x above, we must have det(A − λI) = 0. Thus A has eigenvalue λ only if |A − λI| = 0. 17.4.1

(17.3)

Definition: Characteristic polynomial, characteristic equation

The determinant p(λ) = |A − λI|

is a polynomial of degree n in λ called the characteristic polynomial of A. We call equation (17.3) the characteristic equation. Thus the eigenvalues of A are given by the roots of the characteristic equation (17.3). This means that A has at least 1 and at most n distinct eigenvalues (possibly complex). The corresponding eigenvectors may then be obtained for each λ in turn by solving (A − λI)x = 0 using Gaussian elimination. 17.4.2

Example

Find the eigenvalues and eigenvectors of A=

We have A − λI = So

−5 2 2 −2

−5 − λ 2 2 −2 − λ

−5 − λ 2 p(λ) = |A − λI| =

2 −2 − λ

= (λ + 5)(λ + 2) − 4

= λ2 + 7λ + 6 = (λ + 6)(λ + 1).

Solving p(λ) = 0 yields the eigenvalues λ1 = −1, λ2 = −6. A − λI = So

−5 − λ 2

−5 − λ p(λ) = |A − λI| =

2 −2 − λ

We have

= (λ + 5)(λ + 2) − 4 −2 − λ

= λ2 + 7λ + 6 = (λ + 6)(λ + 1).

Solving p(λ) = 0 yields the eigenvalues λ1 = −1, λ2 = −6.

17.4 How to find eigenvalues

264

To get the corresponding eigenvectors, solve (A − λi )x = 0 for i = 1, 2. 1 − 12 0 1 − 21 0 −4 2 0 ∼ ∼ λ1 = −1 : 2 −1 0 2 −1 0 0 0 0 with general solution x=

α 2α

1 1 =α ⇒ is an eigenvector corresponding 2 2 to the eigenvalue λ = −1;

1 2 0 0 0 0 −2α −2 =α . ⇒ general solution x = α 1 −2 is an eigenvector corresponding to eigenvalue λ = −6. 1 λ2 = −6 :

Hence

1 2 0 2 4 0

To get the corresponding eigenvectors, solve (A − λi )x = 0 for i = 1, 2. −4 2 0 1 λ1 = −1 : ∼ 0 2 −1 2

∼

−1 2 −1

0 0

∼

1 0

−1 2 0

0 0

with general solution x=

2α

1 1 is an eigenvector corresponding ⇒ =α 2 2 to the eigenvalue λ = −1;

λ2 = −6 :

1 2

2 4

0 0

⇒ general solution x = Hence

17.4.3

−2 1

∼

−2α α

1 0

2 0

0 0

−2 . =α 1

is an eigenvector corresponding to eigenvalue λ = −6.

Example

Find the eigenvalues and eigenvectors of A =

0 2 −2 0

17.4 How to find eigenvalues

265

−λ 2 p(λ) = |A − λI| =

−2 −λ

= λ2 + 4.

Eigenvalues: p(λ) = 0 ⇒ λ2 + 4 = 0 ⇒ λ = ±2i ⇒ λ1 = 2i; λ2 = −2i. In this case A has two complex λ = ±2i, exactly as before, using to λ1 = 2i is −2i −2 ⇒

and

−i 1

eigenvalues λ = ±2i. To obtain eigenvectors solve (A − λI)x = 0, complex numbers. The augmented matrix (A − λI | 0) corresponding 2 0 −2i 0

∼ ⇒

x1 + ix2 = 0

−2i 2 0 0 0 0

∼

1 i 0 0 0 0

x1 = −ix2 = −iα

x2 = α −iα −i =α is the general solution, ⇒x = α 1

is an eigenvector corresponding to λ1 = 2i.

For λ = −2i:

⇒ ⇒

−λ p(λ) = |A − λI| =

−2

iα α

2i 2 0 −2 2i 0

∼

1 −i 0 0 0 0

x1 − ix2 = 0 ⇒ x1 = ix2 = iα

=α

i 1

x2 = α

is the general solution, and

is an eigenvector for λ = −2i.

i 1

= λ2 + 4. −λ

Eigenvalues: p(λ) = 0 ⇒ λ2 + 4 = 0 ⇒ λ = ±2i ⇒ λ1 = 2i; λ2 = −2i. In this case A has two complex eigenvalues λ = ±2i. To obtain eigenvectors solve (A − λI)x = 0, λ = ±2i, exactly as before, using complex numbers. The augmented matrix (A − λI | 0) corresponding to λ1 = 2i is

−2i −2 ⇒

and

−i 1

2 −2i

0 0

x1 + ix2 = 0

∼

⇒

−2i 0

2 0

0 0

∼

1 0

i 0

0 0

x1 = −ix2 = −iα

x2 = α −i −iα is the general solution, ⇒x = =α 1 α

is an eigenvector corresponding to λ1 = 2i.

For λ = −2i:

2i −2

2 2i

0 0

∼

1 0

−i 0

0 0

⇒

x1 − ix2 = 0 ⇒ x1 = ix2 = iα x2 = α i i iα is the general solution, and =α x= 1 1 α is an eigenvector for λ = −2i.

17.4.4

Check

Solving the system (A − λI)x = 0 to find eigenvectors always leads to at least one row of zeros when A − λI is row reduced. This is because λ is chosen so that (A − λI)x = 0 has a solution x 6= 0. So A − λI is not invertible and so A − λI does not reduce to I.

17.4 How to find eigenvalues

Notes

266

18.1 Linear combinations

267

Vector Spaces

In this chapter we consider sets of vectors with special properties, called vector spaces. Recall that if A and B are sets then A ⊆ B means A is a subset of B. The set with no elements, called the empty set is denoted ∅. This is not the same as the set {0} whose single element is the zero vector.

18.1

Linear combinations

x x = x i + y j. In R3 every can be written using the vectors i and j: y y vector can be written using i, j, k. We generalize this idea.

In R2 , every vector

18.1.1

Definition: Linear combination

If v 1 , v 2 , ..., v m ∈ Rn , a vector of the form v = α1 v 1 + α2 v 2 + · · · + αm v m , is called a linear combination of the vectors v1 , v 2 , ..., v m . 18.1.2

Example

Show that each of the vectors

is a linear combination of

1 0 , w2 = w1 = 0 1 2 3 and v 2 = . v1 = 2 2

αi ∈ R

18.1 Linear combinations

268

Write w1 = α1

2 3 + α2 . 2 2

Equating components: 2α1 + 3α2 = 1, 2α1 + 2α2 = 0 Solving, α2 = 1 and α1 = −1. So w1 =

1 2 3 = (−1) +1 0 2 2

= −v 1 + v 2 Similarly we get

0 3 3 2 = = −1 1 2 2 2 = 2v 1 − v 2 .

Write w1 = α1

2 2

+ α2

3 2

Equating components: 2α1 + 3α2

2α1 + 2α2

= (−1)

Solving, α2 = 1 and α1 = −1. So

−v 1 + v 2

3 +1 2

Similarly we get

18.1.3

2v 1 − v 2 .

3 2

2 2

3 −1 2

Example 

  Every v =  

v1 v2 .. . vn



   ∈ Rn is a linear combination of the coordinate vectors ei :  v = v1 e1 + v2 e2 + · · · + vn en .

(18.1)

18.3 How to test for linear independence

18.2 18.2.1

269

Linear independence Definition: Linear independence

Consider the linear combination Îą1 v 1 + Îą2 v 2 + Âˇ Âˇ Âˇ + Îąm v m with Îą1 = Îą2 = Âˇ Âˇ Âˇ = Îąn = 0. Obviously this gives the 0 vector. A set of vectors S = {v 1 , v 2 , ..., v m } â&#x160;&#x2020; Rn is linearly dependent if there exist scalars Îą1 , ..., Îąm not all zero such that Îą1 v 1 + Îą2 v 2 + Âˇ Âˇ Âˇ + Îąm v m = 0. S is called linearly independent if no such scalars exist, i.e. if the only linear combination adding up to 0 is with all the scalars 0. So S is linearly independent if Îą1 v 1 + Îą2 v 2 + Âˇ Âˇ Âˇ + Îąm v m = 0

18.3

â&#x2021;&#x2019;

Îą1 = Îą2 = Âˇ Âˇ Âˇ = Îąm = 0

How to test for linear independence

Given vectors v 1 , . . . , v m , solve the vector equation Îą1 v 1 + Âˇ Âˇ Âˇ + Îąm v m = 0. If the only solution is Îą1 , . . . , Îąm = 0 then the vectors are linearly independent. If there is any other solution, the vectors are linearly dependent. 18.3.1

Example

Show that in

R2 ,

the vectors

Îą

1 1

Îą

18.3.2

1 â&#x2C6;&#x2019;1

are linearly independent.

1 0 Îą+Î˛ 0 +Î˛ = â&#x2021;&#x2019; = â&#x2C6;&#x2019;1 0 Îąâ&#x2C6;&#x2019;Î˛ 0 Îą+Î˛ =0 â&#x2021;&#x2019; â&#x2021;&#x2019; Îą = Î˛ = 0. Îąâ&#x2C6;&#x2019;Î˛ =0

1 1

+Î˛

0 Îą+Î˛ 0 1 = â&#x2021;&#x2019; = 0 Îąâ&#x2C6;&#x2019;Î˛ 0 â&#x2C6;&#x2019;1 Îą+Î˛ = 0 â&#x2021;&#x2019; Îą = Î˛ = 0. â&#x2021;&#x2019; Îąâ&#x2C6;&#x2019;Î˛ =0

Example

Show that in Rn the coordinate vectors ei are linearly independent.

18.3 How to test for linear independence



  ⇒  

α1 0 .. . 0

270

· · · + α n en =  α1 e1+ α2 e2 +  0 0 0 α1   α2   ..   α2        +  ..  + · · · +  .  =  ..   .   0   . 

αn

⇒ α1 = 0, α2 = 0, . . . , αn = 0.



  ⇒   

α1 0 . . . 0





      +  

α1 e1 + α2 e2 + · · · + αn en    0 0  .  α2      + · · · +  ..  = .    .   0  . αn 0

=0  α1  α2    ..  . αn

    =  

 0  0    = .  .   .  0 





0 0 .. . 0



  . 



   .  

⇒ α1 = 0, α2 = 0, . . . , αn = 0.

18.3.3

Example 

     1 1 0 Show that in R3 , the vectors  3 ,  1 ,  1  are linearly dependent. 1 1 0 

    1 1 0 α1  3  + α2  1  + α3  1 1 1 0   α1 + α2 ⇒ 3α + α2 + α3  1 α1 + α2

 0 = 0  0 



=0 =0 =0

Note that the first and third equations are the same, so we effectively only have 2 equations in 3 unknowns. α2 = −α1 and α3 = −3α1 − α2 = −2α1 . α1 is not determined: it can be anything. For example if we choose α1 = 1 then α2 = −1 and α3 = −2. Check:         1 1 0 0  3  −  1  − 2 1  =  0 . 1 1 0 0 We have found a non-trivial linear combination summing to 0, so the vectors are linearly dependent.         0 0 1 1 α1  3  + α2  1  + α3  1  =  0  0 0 1 1

⇒

  

α1 + α2 3α1 + α2 + α3 α1 + α2

=0 =0 =0

18.3.4

Example

Show that any finite set of vectors containing the zero vector 0 is linearly dependent.

18.4 Invertible matrices and linear independence

271

Write S = {v 1 , ..., v m , 0} ⊆ Rn . Then 0 · v1 + · · · + 0 · vm + 1 · 0 = 0 ⇒

S linearly dependent.

Write S = {v 1 , ..., v m , 0} ⊆ Rn . Then 0 · v1 + · · · + 0 · vm + 1 · 0 = 0

⇒

S linearly dependent.

18.3.5

Special cases of linear dependence

• If two vectors v 1 , v 2 are linearly dependent, then there exist scalars α1 , α2 ∈ R, not both 0, with α1 v 1 + α2 v 2 = 0. 1 If α2 6= 0 then v 2 = −α α2 v 1 , so v 2 is a scalar multiple of v 1 . If α2 = 0 then α1 6= 0 so v 1 = 0 = 0v 2 , and v 1 is a scalar multiple of v 2 .

So two vectors are linearly dependent if and only if one of them is a multiple of the other. (ii) If one vector v is linearly dependent then αv = 0 with α 6= 0, so v = 0. Thus a single vector is linearly dependent if and only if it is the zero vector.

18.4

Invertible matrices and linear independence

There is a relation between the nullspace of A and between linear independence of the columns of A.   b1  b2   Suppose b ∈ NS(A). Let b =   · · · . That is b = b1 e1 + b2 e2 + · · · + bn en . bn Since b ∈ NS(A), Ab = 0 so

0 = Ab = A b1 e1 + b2 e2 + · · · + bn en

= b1 (Ae1 ) + b2 (Ae2 ) + · · · + bn (Aen ).

We saw earlier that Aej is the jth column vector of A. So the right hand side is a linear combination among the columns of A. Thus: if there is a vector b ∈ NS(A) with b 6= 0 then there is a non-trivial linear combination of columns of A adding up to 0, so the columns of A are linearly dependent. Conversely, if the columns of A are dependent there is a non-trivial linear combination of columns of A adding up to 0, say b1 (Ae  1) + b2 (Ae2 ) + · · · + bn (Aen ) = 0. This gives a non-zero vector b = b1  b2   b1 e1 + b2 e2 + · · · + bn en =   · · ·  in NS(A). bn

Thus: NS(A) = {0} if and only if the columns of A are linearly independent. But recall (§14.3.2) that A is invertible exactly when NS(A) = {0}. We have proved:

18.4 Invertible matrices and linear independence

18.4.1

272

Theorem (existence of inverse)

Let A be a square matrix. Then A is invertible if and only if the columns of A are linearly independent.

Notice this means swapping the columns of a matrix does not affect whether or not it is invertible.

Furthermore, A is invertible if and only if the rows of A are linearly independent:

A is invertible ⇐⇒ AT is invertible

⇐⇒ columns of AT are linearly independent

⇐⇒ rows of A are linearly independent. 18.4.2

Example

From example 18.3.3, the columns of

 1 1 0 A= 3 1 1  1 1 0 

are linearly dependent. Therefore A is not invertible. 18.4.3

Example

From example 18.11.1 the columns of

 1 1 1 A= 0 1 1  0 0 1 

are linearly independent, and therefore A is invertible.. 18.4.4

Invertibility and the Nullspace

18.4.5

Summary of conditions equivalent to invertibility

Let A be an n × n matrix. Give a summary of all the conditions we have found that are equivalent to A being invertible.

18.5 Vector spaces

273

1. A is invertible. 2. There exists an n Ă&#x2014; n matrix X with AX = XA = I 3. AT is invertible

[Â§ 14.3.4(3)].

4. The columns of A are linearly independent 5. The rows of A are linearly independent

[Â§ 18.4.1].

6. Ax = b has a unique solution x for any b â&#x2C6;&#x2C6; R 7. N (A) = {0}

[Definition of invertible, Â§ 14.2].

[Â§ 14.3.2 ].

8. The row reduced form of A is I 9. det(A) 6= 0

[Â§ 13.4].

[Theorem 15.3.2].

1. A is invertible. 2. There exists an n Ă&#x2014; n matrix X with AX = XA = I 3. AT is invertible

[Â§ 14.3.4(3)].

4. The columns of A are linearly independent 5. The rows of A are linearly independent

18.5

[Â§ 14.3.2 ].

8. The row reduced form of A is I 9. det(A) 6= 0

[Â§ 18.4.1].

6. Ax = b has a unique solution x for any b â&#x2C6;&#x2C6; R 7. N (A) = {0}

[Definition of invertible, Â§ 14.2].

[Â§ 13.4].

[Theorem 15.3.2].

Vector spaces

We now consider sets of vectors with certain properties, called vector spaces. For example: x â&#x20AC;˘ V = | x + y = 2 . The set V is a set of (infinitely many) vectors in R2 . y â&#x20AC;˘ The set containing just the zero vector {0}. This is a set with just one vector. â&#x20AC;˘ W =

x y

= 0 . Since x2 , y 2 â&#x2030;Ľ 0, the only vector in W is 0, so W = {0}.

â&#x20AC;˘ Another special set is the empty set, not containing any vectors at all(!) Eg Y = There are no vectors in Y . ďŁąďŁŤ ďŁś ďŁź

ďŁ˛ a ďŁ˝

ďŁ ďŁ¸ â&#x20AC;˘ Let V = b

a + b = 2cďŁž . ďŁł c ďŁŤ ďŁś ďŁŤ ďŁś ďŁŤ ďŁś 1 2 1 ďŁ ďŁ ďŁ ďŁ¸ ďŁ¸ â&#x20AC;˘ Is 0 â&#x2C6;&#x2C6; V ? No: 1 + 0 6= 2 Âˇ 3. But v = 1 , w = 4 ďŁ¸ â&#x2C6;&#x2C6; V : 1 3 3 ďŁŤ ďŁś 3 Is v + w â&#x2C6;&#x2C6; V ? Yes: v + w = ďŁ 5 ďŁ¸ and 3 + 5 = 2 Âˇ 4. 4

x y

= â&#x2C6;&#x2019;1 .

1 + 1 = 2 Âˇ 1 and 2 + 4 = 2 Âˇ 3.

18.5 Vector spaces

18.5.1

274

Definition: Vector space, subspace

A subset V ⊆ Rn is called a vector space if it satisfies the following (o) V is non-empty. (i) If v ∈ V and w ∈ V then v + w ∈ V

(closure under vector addition).

(ii) If v ∈ V and α ∈ R then αv ∈ V (closure under scalar multiplication). If V is a vector space, W ⊆ V and W is also a vector space, we call W a subspace of V . Note: V is a subspace of itself. If V ⊆ Rn is a vector space and v ∈ V , then by property (ii) 0 = 0v ∈ V that is, the zero vector belongs to every vector space. 18.5.2

Example

Show that

is a vector space.

    a

 V =  b 

a + b = 2c , a, b, c ∈ R   c

(o) V is non-empty V.  since 0 ∈   a1 a2 Let v =  b1  , w =  b2  be in V . c1 c2   a1 + a2 (i) v + w =  b1 + b2  c1 + c2 v ∈ V ⇒ a1 + b1 = 2c1

w ∈ V ⇒ a2 + b2 = 2c2 .

Is v + w in V ? (a1 + a2 ) + (b1 + b2 ) = (a1 + b1 ) + (a2 + b2 ) = 2c1 + 2c2 = 2(c1 + c2 ) ⇒

v + w ∈ V . So V is closed under addition.

(o) V is non-empty since 0 ∈ V.    a1 a2 Let v =  b1  , w =  b2  be in V . c1 c2   a1 + a2 (i) v + w =  b1 + b2  c1 + c2

v ∈ V ⇒ a1 + b1 = 2c1

w ∈ V ⇒ a2 + b2 = 2c2 . Is v + w in V ? (a1 + a2 ) + (b1 + b2 ) = (a1 + b1 ) + (a2 + b2 ) = 2c1 + 2c2 = 2(c1 + c2 ) ⇒

v + w ∈ V . So V is closed under addition.

18.5 Vector spaces



 a1 Consider v =  b1  , c1

275



 a2 w =  b2  ∈ V . c2   αa1 (ii) If α is any scalar then αv =  αb1  αc1

αa1 + αb1 = α(a1 + b1 ) = α2c1 = 2(αc1 )

⇒ αv ∈ V. So V is closed under scalar multiplication. Hence V is a vector space.

  a1 Consider v =  b1  , c1

  a2 w =  b2  ∈ V . c2

  αa1 (ii) If α is any scalar then αv =  αb1  αc1 αa1 + αb1 = α(a1 + b1 ) = α2c1 = 2(αc1 ) ⇒ αv ∈ V. So V is closed under scalar multiplication. Hence V is a vector space.

18.5.3

Example

Is V = {0} a vector space?

(o) 0 ∈ V ⇒ V is non-empty. (i) If v, w ∈ V then v = w = 0 and v + w = 0 ∈ V . (ii) If α ∈ R, then α0 = 0 ∈ V . (o) 0 ∈ V ⇒ V is non-empty. (i) If v, w ∈ V then v = w = 0 and v + w = 0 ∈ V . (ii) If α ∈ R, then α0 = 0 ∈ V .

18.5.4

Example

Show that V = Rn is a vector space.

18.5 Vector spaces

276

(o) 0 â&#x2C6;&#x2C6; V (i) v, w â&#x2C6;&#x2C6; V â&#x2021;&#x2019; v + w â&#x2C6;&#x2C6; Rn = V (ii) v â&#x2C6;&#x2C6; V â&#x2021;&#x2019; Î±v â&#x2C6;&#x2C6; Rn = V So V is a vector space. (o) 0 â&#x2C6;&#x2C6; V (i) v, w â&#x2C6;&#x2C6; V â&#x2021;&#x2019; v + w â&#x2C6;&#x2C6; Rn = V (ii) v â&#x2C6;&#x2C6; V â&#x2021;&#x2019; Î±v â&#x2C6;&#x2C6; Rn = V So V is a vector space.

18.5.5 Is V =

Example

1 2

a vector space?

No, since 0 â&#x2C6;&#x2C6; / V. No, since 0 â&#x2C6;&#x2C6; / V.

18.5.6 Is V =

Example 1 Î± | Î± â&#x2C6;&#x2C6; R a vector space? 2

18.5 Vector spaces

277

(o) α = 0 ⇒ 0 ∈ V . (i) If v, w ∈ V then v = α

1 2

and w = β

(ii) If γ ∈ R, then γv = γα

1 2

for suitable α, β ∈ R.

β v+w = + 2β α+β = 2(α + β) 1 = (α + β) · ∈ V. 2

Then

α 2α

∈V.

Therefore V is a vector space. (o) α = 0 ⇒ 0 ∈ V . (i) If v, w ∈ V then v = α

1 2

and w = β

1 2

for suitable α, β ∈ R.

Then

v+w

(ii) If γ ∈ R, then γv = γα

1 2

α 2α

β 2β

α+β 2(α + β) 1 ∈ V. (α + β) · 2

∈ V.

Therefore V is a vector space.

Geometrically, all these vectors can be viewed as lying along a single line through the origin. 18.5.7

Example

Show that for v 1 , v 2 ∈ R3 ,

V = {α1 v 1 + α2 v 2 | α1 , α2 ∈ R}

is a vector space.

(o) If we choose α1 = α2 = 0 we see that 0v 1 + 0v 2 = 0 ∈ V , so V is non-empty. (i) Let v, w ∈ V ⇒ there exists α1 , α2 , β1 , β2 such that v = α1 v 1 + α2 v 2 and w = β1 v 1 + β2 v2 . Then v + w = (α1 + β1 )v 1 + (α2 + β2 )v 2 ∈ V . (ii) For any scalar γ ∈ R, γv = γα1 v 1 + γα2 v 2 ∈ V . Therefore V is a vector space. (o) If we choose α1 = α2 = 0 we see that 0v 1 + 0v 2 = 0 ∈ V , so V is non-empty. (i) Let v, w ∈ V ⇒ there exists α1 , α2 , β1 , β2 such that v = α1 v 1 + α2 v 2 and w = β1 v 1 + β2 v 2 . Then v + w = (α1 + β1 )v 1 + (α2 + β2 )v 2 ∈ V . (ii) For any scalar γ ∈ R, γv = γα1 v 1 + γα2 v 2 ∈ V . Therefore V is a vector space.

18.6 Eigenspace

278

If we draw v 1 and v 2 starting from the origin, then all these combinations α1 v1 + α2 v 2 will lie in a single plane in R3 , through the origin. For example, if v 1 = i and v 2 = j then V = {α1 i + α2 j | α1 , α2 ∈ R} is the set of vectors with no k-component i.e. the xy plane. Thus geometrically, a vector space can be a single point {0}, or can be all vectors along a single line through the origin, or all vectors lying in a single plane through the origin, or similarly in higher dimensions. Theorem: If A is a square matrix, then NS(A) is a vector space. Proof:

(o) A0 = 0

⇒ 0 ∈ NS(A)

⇒ NS(A) is non-empty.

(i) v, w ∈ NS(A) ⇒ Av = 0, Aw = 0 ⇒ A(v + w) = Av + Aw = 0 + 0 = 0

⇒ v + w ∈ NS(A)

(ii) If v ∈ NS(A), α ∈ R then Av = 0 ⇒ A(αv) = αAv = α0 = 0 ⇒ αv ∈ NS(A). Hence NS(A) is a vector space. (o) A0 = 0

⇒ 0 ∈ NS(A)

⇒ NS(A) is non-empty.

(i) v, w ∈ NS(A) ⇒ Av = 0, Aw = 0 ⇒ A(v + w) = Av + Aw = 0 + 0 = 0

⇒ v + w ∈ NS(A)

(ii) If v ∈ NS(A), α ∈ R then Av = 0 ⇒ A(αv) = αAv = α0 = 0 ⇒ αv ∈ NS(A). Hence NS(A) is a vector space.

This explains why either NS(A) = {0} or else it is infinite. If there is any solution v of Av = 0, there will be infinitely many solutions. This is why a consistent linear system can have one solution, or infinitely many, depending on the size of NS(A).

18.6 18.6.1

Eigenspace Proposition (eigenspace is a vector space)

Let A be a square matrix. Let λ be an eigenvalue of A. Then the set of solutions of the equation Ax = λx forms a vector space called the eigenspace of A corresponding to eigenvalue λ, denoted Nλ . So Nλ = {x ∈ Rn | Ax = λx}. Special case: if λ = 0 is an eigenvalue of A then N0 = {x ∈ Rn | Ax = 0} = NS(A), the nullspace of A as above.

18.7 The span of a set of vectors

18.6.2

279

Proof

(o) Note that 0 is always a solution to Ax = λx, so 0 ∈ Nλ . Take x, y ∈ Nλ , α ∈ R. (i) Then A(x + y)

Ax + Ay

λx + λy

λ(x + y)

⇒ x + y ∈ Nλ . (ii) A(αx)

αAx

αλx

λ(αx)

⇒ αx ∈ Nλ . Hence the set of solutions to Ax = λx forms a vector space. (o) Note that 0 is always a solution to Ax = λx, so 0 ∈ Nλ . Take x, y ∈ Nλ , α ∈ R. (i) Then A(x + y)

Ax + Ay

λx + λy

λ(x + y)

⇒

x + y ∈ Nλ .

(ii) A(αx)

αAx

αλx

λ(αx)

⇒

αx ∈ Nλ .

Hence the set of solutions to Ax = λx forms a vector space.

This is why when we found eigenvectors previously, we actually found infinitely many eigenvectors for each λ.

18.7

The span of a set of vectors

We would like a compact way to describe all the vectors in a vector space V , and a way of generating a vector space from a given set of vectors. 18.7.1

Definition: Span

The span of vectors v 1 , v 2 , ..., v m ∈ Rn is the set of all possible linear combinations of these vectors: n o span{v 1 , v 2 , ..., v m } = α1 v 1 + α2 v 2 + · · · + αm vm | α1 , α2 , . . . , αm ∈ R .

18.7 The span of a set of vectors

280

Note that the span is a set of vectors. 18.7.2

Example

Show that v =

3 5

is in span

1 1

1 , . â&#x2C6;&#x2019;1

3 1 1 We want =Îą +Î˛ . 5 1 â&#x2C6;&#x2019;1 Solving, Îą = 4, Î˛ = â&#x2C6;&#x2019;1. We want 35 = Îą

1 1

+Î˛

Solving, Îą = 4, Î˛ = â&#x2C6;&#x2019;1.

18.7.3

1 â&#x2C6;&#x2019;1

Example

In fact, every vector in R2 is in the previous spanning set: 1 1 Show that span , = R2 . 1 â&#x2C6;&#x2019;1

Let v =

v1 v2

â&#x2C6;&#x2C6; R2 be arbitrary. We need to find Îą1 and Îą2 â&#x2C6;&#x2C6; R such that

1 v = Îą1 + Îą2 1 So we need: ( v 1 = Îą1 + Îą2 â&#x2021;&#x2019; v 2 = Îą1 â&#x2C6;&#x2019; Îą2

1 Îą1 + Îą2 = . â&#x2C6;&#x2019;1 Îą1 â&#x2C6;&#x2019; Îą2 ďŁą 1 ďŁ´ ďŁ´ ďŁ˛Îą1 = (v1 + v2 ) 2 . ďŁ´ 1 ďŁ´ ďŁłÎą2 = (v1 â&#x2C6;&#x2019; v2 ) 2 Hence 1 1 1 1 v1 v1 + v2 v1 â&#x2C6;&#x2019; v2 + . = v= 1 â&#x2C6;&#x2019;1 v2 2 2 1 1 2 So any vector in R can be written as a combination of and . 1 â&#x2C6;&#x2019;1 3 , so v1 = 3, v2 = 5 so 21 (v1 + v2 ) = 4, The previous example was v = 5 v=

v1 v2

â&#x2C6;&#x2C6; R2 be arbitrary. We need to find Îą1 and Îą2 â&#x2C6;&#x2C6; R such that

1 + Îą2 1 So we need: ďŁą ďŁ˛v 1 = Îą1 + Îą2 â&#x2021;&#x2019; ďŁł v 2 = Îą1 â&#x2C6;&#x2019; Îą2 v = Îą1

Hence

1 â&#x2C6;&#x2019;1

Îą1 + Îą2 Îą1 â&#x2C6;&#x2019; Îą2

ďŁą 1 ďŁ´ ďŁ´ ďŁ´Îą1 = (v1 + v2 ) ďŁ˛ 2 . ďŁ´ ďŁ´ 1 ďŁ´ ďŁłÎą2 = (v1 â&#x2C6;&#x2019; v2 ) 2

1 1 1 v1 1 v1 + v2 v1 â&#x2C6;&#x2019; v2 = + . v2 1 â&#x2C6;&#x2019;1 2 2 1 1 . and So any vector in R2 can be written as a combination of â&#x2C6;&#x2019;1 1 3 1 1 (v â&#x2C6;&#x2019; v ) = â&#x2C6;&#x2019;1. , so v1 = 3, v2 = 5 so 2 (v1 + v2 ) = 4, 2 The previous example was v = 1 2 5 v=

1 2 (v1

â&#x2C6;&#x2019; v2 ) = â&#x2C6;&#x2019;1.

Let

18.7 The span of a set of vectors

18.7.4

281

Example

ï£±ï£« ï£¶ ï£« ï£¶ ï£« ï£¶ ï£¼ 1 ï£½ 1 ï£² 1 Show that span ï£ 0 ï£¸ , ï£ 1 ï£¸ , ï£ 1 ï£¸ = R3 . ï£¾ ï£³ 1 0 0 ï£«

ï£¶ a Let v = ï£ b ï£¸ â&#x2C6;&#x2C6; R3 be arbitrary. We need to find Î±, Î², Î³ â&#x2C6;&#x2C6; R such that c

ï£« ï£¶ ï£« ï£¶ ï£¶ 1 1 1 v = Î± ï£ 0 ï£¸ + Î² ï£ 1 ï£¸ + Î³ ï£ 1 ï£¸. 1 0 0 That is, ï£«

Î± + Î² + Î³ = a + Î² + Î³ = b Î³ = c Back substitution gives Î³ = c, Î² = b â&#x2C6;&#x2019; c, Î± = a â&#x2C6;&#x2019; b. So ï£« ï£¶ ï£« ï£¶ ï£« ï£¶ ï£« ï£¶ 1 1 a 1 v = ï£ b ï£¸ = (a â&#x2C6;&#x2019; b) ï£ 0 ï£¸ + (b â&#x2C6;&#x2019; c) ï£ 1 ï£¸ + c ï£ 1 ï£¸ . 1 0 0 c ï£« ï£¶ a Let v = ï£ b ï£¸ â&#x2C6;&#x2C6; R3 be arbitrary. We need to find Î±, Î², Î³ â&#x2C6;&#x2C6; R such that c

ï£« ï£¶ ï£« ï£¶ ï£« ï£¶ 1 1 1 v = Î± ï£ 0 ï£¸ + Î² ï£ 1 ï£¸ + Î³ ï£ 1 ï£¸. 0 0 1 That is, Î± Î³

+ + =

Î² Î² c

+ +

Î³ Î³

= =

a b

Back substitution gives Î³ = c, Î² = b â&#x2C6;&#x2019; c, Î± = a â&#x2C6;&#x2019; b. So ï£¶ ï£¶ ï£« ï£¶ ï£« ï£« ï£¶ ï£« 1 1 a 1 v = ï£ b ï£¸ = (a â&#x2C6;&#x2019; b) ï£ 0 ï£¸ + (b â&#x2C6;&#x2019; c) ï£ 1 ï£¸ + c ï£ 1 ï£¸ . 1 0 0 c

18.7.5

The Span of vectors is a Vector Space

The span of any non-empty set of vectors is a vector space. We proved this for two vectors in Example 18.5.7. The general proof is similar. If span{v 1 , v 2 , ..., v m } = V then we say that v1 , . . . , v m span V , or that they form a spanning set for V. 18.7.6

Example

1 Find a spanning set for V = Î± |Î±â&#x2C6;&#x2C6;R . 2

18.8 Bases

282

1 1 Every vector in V is a linear combination of , so a spanning set is . 2 2 â&#x2C6;&#x2019;1 This answer is not unique. For example is another spanning set for V . â&#x2C6;&#x2019;2

Every vector in V is a linear

1 . , so a spanning set is 2 â&#x2C6;&#x2019;1 is another spanning set for V . This answer is not unique. For example â&#x2C6;&#x2019;2 combination of

18.7.7

1 2

Example ďŁŤ

ďŁś 0 ďŁŹ .. ďŁˇ ďŁŹ . ďŁˇ ďŁŹ ďŁˇ ďŁŹ 0 ďŁˇ ďŁŹ ďŁˇ ďŁˇ The coordinate vectors ei = ďŁŹ ďŁŹ 1 ďŁˇ â&#x2020;?i ďŁŹ 0 ďŁˇ ďŁŹ ďŁˇ ďŁŹ .. ďŁˇ ďŁ . ďŁ¸

span Rn since every vector v â&#x2C6;&#x2C6; Rn is a linear combination of

0 these vectors (see equation 18.1). 18.7.8

Example

The vector space {0} is spanned by the zero vector 0 â&#x2C6;&#x2C6; Rn is and called the zero vector space.

18.8

Bases

We have seen that we can describe a vector space by giving a spanning set. When we do this, we usually like to choose a spanning set that is as small as possible (contains no redundancies). We ensure this as follows. 18.8.1

Definition: Basis

Let V be a vector space. We say that a set of vectors B = {v 1 , ..., v m } â&#x160;&#x2020; V is a basis for V if: 1. B spans V ; 2. B is linearly independent. To prove that a set of vectors is a basis, we must prove both properties. Sometimes we are just given a vector space V , and asked to find a basis. Problems like this are solved by finding a spanning set for V , and then checking that the set is linearly independent. A single vector space V may have many different bases. 18.8.2

Example

We have seen that the coordinate vectors ei , i = 1, ..., n, are linearly independent (section 18.3.2) and span Rn (section 18.7.7). Therefore they form a basis for Rn , called the standard basis.

18.9 Dimension

18.8.3

283

Example

Show that the vectors

1 1

1 â&#x2C6;&#x2019;1

form a basis for R2 .

We must check: (i) linear independence; (ii) the set spans R2 .

(i) was example 18.3.1. (ii) was example 18.7.3.

We must check: (i) linear independence; (ii) the set spans R2 .

(i) was example 18.3.1.

(ii) was example 18.7.3.

Note that this basis differs from the standard

18.8.4

1 0

0 basis of R2 , but still has two elements. , 1

Example

Find a basis for the vector space of 18.5.2 ďŁź ďŁąďŁŤ ďŁś ďŁ˝ ďŁ˛ a

V = ďŁ b ďŁ¸

a + b = 2c , a, b, c â&#x2C6;&#x2C6; R ďŁž ďŁł c

Any vector in V is of the form ďŁŤ

ďŁś ďŁŤ ďŁś ďŁŤ ďŁś 2c â&#x2C6;&#x2019; b 2c â&#x2C6;&#x2019;b ďŁ¸ = ďŁ 0 ďŁ¸ + ďŁ b ďŁ¸ = cv 1 + bv2 b v=ďŁ c c 0

with c, b â&#x2C6;&#x2C6; R. Thus {v 1 , v 2 } is a spanning set for V . For {v 1 , v 2 } to be a basis, we need to show v 1 and v 2 are linearly independent. If Îą1 v 1 + Îą2 v 2 = 0 then from the second coordinate Îą2 = 0 and from the first Îą1 = 0, so {v 1 , v 2 } is also linearly independent. Any vector in V is of the form ďŁŤ

v=ďŁ

ďŁś ďŁŤ ďŁś ďŁŤ ďŁś 2c â&#x2C6;&#x2019; b 2c â&#x2C6;&#x2019;b ďŁ¸ = ďŁ 0 ďŁ¸ + ďŁ b ďŁ¸ = cv + bv b 1 2 c c 0

18.9

Dimension

Suppose B = {v 1 , ..., v m } is a basis for V . Then it can be shown that every basis for V has m vectors. We call m the dimension of V and write m = dim V . (By convention, the zero vector space {0} has dimension zero.)

18.10 Components

18.9.1

284

Example

We have seen that the coordinate vectors ei , form a basis for Rn , so dim Rn = n. 18.9.2

Example

The vector space

1 | Îą â&#x2C6;&#x2C6; R â&#x160;&#x2020; R2 V = Îą 2 1 is spanned by a single vector (example 18.7.6) which thus constitutes a basis for V . Hence 2 dim V = 1. Geometrically, V consists of vectors along a single line, which is a one dimensional object. 18.9.3

Example

Consider the vector space of 18.5.2 ďŁąďŁŤ ďŁś ďŁź ďŁ˛ a

ďŁ˝ V = ďŁ b ďŁ¸

a + b = 2c , a, b, c â&#x2C6;&#x2C6; R ďŁł ďŁž c

In example 18.8.4 we found a basis with 2 vectors, so dim V = 2. Note that V is a subspace of R3 , but it consists of a plane through the origin, which geometrically is a two dimensional object.

18.10

Components

Suppose {v 1 , . . . , v m } is a basis for V . Then every v â&#x2C6;&#x2C6; V may be written as a linear combination of elements in B v = Îą1 v 1 + Âˇ Âˇ Âˇ Âˇ Âˇ Âˇ + Îąm v m and this representation is unique: if we also have v = Î˛1 v 1 + Âˇ Âˇ Âˇ Âˇ Âˇ Âˇ + Î˛m vm then Îą1 = Î˛1 , . . . , Îąm = Î˛m . Proof:

18.11 Further properties of bases

285

Since B is a basis, it spans V . Hence every v can be written as some linear combination of v 1 , . . . , v m . We need to prove uniqueness. But 0 = v − v = (α1 v 1 + · · · + αm v m ) − (β1 v 1 + · · · + βm v m ) = (α1 − β1 )v 1 + · · · + (αm − βm )v m .

Since B is linearly independent, αi − βi = 0, i = 1, ..., m. So αi = βi meaning that the representation is unique. Since B is a basis, it spans V .

Hence every v can be written as some linear combination of v 1 , . . . , v m . We need to prove uniqueness. But 0

v − v = (α1 v 1 + · · · + αm v m ) − (β1 v 1 + · · · + βm v m )

(α1 − β1 )v 1 + · · · + (αm − βm )v m .

Since B is linearly independent, αi − βi = 0, i = 1, ..., m. So αi = βi meaning that the representation is unique.

If v = α1 v1 + · · · + αm v m we call αi the ith component of v with respect to the basis B.

18.11

Further properties of bases

With more work, one can also prove the following: (i) Every vector space has a basis. (ii) If m = dim V , any m linearly independent vectors in V will form a basis. (iii) If m = dim V , any set of more than m vectors in V will be linearly dependent (and so not be a basis). (iv) If m = dim V , any set of fewer than m vectors in V will not span V (and so not be a basis). (v) If W is a subspace of V , then dim W ≤ dim V and dim W = dim V if and only if V = W . (vi) The only m-dimensional subspace of Rm is Rm itself. 18.11.1

Example

Show that in R3 , the vectors 

 1 v1 =  0  , 0



 1 v2 =  1  0

are linearly independent and hence form a basis for R3 .



 1 v3 =  1  1

18.11 Further properties of bases

286

Setting αv 1 + βv 2 + γv 3 = 0 implies         0 1 1 1        = 0  +γ 1 +β 1 α 0 0 1 0 0     0 α+β+γ    0  = β+γ ⇒ 0 γ ⇒γ=0⇒β=0⇒α=0 Therefore the vectors are linearly independent. Because we already know dim R3 = 3 and we have found 3 linearly independent vectors, they must form a basis, by (vi) above. (Alternatively: we proved they span in example 18.7.4.) Setting αv1 + βv2 + γv3 = 0 implies      1 1 1 α 0 + β 1 + γ 1 0 0 1  α+β+γ  β+γ ⇒ γ



 

⇒γ =0⇒β =0⇒α=0



 0  0  0   0  0  0

Therefore the vectors are linearly independent. Because we already know dim R3 = 3 and we have found 3 linearly independent vectors, they must form a basis, by (vi) above. (Alternatively: we proved they span in example 18.7.4.)

The shortcut in this example is fine if we already know the dimension of V . If we are just given a space like     a



  V = b

a + b = 2c , a, b, c ∈ R  c it may not be obvious what the dimension is.

18.11 Further properties of bases

Notes

287

19.2 Functions

288

Appendix - Practice Problems

You should attempt the practice problems in this section while learning the material in the course. Try to solve all questions. Do at least questions marked with a star (*) before attempting assignment questions. Have a look at the questions before your tutorial. Try to solve them yourself first, and ask your tutor at the tutorial if you need help. Solutions to questions on this sheet are not to be handed in.

19.1

Numbers

1. Solve for x. (a) |x + 1| < 1

(b) |x + 1| > 1

2. Simplify. (a) (b)

3+i 3−i (2+3i)(3+2i) 1+i

3. Write z = 1 − ∗

√

3i in polar form.

5. Find the roots of x2 − 4x + 5 = 0.

19.2

Functions

1. Graph the following functions and give their domain and range. (a) f (x) = x2 + 2 ∗ (b)

f (x) =

∗ (c)

f (x) =

(d) f (x) = ∗ (e)

f (x) =

∗ (f)

f (x) =

1 x−2 1 − x−2 1 x2 +2 x+2 x2 −4

√

x2 − 9 √ (g) f (x) = − x2 + 9

(h) f (x) = ex+1

(i) f (x) = 1 + e2x+3 . 2. Stewart ed 5, p.56 question 2 3. Stewart ed 5, p.23 question 36 4. Stewart ed 5, p.23 question 38 5. Two species of kangaroos inhabit a national park. There are currently 4,500 grey kangaroos, whose population increases at a rate of 4% a year. There are only 3,000 red kangaroos, but they increase at a rate of 6% a year. In how many years will the red kangaroos outnumber the grey kangaroos? Use logarithms to find the answer and draw a graph of the two populations.

19.3 Limits

289

6. Let P (t) be the amount of a radioactive material at time t. Experiments indicate P (t) decays exponentially with time, so P (t) = P0 e−kt where P0 is the amount of material at time t = 0 (initial amount) and k is a positive constant (called the decay constant). The half-life T is the time taken for the material to decay to one half of its initial amount. Obtain T in terms of k. 7. If y = tan(x), 0 ≤ x < π/2, find cos(x) as a function of y. ∗

8. Find the inverse of f (x) =

1 3

cos(2x) and state its domain and range.

9. For f (x) = 3x + 2, g(x) = ln(x2 + 1) and h(x) = 3e2x+1 , find (a) f (g(x)) (b) g(f (x)) (c) f (h(x)) (d) h(f (x)) (e) g(h(x)) (f) f (g(h(x))) (g) h(g(f (x))). 10. Show from the definition that the following two functions are 1-1 on their domains (a graph is not sufficient here). Find the inverse function, and state the domain and range of both f and f −1 . (a) f (x) = e3x+1 ∗

(b) f (x) =

19.3

x+1 x+2 .

Limits

1. Let f (x) =

x−3 x2 − 9

(a) Use the limit laws to find L = limx→3 f (x). (b) Find a small interval around 3 where |f (x) − L| ≤ .01 ∗

2x+1 2. Prove that lim x4 cos 3x 2 +5 = 0. x→0

3. Evaluate the following limits ∗

x2 +2x−8 2 x→2 x −5x+6

(a) lim

(b) lim

x→−1

x2 −x−12 x+3

x+1 2 x→−1 x −x−2

t3 −t 2 t→1 t −1 √ √ 2 lim x+2− x x→0

(d) lim (e)

(Hint: multiply top and bottom by the conjugate surd

4. Show

3 x = x→3 5 5 lim

using the ǫ, δ definition of the limit.

√ √ x + 2 + 2).

19.5 Derivatives

290

5. Prove that

1 lim x sin( ) = 0. x→0 x

6. Evaluate the following limits. 3x2 +2x+1 2 x→∞ x +x+3

(a) lim ∗

√ √ (b) lim ( x + 1 − x) x→∞

√ √ (Hint: multiply top and bottom by the conjugate surd ( x + 1 + x) )

x2 −1 3 x→∞ x +2

x−x3 2 x→∞ x +1 lim x+1 . x→−∞ 1−x

(d) lim (e)

19.4

Continuity

1. Determine whether or not f (x) is continuous at x = 1 and sketch the graph of f (x) for the following choices of f (x): 2 x − 2x + 2 , if x < 1 (a) f (x) = 3−x , if x ≥ 1 x3 , if x < 1 (b) f (x) = (x − 2)2 , if x ≥ 1 x−1 , if x 6= 1 x2 −1 (c) f (x) = 6 , if x = 1. ∗

2. For what value of a is f (x) continuous for all x, where ( 2 x −3x+2 x−1 , x 6= 1 f (x) = a, x=1 Give clear explanations. 3. Is the following function continuous at x = −5? x+5 , x 6= −5 x2 −25 f (x) = 1 , x = −5 − 10

19.5

Derivatives

1. (a) Sketch the graph of f (x) = x|x|. (b) For which values of x is f differentiable? (c) Obtain a formula for f ′ (x). 2. The hyperbolic sine and cosine functions are defined by 1 1 sinh x = (ex − e−x ), cosh x = (ex + e−x ) 2 2 respectively. (a) Prove cosh2 x − sinh2 x = 1.

(b) Prove (c) Find

d dx

sinh x = cosh x,

lim sinh x . x→0 x

d dx

cosh x = sinh x.

19.5 Derivatives

291

3. The Tasmanian Devil often preys on animals larger than itself. Researchers have identified the mathematical relationship L(w) = 2.265w2.543 , where w (in kg) is the weight and L(w) (in mm) is the length of the Tasmanian Devil. Suppose that the weight of a particular Tasmanian Devil can be estimated by the function w(t) = 0.125 + 0.18t, where w(t) is the weight (in kg) and t is the age, in weeks, of a Tasmanian Devil that is less than one year old. How fast is the length of a 30-week-old Tasmanian Devil changing? 4. Show that the equation of the tangent line to a point (x0 , y0 ) on the ellipse

x2 a2

+ yb2 = 1 is given by

x0 x y 0 y + 2 = 1. a2 b ∗

5. Consider the function f (x) = xx . Evaluate f ′ (x) for x > 0. 6. A new car costing P dollars depreciates at an annual rate of r%, so that after t years the car will be worth C dollars where r t ). C(t) = P (1 − 100 (a) Find

dC dt

(b) How much will the car be worth in the long run? ∗

7. Find

θ d . dθ cos(sin θ)

8. A spherical snowball is melting in such a way that its volume is decreasing at a rate of 1 cm3 /min. At what rate is the diameter decreasing when the diameter is 10 cm? 9. Using L’Hˆ opital’s rule evaluate the following limits ∗ (a)

lim

θ→0

1−cos θ θ2

ex −1 x→0 sin x lim sinxx−x 3 x→0 1 ) lim ( 1 − x−1 x→1 ln x

(b) lim (c) ∗ (d)

(e) lim e−x ln x x→∞

(sin θ)n n , θ→0 sin(θ )

(f) lim

1 ≤ n ∈ N.

10. Use the mean value theorem to prove the following (this is the second derivative test). Suppose f : [a, b] → R is continuous, and has continuous first and second derivatives on [a, b]. Then if c ∈ (a, b) is a point with f ′ (c) = 0 and f ′′ (c) > 0 then f has a local minimum at c. 11. Determine where the following functions are increasing/decreasing. (a) f (x) = (x − 1)(x + 2) = x2 + x − 2

(b) f (x) = x|x|.

e−x +1 ex

19.6 Integration

292

12. Prove the following: Hint: Consider f (x) = ex − x − 1.

ex ≥ x + 1, for all x ≥ 0.

13. Prove the following: sin x ≤ x, for all x ≥ 0. 14. Find the absolute maximum and minimum of the function f (x) = x3 − 3x + 2 on the closed interval [0, 2]. 15. Find all relative maxima and minima and thus sketch the graph of the function f (x) = x + ∗

16. Show that the function f (x) =

1 . x

x ln x , x > 0 0 , x=0

is continuous on [0, ∞). What are the absolute maximum and minimum values of f (x) on [0, 2]? 17. To increase the velocity of the air flowing through the trachea when a human coughs, the body contracts the windpipe, producing a more effective cough. Tuchinsky formulated that the velocity of air that is flowing through the trachea during a cough is V = C(R0 − R)R2 , where C > 0 is a constant based on individual body characteristics, R0 is the radius of the windpipe before the cough, and R is the radius of the windpipe during the cough. Find the value of R that maximizes the velocity. (Interestingly, Tuchinsky also states that X-rays indicate that the body naturally contracts the windpipe to this radius during a cough.)

19.6

Integration

1. Evaluate R (a) (x3 − 2x + 1)dx R (b) x−a dx, a > 0. R (c) sin(x + π3 + 2x)dx R (d) (x + x1 )2 dx

2. Let f be continuous on an interval I. Let F1 and F2 be two different anti-derivatives of f . Show that F1 and F2 differ by a constant only. (Hint: Recall that if f ′ (x) = 0 on an interval I then f is constant on this interval.) 3. Estimate the area under the below function using: (a) the right rectangle rule (Riemann sums) with 2 subintervals, then 4 subintervals then 8 subintervals. (b) the trapezoidal rule with 4 subintervals. 4. What is wrong with the following argument. F (x) = −1/(x − 1) is an anti derivative of the function f (x) = 1/(x − 1)2 . Thus by the fundamental theorem, the area under f on the interval [0, 3] is given by

Z 3 −1

3 −1 −1 3 1 2 dx = (x − 1) = 2 − −1 = − 2 . 0 (x − 1) 0

19.6 Integration

293

y = f(x)

1 1

Does this show a positive function can have a negative area? ∗

5. Find the area enclosed by the graphs of the functions f (x) = x2 and g(x) = x. 6. Derive a formula for the area of the ellipse x2 y 2 + 2 = 1. a2 b 7. Find the area under the graph of √ (a) y = 4 − x2 above [0, 1] (b) y = sec x tan x above [0, π3 ].

8. Evaluate each of the following integrals by interpreting it in terms of areas: (a)

(1 + 2x)dx

(b)

9. Determine

(1 +

√

−3

R∞

9 − x2 )dx

(c)

−2

|x|dx

(d)

R2 √

−2

4 − x2 dx.

xe−x dx.

10. By finding a suitable substitution, evaluate the following indefinite integrals: R ex ∗ (a) 1+e2x dx R dx dx (b) x2 +4x+9 R (c) cos x sin3 xdx.

11. Use integration by parts to find: R ∗ (a) x cos xdx R (b) cos2 xdx R (c) ex cos xdx R R (d) ln xdx (e) x sec2 xdx

12. Use partial fractions to find R x ∗ (a) (x+1)(x+2) dx R x+3 dx (b) x(x+2) R 4x+8 (c) x2 +2x−3 dx.

19.8 Series

294

13. Evaluate the following integrals: R R x (b) 1+e (a) x2x−1 dx 1−ex dx.

14. Evaluate the following indefinite integrals R 2) dt (a) sin(1/t t3 R (b) sin(x) cos(cos(x)) dx

Check by explicitly differentating your answer to confirm it is indeed an anti-derivative.

15. Find the volume of the ellipsoid obtained by rotating the ellipse

x2 a2

y2 b2

= 1 about the x-axis.

16. Let a > 0 be constant. Find the volume V of the solid of revolution obtained by rotating y = f (x) above the interval [0, a] about the x-axis, for the following choices of f (x):

∗

(a) y = x2 p (b) y = ln(x + 1).

17. Use Euler’s formula to express cos(2θ) and sin(2θ) in terms of cos(θ) and sin(θ).

19.7

Sequences

1. Use L’Hˆ opital’s rule to find sin(1/n) . n→∞ log(1 + 1/n) lim

2. Evaluate the limit of the sequence {an }∞ n=1 for the following choices of an : ∗ ∗

(a) an = ncn , 0 ≤ c < 1

(b) an =

1+2n 1+3n .

3. Prove that lim n1/n = 1.

n→∞

19.8 ∗

Series

1. It is known that a repeated decimal can always be expressed as a fraction. Consider, for example, the decimal 0.727272 . . . . Use the fact that 0.727272 · · · = 0.72 + 0.0072 + 0.000072 + . . . to express the decimal as a geometric series. Hence show that 0.727272 · · · =

8 . 11

2. Determine whether or not the following series converge: (a) (b)

∞ P

nn n!

n=1 1 1 1·2 + 2·3

3. (a) Prove that

1 3·4 ∞ P

+ ...

n=1

nxn =

x , (1−x)2

1 d ( 1−x )) (Hint: Consider x dx

for |x| < 1.

19.8 Series

295

(b) Hence, or otherwise, evaluate

∞ P

n=1

n 2n .

4. Consider the p = 1 case of the p-series (called the harmonic series): ∞ X 1 1 1 1 1 1 1 1 1 1 1 = 1+( )+( + )+( + + + )+( + + ··· + ) + ... n 2 3 4 5 6 7 8 9 10 16 n=1

and group together its terms as shown. Show that the sum of each group of fractions is more than 1 2 . Hence deduce that the harmonic series diverges. ∗

5. Evaluate the following series: (a) (b)

∞ P

( √1n −

n=1 ∞ P

√1 ) n+1

(−1)n xn .

n=0

6. (a) For p > 0, show that f (x) = xp is strictly increasing on [0, ∞). (b) For 0 < p < 1, show that

1 1 ≥ , n = 1, 2, 3 . . . np n

∞ X 1 np

n=1

diverges for 0 ≤ p ≤ 1.

7. Determine whether or not the following series converge. (a) ∗

(b)

∗

(c)

∞ P

n=1 ∞ P

n=0 ∞ P

n=1

(−1)n n3n 22n (2n)! ln(n) n! .

8. (a) Write P down from the examples in lectures or the text book two series of positive terms & bn satisfying the following conditions:

an+1

= 1 and the

an converges. lim

n→∞ an m

bn+1

= 1 and lim

bn diverges. n→∞ bn

This shows the ratio test gives no information if the limit is 1. You can accept any proofs of convergence in lectures or the text book but prove both the limits above.

9. Find an example of a sequence an with an = f (n)/g(n) for differentiable functions f and g, where limn→∞ an and limn→∞ f ′ (n)/g′ (n) both exist, and f ′ (n) . n→∞ g ′ (n)

lim an 6= lim

n→∞

This shows before using L’Hˆ opital’s rule we must remember to check that either limn→∞ f (n) = 0 = limn→∞ g(n) or limn→∞ f (n) = ±∞ = limn→∞ g(n)!

19.9 Power series and Taylor series

19.9

296

Power series and Taylor series

1. This question is about the MacLaurin series for (1 + x)3/2 . Find the first three terms of this series. If this series is written as Find a formula for cn so an+1 = cn an . ∗

P∞

n=0 an ,

find the formula for an .

2. Obtain the MacLaurin series expansion for arctan x and its radius of convergence. 1 d (arctan x) = 1+x (Hint: Use dx 2 ). √ Using the fact that tan(π/6) = 1/ 3, show that ∞ √ X π=2 3

(−1)n . n (2n + 1)3 n=0

∗

3. Obtain the MacLaurin series expansions and radius of convergence for the following functions: cosh x =

∗

1 x (e + e−x ) , 2

1 sinh x = (ex − e−x ). 2

4. Show that for |x| < 1, (a) √

1+x = 1+

∞ X (−1)n−1 (2n − 2)!

n=1

(b) √

22n−1 n!(n − 1)!

xn .

∞

X (−1)n (2n)! 1 xn . = 1 + x n=1 22n (n!)2

5. The theory of relativity predicts that the mass m of an object when it is moving at a speed v is given by the formula m0 m= p 1 − v 2 /c2

where c is the speed of light and m0 is the mass of the object when it is at rest. Express m as a MacLaurin series in vc and determine for which values of v this series converges. (Hint: Use a result from the previous question.)

6. Show that the functions (a) f (x) = (1 + x2 )e−x

(b) f (x) = x2 − ln(1 + x2 ) both satisfy f ′ (x) = f ′′ (x) = 0 at x = 0 (so the second derivative test fails here). Show, using Taylor series, that (a) has a relative maximum and (b) a relative minimum at x = 0. ∗

7. Find the sum of each of the following convergent series by recognizing it as a Taylor series evaluated at a particular value of x. (a) 1 +

2 1!

4 2!

8 3!

(b) 1 −

1 3!

1 5!

−

1 7!

1 4

(d) 1 −

100 2!

+ ··· + + ··· +

2n n! + . . . (−1)n (2n+1)! +

...

+ ( 14 )2 + ( 14 )3 + · · · + ( 41 )n + . . . +

10000 4!

+ ··· +

(−1)n ·102n (2n)!

+ ...

19.10 Vectors

19.10

297

Vectors

1. A canoe travels at a speed 3km/h in a direction due west across a river flowing due north at 4km/h. Find the actual speed and direction of the canoe. 2. Draw a clock with vectors v 1 , v 2 , . . . , v 12 pointing from the centre to each of 1, 2, . . . , 12. Describe each of the following vectors. (a) v 1 + · · · + v 12 .

(b) v 1 + v 2 + 3v 3 + v 4 + v 5 + v 6 + v 7 + v 8 + 3v 9 + v 10 + v 11 + v 12 . (c) v 1 + v 2 + v 3 + v4 + v 6 + v 7 + v 9 + v 10 + v 12 . 3. Find the lengths of the sides of the triangle ABC and determine whether the triangle is a right triangle. (a) A = (4, 2, 0), B = (6, 6, 8), C = (10, 8, 6) (b) A = (5, 5, 1), B = (6, 3, 2), C = (1, 4, 4). 4. (a) Suppose u and v are orthogonal vectors. Prove that kuk2 + kvk2 = ku + vk2 . What geometric fact have you proved?

(b) Prove that ku + vk2 + ku − vk2 = 2(kuk2 + kvk2 ). What geometric fact have you proved?

∗

(c) Consider a triangle with side lengths a, b, c. Bisect side c, and draw a line connecting the midpoint to the opposite vertex. Let the length of this line be m. Using vectors prove that a2 + b2 = 2(m2 + (c/2)2 ). This is called Apollonius’ Theorem.     1 4 3    5. Find a non-zero vector in R orthogonal to 2 and 5 . 3 6 →

6. Given the point P = (2, −1, 4), find a point Q where P Q has the same direction as v = 7i+6j −3k. ∗

7. For what values of c is the angle between the vectors (1, 2, 1) and (1, 1, c) equal to 60◦ ? For what values of c are they orthogonal? 8. Determine whether or not each of the following sets of vectors are linearly independent.       1 −1 −1 (a)  2  ,  1  ,  2 . 0 1 1 (b) i + j, i + j + k, i − j.        1 1 1  0   2   2         (c)   0 , 0 , 1 , 0 0 0

9. Prove that the vectors

 4 4  . 1  0 

     1 1 1  0 , 0 , 1  1 −1 0 

 2 form a basis for R3 . What are the components of the vector  1  in this basis? 0

19.11 Matrix Algebra

10. Given

298

     1 −2 2 v 1 =  −5  , v 2 =  1  , v 3 =  −9  , 6 −3 h 

for what values of h is v3 in span{v1 , v 2 }, and for what values of h is {v 1 , v 2 , v 3 } linearly independent? Justify your answer. ∗

11. Which of the following subsets of R3 are vector spaces? Find a basis in each case. (a) {(a, b, a + 2b) | a, b ∈ R}.

(b) {(a, b, c)|a − b = 3c + 1}. 12. Suppose S = {v 1 , v 2 , . . . , v m } ⊂ Rn and let V be the set of all linear combinations of the elements of S, so V = {α1 v1 + α2 v2 + · · · + αm v m | α1 , α2 , . . . , αm ∈ R}. Prove that V is a vector space. 13. If {a, b, c} forms a basis for a 3-dimensional vector space, could {a + b, b + c, c + a} also be a basis? Give your reasons. ∗

14. Let u1 = (0, 0, 3), u2 = (−2, −1, 0), u3 = (1, 0, 0). Which of u1 , u2 , u3 are linear combinations of v 1 = (1, 2, 3), v 2 = (4, 5, 6), v3 = (7, 8, 9)? Do {v 1 , v 2 , v 3 } form a basis of R3 ? 15. Determine if the following vectors are a basis for R3 . Justify your answer.       0 5 6  1  ,  −7  ,  3  5 −2 4 16. Let v 1 = (1, 2, 3), v 2 = (4, 5, 6), v 3 = (−2, −1, 0). Do {v 1 , v 2 , v 3 } form a basis of R3 ? Explain.

∗

17. Let v 1 = (1, 4, 6), v 2 = (2, 0, 1), v 3 = (−1, 12, 16). Let V = span{v 1 , v 2 , v 3 }. Find dim V .

19.11

Matrix Algebra

cos θ sin θ 1. Let A = − sin θ cos θ

. Find simple expressions for A2 and A3 . What do you notice?

2. Find 2 × 2 matrices A, B such that (A + B)2 6= A2 + 2AB + B 2 . ∗

3. Prove that (A + B)T = AT + B T for any m × n matrices A and B. 1 2 4. Let A = . Find all 2 × 2 matrices B such that AB = BA. 3 4 5. Find a 2 × 2 matrix A with no entry of A equal to 0 such that A2 = 0. 6. A matrix A is called idempotent if A2 = A. (a) Find x so that A is idempotent:  2/3 0 x A =  √0 1 0  2 0 1/3 3 

19.11 Matrix Algebra

∗

299

(b) What is the smallest (integer) index N so that B N = 0? A matrix with such a property is called nilpotent.   0 1 0 0  0 0 2 1   B=  0 0 0 1 . 0 0 0 0   1 2 1 7. Let C =  1 1 2  . Show that C is invertible, ie show linear independence of a certain set of 1 3 −1 vectors.   4 -1 2 8. Let A =  3 0 x . For which value(s) of x is A not invertible? 8 -3 2

9. Find the inverse matrix for: 1 0 (a) 0 1 1 2 (b) 3 4   1 2 3 (c)  0 1 2  0 0 1

10. Let



A=

1 5 3 5

2 5 1 5

3 5 2 5 1 5





 −1 2 x  and B =  3 −1 y  . 0 0 z

Find real numbers x, y and z such that B = A−1 .

11. A swimming pool can be filled by 3 pipes A,B and C. Pipe A can fill the pool by itself in 8 hours. Pipes A and C used together can fill the pool in 6 hours and pipes B and C used together can fill the pool in 10 hours. How long does it take to fill the pool if all 3 pipes are used? ∗

12. Use Gaussian elimination to solve the following system of equations, and check your answer by substitution. x1 − 2x2 + 3x3 = −6 4x1 + 2x3 = 2

3x1 + x2 + 2x3 = 3 13. Let xp be a particular solution of Ax = b. Show that every solution of this equation is of the form x = xp + h where h is a solution of the homogeneous equation Ax = 0. 14. Show that the set of solutions of Ax = b, with b = 0 form a vector space for A. If b 6= 0, does the set of solutions still form a vector space? 2 1 15. Show that A = satisfies A2 − 4A + 3I = 0. Hence deduce that A is invertible with inverse 1 2 A−1 = 31 (4I − A).

19.11 Matrix Algebra

300

16. Let a ∈ R. Use Gauss-Jordan elimination to find the inverses of the following matrices. Check by multiplication if your inverse is correct.   1 1 0 ∗ (a)  1 2 α  2 1 1−α   0 0 −1 (b)  1 2 1  . 1 0 α 17. Find the inverses of the following matrices.   1 1 1 (a)  1 2 1  . 2 1 1   1 2 3 1  1 3 3 2   (b)   2 4 3 3 . 1 1 1 1 1 1 (c) . 1 2   1 0 5 18. Let A =  2 -1 2 . Calculate the determinant of A. 3 1 2

19. Determine for which values of x the following determinants vanish:

1−x 2

(a)

3 2−x

1 2

(b)

2 x x

1 1 4

∗ (c) 2 x 1

2x x2 x

∗

20. Evaluate the following determinants

cos θ sin θ

(a)

− sin θ cos θ

3 1 0

(b)

0 6 4

0 0 1

−1 1 3 4

3 2 4 −1

(c)

0 3 1 0

4 1 −1 −5

−1 1 2

(d)

2 1 −1

−1 1 −1

21. Use determinants to establish the linear dependence or independence of the following row vectors: (a) (4, 1, 4), (2, 0, 4), (−1, 2, 2)

19.11 Matrix Algebra

(b) (3, −3, −1), (6, −6, 0), (3, −3, 2) (c) (1, 1, 1), (0, 1, 0), (0, 1, −1)

∗

22. For an invertible matrix A, prove that An is also invertible, for all natural numbers n.

301

APPENDIX - MATHEMATICAL NOTATION

302

Appendix - Mathematical notation

> < ≥, ≤ ∈ ∈ / ⊆ {x|...} ∀ ... n! ⇒ ⇔

is strictly greater than is strictly less than is greater/less than or equal to is an element of is not an element of is a subset of set of all x for which a condition is fulfilled for all continue the pattern n factorial, n! = n · (n − 1) · ... · 3 · 2 · 1 implies, if ... then is equivalent to, if and only if

e.g. e.g. e.g. e.g. e.g. e.g. e.g. e.g. e.g. e.g. e.g. e.g.

5>2 2<5 2 ≤ 2, 4 ≥ 3 3.5 ∈ R 3.5 ∈ /N N⊂R {x ∈ R|x > 0} ∀x ∈ R 1, 2, 3, ..., 9, 10 3! = 3 · 2 · 1 = 6 x = 2 ⇒ x2 = 4 x2 = 4 ⇔ x = 2 or x = −2

21.1 Powers

303

Appendix - Basics

This appendix contains some very important basic mathematical rules. Since this is highschool material, you must know and be able to apply all of them to successfully complete MATH1051. Go through the sections and check your knowledge - do some additional exercises from textbooks if you encounter problems, or check the MATH1040/MATH1050 material on the course websites.

21.1 21.1.1

Powers Notation (product of two real numbers)

The product of two real numbers a and b can be written in either of the following ways: a × b = a · b = ab. 21.1.2

Notation (powers)

The product after multiplying the same real number a n-times is aa · · · a} = an . | {z n factors

Note that an = aan−1 = an−1 a. Then a0 = 1 (for a 6= 0) and a1 = a, aa = a2 = aa1 , aaa = a3 = aa2 . We also have 0n = 0 for n 6= 0. 21.1.3

Power rules

Let a, b, c ∈ R. We set

a−b =

1 . ab

Then

We also have

ab ac = ab+c , a > 0 and c ab = abc . p

aq =

√ q p a ,

with p, q ∈ Z and q > 0. These formulae also apply for some other values of a, p and q but care needs to be taken. 1 For example, a = −1, p = 1. If q = 3, then (−1) 3 = −1. 1 But if a = 2, then (−1) 2 = i, which is not a real but a complex number! 21.1.4

Examples

1. (a × a) × (a × a) × (a × a) = a2 2. a2 × a3 = a2+3 = a5 √ 1 3. a 2 = a √ 1 4. a 3 = 3 a

= a2×3 = a6

21.3 Fractions

21.2

304

Multiplication/Addition of real numbers

21.2.1

Laws (multiplication and addition)

1. a + b = b + a and ab = ba

(Commutative Law)

2. (a + b) + c = a + (b + c) and (ab)c = a(bc)

(Associative Law)

3. a(c + d) = ac + ad = (c + d)a = ca + da 21.2.2

(Distributive Law)

Examples (multiplications of sums)

1. (a + b)(c + d) = ac + ad + bc + bd, as (a + b)(c + d) = a(c + d) + b(c + d) = ac + ad + bc + bd. 2. (a + b)2 = a2 + 2ab + b2 as (a + b)2 = (a + b)(a + b) = a(a + b) + b(a + b) = aa + ab + ba + bb = a2 + 2ab + b2 . 3. (a + b)3 = a3 + 3a2 b + 3ab2 + b3 as (a + b)3 = (a + b)(a + b)2 = (a + b)(a2 + 2ab + b2 ) = a(a2 + 2ab + b2 ) + b(a2 + 2ab + b2 ) = a3 + 3a2 b + 3ab2 + b3 . 4. (a − b)2 = a2 − 2ab + b2 5. (a + b)(a − b) = a2 − b2

21.3 21.3.1

Fractions Rule (multiplying and dividing two fractions)

Let a, b, c, d ∈ Z. Then and

21.3.2

ac a c · = b d bd a b c d

ad a d · = . b c bc

Rule (adding two fractions)

Let a, b, c, d ∈ Z. Then

a b a+b + = . c c c If the two fractions have different denominator, a common denominator needs to be found: ad + bc a c + = . b d bd

21.4 Solving quadratic equations

21.3.3

305

Examples

x2 1 1 x2 + 1 = + =x+ . x x x x

x (x + 1) − 1 x+1 1 1 = = − =1− x+1 x+1 x+1 x+1 x+1

3. Simplify 1 x

1 2 − x+1 x+2

2 1(x + 2) − 2(x + 1) −x 1 − = = Solution: First x + 1 x + 2 (x + 1)(x + 2) (x + 1)(x + 2) 1 2 1 −x 1 −1 ⇒ − = = x x+1 x+2 x (x + 1)(x + 2) (x + 1)(x + 2)

21.4

Solving quadratic equations

A quadratic equation is an equation of the form ax2 + bx + c = 0, or x2 + bx + c = 0 if all coefficients are divided by a. Here a, b and c are given. Quite often you will be required to solve equations of this type, which means we try to determine x. There are many different ways of finding the solution. A few are listed here. They are completing the square, using a formula, or factorizing. 21.4.1

Completing the square

We may solve x2 + bx + c = 0 by first writing the quadratic polynomial x2 + bx + c in the form (x + d)2 + e, and then solving (x + d)2 + e = 0. Applying the formula from example 21.2.2.(2), x2 + bx = (x + 2b )2 − ( 2b )2 (check!) ⇒ x2 + bx + c = (x + 2b )2 − ( 2b )2 + c = (x + d)2 + e, with d =

b 2

and e = c − ( 2b )2 = c − d2 .

√ √ √ To solve: (x + d)2 + e = 0 ⇔ (x + d)2 = −e ⇔ x + d = ± −e ⇔ x = −d − −e or x = −d + −e. 21.4.2

Examples

1. Express x2 + 6x + 5 = (x + d)2 + e. Then solve the equation x2 + 6x + 5 = 0. x2 + 6x + 5 = (x + 3)2 − 32 + 5 = (x + 3)2 − 4. We solve x2 + 6x + 5 = (x + 3)2 − 4 = 0 ⇔ (x + 3)2 = 4 ⇔ x + 3 = ±2 ⇔ x = −5 or x = −1. 2. The method of completing the square can also be used to find the turning point of a parabola. Express f (x) = x2 + 4x + 7 = (x + d)2 + e. Find the turning point of this parabola. Now b = 4 so d = 2. Also c = 7 so e = 7 − 4 = 3. Thus x2 + 4x + 7 = (x + 2)2 + 3. The turning point of a parabola f (x) = (x + d)2 + e is at (−d, e). In this case the turning point is a minimum located at (−2, 3).

21.4 Solving quadratic equations

21.4.3

306

Formula (Solution to quadratic equation)

There is also a formula you can use to solve a quadratic equation. Let a, b, c ∈ R. The solution of the equation ax2 + bx + c = 0 is x=

−b ±

√

b2 − 4ac . 2a

The solution of the equation x2 + bx + c = 0 is b x=− ± 2 21.4.4

b2 − c. 4

Factorizing

Another way of solving quadratic (and higher order polynomial) equations is by factorizing. We try to write the polynomial as a product of sums. You can sometimes use the formulas from Examples 21.2.2.(2),(4) or (5) for this. Factorizing is also used to simplify expressions. If you cannot use one of the formulas from Examples 21.2.2.(2),(4) or (5), you can sometimes factorize the following way. We would like to express x2 + bx + c = (x + p)(x + q). Multiplying out we have x2 + bx + c = (x + p)(x + q) = x2 + (p + q) + pq. We now need to find p and q such that b = p + q and c = pq. Have a look at these examples. 21.4.5

Examples

1. Factorize f (x) = x2 + 6x + 5 using the formula from Example 21.2.2.(2). Then find the zeros. x2 + 6x + 5 = (x + 3)2 − 22 = ([x + 3] − 2)([x + 3] + 2) = (x + 1)(x + 5). The zeros can be read directly from this expression: (x + 1)(x + 5) = 0 ⇒ x = −1 or x = −5. 2. Factorize x2 + 3x + 2 without using the formula from Example 21.2.2.(2). x2 + 3x + 2 = x2 + bx + c. We must find p and q such that b = 3 = p + q and c = 2 = pq. A solution is p = 1 and q = 2, so x2 + 3x + 2 = (x + 1)(x + 2). 3. Factorize x2 − 7x + 6. Now −7 = p + q and 6 = pq. A solution is p = −6 and q = −1, so x2 − 7x + 6 = (x − 6)(x − 1). 21.4.6

Factorizing using common factors

The following formula uses common factors to factorize : a(x + d) + b(x + d) = (a + b)(x + d).

21.5 Surds

21.4.7

307

Examples

1. Factorize x(x + 3) + 2(x + 3). x(x + 3) + 2(x + 3) = (x + 2)(x + 3) setting a = x, b = 2, and d = 3 in the above formula. 2. Factorize (x + 4)(x + 3) + 2(x + 3). (x + 4)(x + 3) + 2(x + 3) = ([x + 4] + 2)(x + 3) setting a = [x + 4], b = 2, and d = 3.

21.5

Surds

The following formula is often used to “simplify” surds: √

a−

√

b= √

a−b √ . a+ b

To see this √ √ √ √ √a + b √ √ a− b = a− b √ a+ b √ √ √ √ ( a − b)( a + b) √ = √ a+ b √ √ ( a)2 − ( b)2 a−b √ √ . = =√ √ a+ b a+ b Similarly, √ √ a−b √ a+ b= √ a− b 21.5.1

Example

Simplify

√

2x + 1 − x

√

x+1

, where x 6= 0

First √

2x + 1 −

√

x+1 = =

⇒

√

2x + 1 − x

√

x+1

= =

√

2 2 √ 2x + 1 − x+1 √ √ 2x + 1 + x + 1 x √ √ 2x + 1 + x + 1 1 x √ √ , where x 6= 0 x 2x + 1 + x + 1 1 √ √ , where x 6= 0 2x + 1 + x + 1