The NLCS Jeju Journal of Pure and Applied Mathematics

Page 1


The NLCS Jeju Journal of Pure and Applied Mathematics Volume 2, Issue 1

Mathematics Society Mathematics Publication CCA March 2022


Credits First of all, we would like to express our sincere gratitude to the following contributors to the very first NLCS Jeju Annual Journal of Mathematics.

The Mathematics Society Chair Daniel Yoo (12) Co-Chair Aaren Joonseok Kang (11) Lucy Hyeonbi Jee (12) Publicity Officer William Yejune Kim (12) Secretary James Daewoong Kang (11)

The Mathematics Publication CCA Program Leader Daniel Yoo (12) William Yejune Kim (12)

LATEX Editors & Managers Aaren Joonseok Kang (11) Eddie Kyubin Min (11)

Writers Kyubin Eddie Min (11) Link Teacher Mr William Hebbron

Members Ju Hyeon Park (12) Joy Dongho Yang (11) Eddie Kyubin Min (11) Joshua Jihong Kim (11) Emma Chae Eun Chung (10) Irene Jiyu An (10) Seoyoung Melissa Min (9) Hanbi Lee (9) Link Teachers Mr William Hebbron Ms Duygu Bulut

Also, we would like to thank the following contributors that are not part of the mathematics societies, but have assisted us in various ways. Marketing Department for helping us with the printing and publicizing our journal. Mr Tamlyn for coordinating societies in our school.

© Copyright by the NLCS Jeju Journal of Pure and Applied Mathematics Formatting by William Kim and Daniel Yoo Edited by Aaren Joonseok Kang & Eddie Kyubin Min 3


Editor’s Note I am delighted to introduce our new NLCS Journal of Pure and Applied Mathematics (JPAM), the first issue of this academic year. There had been many difficulties this year, including the COVID pandemic, so I am surprised by the outstanding participation of our Society members that allowed the journal to be published on schedule. JPAM provides a really exciting opportunity to consider the truly interdisciplinary nature of mathematical concepts and appreciate their beauty. The objective of JPAM is to publish up-to-date, high-quality and original research papers with a high level of academic rigour. As such, the journal aspires to be entertaining, vibrant and accessible, and at the same time deeply integrative and challenging. This issue of JPAM consists of three sections. The first, Pure Mathematics, will discuss the theoretical or methodological method of approaching provocative mathematical ideas, exploring the abstractness, purity, simplicity, depth or orderliness of mathematics. The second, Applied Mathematics, will provide information about how mathematics are applied in computer science, physics, economics and natural sciences. The third and final section will be a mini IA by Joe Kim, providing an insight into what to expect in the IB curriculum. There is no doubt that this year’s journal has exceeded our expectations, successfully maintaining JPAM’s high academic rigour. The authors who have contributed to this issue have scrutinised volumes of scholarly literature to furnish the best possible understanding of their topics. As the editor of this journal, it was my pleasure to work with the astounding articles and we all feel that this special issue of JPAM will certainly stimulate further exploration of this fascinating field of mathematics. My many thanks are given to each author and especially to and our overleaf programmers Eddie Min and Aaren Kang, for their overall contribution of putting the articles together and to Joe for providing the IA for open publishing. On a final note, we would like to express our sincere thanks to all readers for having interest in our NLCS Journal of Pure and Applied Mathematics.

James Daewoong Kang, Chief editor


Contents Pure Mathematics

5

1 Beauty of Mathematics: Euler’s Formula and Identity Lucy Hyeonbi Jee (Y12)

6

2 Proving for the Existence and Applications of the Number e James Daewoong Kang (Y11)

11

3 The Incompleteness Theorem Joshua Jihong Kim (Y11)

18

4 Introduction to Large Ordinals Joy Dongho Yang (Y11)

22

5 Mobius Inverse Formula and its Usage Joy Dongho Yang (Y11)

24

6 Non-Euclidean Geometry Emma Chae Eun Chung (Y10)

26

7 Central Limit Theorem Irene Jiyu An (Y10)

31

8 Euler’s Characteristic Hanbi Lee (Y9)

33

Applied Mathematics

35

9 Matrix in Digital Image Ju Hyeon Park (Y12)

35

10 Optimisation & Dynamic Programming Aaren Joonseok Kang (Y11)

39

11 Investigating Algorithms to Retrieve and Reconstruct Forum Replies from a Database Eddie Kyubin Min (Y11)

48

12 Different Number of Bases Melissa Seoyoung Min (Y9)

56

Mini IA

58

13 The Emergence of during a Simulation of a Series of Elastic Collisions Between Two Objects and a Wall - Moojo (Joe) Kim (Y12)

58

5


Beauty of mathematics: Euler’s formula and identity Lucy Hyeonbi Jee Year 12 Member of the Mathematics Society Email: lyjee23@pupils.nlcsjeju.kr

Using Taylor series, a function F (r) about the point r0 can be represented as the following:

Recommended Year Level: KS4

F (r0 ) · (r − r0 )2 2! ∞ F (r0 ) F (n) (r0 ) · (r − r0 )3 + = · (r − r0 )n + 3! n!

F (r) = F (r0 ) + F (r0 ) · (r − r0 ) +

1

Introduction Euler’s formula, eir = cos(r)+i sin(r) was just a formula which I often use to only solve problems without proof. Since I was already highly interested in e and the polar plane, I wondered about the correlation between e and i and how the Euler’s formula can be driven. Moreover, it was a shock to me that the power of imaginary numbers could be presented as trigonometric equations, which are periodic. After searching for the proof and investigating it, I recognized that Euler’s formula is proved from various perspectives, such as the Taylor series or calculus. I also learned that Euler’s formula and Euler’s identity, eiπ +1 = 0, were called ‘The most beautiful theorem in mathematics’ back in the 1900s, which made me more curious about how the mathematicians ended up calling with that name. When I first looked it up about the proof, I didn’t know where to start because I was unfamiliar with Taylor series or polar coordinate concepts. Therefore, I decided to investigate the derivation of Euler’s formula using different methods like Taylor series and calculus, and seek the relationship with Euler’s identity. I’ll examine why it is called the most beautiful theorem in mathematics. I will also investigate the influence of mathematics, and its current applications of Euler’s formula and Euler’s identity.

n=0

(1)

To prove Euler’s formula, we need to present eir , cos(r) and sin(r) using the Taylor series about the point x = 0, which can be also called a Maclaurin series. Let P (r) = cos(r).

(2)

P (r) = − sin(r)

(3)

Then,

If we keep differentiating P (r) infinitely many times, P (r) can be represented like: P (r) = cos(r)

cos(0) × (r − 0)2 = cos(0) − (sin(0) × (r − 0)) − 2! sin(0) cos(0) 3 4 + × (r − 0) + × (r − 0) +··· 3! 4! r2 r4 +0+ +··· 2! 4! r2 r4 = 1− + +··· . 2! 4!

2 Deriving Euler’s formula 2.1 Taylor Series Taylor series is an expansion of a function into an infinite sum of terms in a power series which are calculated from a function’s derivatives at a single point.

= 1−0−

6

(4)


Since P (r) = P (4) (r), the cycle {1, 0, −1, 0} is repeated and the minus sign follows the pattern of (−1)n for nth term. Therefore, cos(r) =

n=0

(−1)n ×

r2n (2n)!

So, dn = er for every 0 ≤ n, n ∈ N + . drn

(5)

like:

If we differentiate both sides of equation 5 with respect to r, it can be represented as the followings: d r2 r4 r6 d (cos(r)) = (1 − + − + · · · ). dr dr 2! 4! 6!

If we present Q(r) using Maclaurin series, it will be

Q(r) = e0 + e0 × (r − 0) +

e0 e0 × (r − 0)2 + × (r − 0)3 2! 3!

e0 × (r − 0)4 + · · · 4! r2 r3 r4 = 0+r + + + +··· 2! 3! 4! ∞ n r . = n!

+ (6)

Then,

n=0

r2 r4 r6 d (1 − + − + · · · ) dr 2! 4! 6! 2r 4r3 6r5 = 0− + − +··· 2! 4! 6! r3 r5 = 0−r + − +··· 3! 5! = − sin(r).

r3 r5 + +··· 3! 5! ∞ (−1)r × r2n+1 . = (2n + 1)!

(7)

(ir)2 (ir)3 (ir)4 + + +··· 2! 3! 4! r2 ir3 r4 (13) = 1 + ir − − + +··· 2! 3! 4! r2 r4 r3 r5 = (1 − + + · · · ) + i(r − + + · · · ). 2! 4! 3! 5!

Q(ir) = 1 + ir +

As equation 9 is equal to equation 14, we can derive that:

(8)

eir = cos(r) + i sin(r)

n=0

As a result, we can conclude that: r2 r4 r6 + − +···) 2! 4! 6! r3 r5 + i(r − + + · · · ) 3! 5!

cos(r) + i sin(r) = (1 −

(14)

2.2 Calculus 2.2.1 Differentiation All complex numbers, written in the form a + bi (a, b ∈ R), can be expressed in polar coordinates, (R, θ), where R represents the radius of a circle and θ represents the angle from the positive x-axis. Moreover, all complex numbers can be separated by its real part and an imaginary part. Therefore, if we look at the equation eir , it can be expressed like:

(9)

Now we need to present eir using Taylor series to see if it equals to cos(r) + i sin(r) . So, we will first convert er using Taylor series and then later substitute ir into the power series. Let Q(r) = er . Then, d r (e ) = er , dr d2 r (e ) = er , dr2 .. .

(12)

If we substitute ir into Q(r),

Therefore, sin(r) = r −

(11)

eir = f (r) + ig(r),

(15)

where f (r) denotes a real function and ig(r) denotes an imaginary function. f we take the derivative of both sites from equation 15, it will be as the following:

(10)

d d ir (e ) = (f (r) + ig(r)) dr dr

7

(16)


2.2.2 Integration To derive Euler’s formula using integration, we first need to set out an equation which converts cos(r)+isin(r) into another form so that we can later on integrate that set form. So, let

This can be expressed like: ieir = f (r) + ig (r).

(17)

To show the relationship between f (r) + ig(r) and the derivatives of that, we will multiply i to both sides of equation 15. Then, it can be presented like: ieir = if (r) − g(r).

cos(r) + i sin(r) = z

(18)

Then, if we differentiate both sides of equation 25 with respect to z, it will be like the following:

As the left hand side of equation 17 and equation 18 are the same, we can equal them like the following: f (r) + ig (r) = if (r) − g(r).

dz = (− sin(r) + i cos(r))dr.

(19)

(20)

(− sin(r) + i cos(r))dr =((−1) sin(r) + i cos(r))dr

And the imaginary part can be expressed like: ig (r) = if (r), g (r) = f (r).

(21)

=(i2 sin(r) + i cos(r))dr =i(i sin(r) + cos(r))dr =(iz)dr =dz.

From equation 21, we can interpret that if we have a function g(r), the derivative of it will give us another function which is f (r). Then, if we take the derivative of f (r), we get the negative value of the original function, g(r), which is −g(r). So, differentiating the original function g(r) twice gives us the same function back but with a minus sign on it. But, this case is uncommon because most functions, for example like polynomials, don’t get back to their original form after the second derivation process. So, we can know that the function which can suit to this situation is trigonometric equations like cos(r) or sin(r). If f (r) = cos(r), and g(r) = sin(r) f (r) can be presented like: f (r) = − sin(r) = −g(r)

dz = (iz)dr,

(28)

dz = idr. z

(29)

If we integrate both sides of equation 29, it will be like the following:

(22)

dz =i z

dr.

(30)

Then, if we convert both sides, it will be like:

(23)

As this is equal to the original Euler’s formula, we can derive that: eir = cos(r) + i sin(r).

(27)

As

Therefore, we can conclude that: f (r) + ig(r) = cos(r) + i sin(r)

(26)

To get equation 26, we use implicit differentiation where we differentiate two different variables at once. So, although we are differentiating with respect to z, we need to differentiate r like we’re differentiating with respect to r and differentiate once more in respect to r, like the chain rule. Then, if we convert equation 25, it will be like the following:

Because we have complex expressions on both sides, we need to equal the real part and the imaginary part in order for these two expressions to be equal. Then, the real part can be expressed like: f (r) = −g(r).

(25)

(24)

dz = ln(z) z = i dr = ir.

8

(31)


As z was already defined from equation 31, z can be presented like: z = eir = cos(r) + i sin(r)

on the real axis rotates at right angles on a very small scale. Then, as ∆r is the length of arc on the circle in Figure 1, we can imply that Euler’s formula can be used to show that the circle is actually the fundamental shape which represents the periodic properties. As circle is the fundamental of geometry which has repeated movements, we can imply that eir is a periodic formula, which explains further why it can be represented using trigonometric equations. Moreover, Euler’s formula and identity can be explained as the fundamental equation for kinetic movements in physics which includes the field of electromagnetic fields, circuits or springs. This is primarily because the Euler’s formula and equation combine imaginary and real numbers, serving as a bridge connecting those two different concepts.

(32)

3

Relationship between Euler’s formula and Euler’s identity If we substitute r from Euler’s formula into π, it can be expanded like the following: eiπ = cos(π) + i sin(π) = −1.

(33)

∴ eiπ + 1 = 0.

5

Conclusion Like the mathematics professor from Stanford University states, “Euler’s equation reaches down into very depths of existence” Euler combined constants of fundamental math concepts, which made a relationship between existing and imaginary numbers. Euler’s formula or identity was just an unknown area where I couldn’t recognize its beauty until starting this investigation. It was challenging to understand the entire process using different maths. Also, I had to understand the more profound message of the equation, which was beyond just a combination of other concepts. However, I finally understood the derivation process through this research and wrote a step-by-step guide from different perspectives. I gained knowledge of new ideas like the Taylor series, drew it into the polar plane, and applied it in physics or other areas of mathematics. This independent investigation enabled me to become a stronger mathematician overall.

Which comes out to become the Euler’s identity. Here, Euler’s identity is actually a special case of Euler’s formula, which showed to me that the beauty of Euler’s identity actually originates from the formula itself. Both Euler’s identity and Euler’s formula contain e, base of natural logarithm, and i, the unit of imaginary numbers. As e and i are included in the five fundamental mathematical concepts, the final result of 0 from the combination was paradoxical and intriguing, which made me curious how this can be applied to other areas of mathematics or other subjects. 4

Application, meaning of Euler’s formula and identity

A Bibliography [1] Anon, (2020). Euler’s Formula: A Complete Guide | Math Vault. [online] Available at: https:// mathvault.ca/euler-formula/. [2] Mathematics Stack Exchange. (n.d.). complex numbers - Euler’s formula proof with Calculus. [online] Available at: https: //math.stackexchange.com/questions/ 2139059/eulers-formula-proof-withcalculus[Accessed27Jan.2022]. [3] Hansha, O. (2020). Understanding Euler’s Formula. [online] Medium. Available at: https: //ozanerhansha.medium.com/understandingeulers-formula-888e5f58f5591. [4] Anon, (n.d.). Deriving the famous Euler’s formula through Taylor Series – Muthukrishnan. [online] Available at: https: //muthu.co/deriving-the-famous-eulersformula-through-taylor-series/. [5] sites.google.com. (n.d.). 2: Euler’s Formula

Fig. 1: Circle on polar plane with radius 1 centered at the origin Suppose we draw ei×∆r infinitely many times like in Figure 1. As r increase by ∆r, the right triangle

9


Makes Complex Numbers Central to Physics - Great Math Moments. [online] Available at: https://sites.google.com/site/ greatmathmoments/identity [Accessed 27 Jan. 2022]. [6] www.youtube.com. (n.d.). Taylor Series and Maclaurin Series - Calculus 2. [online] Available at: https://www.youtube.com/watch?v= LDBnS4c7YbA [Accessed 27 Jan. 2022]. [7] Wikipedia Contributors (2022). Euler’s identity. [online] Wikipedia. Available at: https:// en.wikipedia.org/wiki/Euler%27s_identity#: ~:text=A%20poll%20of%20readers%20conducted [Accessed 27 Jan. 2022]. [8] Wikipedia Contributors (2020). Euler’s identity. [online] Wikipedia. Available at: https://en.wikipedia.org/wiki/Euler% 27s_identity#Explanations.

10


Proving for the existence and Applications of the Number e James Daewoong Kang Year 11 Mulchat Secretary of the Mathematics Society Email: dwkang24@pupils.nlcsjeju.kr

1

Introduction The number e (a.k.a. the natural number, Euler’s number) is one of the most fundamental mathematical constants that serves as the basis of common mathematical functions such as logarithmic functions and complex numbers. Although this constant is very important, constantly appearing in numerous maths textbooks, people are unaware of its origin and applications, the key factor that makes it so significant and prevalent. The aim of this exploration is to derive the constant e using the principles of sequence and series, and explore its uses in statistics, calculus, and complex numbers. Derivation: Compound Interests Although named after the Swiss mathematician Leonhard Euler , this constant was first discovered by Jacob Bernoulli while computing compound interest rates. He noticed that when there is a total of 100% interest on a single year period, the interest rate converges to a single value as the interest is given more frequently throughout the year. Assume that the interest is computed b times a year. The interest rate of the year at each given value of b can be computed using limits: the value of (1 + n1 )n as n approaches b. It is evident from the table that as interest is applied more frequently over the year (i.e. as the value of b tends to infinity), the interest rate converges to a value of 2.71828 . . ., otherwise known as the constant e. The graphic illustration of this result is as follows: Here, it is evident that as x tends to positive infinity, the value of y tends to the constant e. Also, e would have to be an non-converging infinite decimal. The binomial expansion of the expression (1 + 1 n n ) is

b 1

lim

n→b

1 1+ n 2

n

100

2.70481382942...

10000

2.71814592682...

1000000

2.71828046915... .. .

2.71828...

2

n k n 1 k n

k=0

, and therefore the infinite series would approach 0 as n → ∞. This, in other words, means that e’s decimal expansion does not terminate; no finite number of digits can define e.

11


3

Derivation: Probability The constant e also occurs in probability - the Bernoulli trial. A Bernoulli trial models an experiment where there are only two outcomes: success or failure. For an experiment with n trials, the probability of having a certain number of successes can be calculated using the concept of binomial distribution. A binomial distribution is given as X ∼ B(n, p) where n represents the number of trials and p represents the probability of success, models the probability of succeeding x times out of n trials, given that X is the number of successes over n trials and the probability of successes occurring each time is n1 . If we were to picture it, it could be a ballot where selected raffles are put back into the ballot box each time. Then, the probability of having 0 successes is:

P (X = 0) =

n 1 0 0

1 1− n

n

n

1 = 1− n

same constant in different areas of mathematics should be more than a coincidence.

4

Derivation: Calculus Differentiation is a mathematical principle that enables us to find the gradient at a point on the graph of a function by employing the idea of the limits. The reason why I investigated calculus is because both the compound interest method and the probability method involve limits. For this method of derivation, an exponential function, y = ax , is considered. By using the definition of the derivative, the derivative of the equation can be represented as:

n

d x (a ) dx f (x + h) − f (x) = lim h→0 h ax+h − ax = lim h→0 h x a (ah − 1) = lim h→0 h h −1 a = ax lim h→0 h

f (x) =

Consider now in the same setup where n is a large number, say n=109 . Then, the probability of having 0 successes out of 109 trials is equal to:

P (X = 0) =

109 0

0 109 −0 1 1 1− 9 109 10 109

1 109 1 1 ≈ = 2.71828 . . . e = 1−

In other words, when this result is represented x 1 will tend towards an asymp1−x

graphically, y = tote of y =

1 e

as x tends to ∞.

where the magnitude of h is infinitesimal. From chapter 2 above, “Derivation: In x Compound 1 . By subterests”, it was given that e = lim 1 + x→∞ x 1 stituting x = h , where h is small,

e = lim

x→∞

1+

1 x

x 1

= lim (1 + h) h 1 h →∞

Since

1 h

→ ∞ as h → 0:

1

e = lim (1 + h) h h→0

Isn’t it interesting that the constant e appears both in compound interests and probability, two distinctively different areas of mathematics? The occurrence of the

Now, if I consider the case where a = e and make algebraic manipulations, the derivative f (x) can be given as below:

12


4.1 d x (e ) dx eh − 1 = ex lim h→0 h

f (x) =

1

((1 + h) h )h − 1 = e lim h→0 h 1 + h − 1 = ex lim h→0 h = ex x

Further Exploration h The derivation above also implies that as a h−1 tends to 1, the value of a tends to e. In order to testify this result, I set out a set of calculations to derive h the value a tends to as a h−1 tends to 1. The procedure and values from the calculations are tabulated as below. The table below is the substitution of the variable a with h adequate values to obtain the value of a for which a h−1 −9 becomes equal to 1, assuming that h = 10 , a negligibly small number, as intended in the use of the first principle:

ln(y) = ln(ax ) Then,

y = eln(a

x

ah −1 h ,

a

Here, I proved that the derivative of ex is ex and since I used the definition of e from chapter 2, this result does not apply to other exponential functions with a different base than e. In other words, this is a unique characteristic of e. In fact, there is another interesting point: the function f (x) = ex is a function in which the y-value at any given point on the graph is also the gradient of the graph at the point. From this result, I also became aware that the constant e is what enables us to differentiate or integrate exponential functions of other bases. For instance, when y = ax is to be differentiated, I can take the log to base e on both sides to change the base as follows:

)

h = 10−9

1

0, <1

2

0.6931, <1

3

1.0986, >1

2+3 2 = 2.5 2.5+3 = 2.75 2 2.5+2.75 = 2.625 2 2.75+2.625 = 2.6875 2 2.75+2.6875 = 2.71875 2 2.6875+2.71875 = 2.703125 2 2.71875+2.703125 = 2.7109375 2 2.71875+2.7109375 = 2.71484375 2 2.71875+2.71484375 = 2.716796875 2 2.71875+2.716796875 = 2.7177734375 2 2.71875+2.7177734375 = 2.71826171875 2

0.9163, <1 1.0116, >1 0.9651, <1 0.9886, <1 1.0002, >1 0.9944, <1 0.9973, <1 0.9987, <1 0.9995, <1 0.9998, <1 ≈1

Differentiating using the chain rule gives: Each subsequent value of a to be substituted into was determined by averaging the two values of a

dy = ln aex ln a dx Since ax = ex ln a :

dy = ln a(ax ) dx Exponential and logarithmic relations and changes commonly occur in nature: radioactive decay, cell divisions, and capacitor charging and discharging to name a few. This suggests that the constant e not only is related to mathematics, but also related to nature and science.

ah −1 h

h

that gave the closest value of a h−1 to 1. As shown in the table above, I was able to determine that, as the h value of a h−1 approached to 1, the value of a tended to 2.7183. . . , which is the value of e accurate to 5 significant figures. The calculations I performed showed that the value of a will approach closer to constant e if a smaller number was used for the small change h. Each subsequent value of a to be substituted into ah −1 was determined by averaging the two values of a h h

that gave the closest value of a h−1 to 1. As shown in the table above, I was able to determine that, as the h value of a h−1 approached to 1, the value of a tended to 2.7183. . . , which is the value of e accurate to 5 significant figures. The calculations I performed showed that the value of a will approach closer to constant e if

13


a smaller number was used for the small change h. Derivation: Taylor Series Through further exploration into the number e, I encountered the Taylor series, which can be used to express any function as an infinite series of polynomials. Through studying the Taylor series, I believed that this would lead me to another definition of e through expanding the function, f (x) = ex into the Taylor Series. Taylor series expands a function in the form of an infinite series around a single point on it, using the function’s derivatives. The Taylor series of a function f (x) at x = a is given by:

Rearranging gives cn =

5

f (a) (x − a)2 2! f (a) f (a) (x − a)2 + (x − a)3 + · · · + 2! 3! f n (a) (x − a)n + · · · + n!

Hence, plugging this back into the original form gives f (x) = f (a) + f (a)(x − a) + +

f (x) ≈

n=0

f (a) (x − a)n n!

There are other more complicated derivations of Taylor series, but I have been able to demonstrate it in a simple way, assuming that the function can indeed be expressed as an infinite series. Assuming that n f (x) = ∞ n=0 cn (x − a) , which is the default form of the infinite series, f (x) = c0 + c1 (x − a) + c2 (x − a)2 + c3 (x − a)3 + · · · f (a) = c0

f (x) = c1 + 2c2 (x − a) + 3c3 (x − a)2 + · · · f (a) = c1

f (x) = 2c2 + 3 · c3 (x − a) + 4 · c4 (x − a)2 + · · · f (a) = 2c2

e0 e0 (x − 0)2 + (x − 0)3 + · · · 2! 3! ∞ xn x2 +··· = = 1+x+ 3! n!

ex ≈ e0 + e0 (x − 0) +

n=0

This is in fact, a special case of the Taylor series called the Maclaurin Series (when the function is expanded around x = 0). To verify the value of e through the Maclaurin series, I substituted x = 1 into the equation above, and the result is as follows: e = e1 =

∞ 1 n!

n=0

As the purpose of Taylor series is to ‘approximate’ the function, I researched about how to use this infinite series to approximate the value of e to ten decimal places and realised I could use the ‘remainder term’ of Taylor series. Taylor’s formula with a remainder is f (x) = Tn (x)+ (x−1)k k is the Taylor Rn (x : a) where Tn (x) = n k=0 f (a) k! polynomial (Taylor series truncated after power n) and (n+1

(c)(x−a) Rn (x : a) = f is the remainder term for (n+1)! some c between a and x. This is a powerful tool that gives bounds to the error and represents accuracy in approximating using Taylor polynomials. We need to first compute how many terms we need in our Taylor polynomial to be able to accurately approximate ex to ten decimal places. We have already seen the Taylor Series of ex around x = 0:

f (x) = 3! · c3 + 4 · 3 · 2c4 (x − a) + · · · f (a) = 3! · c3 So we can detect a pattern and say that the nth derivative of f (x) at x = a is

f (n) (a) (x − a)n + · · · n!

f (a) (x − a)2 + · · · 2!

Since we have proved in chapter 4 that the derivative of ex is ex , we can use this to find the Taylor series of ex . It is evident that f (x) = f (x) = · · · = f (n) (x) = ex , so the Taylor series of ex around the point x = 0 is:

f (x) ≈ f (a) + f (a)(x − a) +

f (n) (a) n!

Tn (x) =

n xk

k=0

k!

n+1

, Rn (x : 0) =

We need Rn (x : 0) =

ec xn+1 (n+1)!

ec xn+1 f (n+1) (c)xn+1 = (n + 1)! (n + 1)! < 10−11 where 0 < c < 1.

1 < ec < e1 < 3

f (n) (a) = n! · cn

14


ec 3 ec 1n+1 = < < 10−11 (n + 1)! (n + 1)! (n + 1)! Using the technology, it can be found that n = 14 3 − 10−11 first becomes negative as required when (n+1)! by the inequality. Hence, we an use the Taylor polynomial to order of 14 to approximate e correct to ten decimal places. 14 1 1 1 = 1+1+ +···+ = 2.7182818285 k! 2 14!

k=0

(to 10 decimal places) This is a correct 10 decimal place approximation of e according to what mathematicians have discovered so far. Derivation: Poisson Distribution Poisson distribution is a probability distribution is used to compute the probability of n events occurring in a fixed interval, given that the mean number of events occurring within that interval is m. Let the random variable X represent the number of events occurring.

As a result of this, we reach the derivation for e: e1 =

n=0

n P (X = x) = ( )px (1 − p)n−x x In this equation, n, x, and p denote number of trials, number of successes, and the probability of success, respectively. By utilising the expected value of the distribution, E(x) = np, we can rearrange the value of p into: P=

X P o(m)

P (X = n) =

E(x) n−x n E(x) x ) (1 − ) P (X = x) = ( )( x n n By expressing ( n x ) into the standard combinations form, this can be further simplified into: E(x) n−x n(n − 1)(n − 2) · · · (n − x + 1) (E(x))x · ) · (1 − x x! n n n − x + 1 (E(x))x E(x) n−x n n−1 n−2 · ... · · · (1 − ) = · n n n n x! n n − x + 1 (E(x))x n n−1 n−2 · ... · · = · n n n n x! E(x) −x E(x) n ) · (1 − ) (1 − n n

P (x) =

P (X = 0) + P (X = 2) + · · · + P (X = n) + · · · = 1

m0 m1 m2 mn + e−m + e−m + · · · + e−m +··· = 1 0! 1! 2! n! 0 1 n−1 n m m m m e−m ( + +···+ + =1 0! 1! (n − 1)! n! ∞ mn =1 e−m · n! e−m

n=0

This means that

E(x) n

Now, we can rewrite the whole equation for calculating P (X = x) into:

e−m mn n!

The sum of probability should always be equal to 1. Due to the link between infinity and the constant e, I expected that summing the probabilities of all possible values of n would lead me to a meaningful outcome. Hence, I attempted to add all the probabilities:

n=0

The result yielded in this derivation gave the same result as the Taylor Series. Since the constant, e, both occurs in the Binomial distribution and Poisson distribution, there seems to be a connection between the Binomial distribution and the Poisson distribution. Through further exploration into this, I was able to learn that “the Poisson distribution is actually a limiting case of the Binomial distribution, where the number of the trials, n, tends to infinity while the probability of success, p, is small.” The general form used in calculating probabilities in a binomial distribution is:

6

Then, the probability of P (X = n) can be calculated by:

∞ ∞ n 1 1 = n! n!

Applying the principles of limits which was explained earlier, this equation can be even more simplified assuming that n approaches infinity because the first x terms would tend to 1. As a result, we can derive the equation for calculating the probabilities in a passion distribution:

∞ mn = em n!

lim P (x) =

n=0

n→∞

15

e−E(x) · E(x)x x!


7

Application: Complex Numbers N th roots of polynomials without real solutions can be found by using complex numbers, which consist of both real and imaginary numbers. These numbers take the form of a + ib, and can be geometrically expressed on the argand diagram as shown below.

Applying the same on cos x, cos x can be represented as: cos x = 1 − cos x =

n=1

x2 x4 x2n + + · · · + (−1)n · 2! 4! (2n)! (−1)n ·

x2n (2n)!

As a result, complex numbers can be expressed in the Euler form as below: x2n x2 x4 + · · · + (−1)n · cos x + i sin x = 1 − 2! 4! (2n)! 2n−1 x3 x5 x +i· x− + + · · · + (−1)n−1 × 3! 5! (2n − 1)!

x2 x3 x4 x5 x6 x7 −i + +i − −i +... 2! 3! 4! 5! 6! 7! 2 x2 3 x3 4 x4 = 1 + ix + i + i + i +... 2! 3! 4! = 1 + ix −

(ix)0 (ix)1 (ix)2 (ix)3 (ix)4 (ix)5 + + + + + +... 0! 1! 2! 3! 4! 5! ∞ (ix)n = eix = n! =

n=0

7.1 The complex number z in the figure above has a modulus of r and an argument of θ, and hence can be represented in the modulus-argument form: z = r(cos θ + i sin θ). Also, complex numbers can be represented in polar forms using Euler’s Identity. The polar form of representing complex numbers is eix = cos x + i sin x, where x is the argument measured in radians. This equation can be proved by separately expressing cos x and sin x using the Maclaurin series mentioned in chapter 5. f (x) = sin x → f (0) = 0 f (x) = cos x → f (0) = 1

f (x) = − sin x → f (0) = 0 f (x) = − cos x → f (0) = −1

Therefore, sin x can be represented as: x3 +··· 3! x3 x5 x7 x2n−1 sin x = x − + − + · · · + (−1)n−1 · 3! 5! 7! (2n − 1)! ∞ 2n−1 x (−1)n−1 · sin x = (2n − 1)!

sin x = 0 + x + 0 −

n=1

Euler’s Identity Since this equation eix = cos x+i sin x can be used to compute manipulation of complex numbers much faster than in the a + bi form or the polar form, it is considered a powerful tool in Mathematics. In the particular case of x = π,

eiπ = cos π + i sin π since cos π = 1 and sin π = 0,

eiπ + 1 = 0 This equation is the Euler’s identity, often referred to as “the most beautiful equation” because it consists of the most important constants in mathematics: e, i, π, 1, 0. How these five fundamental constants in mathematics could be gathered into a single equation is indeed beautiful, but I was also able to see that Euler’s identity is a useful tool for converting negative real numbers into polar form quickly. 8

Application: The Constant e in Nature Although derived artificially, the constant e lso often occurs in nature. For example, shells have the shape chaos, which depends on complex numbers.

16


e.g. z = r cis θ when r > 1 By De Moivre’s Theorem, z n = rn (cis nθ). As the value of n increases, the magnitude of r increases exponentially whereas the argument, θ, increases linearly as shown in the diagram above. The loci of the point as n increases forms chaos patterns which occur in many shapes that occur naturally. 9

Conclusion Indeed, the constant e is widely used throughout mathematics, and is one of the most fundamental concepts and foundations to major fields like calculus and statistics. This exploration into e is intriguing especially because unlike π, it’s derivation and application are not clearly taught within the school curriculum. The new knowledge of e in this journal will serve as an motivation for investigation into other significant mathematical trivia such as the derivative of e, the Euler’s identity, or even ‘derangements’ or ‘hat check problem’ linked to the derivation of e, discovered by Jacob Bernoulli and Pierre Raymond de Montmort.

17


The Incompleteness Theorem Joshua Jihong Kim Year 11 Sarah Member of the Mathematics Society Email: jkim24@pupils.nlcsjeju.kr

1

Introduction In mathematics, proof stems from axioms: basic statements that are assumed to be true. The problem lies in the fact that one never knows if an axiom is actually true, even though it is the basis of all proof in mathematics. This exploration is an attempt to find a system of proof capable of proving axioms, while discussing the three assumptions threatening the fundamentals of mathematics, including how they were proven or disproven.

1)2 . Expand the square of the binomial (2k + 1)2 to get (2k + 1)2 = 4k 2 + 4k + 1. Factor out 2 from the first two terms of the trinomial, we have 4k 2 + 4k + 1 = 2(2k 2 + 2k) + 1. Since k is an integer, then 2k 2 + 2k must also be an integer by the Closure Property of Addition and Multiplication over the set of integers (Z). Let r = 2k 2 + 2k, thus 2(2k 2 + 2k) + 1 becomes 2r + 1 which is clearly an odd number. Since we proved the contrapositive to be true, then the original statement must also be true. Therefore, we have proved that if n2 is even, then n is even.

2

3

The Conventional Proof of Mathematics As said above, a system of proof starts with axioms, for instance, “The whole is greater than the part.” Proofs are then constructed from these axioms using Rules of Inference, methods for using statements to derive new statements. Due to the rule, if the starting statement, or axiom, is true, then the deriving sentences would also be true. Here is a simple example: Suppose n is an integer. If n2 is even, then n is even. We will prove this theorem by proving its contrapositive. A contrapositive is a proposition or theorem formed by contradicting both the subject and predicate. For example, if ‘P then Q (P → Q),’ its contrapositive will be ‘if not Q then not P (∼ Q →∼ P ).’ The contrapositive is logically equivalent to the original conditional statement; if the original is true, then is the contrapositive, and vice versa. Here is a simple example of it: If it is raining, then I wear my coat. If I don’t wear my coat, it isn’t raining. The contrapositive of the theorem we are trying to prove is: Suppose n is an integer. If n is odd, then n2 is odd. Since n is odd then we can express n as n = 2k + 1 for some integer k. Next, we square n, thus n2 = (2k +

The Formal System of Proof The proof above has mathematical statements connected with sentences; however, sentences are highly subjective and do not have a fixed format. David Hilbert (1892 – 1943) pursued a different style: a formal system of proof. A formal system of proof consists of a symbolic logical language with a rigid set of manipulation rules for those symbols. Logical and mathematical systems could be then translated into this system. For instance, the statement If you drop an apple, it will fall. would be A ⊃ B.

A more complicated example, No human is immortal. would be ∼ ∃(x)(H(x) ∧ I(x)).

Hilberts and the formalists wanted to express the axioms of mathematics as symbolic statements in a formal system and set up rules of inference as the system’s rules for symbol manipulation. Rules of Inference, as said above, are syntactical transform rules that one can use to infer a conclusion from a premise (hypotheses).

18


So Bertrand Russell (18 May 1872 – 2 Feb 1970) and Alfred North Whitehead (15 Feb 1861 – 30 Dec 1940) developed a formal system like this in their threevolume Principia Mathematica published in 1913. The book consists of 2000 pages full of dense mathematical notation; 762 pages of which are used to prove the statement 1 + 1 = 2.

Although the book may seem complicated and even pointless, it is also exact. It leaves no room for errors or fuzzy logic to be involved. Most importantly, it allows one to prove the statements of the formal system itself. 4

Hibert’s Three Problems There were three major questions that Hilbert wanted answered about mathematics: 1. Is mathematics complete? 2. Is mathematics consistent? 3. Is mathematics decidable? The first asks if there is a way to prove every true statement, the second asks if mathematics is free of contradictions — if one can prove a and ∼ a there would be a problem, and the third asks if there is an algorithm that can always determine whether a statement follows from the axioms. Hilbert believed that the answer to all three questions was a yes. 5

Gödel’s First Incompleteness Theorem However, it did not take long for the first question to be proved false. In 1930 a logician named Kurt Gödel claimed to have proved the first question, although the only person that listened to Gödel was John von Neumann. The following year Gödel published a research paper with his proof which everybody took notice of. This is how Gödel’s proof works: Every basic symbol of the mathematical system is given a number, known as a Gödel number. The symbol for not (∼) is given the Gödel number 1, the symbol for or (∨) is given the Gödel number 2, if. . . then (⊃) is 3, and so on.

Positive integers — there are no negative integers in this system — are represented through two cards; 0 and s. 0 is self explanatory, it has the Gödel number 6. s means “the immediate successor of,” having the Gödel number 7. If someone were to express the integer 2, putting two successor symbols next to integer 0 would do the job. In the same way, 3 would be written as sss0. Mathematical expressions can be also expressed in Gödel numbers. For instance, 0 = 0 will be converted to 656, given the symbols’ respective Gödel numbers. Now this expression itself is given a Gödel number by taking the prime numbers starting at 2 and raising each one to the power of the Gödel number of each symbol in the equation. Therefore, the Gödel number of expression 656 would be 26 ∗ 35 ∗ 56 = 243, 000, 000. Hence, a unique Gödel number can be written down for all sets of symbols imaginable. What is intriguing about Gödel’s system is that one can find out exactly which symbols have been used in an expression by prime factorising the Gödel number. Of course, in this infinite deck of Gödel numbers, there would be true and false systems. Again, axioms are used to prove statements, which also have their own Gödel numbers formed in the same way. For example,

19


This statement, formally notated as ∼ x(xDem(gSub(17, y))), however, has a proof whose Gödel number if g.

Peano Axiom 1, which states no successor of any number equals 0 is true since there are no negative numbers in this system. This axiom is symbolically written as ∼ (sx = 0), and its Gödel number is 538616302983699476208784212315216033281250. Then we can substitute in 0 for x, which creates the expression ∼ (s0 = 0). This means successor of 0 does not equal to 0

being the simplest proof to the statement 1 does not equal to 0. This whole proof also receives its own Gödel number in the same way; 2Gödel number of ∼(sx=0) × 3Gödel number of ∼(s0=0)

This Gödel number is unimaginably large; it has 73 million digits. Since even the simplest proofs get to have tremendously large Gödel numbers, Gödel then decides to use alphabets. For instance, the proof above would be substituted to Gödel number a.

If the statement is true, then it means that it itself is unprovable; there is no proof among the infinite “pool” of expressions represented by Gödel numbers and alphabets. If the statement is false, then there would be a proof. This time a paradox is created because the existent proof proves the statement which states “there is no proof”. Ultimately, if the statement is false, then there is a contradiction. If the statement is true, then it means that there are true statements that do not have a proof, which implies that the mathematical system is incomplete. This is Gödel’s First Incompleteness Theorem. There will always be true statements about mathematics that cannot be proven. 6

Gödel’s Second Incompleteness Theorem Although Gödel’s Second Incompleteness Theorem did not find a perfect answer to Hilbert’s second question, it proved the following: Any consistent formal system of mathematics cannot prove its own consistency. An inconsistent system can prove any statement, even if two statements are contradictory. Therefore a consistent system would have unprovable statements. This can be notated as: ∃y¬∃xDem(x, y)

In the same way, all other statements receive an alphabet for themselves. Here, Gödel runs into a problem. There is a statement: There is no proof for the statement with Gödel number g.

which basically states that y, a Gödel number for a certain statement, exists while x, the Gödel number for the proof of y, does not exist . According to the First Incompleteness Theorem, if the mathematical system is consistent, the proof for g — mentioned in the previous section — does not exist. However, the meaning of g is that “there is no proof for the statement with Gödel number g.” Therefore, if the mathematical system is consistent, g becomes true,

20


and if the consistency of mathematics is provable, then g would also be provable. However, because of the First Incompleteness Theorem, the proof for g is nonexistent. So, if the mathematical system is consistent, its consistency cannot be proven. Thanks to this theorem, the answer to Hilbert’s second question also revealed to not be a yes. So take together Gödel’s two Incompleteness Theorems say that the best one can hope for is a consistent yet incomplete system of math. However, such a system cannot prove its own consistency, implying that sometime in the future a contradiction could appear and declare that the mathematical system had been inconsistent the whole time. Now Hiblert’s last hope is the third question: “Is math decidable?” 7

The Turing Machine In 1936, Alan Turing found a way to settle this question, inventing the modern computer in the process. Turing imagined a mechanical computer powerful enough to carry out any calculation imaginable, but simple enough to understand its operation. He came up with a machine, later named the Turing Machine, that takes as an input an infinitely long tape of square cells containing 0 or 1. The machine has a read/write head that can read one digit at a time, then perform one of the few tasks: 1. 2. 3. 4.

Overwrite a new value Move left Move right Halt

Halting simply meant the program has run to completion. The program consists of a set of internal instructions, similar to a flowchart that tells what to do based on its internal state and the digit read. All the instructions can be exported to another Turing Machine which would then perform in exactly the same way. Although it sounds simple, the Turing Machine can perform any task if given enough time, from simply that of a calculator to a complex algorithm that predicts the weather, thanks to its virtually unlimited memory. When a Turing Machine halts, the program has finished and the output is the tape. However, sometimes a Turing Machine may never halt, stuck in an infinite loop. Would there be a way to determine beforehand whether the Turing Machine would halt given a certain input? This question was very similar to Hilbert’s Third Question; if one could figure out a way to find out if a Turing Machine would Halt, then it would also be possible to decide if a statement followed from the axioms. This could be done by the following: let k be the theorem that needs to be proven. Starting from an axiom, using rules of inference, the Turing Machine can

generate all theorems that could be generated from an axiom and check if there is k among them. If k is not there, the Turing Machine can construct all theorems that can be constructed one step from those, and so on. Each time the machine generates a new theorem, it checks if it is the theorem k. If it is, the machine halts, if it is not, it never halts. Turing assumed a machine that could determine whether a Turing Machine would halt or not on a particular input. The machine was named h. h receives an input and a program and simulates its operation and either prints out halts or never halts. h is considered to be flawless, free of any mistakes or malfunctions. h can then be modified by adding new components; if the new component receives the output halts then it immediately goes into an infinite loop. If the new component receives the output never halts, then it immediately halts. This entire new machine with additional components is called h+ . The entire program of h+ can be exported. What would happen if h+ receives its own exported code as input and program? Now, h is simulating what h+ would do, given its own input. h has to determine the behaviour of something it is part of. In this situation, if h concludes that h+ never halts, this makes h+ immediately halt. If h thinks h+ will halt, then h+ is forced to loop. Whatever output the halt-determining machine h gives, it turns out to be wrong. The h+ in “real life” acts in the opposite way h predicted, although h’s calculations are flawless. There is a contradiction, The only explanation for this is that a machine like this cannot exist. There is no way to tell in general if a Turing Machine will halt or not on a set input. This means mathematics is undecidable. There is no algorithm that determines if a statement derives from the axioms. Therefore some problems in math may be unsolvable. 8

The Aftermath Now it is evident that mathematics is fundamentally flawed; there is a hole at the bottom of math: humanity will never know everything with certainty. There will always be true statements that cannot be proven. Although it seems like this discovery drove mathematicians into panic, it did not. This new way of thinking was rather beneficial; this problem transformed the concept of infinity, changed the course of the World War, and even led to the invention of the computer. Yes, mathematics is not as immaculate as everybody expected it to be, however, still it remains as the core of everything we witness and experience.

21


Introduction to Large Ordinals Joy Dongho Yang Year 11 Jeoji Member of Mathematics Society Email: dhyang24@pupils.nlcsjeju.kr

1

What Ordinal Number are Basically, ordinal numbers are numbers you can count, such as 1, 2, 3, or 1024 and so on. The difference between ordinal numbers and cardinal numbers is that ordinal numbers can be counted, but cardinal numbers are numbers that can be described mathematically. And so, the set of cardinal numbers include the set of ordinal numbers, natural numbers are both ordinal numbers and cardinal numbers, but infinity, or Ω, is only a cardinal number, as this number can be defined, but cannot be calculated or counted.

2 · ω, we can have ω · ω, being 100, or even ω ω , being 10,000,000,000 in our definition. Another prime difference between those infinities is that when ω increases, infinite ordinals that are larger in its magnitude grow much faster; when ω increases to 11, ω increases by 1 compared to when ω is 10, 2 · ω increases by 2, and ω · ω increases by 21, and ω ω increases by 275311670611. However, we can still go farther in terms of magnitude. we can grab ω ω ω , reaching 1010000000000 (10000000000 digit) then we can use the knuth’s up arrow notation to define ω ↑↑ ω, meaning that ω is put up to a power 10 1010

2

To Infinity When expressing big numbers, Usually, standard form is used. Standard form of a number is expressed as n ∗ m1 0, used to represent very large or very small numbers. However, can we go farther? It turns out that infinity, or a variant of it can be defined in the realm of ordinal numbers. ω is defined as number of natural numbers, However, it differs from Ω, in the sense that ω assumes the number of natural numbers as countable; for easy comparison of magnitude for different infinite ordinals, In this article, the set of natural numbers will be defined as 1,2,3,4,5,6,7,8,9,10, thus ω is assumed as 10. Another huge difference between ω and Ω, is that can be used just like a normal ordinal number, thus counting beyond ω is possible such as ω+1 or ω+2. This is impossible with Ω, the cardinal, as there is no such number Ω+1 which is bigger than Ω, as the existence of +1 contradicts the definition of Ω; ‘uncountably large’. 3

Larger Infinities As ω is an ordinal, many larger infinities can be made. For instance, 2 · ω can be defined; in our definition of ω = 10; 2 · ω can be defined as 20. Of course, this is bigger than infinity in our definition, as our set of defined natural numbers are 1,2,3,4,5,6,7,8,9,10. After

1010

1010

10

of ω, ω times; in our case, reaching 1010 . From now, let’s define ω to 3 for convenience, so that ↑↑ can be a rather reasonable number, 7625597484987, since we are already bigger than the volume of the observable universe in meter cubed, 3.566Ö108 0m3 . The next magnitude will be ω ↑ ωω, in our case, 3 is put over the power of 3 7625597484987 times.

4

Getting Meaningless As the numbers become absurdly bigger, this idea of magnitude of infinity went all over the place, as it was easy to define an infinite ordinal bigger than another ordinal, as the generating function for a specific infinite ordinal can be repeated infinitely times to reach a greater magnitude. However, there are still few notable big numbers defined not using infinities; for instance the graham number, the generating formula being gn = 3 ↑(gn−1) 3 where g1 = 3 ↑4 3, or the rayo’s number, which is defined as a smallest number bigger than the largest number that is able to be described in googol symbols of first order set theory symbols. However, note that those numbers are merely created for entertainment purposes; those numbers are so large that it is virtually impossible for the numbers to be used in real life. However, this idea of reaching big ordinal numbers by careful definitions is indeed interesting and deserves

22


examination, although big numbers these days only exist in theory, and cannot be effectively calculated using modern day technology.

23


Mobius Inverse Formula and its Usage Joy Dongho Yang Year 11 Jeoji Member of Mathematics Society Email: dhyang24@pupils.nlcsjeju.kr

1

Introduction Mobius inverse formula is a formula in number theory introduced by an infamous mathematician August Ferdinand Mobius, and is used in various places to calculate the values of those functions with less calculation when the function itself looks discrete; hard to normalize into algebraic forms. To understand this incredible formula, prior knowledge of the Mobius function is needed, as it is the key function Mobius inverse formula utilizes with inclusion and exclusion principle, to calculate the value of the function faster. 2

Mobius Function Mobius function, or (n), is a multiplicative function used in the Mobius inverse formula, and is defined as: µ(n) = 1

(1)

If n has no square number as a factor and has an even number of prime factors µ(n) = −1

(2)

If n has no square number as a factor and has an odd number of prime factors µ(n) = 0

(3)

If n has a square number as a factor The function is multiplicative, meaning that µ(a)µ(b) = µ(ab) when a and b is coprime, and this could be proved as an integer with a square factor multiplied by another positive integer will have a square factor, and the parity of numbers of prime factors will be conserved when multiplying two different numbers, the only

exception of this multiplicative nature being when two integers have a common prime number as a factor, in which case the product number will have a square factor, and it’s Mobius function value will 0, regardless of the parity of the numbers of prime factor of each of those two numbers. 3

Inclusion-Exclusion Principle The inclusion-exclusion principle is a principle used frequently in combinations theory, as it is used to count numbers of elements in unions of multiple sets. The principle uses a simple method of using parity to decide whether the number of elements in the subset should be included or excluded from the current value, for instance, in getting the number of elements in the union of set A, B, and C, using the principle, the number of elements in A ∪ B ∪ C, or n(A ∪ B ∪ C), is n(A) + n(B) + n(C) − n(A ∩ B) − n(A ∩ C) − n(B ∩ C) + n(A ∩ B ∩ C). In here, elements in set A, B, and C are all counted separately in the first step, then portions that are counted twice are subtracted from the total to balance the double count, then the portions that are counted thrice are added back to the total, as they were subtracted multiple times when the portions that are counted twice were subtracted. 4

The Mobius Inverse Formula Using the Mobius function and the inclusionexclusion principle, the method used in the Mobius inverse formula can be produced. The Mobius inverse formula utilizes the inclusion-exclusion principle to the integer’s factors, that is, for a multiplicative function, the function’s value could be deduced from the sum of multiplication of all the number’s factors and its mobious function. by stating a summative function of a wanted function, given that the function is monotonic, the relation of the summative function and the original function can be determined throughout the Mobius in-

24


verse formula. By using this relation, the value of the original function or the summative function can be determined quickly and with little calculation compared to the naive approach of adding up values one by one to get wanted value. 5

Examples Where it is Helpful An easy application of the Mobius inverse formula is when the number of squarefree numbers smaller than a certain number is needed. The naive approach is to check every single number that is lower than the given 5 number, which will take at least N 4 calculations when used optimized square free test, and although this is fast to calculate for smaller Ns , numbers bigger than 107 put into the function will be practically uncalculatable, it would take considerable time with conventional computer devices. Using better methods to count one by one makes this complexity marginally better, but the time complexity cannot be better than O(N ), which is a problem. However, using the Mobius inverse formula, the sum could be calculated by only calculating the Mobius function value for the factors of original numbers, thus the time complexity calculating the number could 1 be reduced down to O(N 2 ), as the Mobius function is multiplicative, which is a significant improvement.

25


Non-Euclidean Geometry Emma Chae Eun Chung Year 10 Noro Member of the Mathematics Society Email: cechung25@pupils.nlcsjeju.kr

1

Introduction Let’s start with a classic interview question, particularly a favorite of tech companies(Umoh, 2018): “You’re standing on the surface of the earth. You walk one mile south, one mile west, and one mile north. You end up exactly where you started. Where are you?”

ematician Euclid(Heilbron, 2019). He proposed five axioms in his book Elements, which were as follows: 1. Given two points, there is a straight line that joins them 2. A straight line segment can be prolonged indefinitely 3. A circle can be constructed when a point for its centre and a distance for its radius are given 4. All right angles are equal 5. If a straight line falling on two straight lines makes the interior angles on the same side less than two right angles, the two straight lines, if produced indefinitely, will meet on that side on which the angles are less than the two right angles Several centuries later, Hilbert refined Euclid’s first and fifth axioms, as 1. For any two different points, (A) there exists a line containing these two points (B) this line is unique

Fig. 1: On a Spherical Surface One possible answer is the North Pole. Under the premise that the Earth is a perfect sphere, one mile south, west, and north could bring you to your original point. Yes, turns out, the sum of the interior angles of a triangle isn’t necessarily 180°. Welcome to Non-Euclidean geometry! 2

Definition There are countless types of geometries, some of the major branches being Euclidean, analytic, projective, differential, non-Euclidean, and topology. Noneuclidean geometry is simply defined as any geometry that is not Euclidean geometry, while Euclidean geometry refers to the study of planes and solid figures based on axioms and theorems employed by the Greek math-

5. For any line L and point p not on L, (A) there exists a line through p not meeting L (B) this line is unique The modern version of Euclidean geometry is unlimited to the second dimension(Artmann, 2019). Often, Euclidean geometry comes to mind when thinking of ‘geometry’. It is the most typical, and carries properties of ‘common sense.’ That said, Non-Euclidean geometries are uncommon and definitely not typical. 3

Development The development of non-Euclidean geometries can be understood by looking at two historical forks in the mathematical world. The first was via an astronomical pursuit. For instance, Euclid endeavored to demonstrate the movement of stars and planets in the hemispherical sky in his book, Phaenomena(Henderson,

26


Fig. 2: Euclid’s Five Postulates 2019). Even before Euclid, the ancients unknowingly studied non-Euclidean geometry in the mapping of large-scale navigations. People spoke of ‘great circles/arcs’, segments of circles representing the shortest distance on a spherical surface. A surveyor who mapped out Europe’s vast areas, Karl Friedrich Gauss was the one to make it a proper study. He believed that intrinsic geometry- measurements made along the spherical surface- may be different from extrinsic geometrythe way it sits in space. In the intrinsic perspective confined on the surface of a sphere, so large that curvature is not obvious, the sum of a small triangle’s interior angles will be 180°, but one that vastly stretches across the globe will measure to be over 180°(explaining why similar triangles are always congruent in ‘elliptical geometry;). From an extrinsic perspective, without drawing a ‘triangle’, three points on the sphere will appear to form a triangle, though the result of actually doing so will give seemingly curved edges. We are only able to visualise this because we live in a three-dimensional space, able to leave the surface and take a shortcut through space to connect the points(www.math.brown.edu, n.d.). Gauss’s Theorema Egregium states that the curvature of a surface is constant, no matter how it is bent. Therefore a spherical Earth cannot be converted to a flat surface without deformation, bad news for his fellow cartographers(Bhatia, 2014). This analysis of curved surfaces provided a strong foundation for the future understanding of nonEuclidean geometry, later developing to ‘spherical geometry’. The second historical thread starts from Euclid’s 5th postulate, aka the parallel postulate. Somewhat less intuitive, it distinctly stood out from the others, and Euclid himself was not satisfied. He was reluctant to use it, and he never did during the first 28 propositions(Weisstein, n.d.). Many attempted to prove the postulate as a theorem - some by deducing conclusions from the other four, some by proof of contradiction - but gave false proofs or arrived at equivalent forms. Proclus produced an equivalent statement, ‘Given a line and a point not on the line, it is possible to draw exactly one line through the given point parallel to the line’, later known as the Playfair’s Axiom after John Playfair fa-

mously commented on the parallel postulate by replacing it with this axiom. Wallis’s ‘proof’ in 1663, ‘To each triangle, there exists a similar triangle of arbitrary magnitude’, was also discovered to be an equivalent. However, Girolamo Saccheri’s work is considered to be most notable amongst the attempted proofs, making him very close to discovering the existence of non-Euclidean geometry. Saccheri expected to derive a contradiction by assuming the parallel postulate was false and instead derived non-Euclidean theorems that he himself did not fully understand. One of Saccheri’s conclusions was that straight lines could be finite, true in elliptical geometry, though he rejected it according to Euclid’s second postulate. Other mathematicians, too, like Lengendre who showed another equivalent of the parallel postulate, ‘The sum of a triangle’s interior angles are equal to two right angles,’ failed to unveil non-Euclidean geometry because their conclusions were built on the foundation of a straight line’s infinity. A breakthrough was made in the 19th century when mathematician Bolyai wasn’t of self-doubt in the existence of non-Euclidean geometries after a contradiction-free ‘proof’ of contradiction(O’Connor and Robertson, 1996). It was later revealed that earlier Lobachevsky published his work on non-Euclidean geometry in 1829, yet Bolyai is still credited along with Lobachevsky for their independent studies(Henderson, 2019).

4

Types and Models

It is believed that there are 3 geometries: Euclidean, spherical, and hyperbolic. Of these, spherical and hyperbolic geometries are non-Euclidean(Beardon, 2001).

4.1

Elliptic

Elliptic geometry(aka Lobachevskian Geometry) is any geometry occurring on the plane of a sphere. The mathematician Riemann greatly contributed to the intrinsic analytic view of a sphere, developing the field of elliptic geometry, thus it is sometimes given the name ‘Riemannian geometry,’ in which the sphere is called the ‘Riemann sphere’. A special case but a good example of elliptic geometry is spherical geometry. A difference that distinguishes spherical geometry from other elliptical geometries is the identification of the pair of antipodal points(points on a sphere diametrically opposite to each other). In elliptical geometry, both antipodal points are representatives of the same point, whereas in spherical geometry, they are treated as two different points(Henderson, 2019). Therefore, great circles intersect each other once in elliptical geometry, but twice in spherical geometry. The surface of elliptical geometries curve outwards, hence they have positive curvature(Lamb, 2017).

27


Fig. 5: Models of Hyperbolic Geometry

Fig. 3: Elliptic Geometry

4.2

Hyperbolic In non-Euclidean geometry, hyperbolic geometry refers to the non-Euclidean geometries that reject Euclid’s parallel postulate(Hyperbolic geometry | mathematics, 2019). Unlike elliptical geometries, it has no fixed shape, except for its every point being a ‘saddle point’ and having a negative curvature. Below is an example of hyperbolic geometry.

5.1

Parallel Lines Such a thing as ‘parallel’ does not exist in elliptical geometries. On the other hand, in hyperbolic geometries there are at least two parallel lines through a given point for a given line, admitting all but the fifth of Euclid’s postulates(Hyperbolic geometry | mathematics, 2019). 5.2

Geodesics A geodesic is the shortest path from one point to another. In Euclidean geometries they are straight, and in non-Euclidean geometries, they are curves that are ‘intrinsically straight’. Elliptical geometries have geodesics in the shape of a circle, referred to as great circle arcs. In hyperbolic geometries, they are in the shape of a hyperbola.

Fig. 6: Shape of Geodesics Fig. 4: Hyperbolic Geometry

There are three models of hyperbolic geometry that depict the projections of a hyperbolic surface: the Klein-Beltrami model, the Poincaŕe disk model, and the Poincaŕe upper half-plane model. They are all depictions of straight lines on a hyperbolic surface, deforming either angle or distance/straightness for a simplified viewpoint. The Klein-Beltrami model preserves the straight lines in exchange for distorting angles. Both Poincaŕe models keep the angles but distort distances(Henderson, 2019).

5.3

Angles of a triangle The sum of a triangle’s interior angles is not 180° in non-Euclidean geometries. In elliptical geometries, the sum is greater than 180°, and less in hyperbolic geometries(Liu, 2014).

Fig. 7: Triangles on a Non-Euclidean Surface

5

Properties Non-Euclidean geometries have numerous uncommon properties.

28


5.4

Area of a triangle Girad’s theorem states that the area of a triangle in spherical geometry is (assuming x to be the interior angles of the triangle),

Σx − π

(1)

The proof is as follows: Take the triangle to be a spherical triangle lying in one hemisphere on a unit sphere, radius of 1. The

Fig. 10: 2D Visualisation of Fig. 8 In hyperbolic geometry, the formula is (assuming x to be the interior angles of the triangle), π − Σx

Fig. 8: Spherical Triangle Formed by Lines a, b, c lines b and c meet in antipodal points A and A , defining a lune with area 2α.

The maximum area of a triangle is π when the interior angles all equal 0◦ . In this circumstance, the edges never intersect, or the edges are so long that the angles they form are close to 0(Bennett, 2005). 5.5

Quadrilaterals A Lambert quadrilateral has three right angles. The size of the fourth angle depends on the geometry it lies on; obtuse in elliptical, acute in hyperbolic, and right in Euclidean. With two sides of equal length perpendicular to the base and the other two angles(‘summit angles’) of equal measure, the summit angles of a Saccheri quadrilateral carry the same property.

Fig. 11: Saccheri Quadrilateral

Fig. 9: Lune Defined by Lines b and c, Shaded The sphere is divided into 8 pieces, ∆ is the antipodal triangle to ∆ and ∆ ∪ ∆1 the above lune, etc. The area ∆ = area ∆ , ∆1 = ∆ 1 , etc. Then ∆ + ∆1 = area of the lune = 2α and ∆ + ∆2 = 2β and ∆ + ∆1 = 2γ Also 2∆ + 2∆1 + 2∆2 + 2∆3 = 4π ∴ 2∆ = 2α + 2β + 2γ - 2π (May, 2012)(www-groups.mcs.st-andrews.ac.uk, 2003)

(2)

6

Applications The understanding of non-Euclidean geometry has been useful in other areas of knowledge. Influences can be seen in literature such as Lovecraft’s fictional sunken city of R’lyeh, in M.C. Escher’s geometric artworks, and even in ‘hyperbolic handcrafts’, especially after the Latvian mathematics professor Daina Taimina crocheted the first model of a hyperbolic plane(Pitici, 2008). 7 Bibliography 1. Artmann, B. (2019). Euclidean geometry. In: Encyclopædia Britannica. [online] Available at:

29


12. 13. Fig. 12: Crochet of a Hyperbolic Plane

2.

3.

4.

5. 6.

7.

8.

9. 10.

11.

https://www.britannica.com/science/Euclideangeometry. Beardon, A. (2001). How Many Geometries Are There? [online] nrich.maths.org. Available at: https://nrich.maths.org/1386 [Accessed 27 Dec. 2021]. Bennett, A. (2005). Hyperbolic Geometry - Triangles, Angles, and Area | Mathematical Association of America. [online] www.maa.org. Available at: https://www.maa.org/press/periodicals/loci/joma /hyperbolic-geometry-triangles-angles-and-area [Accessed 27 Dec. 2021]. Bhatia, A. (2014). How a 19th Century Math Genius Taught Us the Best Way to Hold a Pizza Slice. [online] Wired. Available at: https://www.wired.com/2014/09/curvatureand-strength-empzeal/. Heilbron, J.L. (2019). Geometry | mathematics. In: Encyclopædia Britannica. [online] Available at: https://www.britannica.com/science/geometry. Henderson, D. (2019). Non-Euclidean geometry | mathematics. In: Encyclopædia Britannica. [online] Available at: https://www.britannica.com/science/nonEuclidean-geometry. Hyperbolic geometry | mathematics. (2019). In: Encyclopædia Britannica. [online] Available at: https://www.britannica.com/science/hyperbolicgeometry. Lamb, E. (2017). A Few of My Favorite Spaces: The Pseudosphere. [online] Scientific American Blog Network. Available at: https://blogs.scientificamerican.com/rootsof-unity/a-few-of-my-favorite-spaces-thepseudosphere/ [Accessed 27 Dec. 2021]. Liu, M. (2014). Dr Mark Liu. [online] Dr Mark Liu. Available at: http://www.drmarkliu.com/noneuclidean. May, J. (2012). A Brief Survey of Elliptic Geometry. [online] Available at: https://uwf.edu/media/university-of-westflorida/colleges/cse/departments/mathematicsand-statistics/documents/proseminar/JustineMay-Fall-2012-Proseminar-Paper.pdf. O’Connor, J.J. and Robertson, E.F. (1996). Non-Euclidean geometry. [online] Maths His-

14.

15.

16.

30

tory. Available at: https://mathshistory.standrews.ac.uk/HistTopics/NonEuclidean_geometry. Pitici, M. (2008). Non-Euclidean Geometry. [online] pi.math.cornell.edu. Available at: http://pi.math.cornell.edu/ mec/mircea.html. Umoh, R. (2018). Elon Musk asks this tricky interview question that most people can’t answer. [online] CNBC. Available at: https://www.cnbc.com/2018/10/09/whyspacex-ceo-elon-musk-asks-this-tricky-interviewquestion.html [Accessed 2 Nov. 2021]. Weisstein, E. (n.d.). Euclid’s Postulates. [online] mathworld.wolfram.com. Available at: https://mathworld.wolfram.com/EuclidsPostulates .html. www-groups.mcs.st-andrews.ac.uk. (2003). Elliptic geometry. [online] Available at: http://wwwgroups.mcs.st-andrews.ac.uk/ john/geometry/Lectures/L25.html [Accessed 28 Dec. 2021]. www.math.brown.edu. (n.d.). The Development of Non-Euclidean Geometry. [online] Available at: https://www.math.brown.edu/tbanchof/Beyond3d/ chapter9/section03.html [Accessed 26 Dec. 2021].


Central Limit Theorem Irene Jiyu An Year 10 Noro Member of the Mathematics Society Email: jan25@pupils.nlcsjeju.kr

The central limit theorem states in probability theory that, in many instances, when independent random variables are added together, their correctly normalized sum tends toward a normal distribution, even if the original variables are not normally distributed. The central limit theorem asserts that as the sample size grows higher, the sampling distribution of the sample means approaches a normal distribution, regardless of the form of the population distribution. This is especially true for sample sizes greater than 30. All this implies is that when you collect more samples, particularly large ones, your sample means graph will begin to resemble a normal distribution. Here’s a visual representation of what the Central Limit Theorem says when you are rolling a fair die. One of the most basic forms of tests is rolling a fair die, as shown in the image above. The form of the distribution of the means tends to appear like a normal distribution graph the more times you roll the die. It states that the population mean is equal to the average of your sample means. To put it another way, add together all of your samples’ means, determine the average, and that average will be your population mean. Similarly, if you average all of the standard deviations in your sample, you’ll get the population’s true standard deviation. It’s a practical phenomenon that can aid in precisely predicting a population’s features. Abraham De Moivre, a French-born mathematician, coined the first version of the central limit theorem. De Moivre utilized the normal distribution to calculate the number of heads from several coin flips in an article published in 1733. The idea was unpopular at the time, and it was swiftly forgotten. However, another great French mathematician, Pierre-Simon Laplace, reintroduced the concept in 1812. In his work "Théorie Analytique des Probabilités," Laplace reintroduced the normal distribution notion, attempting to approximate the binomial distribution with the normal distribution. In current industrial quality control, the central limit theorem is crucial. Identifying the key elements

Fig. 1: Graph of when rolling a fair die (Photo credit:cmglee|Wikimedia Commons)

that contribute to undesirable deviations is often the first step in enhancing the quality of a product. After that, efforts are undertaken to control these variables. If these attempts are successful, any residual variance will be produced by a large number of factors working in a relatively independent manner. In other words, the central limit theorem can describe the remaining small amounts of variation, and the remaining variation will typically resemble a normal distribution. As a result, the normal distribution serves as the foundation for a number of important statistical quality control

31


processes. The central limit theorem is the subject of a plethora of helpful and intriguing examples and applications in the literature. The following are some examples, according to one source: A random walk’s total distance travelled probability distribution (biased or unbiased) will gravitate toward a normal distribution. When you flip a lot of coins, you’ll get a normal distribution for the overall number of heads (or equivalently total number of tails). From another perspective, the central limit theorem explains why density estimates applied to realworld data sometimes seem like a "bell curve." We can often regard a single measured number as the weighted average of several minor effects in circumstances like electrical noise, examination grades, and so on. Using generalizations of the central limit theorem, we can see that this often (but not always) results in a final distribution that is roughly normal. In general, the more normal a measurement is, the more it resembles the sum of independent factors with equal influence on the outcome. This explains why this distribution is frequently used to represent the effects of unobserved variables in models like the linear model.

32


Euler’s Characteristic Hanbi Lee Year 9 Mulchat Member of the Mathematics Society Email: hblee26@pupils.nlcsjeju.kr

Euler’s characteristic is a theorem that explains that when the vertex is set as v, edge set as e, and face set as f , for every polyhedron, V − E + F equals 2. Leonhard Euler was born in Basel, Switzerland in 1707 and he contributed to different branches of mathematics like trigonometry, geometry, calculus, and number theory. One of his most famous accomplishments is the Euler formula. Euler’s characteristic formula is V −E +F = 2 which is another accomplishment of Leonhard Euler. In order to verify this, ignore one face of a polyhedron and place the polyhedron flat on a planar graph like the diagram shown on the bottom. Since one face is removed, V − E + F would equal 1 at the end if the theorem is true. equal, because 1 − 2 + 1 equals 0. When these steps

Then, separate every face of the shape into triangles on the planar graph. This can be done by connecting the remaining vertices from one vertex to another on each face except for the two adjacent vertices. The values of E and F will change, but these values will equally increase during this process, so the calculation will still stay the same as 1. Next, remove the triangles which have one edge on the outside of the shape. By doing this, the value of V would stay the same, but the values of E and F will equally decrease by one, so there won’t be a difference in the final calculation of V − E + F . Then remove the triangles with two edges on the outside of the shape which will take away 1 from V , 2 from E and 1 from F which again makes the formula

are repeated correctly, there will be only one triangle left in the middle. The calculation of V − E + F will stay the same until that step, so the value of V − E + F of the leftover triangle and the value of V − E + F of the original shape should be equal. A triangle has 3 vertices, 3 edges, and 1 face. When these values of a triangle are substituted to the formula, V − E + F will equal 3−3+1 and the overall equation would equal 1. If the face ignored at the start is added to the calculation, the entire equation will equal 2, because the value of F would increase by 1 no matter what value F had. This proves that V − E + F equals 2 for a polyhedron.

33


However, the method that Euler used in the first place to prove Euler’s characteristics was different from the method of verification shown above. He thought of a polyhedron as a balloon and imagined it as a form of a sphere. When all the vertices are removed from the surface of the sphere, and when it is deflated, the new shape would have 2 fewer vertices, 3 fewer edges, and 1 less face compared to the original polyhedron. V − E + F = (V − 2) − (E − 3) + (F − 1) = V − E + F − 2 + 3 − 1 = V − E + F = 0. This shows that the value of V − E + F of the new shape is equal to the V − E + F value of the original polyhedron. When every edge is removed in the state of a sphere and has been flattened, there will be a tetrahedron left which has 4 vertices, 6 edges and 4 faces. When these values are substituted, it equals 4 − 6 + 4 = 2, which can lead up to a point that V − E + F is equal to 2 for every polyhedron. The core of the proof by Euler is thinking about an image in which air is blown into a polyhedron and inflated into a sphere. Consequently, the value of V − E + F is determined by the figure made from inflating air. The amount that does not change in variations is called topological invariant and an example of this can be inflating like mentioned above. V − E + F is one of the first topological invariants discovered in the history of mathematics. In other words, V − E + F constantly being 2 has triggered the question, ‘what does not change even though modification is applied?’ and the product of the effort to answer this was topology, which is the study of the properties which are maintained throughout the process of deformation, stretch, and twisting of the objects.

34


Matrix in Digital Image Ju Hyeon Park Year 12 Halla East Co-chair of the Mathematics Society (Applied Mathematics in Natural Sciences) Email: juhpark23@pupils.nlcsjeju.kr

1

Introduction Digital image is a picture that is stored on a computer in a form of numbers that computers can understand. Also, it is composed of pixels, the smallest form of information in an image. It can too represented as matrices, the rectangular arrays of numbers. In this article, colour and processing of an image will be explained using different types of matrix. Matrix Colouring In this section, how monochromatic and colour images are converted into numbers will be explained. Also, the idea of bit depth will be introduced. It refers to the colour information stored in an image. The higher the bit depth of an image, the more colours can be stored.

Fig. 1: Binary Pixel Art

2

image is a 8-bit image since it can store 28 possible colours.

2.1 Monochromatic images 2.1.1 Binary Image Binary image only contains pixels in the correlating matrix in either “1” or a “0”, the logical matrix. These two different numbers can only show two colours without presentation of various different shades. The most common colours used are white and black. Binary image is also known as 1-bit images because it can store 21 possible colours. It is the simplest image among other bit depths. Figure 1 shows binary pixel art. In this case, “0” is shown as black colour, while “1” shows white.

Figure 2 is the image that matrix in figure 3 presents.

2.2

2.3

Grayscale Grayscale image, also known as an intensity image, is a data matrix that shows a range of monochromatic shades from black to white. In a grayscale image, each pixel is presented by one matrix element set: integer from 0(black) to 255(white). Therefore, as one image pixel is represented by the corresponding element of the matrix, it can represent 256 different shades. Grayscale

Fig. 2: Grayscale Image

Colour Images Every colour can be made by combining red, green and blue lights. Like grayscale matrix, each colour is represented in a number from 0 to 255, where 255 is the maximum intensity/brightness. For example, table 1 shows how each value of RGB creates different colours. When all RGB are at its maximum value, white colour is shown and when all RGB

35


Fig. 3: Grayscale matrix colour

red

green

blue

red

255

0

0

purple

255

0

255

white

255

255

255

black

0

0

0

Table 1: Combination of RGB are at its minimum value, black colour is shown. In this way of specifying brightness of RGB, any colour can be formed.

Fig. 4: Zoomed part of an image is calculated, the target pixel changes to another pixel so that every pixel in the image can be calculated. In this article, each pixel will be represented with a letter to easily identify it. 3.2 Blur 3.2.1 Box blur

3

Image Processing Introduction of technique to break the image into data or numbers allowed the photography industry to have easier and faster editing techniques. In this section, the process of different image processing involving matrices will be explained. 3.1

Kernel Kernel is a small 3x3 or 5x5 matrix used in image processing. Specifically, it is used for blurring, sharpening, embossing, edge detection, and more. Also, they are used in machine learning for ‘feature extraction’, a technique used for detecting and isolating the most important portions of an image. To use Kernel, the original image is transformed into a matrix so that it has data made by numbers. Then, the kernel is multiplied to the matrix. Rather than using a normal matrix multiplication, the kernel is applied by the process called ‘convolution’. Therefore, the kernel is also called a convolution matrix, the matrix of values that shows how the neighborhood pixel contribute to the state of centre pixel in the final image. Figure 4 is an extremely zoomed part of an image. When the kernel is used to process the image, a new target pixel is specified. Then, surrounding pixels of the target pixel are used to calculate the new value of the target pixel. After the new value of the target pixel

1 1 1 1   1 1 1   9 111 Box blur is the process in which this kernel is convoluted. It blurs the image because each pixel in the image will have a value equal to the average value of its surrounding pixels after convolution.

Fig. 5: Original Image

36


highest priority, and as the pixel gets further from the centre, it gets lower priority. This kernel is one example of 3x3 Gaussian blur. 

Fig. 6: Image Pixel For example, figure 5 is an image 3x3 matrix from figure 6. For the blur box process, each pixel should be converted into specific numbers that contain colour information. Then, it is convoluted by the box blur

a × 1 + b × 2 + c × 1 1   d × 2 + e × 4 + f × 2 = new e value   16 g×1+h×2+i×1 In Gaussian blur, more natural blur effects can be obtained because peripheral pixels are weighted by the distance from the center pixel. 3.3

Sharpening

matrix. More specifically, every target pixel in a specific

E is a target value, and other pixels are surrounding pixels that are used to help calculate the new value of pixel E. 

a × 1 + b × 1 + c × 1 1   d × 1 + e × 1 + f × 1 = new e value   9 g×1+h×1+i×1 This is a process of calculating the new value of E, the target value. This shows that all pixels in figure 6 are multiplied by 1, added altogether, and multiplied by 1 9 . The result value is the average value of all pixels in figure 6. Because the box blur matrix calculates each pixel into an average value of 3x3 pixels, it makes the image lose its detail. 3.2.2

0 −1 0     −1 5 −1   0 −1 0

matrix is calculated into a new value. For figure 6, pixel

Sharpening is used to find the difference of centre and neighborhood pixel, and enhancing them even more. This process is called differentiation. This is one of 3x3 sharpening kernel. 

a × 0 + b × −1 + c × 0     d × −1 + e × −5 + f × −1 = new e value   g × 0 + h × −1 + i × 0 Once it is convoluted, contrast is added by accentuating bright and dark areas.

Gaussian blur 4 

1 2 1 1   2 4 2   16 121 Gaussian blur is another type of blur that is used to reduce image noise and detail. This differs from box blur because it calculates average which includes different weight of each pixel. The central target pixel gets the

Conclusion Other than box blur, Gaussien blur, and sharpening blur that was shown in this journal, ther are a lot of kernel matrix for image processing. It is still an area where applications are ever increasing. It contributes to many other areas such as broadcast, exploration of the universe, innovation of mobile device, and more. A Bibliography [1] Web.stanford.edu. n.d. Image-1 Introduction to Digital Images. [online] Available at: <https://web.stanford.edu/class/cs101/ image-1-introduction.html> [Accessed 13 February 2022].

37


[2] Dirce, U.P. and Humberto, J.B. Dmuw.zum.de. n.d. Matrices and Digital Images. [online] Available at: <http://dmuw.zum.de/images/6/6d/Matrixen.pdf> [Accessed 19 January 2022]. [3] Etc.usf.edu. n.d. What is bit depth? » Images » Windows » Tech Ease. [online] Available at: <https://etc.usf.edu/techease/ win/images/what-is-bit-depth/#:~:text= Bit%20depth%20refers%20to%20the> [Accessed 13 January 2022]. [4] Vinny,. D. Medium.com. 2019. Computer Vision for Busy Developers: Convolutions. [online] Available at:<https: //medium.com/hackernoon/cv-for-busydevelopers-convolutions-5c984f216e8c> [Accessed 13 January 2022]. [5] Powell, V. (n.d.). Image Kernels. [online] setosa.io. Available at: https://setosa.io/ev/ image-kernels/ [Accessed 13 Jan. 2022]. [6] Christensson, Per. Grayscale Definition. TechTerms., 01 April 2011. Web. Available at: <https://techterms.com/definition/ grayscale> [Accessed 13 Jan. 2022]. [7] Vučković, V. (2008). IMAGE AND ITS MATRIX, MATRIX AND ITS IMAGE. [online] 12, pp.17–31. Available at: http://elib.mi.sanu.ac.rs/files/ journals/ncd/12/ncd12017.pdf [Accessed 10 Dec. 2019]. [8] White, C. (2017). Application of linear Algebra to image processing. [online] gcsu.edu. Department of Mathematics Georgia College. Available at: https://www.gcsu.edu/sites/files/pageassets/node-808/attachments/white.pdf [Accessed 14 Jan. 2022].

38


Optimisation & Dynamic Programming Aaren Joonseok Kang Year 11 Geomun Co-chair of the Mathematics Society (Applied Mathematics in Natural Sciences) Chair of the LAN Society Email: jskang24@pupils.nlcsjeju.kr

Reviewer Eddie Kyubin Min∗ Ms Sarah Roberts†

Recommended Year Level: KS4, KS5 Keywords: Optimisation, Dynamic Programming, Bellman Equation, Algorithms

1 Introduction 1.1 Optimisation There are solutions that we consider “elegant” in mathematics, engineering, and computer science. “Elegant” solutions often refer to solutions that can reach the answer in the simplest way with the smallest effort. This can be achieved in different ways in different areas. For example, when solving problems in computer science, we can consider a solution to be elegant if it is solved within the most efficient time complexity and space complexity. Optimisation is frequently involved when creating an “elegant” solution; this article aims to present a method of optimisation - dynamic programming. 1.2

Dynamic Programming Dynamic programming (DP) is a method implemented for optimising a problem in various fields of study: mathematics, computer science, economics, and even linguistics. This technique can be implemented when there is a question containing the two key properties: overlapping subproblems and optimal substructure (explained in detail in 4.1). Dynamic programming was first implemented by an American mathematician Richard Bellman, who is well known for his discovery

of dynamic programming including, Bellman equation, Hamilton-Jacobi-Bellman equation - known as control theory, and the Bellman-Ford algorithm. Bellman’s works - especially dynamic programming and Bellman equation - are frequently applied in modern mathematics and computer science. Beginning with an explanation of the Bellman equation and its application in reinforcement learning, this exploration aims to explore the basic techniques and the applications of dynamic programming. 2 Bellman Equation 2.1 Reinforcement Learning Bellman equation is an equation which is frequently used in “Reinforcement Learning” (RL). Reinforcement learning is the training process used in machine learning to find the maximum expected return through a sequence of decisions. Machine learning and reinforcement learning are rooted in the area of discrete mathematics, involving the concepts such as Markov chain and search of graphs. Before mentioning how Bellman Equation is applied into reinforcement learning, understanding the basic concepts of reinforcement learning would be crucial. 2.2

Key Terminologies The following list defines the key terminologies in understanding reinforcement learning. 1. Agent - Agent is the thing / object that observes the environment and finds the optimal action. It interacts with the environment to gain the maximum reward. 2. State - State is how the agent defines its current situation. For example, if the agent defines itself as

∗ Member of the Mathematics Society, Chair of Algorithm Society † Teacher of Mathematics

39


being in a classroom, the state would be the classroom. However, states can be more specific: facing north in a classroom which is about 3m2 with 2 tables with 4 chairs. The state at time t is represented as St . 3. Action - Action is a decision made by the environment depending on the current state of the agent. After the action is done, the state of the agent would change. The action at time t is represented as At . 4. Reward - Reward is the result that the agent receives after an action has been done. For example, in the board game of “Go”, winning the game would be the reward. Since the agent starts with no prior knowledge of the environment, it learns by the reward it can gain. Reward uses the notation Ras which considers the action a and state s. The equation is modelled as: Ras = E[Rt + 1|St = s, At = a]

(1)

5. Discount Factor - Discount factor is used to distinguish the two or more ways of reaching the same reward and to find the optimal way. γis used to represent the discount factor and it is a value between 0 and 1. For example, if the reward is expected now, the value would be 1(1 · γ 0 ). However, if it is given at the next step, it would be notated as γ and if it is two steps away, it would be notated as γ 2 . 6. Return - Return is the sum of all the discounted reward based on time t. Gt is the total discounted reward accumulated until time t. The equation is be modelled as: Gt = Rt+1 + γRt+2 + γ 2 Rt+3 + · · · =

γ k Rt+k+1

k=0

(2) 7. Policy - Policy is the action that the agent has to take in order to arrive at a certain state. The goal of reinforcement learning is to find the optimal policy - the maximum reward. An equation to represent policy π would be: π(a|s) = P[At = a|St = s]

One of the most important concepts that is used in reinforcement learning is the trial and error process which the “agent” goes through. Reinforcement learning is a process of the “agent” learning the best method to reach the maximum “reward” in the least amount of effort in the given “state”. The second characteristic is the delayed reward that the “agent” can experience. Reinforcement learning heavily depends on the “state” that the “agent” is in. An interference from the “state” can make the “reward” smaller by forcing the “agent” to do something else. However, a coincidental interference can also lead to a better reward. Consequently, all of these factors can lead to a delayed reward. 2.4

Markov Decision Process The word “Markov” is significant in discrete mathematics. It is named after the mathematician Andrey Markov. It is now used as an adjective which is used to describe a state where the current state has all the information of all the previous actions and states. This is modelled as the following equation: P[Rt+1 = r, St+1 = s |S0 , A0 , R1 , ..., Rt , St , At ] = P[Rt+1 = r, St+1 = s |St , At ]

Two Key Properties of Reinforcement Learning There are two key characteristics in reinforcement learning:

(4)

The equation states that the probability of the next reward and state happening because of the actions, rewards, and states that have been chosen and done previously can be simplified to the current state and action which has all the information of the previous states, actions, and rewards. Markov Decision Process relies on mainly five factors: S - the finite set of states, A - a finite set of actions, P - a state transition probability matrix, R - the reward function, and γ - the discount factor. State transition probability matrix is the probability of the agent arriving at the state s from the state s by doing the action a, which is be modelled as:

(3)

It is the probability of doing action At at the state St 8. Stochastic environment - Reinforcement learning always happens in a stochastic environment: an environment where nature is random and can not be determined. 2.3

1. Trial and Error 2. Delayed Reward

a Pss = P[St+1 = s |St = s, At = a]

(5)

Thus, if the environment is “Markov”, the Markov Decision Process can be involved in making decisions. To simplify, Markov Decision Process is influenced by the 5 factors that can be represented as a list: (S, A, P, R, γ). 2.5

Value Functions There are mainly two types of value functions that are used in reinforcement learning: state value function and action value function.

40


Value function is a function which takes state s as a parameter with the notation v(s). It returns the expected return from state s at time t. The equation can be modelled as:

Hence, the Bellman equation for the state-value function is:

v(s) = E[Gt |St = s]

To briefly explain the process involved in the derivation process, first the return Gt is expanded using the equation mentioned above:

(6)

There is another important factor which alters the agent’s decisions and states: the policy. With the policy included, the value function becomes the state-value function. To clearly show that policy is also clearly taken in to consideration, the state-value function will be: vπ (s) = E[Gt |St = s]

(7)

The equation is the expected return with state s following the policy π. Even though the value function looks very simple, it has a very significant role in the agent choosing the optimal path. It is considered a “good” path if variance is small and if it converges within a relatively short time and efficient manner. On the other hand, there is the action-value function, which returns the expected return from state s with action a being taken. With policy π being taken as well, the equation is modelled as: qπ (s, a) = Eπ [Gt |St = s, At = a]

(8)

vπ (s) = Eπ [Rt+1 + γvπ (St+1 )|St = s]

Gt = Rt+1 + γRt+2 + γ 2 Rt+3 + · · · =

Bellman Expectation Equation Again, as there are two types of value functions: state-value functions and action-value functions, there are two bellman expectation equations that can be modelled: Bellman equation for state-value function and Bellman equation for the action-value function. To present the Bellman equation for the state-value function first, the state-value function of the current state (St = s) is expressed as the state value function of the next state (St+1 ). The derivation from the original state-value function is as follows:

γGt+1

= Eπ [Rt+1 + γGt+1 |St = s]

= Eπ [Rt+1 + γvπ (St+1 )|St = s]

(11)

(12)

With this calculation, we can notice that we can then substitute v(St+1 ) for Gt+1 , which results in the final Bellman equation for the state-value function. Similar steps can be taken to express the actionvalue function of the current state St = s and action At = a to be expressed as the action-value function of the next state and action, St+1 and At+1 , respectively. The Bellman equation for the action-value function is derived through the following processes: qπ (s, a) = E[Gt |St = s, At = a]

= Eπ [Rt+1 + γRt+2 + γ 2 Rt+3 + · · ·|St = s, At = a] = Eπ [Rt+1 + γ(Rt+2 + γRt+3 + · · ·)|St = s, At = a] = Eπ [Rt+1 + γGt+1 |St = s, At = a]

= Eπ [Rt+1 + γqπ (St+1 , At+1 )|St = s, At = a] (13) Hence, the Bellman equation for the action-value function is: qπ (s, a) = Eπ [Rt+1 + γqπ (St+1 , At+1 )|St = s, At = a] (14) The derivation is very similar to the process that was done when deriving the Bellman equation for the statevalue function. Firstly, Gt is expanded equally and γ is factored out for every term excluding Rt+1 resulting in the following expression:

vπ (s) = E[Gt |St = s]

= Eπ [Rt+1 + γRt+2 + γ 2 Rt+3 + · · ·|St = s] = Eπ [Rt+1 + γ(Rt+2 + γRt+3 + · · ·)|St = s] (9)

k=0

γ k Rt+k+1

Then, excluding Rt+1 , factor out γ from every other term in the expression. This would result in the equation being simplified to:

Action-value functions evaluate each action that can be taken from the current state of the agent and chooses the best path. 2.6

(10)

γ(Rt+2 + γRt+3 + · · ·)

(15)

Again, simplifying Rt+2 + γRt+3 + · · ·, this expression can be expressed as qπ (St+1 , At+1 ) Bellman Optimality Equation

41


Bellman’s optimality equation is the equation in which finds the policy with the maximum reward. For the optimal state-value function, it would find the case of the state-value function which returns the biggest reward. On the other hand, the optimal action-value function would find the policy which maximises the return of the particular action-value function. Optimal state value functions have the form v∗ (s) where ∗ represents the optimal policy. The function would be modelled as: v∗ (s) = max vπ (s) π

(16)

The optimal state value function would have q∗ (s, a) to express the functions. Hence, the equation can be modelled as: q∗ (s, a) = max qπ (s, a) π

3

(17)

Time Complexity Before explaining the key properties of dynamic programming and the methods to solving a problem with dynamic programming, understanding the basic calculations of time complexity is essential to evaluate and explain how the algorithm is used to optimise the solution. Time complexity is a tool which is used to evaluate the efficiency of an algorithm based on the number of operations that the program is going to calculate when it is compiled. This is mostly dependent on the number of repetitions or loops that are used throughout the whole program. The notation used for calculating the time complexity is the “Big O” notation. The notation evaluates the algorithm in a function form, usually based on the variable N , the number of data. Generally, the functions that are used are logarithmic functions, polynomials, and exponential functions. Moreover, there are two things to always note when using the Big O notation: only leave the highest term without the coefficient of that term and evaluate the time complexity based on the worst case scenario. Sequential search, a naive search algorithm, is a search algorithm which searches for the target element in the list by checking and comparing each element. Assuming that the length of the list of the numbers as N , the time complexity of the algorithm would be O(N ). This is because there should be N comparisons executed if the list does not contain the target element, which is the worst case scenario. An example code written in Python is given below. Listing 1: Code 1 - Sequential Search import s y s input = s y s . s t d i n . r e a d l i n e

u s r _ i n p u t _ l s t = l i s t (map( int , input ( ) . s p l i t ( ) ) ) usr_input_wanted = i n t ( input ( ) . r s t r i p ( ) ) i n d e x = −1 f o r x in u s r _ i n p u t _ l s t : i f x == usr_input_wanted : index = x break print ( i n d e x )

As stated previously, even though the code shown above is suggested to have the time complexity of O(N ), the true time taken or the number of operations and comparisons executed would not be equal to exactly N - the number of elements in the list. This is because there are other lines of codes that needs to be executed; for example, declaring variables and initialising the variables would have the time complexity of O(1). However, since time complexity only considers the term with the biggest rate of increase, these lines of program would be omitted. The time complexity O(N ) is called “linear time”. This is because the real execution time increases at a linear rate as N increases.(Refer to fig 1 to the graph plotted for linear time)

Fig. 1: Linear Time Graph

As different programs and algorithms will have different time complexities depending on factors such as the loops and functions used, there are diverse time complexities which have a conventional name to refer to. The table is given below for some of the most used time complexities. As well as linear time complexity, the other time complexity can be plotted as a graph of N against the operations and comparisons executed. The graph below, Fig 2, shows multiple time complexities plotted with simple evaluations of the efficiency of each time complexity.

42


O(1)

Constant Time

O(log(N ))

Logarithmic Time

O(N )

Linear Time

O(N a )

Polynomial Time

O(aN )

Exponential Time

Table 1: Conventional Names of Time Complexity

ing the optimal substructure would be equal to finding the optimal policy in reinforcement learning. Hence, Bellman Optimality Equation can be applied to actual dynamic programming as a concept which implies the optimal substructure. Overlapping subproblems is a case where a problem broken down into subproblems contains two or more equal subproblems. This is a key property in dynamic programming because the time complexity of the solution can be greatly reduced by eliminating the overlapping subproblems by storing the calculated values into an array (list). However, by using an extra array (list), the space complexity needs to be sacrificed as there would be more values to store. 5

Methods of Solving Problems with Dynamic Programming There are two main ways in solving a dynamic programming problem: top-down approach and bottom-up approach. Either approach is possible for every dynamic programming problem that can be solved. The two approaches are very similar in a way that the only difference is the structure of the solution.

Fig. 2: Graphs of Different Time Complexities 4

Concepts of Dynamic Programming Dynamic programming is a very powerful programming technique that is frequently implemented in problem solving because it can optimise naive approaches, which have exponential time complexity, to optimised solutions that have a polynomial time complexity. The two time complexities will have critical differences in their efficiency of the program. This is because, as the variable N increases for both of the complexities, the rate of increase of exponential time would get incomparably large when compared to the polynomial time hence, creating a large difference in efficiency as N gets larger.

5.1

Top-Down Approach The top-down approach, also called memoization, usually utilises recursion to solve a problem. This optimisation approach assumes that it already has calculated the subproblems and is stored in the cache. As it assumes, it would always refer to the list to check if the value is actually calculated; if the value is actually calculated, it would return the calculated value without recalculating, but would need an alternative solution if the value is not calculated. Recursion is calling the same function again to repeat the same procedure with a smaller or an adjusted value. For example, in order to program a code which finds n! if an integer n is given by using recursion, it would need to execute the following code: Listing 2: Code 2 - Factorial with Recursion import s y s

4.1

Two Key Properties of Dynamic Programming There are two key properties that the problem has to have in order for dynamic programming to be successfully implemented: optimal substructure and overlapping subproblem. Optimal substructure is the process of finding the optimal solutions of the sub-problems, which combine to be the optimal solution for the whole problem. This property is crucial as it means that dynamic programming cannot be implemented if the problem doesn’t contain an optimal substructure. Moreover, Optimal substructure can refer back to the Bellman Optimality Equation. This is because find-

input = s y s . s t d i n . r e a d l i n e n = i n t ( input ( ) ) def f a c t o r i a l ( n ) : i f n == 1 : return 1 return n ∗ f a c t o r i a l ( n−1) print ( f a c t o r i a l ( n ) )

The solution given above finds n! by recursion: calling f actorial(n − 1). To have a visual representation of how f actorial(4) would be calculated, a table is given below to follow the functions recalled and returned: Even though the example of the factorial above cannot be classified as a dynamic programming because it

43


f actorial(4) 4 · f actorial(3)

4 · (3 · f actorial(2))

4 · 3 · (2 · f actorial(1)) 4·3·2·1

Table 2: Recursion by Factorial

doesn’t have any overlapping subproblem, recursion is similarly implemented in the top-down approach. (Further explanation with Fibonacci sequence in section 6.1) 5.2

Bottom-Up Approach The bottom-up approach, also called tabulation, in contrast to the top-down approach, employs loops (forloops or while-loops). The biggest difference between the top-down approach from the bottom-up approach is that the top-down approach starts and the smallest subproblems to complete the whole problem. However, this approach is similar to the top-down approach as it also caches the calculated values in a list. Using the factorial as the example to demonstrate the difference between using recursion, the code is given below: Listing 3: Code 3 - Factorial with Loops import s y s

6

Applications of Dynamic Programming In programming, there are numerous situations where a naive solution has a problem such as being too slow. For example, in competitive programming, there are problems which have strict time limits that need the program to implement dynamic programming to solve the problem. When programming applications, optimisation of some elements in the application would also be essential. Dynamic programming can be used in this situation as well to help optimise the part. Like these examples, this section aims to present several specific examples of implementations of dynamic programming which can be faced very frequently. 6.1

Fibonacci Sequence The Fibonacci sequence is an example of a typical implementation of dynamic programming. A naive solution would have the time complexity of O(2N ) in contrast to the time complexity of the solution after implementing dynamic programming is O(N ). The Fibonacci sequence is a sequence in which the first two terms are 1 and the other terms are the sum of two previous terms. If expressed in equations, the nth term an would be: an = an−1 + an−2 (n > 2, n ∈ N) an = 1(n ≤ 2, n ∈ N)

Before presenting an optimised solution, the naive solution which calculates every value through recursion is given first. A naive solution which finds the nth term of the Fibonacci sequence is given below.

input = s y s . s t d i n . r e a d l i n e

Listing 4: Code 4 - Naive Fibonacci Solution

n = i n t ( input ( ) )

import s y s

factorial = [1]

input = s y s . s t d i n . r e a d l i n e

f o r x in range ( 1 , n+1) : f a c t o r i a l . append ( f a c t o r i a l [ x −1]∗ x )

n = i n t ( input ( ) )

print ( f a c t o r i a l [ n ] )

The major difference between the code above and the recursion code in Code Ref. 2 is that the code above uses a “for loop” instead of a recursion. Every loop, an element would be appended to the list “factorial” and the last element of it would be printed at the end of the program, which is the value of n!. (Bottom up approach explained in detail with Fibonacci sequence in section 6.1) Both solutions using bottom-up and top-down are valid. Depending on the question, there may be a more preferable solution since one type of approach can be simpler than the other in a particular question. Consequently, the next section presents examples of implementations and provides evaluations of the efficiency of particular algorithms.

(18)

def f i b ( n ) : i f n == 1 or n == 2 : return 1 e l s e : return f i b ( n−1) + f i b ( n−2) print ( f i b ( n ) )

The solution given above would find the nth Fibonacci sequence using the function “f ib” with the parameter n. The code uses recursion which is the process of calling the function again and again which makes the problems into smaller subproblems. For example, f ib(4) would call f ib(3) + f ib(2) which f ib(3) would call f ib(2) + f ib(1) recursively. The tree diagram below shows the recursion process when finding the 7th Fibonacci number: f ib(7) would call f ib(6) + f ib(5) which f ib(6) would call f ib(5) + f ib(4) and continue until the n of f ib(n) reaches either f ib(2) or f ib(1). The time complexity of finding the f ib(n) can be notated as O(2n ) which can be observed easily using

44


return fib_dp [ n ] f ib (n) print ( fib_dp [ n ] )

Fig. 3: Tree Diagram for Finding the 7th Fibonacci Number a tree diagram like the figure above: n would decide the depth of the tree. The tree in the figure above is a “binary tree” where a node has two child nodes. Additionally, since every new node would have two children nodes unless they are f ib(2) or a f ib(1), the time complexity would increase exponentially with the base of 2 and the power of n; hence, the time complexity being O(2n ). Time complexity of O(2n ) is one of the worst time complexities that a program can have. This is because when the program runs with a large value of n such as 100, the program would have to approximately do 2100 operations. However, O(2n ) can be optimised to a linear time, O(n), using the properties of dynamic programming. Firstly, in the tree diagram (fig. 3), an observation can be made that f ib(5) is repeated two times: after executing f ib(6) and f ib(7). This is very crucial because this shows that there are overlapping subproblems. Additionally, an optimal substructure can be observed as well, which is the equation f ib(n) = f ib(n − 1) + f ib(n − 2) where the optimal solution depends on both n − 1 and n − 2. Hence, with both key properties of dynamic programming being applied to this problem, it means that application of dynamic programming is possible. Application of both the top-down method and bottom-up method can be implemented to solve this Fibonacci problem. Both of the solutions store the calculated Fibonacci value to a list (array). The top-down method, which uses recursion, is implemented in the code below. Listing 5: Code 5 - Top-Down Fibonacci import s y s input = s y s . s t d i n . r e a d l i n e n = i n t ( input ( ) ) fib_dp = [ 0 f o r _ in range ( n+1) ] def f i b ( n ) : global fib_dp i f fib_dp [ n ] : return fib_dp [ n ] i f n == 1 or n == 2 : return 1 fib_dp [ n ] = f i b ( n−1) + f i b ( n−2)

The major difference between the top-down solution and naive solution is that the top-down solution uses a list to store the calculated Fibonacci numbers so as not to not repeat the calculations. For example, when f ib(6) calls f ib(5) and f ib(4), the program would calculate f ib(5) first, which calls f ib(4) and f ib(3) and so on with storing each of the values in the list. When the program stores the values of f ib(5), f ib(4), f ib(3), the program does not have to calculate f ib(4) from the first call (f ib(6)) again, which reduces the execution time tremendously. The tree diagram which represents the process of the code is shown below. The figure also shows the other unnecessary calculations being eliminated such as f ib(3) and f ib(2).

Fig. 4: Dynamic Programming Implemented to Fibonacci The top-down solution provided would have the time complexity of O(n) because the calculations are only needed to be done once each for the numbers between 1 and n. Instead of breaking down the calculations from the given integer n, it starts solving from the 1st Fibonacci term to reach the nth Fibonacci number. It exactly goes through n calculations which makes the time complexity of the bottom-up approach O(n). To demonstrate the actual process, the code is given below: Listing 6: Code 6 - Bottom-Up Fibonacci import s y s input = s y s . s t d i n . r e a d l i n e n = i n t ( input ( ) ) fib = [0 ,1 ,1] f o r x in range ( 3 , n+1) : f i b . append ( f i b [ x −1] + f i b [ x −2]) print ( f i b [ n ] )

The list “f ib” in the code above would store the calculated numbers of the Fibonacci sequence. In the

45


“for loop”, it would add a new element to the list by using the formula of optimal substructure of Fibonacci sequence: an = an−1 + an−2 (2 ≤ n)

(19)

By printing the element with the index of n in the list, it would return the nth Fibonacci number. Consequently, only with not repeating the calculations, Fibonacci sequence can be calculated in a linear time O(n). However, there are other types of problems which implement dynamic programming in different ways. The next section presents how knapsack problems can be optimised using dynamic programming. 0/1 Knapsack Problem A Knapsack problem is also one of the most typical implementations of dynamic programming. This problem involves an imaginary backpack and several items each having a unique weight and value; a most common form of knapsack is the 0/1 knapsack problem which assumes that there is only one of each of the items. There are mainly four variables that influence the solution: the weight limit W , the number of items n, the weight of the ith item wi , and the value of the ith item vi . The objective of this problem is to maximise the value v of the backpack (knapsack) within the weight limit W . As well as the Fibonacci problem, the naive solution or the “brute force” solution of this problem would have a very large time complexity. (Brute force solution is a solution which tries every possibility within the conditions. In case of a 0/1 knapsack problem, it would be to try all the possible combinations of the items within the weight limit.) The code for the brute force solution is given below:

l e n g t h = len ( v a l u e ) print ( knapSack ( c a p a c i t y , weight , v a l u e , l e n g t h → ) )

The code, when executed, with trying all the possible combinations, would make a maximum of 2n combinations; hence, the time complexity would be O(2n ). The figure below shows an example with 4 items - 16 possibilities.

6.2

Listing 7: Code 7 - Brute Force 0/1 Knapsack import s y s input = s y s . s t d i n . r e a d l i n e def knapSack ( c a p a c i t y , weight , v a l u e , length ) : i f l e n g t h == 0 or c a p a c i t y == 0 : return 0 i f ( w e i g h t [ l e n g t h −1] > c a p a c i t y ) : return knapSack ( c a p a c i t y , weight , v a l u e , → l e n g t h −1) else : return → → → →

max( v a l u e [ l e n g t h −1] + knapSack ( c a p a c i t y −w e i g h t [ l e n g t h − 1 ] , weight , v a l u e , l e n g t h −1) , knapSack ( c a p a c i t y , weight , v a l u e , l e n g t h −1) )

v a l u e = l i s t (map( int , input ( ) . s p l i t ( ) ) ) w e i g h t = l i s t (map( int , input ( ) . s p l i t ( ) ) ) c a p a c i t y = i n t ( input ( ) )

Fig. 5: 0/1 Knapsack Brute Force with 4 Items

Applying dynamic programming to this problem would reduce the time complexity greatly. As all implementations of dynamic programming need caching for overlapping subproblems, the program is going to make use of a 2-dimensional array which stores the results of the calculations to avoid redundant calculations. The code for dynamic programming implemented is given below: Listing 8: Code 8 - Dynamic Programming approach of 0/1 Knapsack def knapSack ( c a p a c i t y , weight , value , length ) : K = [ [ 0 f o r x in range ( c a p a c i t y + 1 ) ] f o r x → in range ( l e n g t h + 1 ) ] f o r x in range ( l e n g t h + 1 ) : f o r y in range ( c a p a c i t y + 1 ) : i f x == 0 or y == 0 : K[ x ] [ y ] = 0 e l i f w e i g h t [ x −1] <= y : K[ x ] [ y ] = max( v a l u e [ x −1] + K[ x → − 1 ] [ y−w e i g h t [ x − 1 ] ] , K[ x → − 1 ] [ y ] ) else : K[ x ] [ y ] = K[ x − 1 ] [ y ] return K[ l e n g t h ] [ c a p a c i t y ] v a l u e = l i s t (map( int , input ( ) . s p l i t ( ) ) ) w e i g h t = l i s t (map( int , input ( ) . s p l i t ( ) ) ) c a p a c i t y = i n t ( input ( ) ) l e n g t h = len ( v a l u e ) print ( knapSack ( c a p a c i t y , weight , v a l u e , l e n g t h ) → )

The bottom-up approach is used in the code above. The time complexity of the code would be O(nW ) be-

46


cause there is one for loop which operates n + 1 times and the other for loop operates W + 1 times. However, the method used by dynamic programming to solve this problem is completely different from the previous example - Fibonacci sequence. For every item, the program has two choices:

4. Velog.io. 2022. (5). [online] Available at: <https://velog.io/@suminwooo/---5> [Accessed 20 February 2022]. 5. Medium. 2022. Reinforcement Learning : MarkovDecision Process (Part 2). [online] Available at: <https://towardsdatascience.com/reinforcementlearning-markov-decision-process-part-296837c936ec3> [Accessed 20 February 2022]. 6. Medium. 2022. Solving Fibonacci Numbers using Dynamic Programming. [online] Available at: <https://elishevaelbaz.medium.com/solvingfibonacci-numbers-using-dynamic-programmingee75ea708b7b: :text=Optimal 7. En.wikipedia.org. 2022. Dynamic programming - Wikipedia. [online] Available at: <https://en.wikipedia.org/wiki/Dynamic_programming Computer_programming> [Accessed 21 February 2022]. 8. Neuhaus, R., 2022. What is the difference between bottom-up and top-down?. [online] Stack Overflow. Available at: <https://stackoverflow.com/questions/6164629/whatis-the-difference-between-bottom-up-and-topdown> [Accessed 21 February 2022].

1. To pick the item up 2. To not pick the item up Then, it depends on the “if statement” in the code which determines whether the item should be picked up or not. The first “if statement” is only used to initialise the K[0][0] to 0. From the next comparison, it would compare the next item to every weight possible. If the item can fit in the knapsack and the total value of the current knapsack is bigger than the value of it previously, the new total value would replace the K[x][y] (referring to the for loop variables x and y). Lastly, by returning the K[n][W ], it would return the most optimal case. This would have maximum n · W calculations being incomparably faster than the brute force approach if either n or W gets large. 7

Conclusion Being efficient is always crucial; however, being efficient becomes more so important especially in mathematics and computer science. This is because small changes made to optimise the problem can result in big differences in the solution. Consequently, optimisation is pivotal. This article introduced a way of optimisation — dynamic programming. Even though dynamic programming has a limitation that it can only optimise a problem which has two key properties (section 4.1), it is very powerful towards optimising these problems. Dynamic Programming is now being used in various fields, including the ones close to our real life. For instance, map services like Google Map use Dynamic Programming to figure out the shortest path from a place to place. As Information Technology progresses, Dynamic Programming will be used in more various fields than ever. 8 Bibliography 1. En.wikipedia.org. 2022. Dynamic programming - Wikipedia. [online] Available at: <https://en.wikipedia.org/wiki/Dynamic_programming> [Accessed 20 February 2022]. 2. Dnddnjs.gitbooks.io. 2022. MDP · Fundamental of Reinforcement Learning. [online] Available at: <https://dnddnjs.gitbooks.io/rl/content/mdp.html> [Accessed 20 February 2022]. 3. Velog.io. 2022. :: MDP . [online] Available at: <https://velog.io/@jiyoung/--MDP-> [Accessed 20 February 2022].

47


Investigating Algorithms to Retrieve and Reconstruct Forum Replies from a Database Eddie Kyubin Min Year 11 Mulchat Member of the Mathematics Society Member of the Mathematics Publications CCA Chair of the Algorithm Society Email: kbmin@pupils.nlcsjeju.kr

Reviewer Aaren Joonseok Kang∗

Recommended Year Level: KS4, KS5 Keywords: Tree, RDBMS, Time Complexity 1

Introduction

the use of a RDBMS, a software allowing storing structured sets of data. However, the problem arose when creating a feature for users to write comments. In the forum, the comments needed to form an ‘hierarchical’ structure, where replies can be written to comments, replies can be written to those replies and so on. One example of this is shown in fig. 1. This structure can be represented as a tree, which is a hierarchical data structure in discrete mathematics. The problem is that traditional RDBMS softwares can only store data in a two dimensional manner. Therefore, in this article, I will propose an algorithm to store and retrieve tree from a RDBMS.

2 Prior Knowledges 2.1 Tree In discrete mathematics, a tree is defined as a undirected, connected, acyclic graph (Wolfram Mathworld). In simpler, but looser, words, it is a diagram connecting different ‘elements’ by a set of ‘lines’ starting with a single point and branching out like a hierarchy.

1 Fig. 1: Comments and replies on an Internet forum

2

Recently, I was writing a forum software that was to be used in my wiki website. The forum software involved

4

5

3 6

7 8

Fig. 2: An example of a tree

∗ Co-chair of the Mathematics Society(Applied Mathematics in Natural Sciences), Chair of the LAN Society

48


Fig. 2 illustrates an example of a tree. It has eight ‘elements’ that are interconnected without a cycle. Another important characteristic of a tree can be observed from the fig. 2: there is only one way to get to a different ‘element’ from an ‘element’. Following list defines some terminologies which are referred to in this article. 1. Nodes are ‘element’ of a tree. They represent certain properties or values and are interconnected by edges. 2. Edges are lines that connect two nodes. They represent hierarchies between the nodes. 3. A Root is a ‘starting point’ of a tree. It has a depth of 1. 4. A depth of a node is a number of nodes that must be visited in order to get to a node from the root node. For instance, in fig. 2, node ‘7’ has a depth of 3 (1 → 3 → 7). 5. Parent nodes, of a connection of two nodes, is the node that has a lower depth. For instance, in fig. 2, ‘2’ is a parent node of ‘4’, ‘5’, ‘6’. 6. Child nodes, of a connection of two nodes, is the node that has a higher depth. For instance, ‘4’, ‘5’ and ‘6’ are child nodes of ‘2’. Tree is a good data structure to represent a hierarchical structure of replies because replies to a comment can be represented with parent and child nodes. Furthermore, when there are more than one comments with depth 1, this representation can be simply extended by using multiple trees (forest). For instance, the comments in Fig. 1 can be represented by the tree shown in fig. 3.

1

id

depth

parent

1

1

2

1

3

2

1

4

3

3

5

3

3

6

4

4

Table 1: Possible representation of fig. 3 in a RDBMS

3 5

6 Fig. 3: Representation of fig. 1 using Trees

2.2

database. In RDBMS, data is stored in tables, with records and fields, in a similar manner to a spreadsheet. A problem arises when storing trees into databases. In most representations of trees, parent nodes contain an array of pointers to its children nodes. However, as this requires an extra dimension, this cannot be done when storing trees to RDBMS. Instead, as children nodes will each only have one parent node, children nodes can include references to thair parent nodes, as shown in table 1. Furthermore, to enable easy identification of root nodes and the depths of nodes, depths of nodes can also recorded.

2

3 4

Fig. 4: MariaDB

Database Database in Computer Science is ‘any collection of data, or information, that is specially organized for rapid search and retrieval by a computer’ (Encyclopedia Britennica). This article is specifically concerned with RDBMS(Relational DataBase Management System), such as MariaDB(fig. 4), which is one form of

Modelling Depths of Comments Before investigating and proposing Algorithms to reconstruct trees from comments stored in a database, distribution of depths of comments on online forums were discovered, so that their characteristics could be exploited. In order to model the distribution, web crawling was done on ‘Arcalive Counterside Channel’(https://arca.live/b/counterside), a Korean Internet forum; a program was written to collect the distribution of depths of comments of 1000 posts. The summary of the data collected is available on Appendix A. As shown in fig. 5, most of the comments were of depth 1, and the number decreased exponentially as the depth increased. Probably due to the limitation of the website, the maximum depth of the comment was 7. Using the data above and Microsoft Excel, the line of best fit was calculated as: f (x) = 7494e−1.025x

49

(1)


Therefore, the number of comments at depth 1, in terms of the number of comments N , is

Number of Comments per Depth 4000 3587

Number of Comments

3500 3000

a=

2500 2000

943

500 0

242

0

2

120

44

4 Depth of Comments

15

6

7

E(X) =

Investigation of Algorithms In order to demonstrate how algorithms can be written to solve the particular problem, I will propose two different algorithms. 4.1

Algorithm 1 Before proposing an efficient algorithm, this section stresses the importance of well-written algorithm, by showing possible inefficiencies of algorithms that are not written with care. The first algorithm randomly formulates a tree from the list of comments until a correct permutation is found. Specifically, it follows the procedure presented in Algorithm 1. The code is also on Appendix B.

(2) Algorithm 1: Reconstructing Forum Comments by Randomly Shuffling Them 1 2

Using this, the number of replies to comments per depth n, for any number of root node a, can be modelled using a geometric sequence:

3 4 5 6

(3)

7 8

Note that e−1.025 is less than 1; this means that it is more likely for a comment to not have a comment than to have a comment. Therefore, we can write algorithms under an assumption that the number of nodes with large depths will not be large. We can further expand this model to find the number of comments at each depth when we only know the entire number of comments, N . Because it is given that the nodes will at most have the depth of 7, N , in terms of a is: a(1 − exp(−1.025)7 ) 1 − exp(−1.025)

(6)

4

= e−1.025

an = a exp(−1.025)n−1

1 − exp(−1.025) · N · exp(−1.025)n−1 1 − exp(−1.025)7

= 0.64169 · N · 0.35880n−1

, as shown on the graph. The coefficient of determination (R2 ) was 0.989, showing that the correlation was exceptionally strong. This model can be then applied in order to find the expected value of number of replies to a comment. Given depth X, the expected number of replies of depth X + 1 can be calculated as: f (X + 1) f (X) 7494 exp(−1.025(X + 1)) = 7494 exp(−1.025X) exp(−1.025X − 1.025) = exp(−1.025X)

an =

8

Fig. 5: Distribution of Number of Comments of 1000 Posts on https://arca.live/b/counterside

N = S7 =

(5)

Thus, because of equations 3 and 5, the number of comments at depth n can be written as:

1500 1000

N (1 − exp(−1.025)) 1 − exp(−1.025)7

9

do

Create a new tree; Insert every nodes of depth 1 as root nodes; for i ← 2 to 7 do for every nodes of depth i do Make the node a child node of a random node at depth i − 1; end end while the tree is not correct;

To determine the average time complexity of this algorithm, let us first determine the probability that a random tree generated from this algorithm is correct. Let us first consider a node of depth i, where i > 1. Considering the number of comments at depth i − 1, the probability that that node has the correct parent node is 1 N qpi−1

(4)

50

(7)


where N is the entire number of comments, p is exp(−1.025) (expected number of replies to a comment), 1−p q is 1−p 7. Considering the number of comments at depth i, the probability that all comments at depth i will have correct parents is

1 N qpi−1

N qpi

(8)

Thus, the probability that the tree will be entirely correct is 7

i=2

1 N qpi−1

N qpi

(9)

As p and q are constants, this can be simplified, using asymptotic notation, as 1 NN

(10)

Given the probability that a random arrangement is correct, the average time complexity can also be calculated. Defining the function f (N ) as the expected number of rearrangements needed, following equation can be written: f (N )

i=0

1 NN

1−

1 NN

i

=

1 2

(11)

Interpreting the left hand side as a geometric series, this equation can also be written as: 

f (N )  1 1 − 1  NN  1  = NN 2 1 − N1N

(12)

which can be simplified to log 2N N f (N ) = 1 − log 1 − N1N

(13)

By inspection, the graph of f (N ) is similar to the graph of y = N N ; therefore, the average time complexity can be estimated as O N N , showing that the algorithm is very inefficient.

4.2

Algorithm 2 Contrary to the algorithm above, this section suggests an algorithm that is efficient. Listing 1: Python code showing Algorithm 3. import copy class comment : def __init__ ( self , id , depth , → content , children ) : self . id = id self . depth = depth self . content = content self . children = children commentsPerLevel = → [[] ,[] ,[] ,[] ,[] ,[] ,[]] commentTree = {} commentCache = {} for c in comments : commentsPerLevel [ c . depth - 1]. → append ( c ) for level in range (7) : for c in commentsPerLevel [ level ]: cache = [] cursor = commentTree #C o p i e s → R e f . if level != 0: cache = commentCache [ c . → parent ] for nextSelect in cache : cursor = cursor [ nextSelect → ]. children cursor [ c . id ] = comment ( c . id , c → . depth , c . content , {}) newCache = copy . deepcopy ( cache → ) #c o p i e s c o n t e n t s , n o t → r e f . newCache . append ( c . id ) commentCache [ c . id ] = newCache This algorithm constructs a tree of comments, in a form of nested Maps, in a variable called commentTree. ‘comments’ is a list of comments, classes having id, depth, parent and content as their properties. Instead of randomly shuffling the tree until the correct permutation is found, this algorithm traverses through the original list, and, for every depths, it finds the nodes’ parents in that depth, drastically decreasing the time required. Particularly important about this algorithm is that it uses a Hashmap (called ‘Dictionary’ in Python), a data structure containing pairs (composed of ‘keys’ and ‘values’) that enables values to be found, using a key, at a constant time complexity. By doing so, it ensures that comments can be found, using their ids, in O(1) time complexity. Furthermore, even though copying arrays value by value is not frequently used as it takes a long time, it does not affect the time taken in this particular case because the maximum number of elements in arrays

51


(in Hashmap newCache) is at most 7 (i.e. the maximum depth). Because each node is accessed once, the time complexity of this algorithm is O(n).

5.3

Results for Algorithm 2

1.6E+09 1.4E+09

5 Evaluating Efficacy of the Algorithms 5.1 Generation of Example Data As it is difficult to find actual tree of comments for a given size of the tree especially when N is large, I created an ‘average’ tree using the model from Section 3. A code, available on Appendix C, was written; An outline of the algorithm is as follows.

Time Taken (ns)

1.2E+09 1E+09 800000000

600000000 400000000 200000000 0

Algorithm 2: Generation of Example Data 1 2 3 4 5

5.2

Generate 0.64169N comments at depth 1; for i ← 2 to 7 do Uniformly select 0.64169 · i · 0.35880i−1 comments from depth i − 1; Create one replies for each comment selected; end

0

20000

40000

60000 N

80000

100000

120000

Fig. 7: Time Taken to Reconstruct the Tree Using Algorithm 2 It was possible to test Algorithm 2 even with 100 000 comments, due to its efficiency. As shown in fig. 7, the algorithm has a linear time complexity. It also shows that it is very efficient, requiring only 1.27 seconds even when there were 100 000 comments at different depth, on the computer it was tested.

Results for Algorithm 1

6 1E+10

9E+09 8E+09

Time Taken (ns)

7E+09 6E+09 5E+09 4E+09 3E+09 2E+09

1E+09 0

0

2

4

6

8

N

10

12

14

16

18

Fig. 6: Time Taken to Reconstruct the Tree Using Algorithm 1

As shown in fig. 6, Algorithm 1 was highly inefficient. As shown on the graph, the time required slowly increased until N = 15, taking under 1 second. However, the time taken exponentially increased since N = 16. It was possible to test at maximum of 16 comments, which even took 9 seconds to reconstruct. Any data of N = 17 or more was not attempted as it took too long.

Conclusion This article proposed an algorithm that can be used on internet forums to store and retrieve comments and replies. As the algorithm was efficient — not only running for short time but also running quickly for large data — it is anticipated that the algorithm proposed can be used in websites with varying number of users. The algorithm proposed can also be used in general situations where a tree needs to be stored in a database — for example, when storing a list of website bookmarks with folders. However, even though this algorithm can be used with large data, it is also worth noting that the algorithm is only suitable when the depth of the nodes are limited small, as the algorithm copies an array with the depth of the node as its length, which takes a long time if the array is long. Therefore, in scenarios where there are lots of nodes with large depths, modification of this algorithm will be required. A Bibliography [1] Encyclopedia Britannica. Database. [online] Available at: https : / / www.britannica.com / technology/database [2] Wolfram Mathworld. Tree. [online] Available at: https://mathworld.wolfram.com/Tree.html

52


[3] MariaDB. Official MariaDB Logos. [online] Available at: https : / / mariadb.com / ko / about - us / logos/

53


Appendix A: Summary of Data Crawled from https://arca.live/b/counterside

Depth

Number of Comments

1

3587

Minimum Number of Comments per Post

0

2

943

Maximum Number of Comments per Post

45

3

242

Medan Number of Comments per Post

4.60

4

120

Median Number of Comments per Post

3

5

44

6

15

7

7

Table 2: Summary of the Number of Comments

Table 3: Depths of the comments

Fig. 8: Distribution of the Number of Comments per Post

Appendix B: Implementation of Algorithm 1 Using Python 3.8 # ' comments ' is an array storing the list of comments . class comment : def __init__ ( self , id , depth , content , parent , children ) : self . id = id self . depth = depth self . content = content self . parent = parent self . children = children commentTree = [] while True : commentTree = []

54


for c in comments : if c [ ' depth '] == 1: commentTree . append ( comment ( c [ ' id '] , c [ ' depth '] , c [ ' content '] , c [ ' → parent '] , []) ) for depth in range (2 ,7) : for c in comments : if c [ ' depth '] != depth : continue cursor = commentTree for i in range ( depth - 1) : randomid = -1 while randomid == -1: randomid = random . randint (0 , len ( cursor ) - 1) if i != depth - 2 and len ( cursor [ randomid ]. children ) == 0: randomid = -1 cursor = cursor [ randomid ]. children cursor . append ( comment ( c [ ' id '] , c [ ' depth '] , c [ ' content '] , c [ ' parent → '] , []) ) def is Co mm entT re eVa li d ( cursor , parentID ) : isGood = True for ch in cursor : isGood = isGood and ch . parent == parentID isGood = isGood and isCommentTreeValid ( ch . children , ch . id ) return isGood if i sCom me nt Tr ee Va li d ( commentTree , 0) : break

Appendix C: Generation of an Example Tree of Comments lastid = 0 comments = [] lastEnd = -1 for i in range ( int (0.64169 * N ) ) : comments . append ({ 'id ': lastid , ' depth ': 1 , ' parent ': 0 , ' content ': lastid }) lastid += 1 lastEnd = 0 for i in range (2 , 8) : a = random . sample ( list ( range ( lastEnd , len ( comments ) ) ) , int (0.5 + 0.64169 * → N * 0.35880 ** (i -1) ) ) for j in a : comments . append ({ 'id ': lastid , ' depth ': i , ' parent ': j , ' content ': lastid }) lastid += 1 lastEnd = lastid - len ( a )

55


Different Number Bases Melissa Seoyoung Min Year 9 Mulchat Member of the Mathematics Society Email: symin26@pupils.nlcsjeju.kr

1

Introduction In the past, quantities were represented by tally marks or drawing things. However, these methods weren’t efficient as the world developed and the complexity of life increased. Even though civilizations such as Egyptian and Greek developed their own way to represent the quantity which was increasing the number of symbols to count larger quantities, it was not sufficient. This is where the positional notation, also known as the number system has developed. The same symbols could be reused to express the number instead of creating a new complex symbol because the position of the value determined the value. This way to represent quantity, a number system, includes decimal number system, binary number system, octal number system and hexadecimal number system. 2

Decimal number system Firstly, there is a decimal system which is a baseten system and a number system that is mostly used in modern civilization. This is a base-ten system because deci stands for one-tenth of the whole(it associates with 10). The prefix ‘deci’ is used a lot in the measurements used in daily life such as ten decimetres in one metre or the fact that a decade corresponds to 10 years. In the decimal system, the digits included are 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9. 10 is an existing number; however, it is not included because 10 stands for 1 ten and 0 ones which means that there is no solidarity digit that stands for 10. For instance, in the number 235, five is in ones place, three is in the tens place and two is in the hundreds place(2 × 100 + 3 × 10 + 5 × 1 = 235). Notice that this could also be represented as 10 to the power of something. 3

Binary number system Second number system is the binary system. Compared to a decimal system, a binary system is much

longer than its corresponding decimal number. Bi is a prefix for two which means that binary system is base 2 system, only requiring two digits 0 and 1. 0 and 1 are the only digits that are included in the binary system(which makes the length of the number longer). This means that 0 and 1 are the only digits that are used instead of the 10 different digits used in the decimal system, to represent all of the numbers that exist. In addition, in the binary system, the value of digits are different compared to the decimal system. For example, if there is a binary number ‘1010’, the first place value is the same(ones) as decimal which will be zero. The second place value is the twos place not a tens place in the decimal system. So, 1 has to be multiplied by 2 which will result in 2. The third place value will be 4, and the fourth place value will be 8. Therefore the expression will look like 1 × 8 + 0 × 4 + 1 × 2 + 0 × 1 which will result in 10. In the decimal system, each of the place values represents the power of 10 but in the binary system, each place represents the power of 2. Binary system is significant because a computer stores everything in a binary system using one bit for each digit, such as “on and off” or "yes or no” because it has an advantage that it can easily distinguish when there are two possibilities even though it requires longer binary digits. 4

Octal number system The third system is the octal system. Octo or octa can be seen in the shape octagon which is basically a polygon with eight sides. Octa has a value of eight and it is a system that is base-eight. The numbers included are 0, 1, 2, 3, 4, 5, 6 and 7. Octal system has a similar pattern as different number bases; however, it only includes eight digits which suggests that it will have a different place value as well as binary system. In order to represent the number in this system, the subscript 8 is used (example: 2148 ). To compare the octal number to the decimal number, as shown in the table below, from 0 to 7, it has the same value as the decimal num-

56


ber. However, since the octal number does not include 8 and 9, it straightly moves to 10. Octal number 10 is equivalent to decimal number 8, octal number 11 is equivalent to decimal number 9, and so on. Octal system is a reasonable compromise between the decimal system and binary system in that it uses more symbols than the binary system, but has shorter digits.

5

Hexadecimal number system Last number system is the hexadecimal number system. Hexa, is used in the hexagon which can be predicted to be a prefix of six. Since the decimal system corresponds to 10, the hexadecimal system is a base 16 system. The symbol used is 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9 which is all ten numbers in the decimal system and for the hexa, instead of the digits, it uses six letters which includes A, B, C, D, E and F. A corresponds to 10, B has a value of 11, C with value of 13, E with 14, and F with 15. In hexadecimal system, similarly as other number base systems it has different place value. The first place value will be 160 which is same for the other number bases and follow the same pattern of second place value being 161 , third place value being 162 and so on. For example, if the number is B3, since B represents a value of 11, and the second place has a value of 16, it will be 11 sixteens and 3 ones. 11 sixteen is equivalent to 11 × 16 which will result in 176 and plus the three. Therefore, the value of the number would be 179.

Converting decimal to binary number First thing to do is take the number and divide it by two, since it is being converted to the binary number, and get its quotient. If there is a decimal number 34810 , it will be 174 with the remainder of 0. Then divide it again by two which will give a value of 87 with remainder of 0. Then, divide again and you will get 43 with the remainder of 1.

So the remainder at the back is the binary number which is equivalent to the original number(read from bottom to top). In this case, the binary number for 348 will be 101011100. 7

Converting decimal to octal number In order to convert into an octal system since it is a base eight system, we have to divide by eight and the other process is the same as converting decimal numbers into binary systems. Calculation

Remainder

348 ÷ 2

=

174

0

174 ÷ 2

=

87

0

87 ÷ 2

=

43

1

43 ÷ 2

=

21

1

21 ÷ 2

=

10

1

10 ÷ 2

=

5

0

5÷2

=

2

1

2÷2

=

1

0

1÷2

=

0

1

348 ÷ 8

=

43

4

43 ÷ 8

=

5

3

5÷8

=

0

5

8

Converting decimal to hexadecimal number Divide by 16 following the same steps for the hexadecimal system. Calculation

Quotient

Remainder

348 ÷ 16

=

21

12

21 ÷ 16

=

1

5

5÷8

=

0

5

The digits are 1, 5, 12 but since 12 corresponds to letter C, 34810 will be equal to 15C16 . 9

Quotient

Remainder

Therefore, 34810 = 5348 .

6

Calculation

Quotient

Conclusion Therefore, number base is the different number of digits or a combination of digits/letters that is a system of representing numbers. The process of converting to different types of number bases is not complicated, however, it is quite difficult to understand the meaning behind it. As the use of computers and computer graphics increased(which requires a lot of numbers), it became necessary to understand and use the different number bases such as decimal system, binary number system, octal system and hexadecimal system.

57


The Emergence of π during a Simulation of a Series of Elastic Collisions Between Two Objects and a Wall Joe Moojo Kim Year 12 Halla North Year 12 Mathematics AA HL Student Email: mjkim22@pupils.nlcsjeju.kr

Table of Contents 1. Introduction 2. Rationale 3. Creating a simulated environment (a) Conditions of the simulations i. Elastic collisions ii. Collisions with the wall 4. Analysis of data (a) The emergence of π 5. 6. 7. 8. 9.

1

The hidden circle and the phase space How collision are presented in the phase space A change in the question Evaluation of the findings Bibliography

Introduction The constant π is indeed very intriguing – not only for the fact that it is a natural numerical value that is irrational, i.e., is composed of a never-ending nonrepeating list of decimal digits, but also for the fact that it unexpectedly appears in a lot of natural phenomena that ‘seems’ to have no relationship with the concept of circles – which is, of course – where the concept of π originates. Now, the statement above, when interpreted in a different way, may have an alternate meaning: that whenever the concept of π ‘emerges’ in some natural phenomena, it must have something to do with the concept of circles. A lot of the times, this ‘something’ is quite straightforward, since circles tend to reveal itself in natural phenomena directly in a geometric sense. But this is not always the case.

2

Rationale I like to spend a lot of my free time uploading videos on my YouTube channel – Joe’s STEM Corner, and one of my major contents are videos basically about developing my own physical simulations or models such as double pendulums, collisions during freefall, three-body problems, et cetera. This time, I wanted to create a two-dimensional simulation of a physical system whereby there are two objects and a wall, and the object that is further away from the wall moves towards the object that is closer to the wall. Logically, the heavier the object (that is given an initial velocity) is relative to the object that is initially still, the greater the number of total collisions occurring in the system, since the object with initial velocity would be more, colloquially speaking, ‘relentless’ as its mass increases; greater momentum means greater tendency to continue its motion. What I wanted to find out was any form of mathematical trend or pattern that may emerge from the relationship between the mass of the object that is given an initial velocity and the total number of collisions (both between the two objects and between the object initially still and the wall) that occur until, of course, a condition whereby no more collisions can be made is met. 3

Creating a simulated environment I used the programming software Processing to create my own version of the simulated environment. To eliminate any source of error, I have set a few things constant throughout the simulation to isolate the effect of the change in mass on the total number of collisions. As shown in fig.1., firstly, the initial velocity of object1 is set constant as 1pxs−1 (1 pixels per second). Secondly, the mass of object2 is set constant as 1kg (or a ‘unit mass’). Thirdly, both objects 1 and 2 are square-

58


Fig. 1: Diagram (all diagrams in this paper was created with GeoGebra) of the simulated environment. The y-axis indicates the wall, and the x-axis indicates the ‘smooth’(or ‘frictionless’) ground shaped objects, the initial position of the center of mass of object1 is at

P1 (−500,

is no variable referred to as m2 because, as mentioned in section 3, the mass of object2 is 1kg (or a unit mass). Equation 1 can be sorted in such a way that 2 is multiplied to all terms to eliminate the coefficient 12 , and then all terms with m1 is on the left side of the equation, and those without is on the right, giving the equation

m1 v1B 2 − m1 v1A 2 = v2A 2 − v2B 2 This equation can then be converted into a useful form as follows

widthobject1 ) 2

m1 v1B 2 − m1 v1A 2 = v2A 2 − v2B 2

, and that of object2, similarly, is at

P1 (−450,

widthobject2 ) 2

. Since the width of object1 is 40 pixels and that of object2 is 20 pixels, and the distance between the center of masses of the two objects is 50 pixels, object1 would travel a distance of 20 pixels before it collides with object2. Conditions of the simulation A series of collisions’ are basically all that happens in the simulation, and there are two types of collisions that happen in the simulation. Firstly, the collision between the two objects, and secondly, the collision between object2 and the wall (see fig.1.). Therefore, mathematically representing the changes that occur during these collisions are basically the ‘conditions’ of the simulation.

⇒ m1 (v1B + v1A )(v1B − v1A ) = (v2A + v2B )(v2A − v2B ) (2) The purpose here is to derive two equations, each giving an expression for v1A and v2A , in terms of known variables and constants (v1B , v2B , and m1 ). Since there are two ‘variables’ that needs to be given as expression for, there must be two mathematical relationships. Equation (2) gives one of them, and another comes from the principle of conservation of linear momentum.

3.1

3.1.1 Elastic collisions The first type of collision, between object1 and object2, is assumed to be, theoretically, perfectly elastic, which means that both momentum and kinetic energy is conserved before and after the collision occurs. The equation of conservation of kinetic energy, in this environment, would be 1 1 1 1 m1 v1B 2 + v2B 2 = m1 v1A 2 + v2A 2 2 2 2 2

m1 v1B + v2B = m1 v1A + v2A

again, m2 is not referred to because object2 is a unit mass

⇒ m1 (v1B − v1A ) = v2A − v2B

(4)

When the whole of equation (2) is divided by equation (4), like terms cancel out and an interesting relationship between the velocities before and after the collision occurs can be found (2) ⇒ v1B + v1A = v2A + v2B (4)

(1)

whereby m1 is the mass of object1, v1B and v2B are the velocities of objects 1 and 2 before a collision occurs, and v1A and v2A are the velocities of objects 1 and 2 after a collision occurs. Notice that in equation (1) there

(3)

(5)

Notice that by using equation (5), an expression for v1A in terms of v2A and known constants (v1B , v2B , and m1 ) can be found. Also notice that equation (3), by itself, does the same job. Therefore, by setting the two equations simultaneously, an expression for both

59


v1A and v2A that does not include either of them but is only composed of known constants can be found. Conducting the process for v1A first, since v2A is the variable that must be eliminated, equation (5) should be rearranged in terms of v2A , and that expression should then be plugged into equation (3).

v2A = v1B − v2B + v1A plugging this value into equation (3)

v2A = −v2B

⇒ m1 v1B + v2B = m1 v1A + [v1B − v2B + v1A ] ⇒ m1 v1B + v2B = v1A (m1 + 1) + v1B − v2B

which basically means that the velocity of object2 after collision reverses direction. The link for the full code of the Processing simulation is at the bibliography below (Moojo Kim, 2021).

rearranging in terms of v1A m1 v1B − v1B + 2v2B m1 + 1 v1B (m1 − 1) + 2v2B ∴ v1A = m1 + 1 ⇒ v1A =

3.1.2 Collisions with the wall It is evident from fig.1. that since object1 is initially further away from the wall than object2, object1 will never collide with the wall due to object2 being in between object1 and the wall. However, object2 would indeed collide with the wall a few times. In our simulated, theoretical environment, I will assume that when object2 collides with the wall, the magnitude of its velocity remains constant (i.e. no kinetic energy is lost), and only its direction changes to the opposite way. Mathematically,

4 (6)

A similar process that involves plugging in the expression of v1A , instead of v2A , from equation (5) into equation (3) can be followed to get an expression for v2A in terms of known constants.

Analysis of data As mentioned before, what I want to find is the relationship between the mass of object1 and the total number of collisions that occur in the simulation. fig.4. shows the 39 data points that I have obtained by running the simulation 39 times, and ‘only’ changing the mass of object1. Mass of object 1 /kg

Total number of collision

1

3

5

7

10

10

15

12

20

14

25

15

30

17

35

18

40

20

45

21

m1 v1B + v2B − m1 (v2B − v1B ) m1 + 1 v2B − m1 v2B + 2m1 v1B = (7) m1 + 1 2m1 v1B − v2B (m1 − 1) ∴ v2A = m1 + 1

50

22

55

23

60

24

65

25

70

26

Equations (6) and (7) are the equations that formulate what happens to the velocities of each objects when they collide with ‘each other’ during the simulation.

75

27

v1A = v2A + v2B − v1B plugging this value into equation (3), ⇒ m1 v1B + v2B = m1 [v2A + v2B − v1B ] + v2A ⇒ m1 v1B + v2B = m1 v2A + m1 (v2B − v1B ) + v2A ⇒ m1 v1B + v2B = m2v (m1 + 1) + m1 (v2B − v1B )

rearranging in terms of v1A ,

⇒ m1 v1B + v2B =

60


Mass of object 1 /kg

Total number of collision

80

28

85

29

90

29

95

30

100

31

200

44

300

54

400

62

500

70

600

76

700

83

800

88

900

94

1000

99

2000

142

3000

172

4000

198

5000

222

6000

243

7000

262

8000

281

9000

298

10000

314

Fig. 3: Number of collisions plotted against mass of object1 using data from fig.4

a number that is close but not exactly 12 , a new graph 1 √ that plots m1 2 (which is equivalent to m1 was plotted in hope for finding a linear relationship. 4.1

The emergence of π

square root of the

Fig. 2: Table (all tables and graphs in this paper was created by myself with Microsoft Excel) that shows the raw data obtained from the simulation By simply plotting the data points from fig.4., mass of object1 against the total number of collisions, the following graph is obtained. The mathematical relationship between the number of collisions, N (plotted on the y-axis in fig.3.), and the mass of object1 (plotted on the x-axis in fig.3.), m1 , was identified by Microsoft Excel as having the equation

N = 3.01717m1 0.5025 What is intriguing about this relationship is that the power of m1 is a value that is very close to 0.5, or 12 . Also, since it makes more sense for the index of some natural relationship to be exactly 12 than it to be

61

mass of object 1 /kg

Total number of collision

1.000

3

2.236

7

3.162

10

3.873

12

4.472

14

5.000

15

5.477

17

5.916

18

6.325

20

6.708

21

7.071

22

7.416

23

7.746

24

8.062

25

8.367

26

8.660

27

8.994

28

9.220

29

9.487

29

9.747

30

10.000

31


square root of the mass of object 1 /kg

Total number of collision

14.141

44

17.321

54

20.000

62

22.361

70

24.495

76

26.458

83

28.284

88

30.000

94

31.623

99

44.721

142

54.772

172

63.246

198

70.711

222

77.460

243

83.666

262

89.443

281

94.868

298

100.000

314

Fig. 4: Table that shows the square root of m1 and the consequent number of collisions that occur

√ that occur — section 4) and m1 . The gradient of the linear relationship – 3.1413 – is indeed a familiar number, since it is a number that is ‘very’ close to π. An even more interesting (and interlinked) pattern that can be observed from the data is that the digits of π appear for when the value of is a multiple of 10 (i.e. m1 is a multiple of 100). For example, when m1 is 1 kg, a total of 3 collisions occur. When m1 is 100 kg, a total of 31 collisions occur. When 10,000 kg, 314 collisions. When 1,000,000 kg, 3141 collisions. And so on. Indeed this is only true in a perfectly theoretical setting which is somewhat impossible to create. However, the fact that the digits of π appear in some ‘counted’ value – a discreet set of data – itself is a very fascinating idea. So, why is this the case? Clearly, our simulated system seems to have nothing to do with the concept of circles, since, after all, the simulation was simply dealing with the relationship between the mass of the object being pushed and the total number of collisions that occur in an elastic system. Where could π appear?

5

The hidden circle and the phase space It was mentioned in section 1 that π appears in natural phenomena that deal with the concept of circles. This conveys the idea that even with our simulated environment, there is a ‘hidden’ circle. The hidden circle, in this case, is in fact the equation of the conservation of energy. This approach of having a circular phase space was based on this (3Blue1Brown, 2019). The conservation of energy, in our simulation, tells us that the value for the sum of the kinetic energies of the two objects ‘never changes’ throughout the simulation and is set to a certain constant value, giving the following relationship 1 1 m1 v12 + v22 = E 2 2

Fig. 5: Number of collisions plotted against the square root of the mass of object1 using data from fig. ??. Indeed, a linear relationship is given. Have a look at the trend line given by Microsoft Excel that gives the following linear relationship √ N = 3.1413 m1 − 0.3153 between N (defined as the total number of collisions

(8)

Again, m2 is set as 1 in this case, and E is the value for the initial total kinetic energy of the system. When we plot this system in a cartesian plane with the x-axis as v1 and the y-axis as v2 , the following ‘elliptical’ graph can be constructed. Since ellipses are basically circles that have been ‘stretched’ in one dimension, enlarging it by a scale fac1 tor of precisely m1 , or in other words adjusting the √ scale of the horizontal axis (x-axis) from v1 to m1 v1 , can create a ‘circular’ phase space. What the circle in fig.7. that basically defines the principle of conservation of energy tells is that whatever value v1 and v2 becomes throughout the simulation, it must remain on top of the circle, since, regardless of the values of the velocities, the energy must remain constant to the abstract value of E from equation (8). Here, the equation for the phase space would be a modification

62


conservation of momentum intersects with the circular phase space (conservation of energy). We know that the equation of the conservation of momentum is

m1 v1 + v2 = P

(10)

, and again, using the substitutions by the definition √ x = m1 v1 and y = v2 , equation (10) can be said to be equivalent to

Fig. 6: Graph of equation (8) that shows the elliptical phase space of the velocities of the system

√ m1 x + y = P √ ⇒y = − m1 x + P

(11)

when plotted in the phase space. Since we know that equation (11) defines the ‘conversion’ of velocities in the phase space, it must be the case that equation (11) intersects the coordinate in the phase space that defines the velocities ‘before’ the collision occurs, and the coordinate where such equation meets a new point in the circular phase space defines the velocities ‘after’ a collision occurs.

Fig. 7: Graph of the circular phase space — modified from fig.6. by adjusting the horizontal axis from to from equation (8) in the form of 1 2 x + y2 = E 2

(9)

√ whereby x = m1 v1 and y = v2 . The initial conditions from fig.7. makes logical sense since it tells that v1 is at maximum value and v2 has a value of 0, i.e., when the simulation begins, only v1 moves while v2 remains still until the first collision happens. 6

How collisions are presented in the phase space The first collision is an elastic collision between object1 and object2. In section 5, I have established the fact that the velocities of the objects must remain on top of the circular phase space. Therefore, knowing the fact that the collisions between the two objects are perfectly elastic, the new coordinates of the velocities in the circular phase space would be where the equation of the

Fig. 8: A new coordinate (C1 ) formed in the phase space after the first collision occurs Fig. 8. shows a new coordinate C1 that defines the velocities after the first collision occurs, which is — as mentioned before — where the equation of the conservation of energy intersects with the equation of the conservation of momentum (that contains the velocities before the collision) in the phase space. In this example, m1 is given a value of 10. We know that the next thing that happens in the system is object2 bouncing off from the wall, causing it to travel in the opposite velocity with the same magnitude. This, in the phase space, can simply be presented as the sign of the y-coordinate of C1 switching from a positive value to a negative value. This new point, then will be C2 .

63


Fig. 9: A third coordinate (C3 ) formed in the phase space after the second collision (with the wall) occurs From now onwards, the same process can be repeated for an unknown number of times until there can be no more collisions (when the system has been ‘terminated’). The conditions for the termination of the system is simply when both velocities are negative, and the magnitude of v1 is greater than v2 , since this indicates that both objects are traveling away from the wall, and since v1 is ‘faster’, they cannot collide any more. Mathematically, the following inequality x <y<0 √ m1

Fig. 10: The phase space for m1 = 10, showing a total of 10 collisions until Cn (n = 10) is located in the gray area — inequalities (12) C2 C1 and C2 C3 . The angle between C2 C1 and C2 C3 , as mentioned before, is defined by the gradient of the red line (more precisely, the negative inverse of the gradient, since the tangent of that angle refers to the expression of the run over rise, whereas the gradient indicates the expression for the rise over run, and either one of them in the opposite direction).

(12)

summarizes the conditions for termination, with v1 = √x m1 and v2 = y since the inequalities mean that both are negative, and v1 is ‘smaller’ (greater in magnitude but to the negative side) than v2 . Fig. 10. below is the result of repeating the process until inequalities (12) are met. The graphical result shown in fig.10. is in accordance with the result of the total number of collisions shown in fig.4. when m1 = 10, both indicating that a total of 10 collisions occur. By first glance, a crucial piece of information about the phase space may not be clearly evident – that all arc lengths between adjacent coordinates except that between the last two collisions (C9 and C10 ) are of the same lengths, or, in other words, the same measure of angles apart from each other. This is because the gradients of all the ‘red lines’ that indicate the change in velocities due to the collisions between the two objects √ are all identical, with a value of − m1 , evident from equation (11). For example, have a look at the arc formed by C1 , C2 , and C3 of the phase space. When two radii from C1 and C3 are projected to the center of the phase space circle, by using the circle theorem that states that the angle at the center is twice the angle at the circumference, the size of the angle between the two radii would be equal to twice the angle between

tan θ =

1 1 1 ∆x = − ∆y = √ =√ ∆y − m1 m1 ∆x

1 ⇒ θ = arctan √ m1 1 OC , OC = 2θ = 2 arctan 1 3 m1

(13)

And since all sectors formed by adjacent coordinates (except C10 and C9 ) have the same angle, the fig.11. can be drawn on the phase space. 7

A change in the question Now, an entirely new question can be asked about the number of collisions. When the unknown maximum number of collisions is N , it must be the case that N sectors (or N arcs) must be the largest circle that can be constructed without any overlap with the starting sector. Mathematically, the following relationship can be constructed from fig.11. N × 2θ ≤ 2π

(14)

, which basically means that N number of angles with size 2θ must be smaller or equal to a full circle, i.e., 2π radians. Substituting equation (13) into equation

64


N=

π

√1 m1

√ ⇒N = π m1

This form intuitively shows how the digits of π appear when m1 is given a value of 1, 100, 104 , 106 , and so on. For example, it is clearly evident √ that when the floor function is taken to a value of π × 104 (= π × 100) , it will result in returning the integer formed by the first three digits of π, 314. Fig. 11: The angles formed by adjacent sectors depicted on the phase space, whereby θ is defined in fig.10 (14), the following relationship between the m1 and N is constructed.

8

Evaluation of the findings The validity of equation (15) can be checked by comparing the raw output N that comes out from equation (15) to the actual results from the simulation. fig.12. shows this.

1 2N arctan √ ≤ 2π m1 1 ≤π ⇒ N arctan √ m1 π N≤ 1 arctan √m 1 And since N is defined as the largest integer that satisfies this relationship, a gauss notation can be used to create a rather complete model for the simulation:    N =

arctan

π

√1 m1

   

(15)

Although this form of the equation is the most accurate, a slight modification to the equation can be made in order to make the equation closer to the form that is shown in the trend line shown in section 4.1. The trick involves using the small angle approximation, whereby it states that the smaller the value of θ is (the closer it is to 0), the closer the value of tan θ becomes to θ.

θ ≈ tan θ Since the values I input to m1 in the simulation are very large values (relative to a unit mass) and so the values for θ are very small, equation (??) can be modified to a simpler but less accurate (but still very intuitive) form:

65

mass of object 1 /kg

number of collisions

N calculated with model

Column 2 and 3 equal?

1

3

4

Equal

5

7

10

Equal

10

10

10

Equal

15

12

12

Equal

20

14

14

Equal

25

15

15

Equal

30

17

17

Equal

35

18

18

Equal

40

20

20

Equal

45

21

21

Equal

50

22

22

Equal

55

23

23

Equal

60

24

24

Equal

65

25

25

Equal

70

26

26

Equal

75

27

27

Equal

80

28

28

Equal

85

29

29

Equal

90

29

29

Equal

95

30

30

Equal


mass of object 1 /kg

number of collisions

N calculated with model

Column 2 and 3 equal?

100

31

31

Equal

200

44

44

Equal

300

54

54

Equal

400

62

62

Equal

500

70

70

Equal

600

76

76

Equal

700

83

83

Equal

800

88

88

Equal

900

94

94

Equal

1000

99

99

Equal

2000

142

142

Equal

3000

172

172

Equal

4000

198

198

Equal

5000

222

222

Equal

6000

243

243

Equal

7000

262

262

Equal

8000

281

281

Equal

9000

298

298

Equal

10000

314

314

Equal

can be established as true, and so the first few digits of π very accurately predicts the total number of collisions that would occur (this prediction gets better as the value for θ gets smaller, since the small angle approximation becomes more accurate). 9 Bibliography 1. 3Blue1Brown, director. Why Do Colliding Blocks Compute Pi?, YouTube, 20 Jan. 2019, www.youtube.com/watch?v=jsYwFizhncE. 2. Hohenwarter, Markus. “Calculator Suite.” GeoGebra, GeoGebra, 2001, www.geogebra.org/calculator. 3. Kim, Moojo. “EmergenceOfPiElasticCollisionsMathAAMockIAM oojo(Joe)Kim.txt.” Google Drive, Google, 15 Jan. 2021, drive.google.com/file/d/1Wcv-UyFEVfpVA98VU1I xmeDwwhe52Ody/view?usp=s haring

fig.12. shows that except for when m1 is 1kg, the model is very accurate. The error with m1 = 1 is due to the fact that the simulation ends with v2 = 0, meaning that although according to equation (15) object2 is expected to collide once more with the wall, in reality, it just ‘stops motion’. Therefore, it can be concluded that the mathematical model of equation (??) to predict the total number of collisions that would occur in such simulated system very accurate. Also, back to the main question about ‘where π appears’, it mainly has to do with the fact where and how π appears, it can be said that constant π appears in the result of the simulation because firstly, the equation of conservation of energy can be adjusted to form a circular phase space, secondly, the fact that all collisions are elastic makes sure that the change in coordinates in the velocity phase space is periodic, and thirdly, due to the fact that at very small values of θ,

θ ≈ tan θ

66




Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.