Chalkdust, Issue 13

Page 1

c ha l k dus t 2021

1 3/ /

Amagaz i nef ort hemat hemat i cal l ycur i ous

A I H AT M

D : N RS LA E G AGED! EN AN K M AN R

S,& CH D RIS RD I CA V ZY AN CORA SI C A

T N EN N D MA I LL ES I PR ET SI K LM LR U

C I AT M M RA RA AG EB DILG A

E S T N O O V TI AR EC UL ELOP P

PRI ZECROSSNUMBER +YOURNUMERALSYSTEM DEARDI RI CHLET+TOP CALCULATORBUTTONS



In this issue... 4

In conversation with Ulrike Tillmann Sophie Maclean and David Sheard sit down with the incoming president of the LMS

12

The maths of Mafia Sophie Maclean has a run-in with the mathia

30

How to be the least popular American president Francisco Berkemeier is still waiting for results from Georgia

39

Who is the best England manager? Paddy Moore levels the score

48

Surfing on wavelets Johannes Huber debates whether it’s pronounced JPEG or JPEG

60

Diagrammatic algebra Aryan Ghobadi gives a maths lecture at a zoo

12

4

56 1

22

On conditional probability 𝑃(this article is good | it’s written by Madeleine Hall) = 𝟒.𝟫𝟀

3 11 21

Page 3 model

27

The big argument: Is the Einstein summation convention worth it?

28 35

Dear Dirichlet

46 55 56

Puzzles

58 69 70 72

Crossnumber

What’s hot and what’s not Which number system are you?

On the cover: Cellular automata by Matthew Scroggs Letters Significant figures: John Conway by Jamie Handitye and Jakob Stein Zoom conference zingo Reviews Top ten: Calculator buttons spring 2021


chalkdust Welcome to Chalkdust issue 13: this magazine is officially a teenager. Normally, with age comes maturity and wisdom.

The team Charlotte Connolly Jamie Handitye Ellen Jolley Sophie Maclean Matthew Scroggs Belgin Seymenoğlu David Sheard Jakob Stein Adam Townsend

d c a b l n e

chalkdustmagazine.com contact@chalkdustmagazine.com @chalkdustmag chalkdustmag chalkdustmag @chalkdustmag@mathstodon.xyz Chalkdust Magazine, Department of Mathematics, UCL, Gower Street, London WC1E 6BT, UK.

This issue, our significant figure (pp 56–57) is John Conway, and his influence can be felt throughout the magazine: from the coverβ€”generated by cellular automata (pp 35–38)β€”to the team’s favourite and least favourite mathematical games. You can also read about the mathematics of games like Mafia or Among Us (pp 12–20), conditional probability in card games (pp 22–26), and find out about a gameβ€”which is very popular apparentlyβ€”called β€˜football’ (pp 39–45). And on pp 30–34 we delve into the greatest game of all time (according to Russian hackers), the US presidential election. Speaking of presidents, since our last issue a new one has been elected. You can read our interview with Ulrike Tillman, incoming president of the London Mathematical Society (pp 4–10), where we discuss her work tackling inequality and her research bridging pure and applied mathematics. On top of that, we have two articles discussing the very purest and very applied-est of mathematics: braided categories (pp 60–68), and image compression (pp 48–54). We know everyone is sick of sitting in endless video calls over the last few months; to relieve the monotony you can play our very own game of Zoom conference bingo (page 69). After all, it’s the twenties and there is time for Zingo! We at Chalkdust are excited at the prospect of running an in-person event again in the future, seeing old and new faces (edges and vertices), and getting rid of the boxes of magazines that are filling up our spare rooms. The Chalkdust team

Acknowledgements We would like to thank: all our authors for writing wonderful content; our sponsors for allowing us to continue making the magazine; Helen Wilson, Helen Higgins, Luciano Rila and everyone else at UCL’s Department of Mathematics; everyone at Achieve Fulfilment for their help with distribution. ISSN 2059-3805 (Print). ISSN 2059-3813 (Online). Published by Chalkdust Magazine, Dept of Mathematics, UCL, Gower Street, London WC1E 6BT, UK. Β© Copyright for articles is retained by the original authors, and all other content is copyright Chalkdust Magazine 2021. All rights reserved. If you wish to reproduce any content, please contact us at Chalkdust Magazine, Dept of Mathematics, UCL, Gower Street, London WC1E 6BT, UK or email contact@chalkdustmagazine.com

chalkdustmagazine.com

2


Have you β€˜herd’? The world’s largest cow is over six feet tall and weighs more than 1.3 tonnes. Is a bigger cow possi-bull? Will the future contain infinitely large cows? The steaks have never been higher! To answer this question, let’s take a look at the cow’s legs. If the main (meaty) bit of the cow has a volume 𝑉 and density 𝜌 then its weight is πœŒπ‘‰ 𝑔. So each leg supports a load of about πœŒπ‘‰ 𝑔 𝑁= . 𝟦 In pursuit of glory, let’s now make the length, height and width of the cow bigger by a factor π‘Ž. The cow’s new volume is π‘ŽπŸ₯ 𝑉 and so the load on each leg is π‘ŽπŸ₯ 𝑁: it grows cubically as π‘Ž increases. Can the legs cope? If we model the legs as cylinders (since they already β€˜lactose’...), we can use a 1757 result from the famous cow enthusiast Euler: if a cylinder has height 𝐿 and radius π‘Ÿ , the maximum load it can support standing upright is 𝐸ππŸ₯ π‘Ÿ 𝟦 . 𝟦𝐿𝟀 𝐸 here is just a property of the material: its stiffness, or Young’s moo-dulus. 𝑁max =

With our scaling, 𝐿 and π‘Ÿ are now π‘Ž times bigger. Our new maximum load is 𝐸ππŸ₯ π‘ŽπŸ¦ π‘Ÿ 𝟦 = π‘ŽπŸ€ 𝑁max . πŸ¦π‘ŽπŸ€ 𝐿𝟀

Uh oh... this only scales as π‘ŽπŸ€ : quadratically. So even though 𝑁max starts above 𝑁 (it has to, given that these cows exist!), there will come a maximum possible π‘Ž, after which there will beef-ar too much cow and its legs will give way... an udder disaster.

This analysis tells us something really important about biologyβ€”that there is a natural maximum size for land mammals. But have we reached it for cows? Brody & Lardy’s 1000-page tome Bioenergetics and Growth from 1946 has all the de-tail you need. We’ll leave you to ruminate on the cow-culations. 3

spring 2021


chalkdust

I n c o n versat io n w it h . . .

Ulrike Tillmann Ander McIntyre, used with permission from Ulrike Tillmann

Sophie Maclean and David Sheard

T

hey say variety is the spice of life and to us at Chalkdust, maths is life so it makes sense that maths is made better by variety. A variety of topics, a variety of people, a variety of poorly constructed maths puns. Ulrike Tillmann embodies this ethos with her work bridging the gap between pure and applied maths. Despite spending most of her academic career in the UK, Ulrike has lived in several other countries. She was born in Germany and then went on to study in the US. She is now a professor of pure mathematics at the University of Oxford and a fellow of the Royal Society, balancing her time between research, teaching, and outreach. She sat down with us to chat about her career and what the future holds, both for her and maths in general.

Taking the reigns If you’ve been following maths news in the past few months, the name β€˜Ulrike Tillmann’ may be particularly familiar to you. It was announced recently that she will be the next president of the London Mathematical Society, one of the UK’s five β€˜learned societies’ for mathematics. She will also take up the mantle as director of the Isaac Newton Institute, a research institute at the University of Cambridge, in autumn of this year. Research institutes are perhaps the least wellknown entities in the academic world (as viewed from the outside), often only visited by some of the most senior academics in a field. We asked Ulrike to explain what they are all about. β€œThe Isaac Newton Institute runs mathematical programmes in quite a broad range of areas. These programmes typically run between four and six months and researchers come from all over the chalkdustmagazine.com

4


chalkdust world to concentrate on their research.” The programmes are beneficial not only to individual mathematicians, but to the community as a whole. β€œBeing together with your colleagues who are also experts in your area, and who are often completely spread all over the world, is a fantastic thing. It brings the field forward and it can make a big difference to that research area.” On paper, the role of director will involve overseeing the organisation of these programmes, but she sees it going beyond this, including β€œmaking sure that things like equality and diversity are not just observed, but also incorporated.” Diversity matters a lot to Ulrike and she has spent a lot of time thinking about what can be done, so she turns our conversation towards representation in mathematics more generally. β€œThe most important part seems to me to be ensuring that women, and also other minorities, are welcome; and fostering a very open society.” Ulrike is involved with many events for women in mathematics, both as a speaker and organiser. Indeed, encouraging more women into mathematics was part of her motivation for taking on her new positions. β€œI think as women we have to occasionally come forward and do these roles, even though sometimes we shy away from them. Being a presence is important.” She hopes that by increasing the visibility of women in mathematics, women will be encouraged to study maths and stay in academia. β€œWe’re always drawn to groups where we see people who are similar to us, where we can identify, where we are obviously welcome. So I think we need to make that part of our culture: to just be open.” Unfortunately we know all too well that any change like this will take time, and she acknowledges Ulrike Tillmann the difficulties. β€œIf I had some solution, I would have implemented it by now. But go back 30 years and there has been a big change. I think that’s encouraging and we just need to make sure that it is pushed in the right direction.” Ulrike, of course, knows personally the importance of diversity as a woman in mathematics, but she is keen to impress that diversity goes beyond gender, and the other underrepresented groups we often talk about. Really she sees the need for diversity of experience, thinking, and background. β€œIn terms of excellence you need a mixture of peopleβ€”not just the starsβ€”you need a whole mixture of people all striving for excellence in their own ways. This cannot be measured on a simple onedimensional scale. I think geographic diversity is also another aspect of this which is really important. And we will all be better off if we spread things around a little bit in a sensible way.”

The most important part seems to me to be ensuring that women, and also other minorities, are welcome; and fostering a very open society.

Seeing the shape of data Some of Ulrike’s most recent research has been in the fledgling field of topological data analysis (TDA). β€œIt’s really trying to capture the shape of data. You can imagine data as a point cloud in some Euclidean space, and when you have such point clouds, what does it mean to be a shape?” 5

spring 2021


chalkdust The idea of studying the basic shapes in data is nothing new. β€œThere is clustering, for example: people already understand clustering relatively wellβ€” there’s a bunch of points here, a bunch of points thereβ€”they seem to be separated and maybe that separation is meaningful for the data. Or in linear regression, you are trying to fit a line to your data, and then that gives you some understanding of the data.” Topological data analysis seeks to use advanced topological techniques to detect more complicated structures hidden in complex data. β€œYou are looking for holes in dimension one or two and then you can use different techniques to approach the same data from different directions and try to understand a little bit more about the shape. The idea is that, especially for complex data, the shape should be meaningful.” It might be difficult to imagine how complex topological features can be interpreted meaningfully in the real world, but the approach has many success stories. β€œThere has been a famous study by Gunnar Carlsson and collaborators which looks at different types of breast cancers. The data was effectively Y-shaped, rather than just a line. Understanding that there was a third branch, a new branch, meant they could see that not all cancers were the same. There was actually a β€˜good’ version that you didn’t have to treat.” Data scientists rely on TDA as without it β€œsometimes you just can’t predict what shapes the data has.” This last point is essentialβ€”the topological techniques can help you find patterns in your data that you would not even think to look for.

It’s always a two way dialogue between the mathematician and those people who want to apply it.

One key topological tool used in TDA is persistent homology. Homology is a technique which uses algebra to count the topological features of a space, for example, the homology of a torus can be summarised by three numbers counting the features in dimensions 0, 1, and 2: π›½πŸ’ = 𝟣

It has one connected component, so every 0-dimensional point is connected to every other

π›½πŸ£ = 𝟀

It contains two β€˜independent’ 1-dimensional circles (red)

π›½πŸ€ = 𝟣

Its 2-dimensional surface encompasses one interior cavity

Persistent homology studies those topological features which can be found persistently in the homology of the data as you vary the scale on which you look at the data points’ interactions (see diagrams on the next page). In this context, clustering means that your data has several connected components, and so just corresponds to the 0-dimensional homology. TDA focuses on finding higher dimensional structures by looking at the higher-dimensional homology. β€œEspecially looking at biological data, it is generally not so important where exactly the points areβ€”they are just samples anyway. So topology in particular is quite useful because it tries to study the shape in a β€˜fuzzy’ way. Topology is just a poor relative of geometry where you forget about angles and distances, so that it can focus on the most important features.” chalkdustmagazine.com

6


chalkdust

An evolving field Ulrike’s work in this area is formalised through her role as codirector of the Centre of Topological Data Analysis at the University of Oxford. Despite a research background in very pure mathematics, she doesn’t limit herself to the theoretical side of TDA. β€œOur pitch to the EPSRC [Engineering and Physical Sciences Research Council] was that we would go all the way to the applications. The application should tell us what we want to understand theoretically, and then we work backwards and forwards between pure and applied. I have been involved in a study where immune cells are analysed. How quickly can they infiltrate a cancer? That is a real life study where maybe these topological methods can be used. So you see the whole pipeline going through.” Since TDA is a new and exciting field, it is tempting to try and speculate about what developments we can expect in the next few years. Ulrike is cautiously optimistic: β€œI think evolution rather than revolution is probably what we are going to see. A certain amount of new thinking has to be cultivated because topology is not one of your typical areas of applied mathematics: you tend to see more analysis, numerics, and linear algebra.” It takes time for new mathematical ideas from pure topics to infiltrate applied research groups. β€œI think it needs to be popularised a little bit first because it’s always a two-way dialogue between the mathematician and those people who want to apply it and we just need to fill in that space more. But it is this interaction between topological data analysis and other techniques that will really be important.” These other techniques are by no means limited to data scienceβ€”applied mathematics is about pulling together any and every tool which might be helpful. β€œWe are also trying to mix TDA with machine learning methods to make more meaningful and also more interpretable machine learning algorithms.”

Igniting a love of maths You’d be forgiven for thinking Ulrike never had doubts she’d be a mathematician, but this was not the case. β€œIt was somewhat gradual. I went to Stanford for my graduate degree and during my first year I was playing with the idea of doing something in computer science.” Although Ulrike did eventually settle on maths, she does worry that a more rigid degree course would 7

Persistent homology studies the shape data makes as points interact at different length scales.

spring 2021


chalkdust have prevented this. β€œI did about a third of my undergraduate courses in mathematics. If I had come to Britain at that point I would have completely missed the train.” In fact, Ulrike believes this is a significant flaw in the British system. β€œI think we force our students into decisions too early. If you like mathematics you shouldn’t have to rely on your decision as a 16-year-old to pick further mathematics A-level.” Ulrike grew up in a small town in Germany and partially credits this for sparking her joy of mathematics. β€œThere was no kindergarten or anything like that, so I was a bit bored. I asked my mother for problems and she would set me some sums and I liked doing those.” Throughout school all the way to her undergraduate degree, maths was just something that came easily to her, rather than a strong interest. Eventually it was the puzzles that drew her in. β€œI really wanted to work on these challenging problems that mathematics provides. I think it’s a really deep satisfaction that comes out of solving a problem. That you see connections between things that you haven’t been able to see before, and that maybe nobody else has been able to see before. That is very excitingβ€”to really try to understand somethingβ€”and sometimes you bring new concepts together.”

I think it’s a really deep satisfaction that comes out of being able to solve a problem. That you see connections between things that you haven’t been able to see before…

Studying surfaces and spaces The problems which really appeal to Ulrike come from geometry and topology, in particular the so-called moduli spaces of surfaces. These turn out to connect with several areas of maths and physics: β€œYou know what a surface is, and one way to think of moduli spaces is to understand them in families.” A simpler one-dimensional example might be to think about all possible circles in the plane. A circle is determined by its centre (π‘₯, 𝑦) and its radius π‘Ÿ > 𝟒. Therefore choosing a circle in β„πŸ€ is the same as choosing a point (π‘₯, 𝑦, π‘Ÿ) in β„πŸ€ ×ℝ>𝟒 ≕ M. This set is the moduli space for circles in the plane (see diagram below). The key idea is that this moduli space is itself a geometric space (not merely an abstract set), and so you can study it using geometric and topological toolsβ€” and hence study all possible circles at once. For example, following a path in M corresponds to continuously deforming one circle into another. ℝ>𝟒

β„πŸ€

β„πŸ€

The moduli space M of circles in the plane (left), and a continuous deformation of circles corresponding to a path in M (right). chalkdustmagazine.com

8


chalkdust Since surfaces are two-dimensional, and geometrically more complex, their moduli spaces are a lot more complicated, but that also makes them more interesting. β€œSurfaces are one of the foundational objects in mathematics. They appear in geometry, of course, but also dynamics and number theory; they’re all somehow connected to surfaces in one way or another, and the moduli space is of interest to a lot of these subjects.” In fact, it was through the applications of moduli spaces to physics that PavlΓ­na JΓ‘chimovΓ‘, CC BY 3.0 CZ Ulrike representing the British Royal Ulrike first became interested in the subject, working Society of Sciences, 2018. with fellow Oxford topologist Graeme Segal. β€œHe was interested in conformal field theories and topological quantum field theories and it is the physics story behind it that made it very interesting for me. I’m still very excited about this physics part of it, because some of the theorems that we were able to prove can be interpreted as classifications of so-called invertible topological quantum field theoryβ€”so the story behind it is quite important.” Again the go-to technique for studying moduli spaces for Ulrike is homology.

A mathematician’s apology People engaged in basic researchβ€”research which has no immediate applicationβ€”are often called upon to justify their work, whether to family and friends, funding bodies, or even policymakers and the general public. Sometimes this may amount to no more than a minor inconvenience. However, in recent months the topic has risen to the fore since the University of Leicester began a consultation on proposals which, as part of broad restructuring across its faculties, include the disbanding of the pure maths research group in favour of a focus on applications of mathematics to artificial intelligence, computational modelling, and data science. Of course, a consultation is not the same as enacting a proposal, and there are likely to be many factors involving the plans which are not in the public domain; nevertheless as a pure mathematician who works on the interface with applications in TDA, Ulrike is well-placed to comment in general on the idea of doing away with pure mathematics in a research intensive institution. β€œOf course I don’t know the precise situation, Turing, for example, was a mathematithere are often financial considerations and so cian first and then the inventor of the on, but I find it a little bit puzzling frankly. I’m a pure mathematician who also moved into Turing machine. It feels to me that redata science and it seems to me that a new search culture ought to support founsubject like data science will certainly benedational mathematics. fit from pure mathematics, where many of the new ideas are coming from. Turing, for example, was a mathematician first and then the inventor of the Turing machine. It feels to me that research culture, in a research universityβ€”especially one that hopes to do something as technical as data scienceβ€”ought to support foundational mathematics.” The Centre for Topological Data Analysis stands as a prime example of how pure mathematics works in tandem with its applications, although of course immediate applicability is by no means the sole justification of pure maths. 9

spring 2021


chalkdust Undergraduate degrees tend to have curricula designed to ensure a strong basis in pure mathematics. This allows students to develop a sufficient grounding and enable them to specialise in topics ranging from applied to pure. How these pure modules will be taught without pure research mathematicians is clearly a question which must be tackled. β€œIn Leicester’s case in particular, what troubles me is that they still hope to have an undergraduate maths degree, but having teachingonly staff to do the pure modules, and I don’t think that’s a great solution. For me, teaching gets more interesting and is invigorated by research, so the best teaching is often inspired and kept relevant by research.” Much of the criticism of Leicester’s proposals centre around legitimate concerns for the individual workers who are now facing the prospect of redundancy during a pandemic, but any argument in defence of basic research in mathematics or any other subject needs to stand independent of the present situation. It is no great secret that academia is a very insular field, and one might reasonably try to argue that mathematics needs to modernise, consider new ways of working, and what is the harm if one university opts to focus solely on technology and applications? β€œI think actually Britain is generally quite advanced in thinking in terms of impact. Of course it takes some effort to make these connections and they are also not necessarily done by pure mathematicians themselves. But pure mathematics brings the practice and culture of rigorous thinking. That’s really important, and it’s often our students who make the applications. Britain has a science and technology based industry and economy, and we need more people educated in Stem subjects, of which mathematics is a foundational part. I don’t think we can get away from that.” Sophie Maclean Sophie Maclean is a recent maths graduate from the University of Cambridge and very much misses her degree. She has no free timeβ€”she is a Chalkdust editor.

a @sophiethemathmo David Sheard David is a final year PhD student at UCL studying geometric group theory. When not doing maths he can usually be found singing or playing the flute.

d davidsheard.co.uk c david.sheard.17@ucl.ac.uk a @SheardDavid My favourite game Maths is all about playing with ideas to see what happens, and some of the coolest maths come straight out of playing games. We’ve spread some of our favouriteβ€”and least favouriteβ€” mathematical games throughout this issue.

Mathsteroids Matthew Scroggs

Mathsteroids ( d mscroggs.co.uk/mathsteroids) is a version of the classic arcade game Asteroids that you can play on a selection of interesting surfaces. For the levels on a sphere, you can choose which 2D representation of the sphere you want to see while you play. The Gall–Peters level is very hard. 10/shameless self-promotion chalkdustmagazine.com

10


& WHAT’S

WHAT’S

HOT NOT NOT Muting yourself when

Typing noise ASMR

not talking

So soothing

There’s a 0% chance you’ll remember to unmute later

HOT

HOT

Financial derivatives

All the cool kids love buying GameStop.

Maths is a fickle world. Stay Γ  la mode with our guide to the latest trends. Agree? Disagree? a @chalkdustmag b chalkdustmag l chalkdustmag f chalkdustmag

Partial derivatives None of the cool kids love these. Stop.

NOT

HOT

𝟨.𝟀 cm

Seems a reasonable height for a person

πŸ¨β€²πŸ€β€³ Seems an unreasonable height for a person

NOT

HOT

FFTs

Don’t waste your time on slow Fourier transforms

NFTs Don’t waste your money buying a blockchain picture of a scorpion

Writing about an election six months too late

Being current See no pages.

NOT

NOT

See pages 30–34.

HOT

More free fashion advice online at d chalkdustmagazine.com

Pictures Background: Flickr user Harmon, CC BY-SA 2.0. GameStop: Mike Mozart, CC BY 2.0.

11

spring 2021


chalkdust

The maths of Mafia Sophie Maclean

I

t was 8pm on a wintry Saturday and I was pleading for my life. β€œI would never betray you. I promise.” I searched desperately for someone, anyone, to back me up. Of course, they were right. I had been responsible for the murder of many of their friends but I wasn’t about to admit to that. My co-conspirators had gone quiet, well aware that to support me was to put themselves into the firing line. At 8.55pm a vote was held. By 9pm I had been executed. OK, so that first paragraph may have been a bit misleading. Thankfully I was not actually put to death, and I haven’t killed anyone in real life. This all happened whilst I was playing Mafia. For those unfamiliar, Mafia is strategy game in which players are (secretly) assigned to be either citizens or mafia. The game is split up into day and night phases (when playing in person, night is simulated by everybody closing their eyes). During the night phase, the Mafia are able to communicate with each other and can vote to kill one person. During the day phase, all the residents (both citizens and mafia) discover who died, and then vote to execute one resident. The aim for each group is to eliminate the other. Some of you may be reading this thinking β€œHmmmm this is sus. It sounds very much like Among Us” and you’d be right. The popular online game was inspired by Mafia and is one of many adaptations of the game. The particular version I was playingβ€”when I was outed as a Mafia memberβ€”was Harry Potter themed (if there’s one thing you’ll learn about me during this article, it’s that I’m incredibly cool). People found themselves either on Team Hogwarts or Team Death Eater, and there chalkdustmagazine.com

12


chalkdust were some special Potter themed rules, which led to an interesting situation mathematically. At the beginning of the game, Voldemort was allowed to select a horcrux. This essentially meant that Voldemort could not die until both he and this other character were killed. The crucial part of this power was that the horcrux didn’t know they were the horcrux. Towards the end of the game, there were four characters left alive: Voldemort, Fred Weasley (who was the horcrux), Fawkes the phoenix, and Ginny Weasley. By this point, everyone knew for certain which character had been assigned to each player. Fred, Fawkes and Ginny had also worked out that one of them was the horcrux but didn’t know who. They had guessed it was Fawkes. That night Voldemort attempted to kill Ginny (though didn’t succeed because magic). The next day, Team Hogwarts were informed of the failed assassination. What should they do next?

The Great (Monty) Hall On the face of it, there seems to be no reason to alter who they suspect the horcrux is. The probability of Fawkes being the horcrux hasn’t changed, right? Well actually, wrong. When they settled on Fawkes, there was a 𝟣/πŸ₯ chance that he was the horcrux. In attempting to kill Ginny, Voldemort had shown that she wasn’t the horcrux (Voldemort could equally have attempted to kill Fawkes here). So the probability that Fred is the horcrux is 𝟀/πŸ₯. Therefore it’s best to kill Fred (sorry Fred!). Though at first glance, the situation doesn’t appear to have changed, the addition of new information means it actually has, and in Mafia information is crucial. Those of you familiar with a certain piece of mathematical lore will now be jumping out of your seats. This is equivalent to the infamous Monty Hall problem. In that problem, a contestant must pick one of three doors as part of a game show. The contestant is told that behind two of the doors is a goat, but behind the third is a car. After they have made their choice, Monty Hall (the presenter) opens one of the other doors to reveal a goat. The contest is then given the opportunity to swap doors. In our Mafia analogy, the horcrux is the car, with the other two characters being goats. Voldemort is our Monty Hall, and by attempting to kill Ginny he opens the door to a goat. And just like in our Mafia version, the contestant is better off swapping doors.

Harry Potter in the Great Hall during a feast

If this is taking a while to get your head around, consider the following table (now framed in terms of the Monty Hall problem): Swap?

initial guess

final guess

Don’t swap

car goat 1 goat 2

car! goat 1 goat 2

Swap

car goat 1 goat 2

a goat car! car!

13

spring 2021


chalkdust Each of the initial scenarios (ie each line) is equally likely to occur. You can now clearly see that when swapping, the contestant wins 𝟀/πŸ₯ of the time, compared to only 𝟣/πŸ₯ of the time when staying put. Another way of putting this is that to win when sticking with your first guess, you have to guess correctly first time (which has probability of 𝟣/πŸ₯) but to win if you swap, you have to be wrong on your first guess (which has probability of 𝟀/πŸ₯). This whole scenario got me wondering how else maths could help when playing Mafia. I first learnt to play Mafia at maths camp (I told you I was cool) so I knew it was popular with mathematicians. Could it be that there’s a secret mathematical strategy to guarantee that you win the game?

Rules and Regulus-ions Black In short, no. One of the great things about Mafia is there’s a huge psychological element. When playing with people you know well, you can observe changes in behaviour that indicate they’re lyingβ€”you can spot contradictions in alibis, you can notice voting patterns and compare that with known friendships. Interrogations can be carried out, pressure can be applied; I’ve even known someone to threaten to end a relationship if it transpired her partner was lying to her. But this doesn’t mean that there’s nothing we can say. If we take out the psychological aspect, and assume that murder choices are truly random, we can give a probability that the mafia win. The first step in exploring the maths of any game is to clearly set the rules, and formalise our mathematical model. We will consider a simple version of the game, where everyone is either a mafia member or a citizen and there are no Potter-esque powers. Let’s define the game as follows: There are initially 𝑁 players, all of whom are called residents. There is also one more person, not playing, who coordinates the game (this allows anonymous and simultaneous votes). Before the gameplay begins, 𝑀 of these 𝑁 players are assigned to be mafia. The remaining 𝑁 βˆ’ 𝑀 players are citizens. Every player is told their own identity. The mafia are also told the identities of each other. The citizens only know their own identity. A turn is defined as a day phase, followed by a night phase: β€’ A day phase consists of a debate, where all players can freely discuss strategy. After this, there is a vote where all players simultaneously choose a resident to execute. The resident with the most votes is killed. In the event of a tie, one of the most voted residents is randomly chosen to die. It is then revealed whether this resident was a mafia member or not. β€’ A night phase consists of only the mafia communicating and then voting on which citizen to kill, followed by their death. In this model, there is no way for this death to be prevented, and the mafia cannot kill one of their own. We assume no psychological aspect comes in to play, and so we assume citizens never have any information on who is and isn’t mafia (obviously this is a vast oversimplification because chalkdustmagazine.com

14


chalkdust if, for example, a group of people exactly the same size as the mafia always vote the same way, and never vote for each other, even the least observant player may get suspicious.) The game continues until only one team (citizens or mafia) remains and this team is declared the winner. We will assume every player plays rationally and always makes the decision that maximises their chance of winning. This is perhaps the biggest assumption of all. Let us write the current state with 𝑛 players and π‘š mafia as (𝑛, π‘š). Let the probability of the mafia winning when there are 𝑛 players and π‘š mafia be 𝑀(𝑛, π‘š). There are a few things that we can immediately say, without much further calculation. During a single turn, the possible transitions are (𝑛, π‘š) β†’ (𝑛 βˆ’ 𝟀, π‘š βˆ’ 𝟣) (when the residents execute Regulus Black (Sirius’s late brother) dura mafia member) or (𝑛, π‘š) β†’ (𝑛 βˆ’ 𝟀, π‘š) (when the resiing the day phase (left) and the night dents execute a citizen). Fans of Chalkdust may recogphase (right) nise this as a Markov chain (which you can read about in issue 12). One key thing to note here is that the number of residents decreases by 𝟀 each turn, therefore the game must end in a finite number of turns. By putting a time limit of the length of each phase, you can also guarantee the game ends in finite time. It would be very unsporting for the last remaining mafia member to refuse to stop talking and allow a vote to occur, thereby ensuring that although the mafia couldn’t win, they also couldn’t lose, but it wouldn’t be unheard of (looking at you, MPs). Now we want to consider some probabilities. The probability of the mafia winning from each state is independent of what happened before. We can therefore say that 𝑀(𝑛, π‘š) = 𝑃(mafia executed | 𝑛 players, π‘š mafia) 𝑀(𝑛 βˆ’ 𝟀, π‘š βˆ’ 𝟣)

(f)

+ 𝑃(citizen executed | 𝑛 players, π‘š mafia) 𝑀(𝑛 βˆ’ 𝟀, π‘š).

Here, 𝑃(𝐴 | 𝐡) is the probability of 𝐴 happening if 𝐡 has already happened (see pages 22–26).

Sorting into houses cases This is still a fairly general formula, and doesn’t give much insight. In order to say more, we’ll need to look at three separate cases: β€’ π‘š >π‘›βˆ’π‘š

(the mafia have a majority)

β€’ π‘š =π‘›βˆ’π‘š

(there are an equal number of mafia members and citizens)

β€’ π‘š <π‘›βˆ’π‘š

(the citizens have a majority)

Firstly, let’s look at when the mafia outnumber the citizens. In this case, the mafia are guaranteed to win. This is because during the day phase, they can all vote to kill the same citizen, win the 15

spring 2021


chalkdust majority vote, and ensure it is a citizen that is killed. This can be organised during the night phase. Therefore 𝑀(𝑛, π‘š) = 𝟣 when π‘š > 𝑛 βˆ’ π‘š. What about if the citizens have a majority (ie π‘š < 𝑛 βˆ’ π‘š)? In our model, the citizens have no information on who is and isn’t mafia. Therefore their strategy can only be to randomly select a resident to eliminate each day phase. The debate phase can be used to agree on which random resident to execute (for example using a random number generator). The citizens then all vote for this person, and (because they have the majority), the unlucky Sorting into cases resident is executed. There’s nothing the mafia can do to change that. Each resident has an equal chance of being selected. Therefore the probability that a mafia member is executed is π‘š/𝑛, and the probability a citizen is executed is (𝑛 βˆ’ π‘š)/𝑛. π‘š <π‘›βˆ’π‘š

The case when π‘š = 𝑛 βˆ’ π‘š (ie 𝑛 = πŸ€π‘š) is a special case. The citizens and mafia can propose a player each, and then one of them will randomly be executed. The probability that a mafia member is executed is therefore 𝟣/𝟦 (and so the probability a citizen is executed is πŸ₯/𝟦). If a citizen is executed, the mafia outnumber the citizens, and so they win. If a mafia member is executed, there remains an equal number of mafia and citizens (as a citizen is killed during the night phase). This puts us in exactly the same situation as before, with the same probabilities. For the citizens to win, a mafia member must be executed on all remaining turns (of which there must be π‘š as two residents are killed per round). Hence 𝑃(citizens win) = (𝟣/𝟦)π‘š = (𝟣/𝟀)πŸ€π‘š = (𝟣/𝟀)𝑛 , and so 𝑀(πŸ€π‘š, π‘š) = 𝟣 βˆ’ (𝟣/𝟀)𝑛 . If we combine all this, and use equation (f), we get the following: ⎧𝟣 𝟣 𝑛 βŽͺ πŸ£βˆ’( ) 𝑀(𝑛, π‘š) = 𝟀 βŽ¨π‘š βŽͺ 𝑀(𝑛 βˆ’ 𝟀, π‘š βˆ’ 𝟣) + 𝑛 βˆ’ π‘š 𝑀(𝑛 βˆ’ 𝟀, π‘š) 𝑛 βŽ©π‘›

if π‘š > 𝑛 βˆ’ π‘š if π‘š = 𝑛 βˆ’ π‘š otherwise.

This is pretty neat and is now just an iterative equation. It would be perfectly possible to calculate 𝑀(𝑛, π‘š) now, by just plugging through the steps (or even writing some code to do it for you if you’re that way inclined). Finding a more general formula becomes pretty complicated pretty fast. But there is still more that we can say, without our brains hurting too much.

(The philosopher’s st)one mafia member Let us consider the game with only one mafia member. In order for this mafia member to win, in every day phase a citizen must be executed. Remember that day phases precede night phases. Therefore 𝟣 + 𝑛 (mod 𝟀) (𝑛 βˆ’ 𝟣)!! π‘›βˆ’πŸ£π‘›βˆ’πŸ₯ 𝑀(𝑛, 𝟣) = Γ—β‹―Γ— = . (f f) 𝑛 π‘›βˆ’πŸ€ 𝑛!! 𝟀 + 𝑛 (mod 𝟀)) Here !! is the double factorial function (like factorial, but taking every other element). It’s worth chalkdustmagazine.com

16


chalkdust noting that because 𝑛 (mod π‘˜) is the remainder when 𝑛 is divided by π‘˜ , we have 𝑛 (mod 𝟀) = {

𝟒 𝟣

if 𝑛 is even if 𝑛 is odd.

(f f)

This highlights a rather interesting property: the dependence on the parity of the number of residents. This doesn’t seem unreasonable, because two players are killed each day, and whether there are an even or odd number of players does affect the proportion of players needed to have a clear majority. A quick calculation using equation (f f) shows that 𝑀(𝟀, 𝟣) = 𝟣/𝟀 and 𝑀(πŸ₯, 𝟣) = 𝟀/πŸ₯. In the second case, despite there being a greater proportion of citizens, the probability that the mafia win is actually higher. In fact, with a single mafia member, it is always true that adding an extra citizen to make the total number of players odd increases the mafia’s chance of winning. We can prove this by induction, as shown on the right. So now we’ve shown that the mafia’s chance of winning is higher with an additional citizen making the total number of players odd, which I found pretty surprising. So you can only imagine how shocked I was when I learned that the parity of the number of players has such an effect that 𝑀(𝟫, 𝟣) > 𝑀(𝟦, 𝟣). In fact, we can plot a graph of 𝑀(𝑛, 𝟣) for odd and even 𝑛 using our expression in equation (f f) to highlight this:

Proof by induction We use 𝑀(𝟀, 𝟣) and 𝑀(πŸ₯, 𝟣) as our base case. Our inductive hypothesis is that 𝑀(πŸ€π‘˜ + 𝟣, 𝟣) > 𝑀(πŸ€π‘˜, 𝟣). Let’s now consider 𝑀(πŸ€π‘˜ + πŸ₯, 𝟣) =

It is true in general that (π‘Ž+𝟣)/(π‘Ž+𝟀) > π‘Ž/(π‘Ž + 𝟣) for all non-negative π‘Ž (if you don’t believe me, you can try proving it yourself). Therefore

𝟣 𝑀(πŸ€π‘˜ + πŸ₯, 𝟣) >

𝑀(𝑛, 𝟣)

𝟒.πŸͺ

πŸ€π‘˜ + 𝟣 𝑀(πŸ€π‘˜ + 𝟣, 𝟣). πŸ€π‘˜ + 𝟀

Now we apply our inductive hypothesis to get

𝟒.𝟨 𝟒.𝟦

πŸ€π‘˜ + 𝟣 𝑀(πŸ€π‘˜, 𝟣) πŸ€π‘˜ + 𝟀 = 𝑀(πŸ€π‘˜ + 𝟀, 𝟣)

𝑀(πŸ€π‘˜ + πŸ₯, 𝟣) >

𝑛 odd

𝟒.𝟀 𝟒

πŸ€π‘˜ + 𝟀 𝑀(πŸ€π‘˜ + 𝟣, 𝟣). πŸ€π‘˜ + πŸ₯

𝑛 even 𝟒

𝟧

𝟣𝟒

𝟣𝟧

𝟀𝟒

𝟀𝟧

πŸ₯𝟒

𝑛 A graph of 𝑀(𝑛, 𝟣) against 𝑛 for 𝑛 odd (green), and 𝑛 even (blue)

which is exactly what we want and completes our inductive step. Therefore 𝑀(πŸ€π‘› +𝟣, 𝟣) > 𝑀(πŸ€π‘›, 𝟣) for all 𝑛 > 𝟒.

Time to get Sirius And now we can reach the limit of what I can explain without writing a thesis. Luckily for me, Piotr MigdaΕ‚ has written a paper for his bachelor’s degree. There is one main result from that, 17

spring 2021


chalkdust extending the ideas above, which I’d like to share with you. It considers the case of there being multiple mafia members. In a similar way to how we derived 𝑀(𝟣, 𝑛), MigdaΕ‚ shows that π‘š

(𝑛 βˆ’ 𝑖)!! π‘š 𝑀(𝑛, π‘š) = 𝟣 βˆ’ βˆ‘ ( ) . 𝑖 𝑛!!((𝑛 (mod 𝟀) βˆ’ 𝑖)!! 𝑖=𝟒

Now observe that (𝑛 βˆ’ 𝑖)!!/(𝑛 βˆ’ 𝟣)!! β†’ 𝟒 as 𝑖 increases. Therefore only the first two terms of the sum contribute significantly. Hence we can write 𝑀(𝑛, π‘š) β‰ˆ π‘š

(𝑛 βˆ’ 𝟣)!! . 𝑛!!

(f f f)

To write this in a nicer form involves a few neat ideas. One fact that will prove very useful is that for any π‘˜ , (πŸ€π‘˜ + 𝟣)!! = (πŸ€π‘˜ + 𝟣)(πŸ€π‘˜ βˆ’ 𝟣)!! (using the definition of double factorial above). We first consider the product of 𝑀(πŸ€π‘˜ + 𝟣, π‘š) and 𝑀(πŸ€π‘˜, π‘š). By equation (f f f), this gives: π‘š(πŸ€π‘˜)!! π‘š(πŸ€π‘˜ βˆ’ 𝟣)!! Γ— (πŸ€π‘˜ + 𝟣)!! (πŸ€π‘˜)!! π‘š(πŸ€π‘˜)!! π‘š(πŸ€π‘˜ βˆ’ 𝟣)!! = Γ— (πŸ€π‘˜ + 𝟣)(πŸ€π‘˜ βˆ’ 𝟣)!! (πŸ€π‘˜)!! 𝟣 . = π‘šπŸ€ πŸ€π‘˜ + 𝟣

𝑀(πŸ€π‘˜ + 𝟣, π‘š) 𝑀(πŸ€π‘˜, π‘š) β‰ˆ

(f f f)

Now we’re going to look at 𝑀(πŸ€π‘˜ + 𝟣, π‘š)/𝑀(πŸ€π‘˜, π‘š), again using equation (f f f). This may seem a bit odd right now, but trust me on this one. 𝑀(πŸ€π‘˜ + 𝟣, π‘š) π‘š(πŸ€π‘˜)!! π‘š(πŸ€π‘˜ βˆ’ 𝟣)!! β‰ˆ Γ· 𝑀(πŸ€π‘˜, π‘š) (πŸ€π‘˜ + 𝟣)!! (πŸ€π‘˜)!! 𝟀

= =

(πŸ€π‘˜)!! (πŸ€π‘˜ + 𝟣)!!(πŸ€π‘˜ βˆ’ 𝟣)!! (πŸ€π‘˜)!!

𝟀

(πŸ€π‘˜ + 𝟣)(πŸ€π‘˜ βˆ’ 𝟣)!!

𝟀

𝟀

=(

(πŸ€π‘˜)!! 𝟣 . ) (πŸ€π‘˜ βˆ’ 𝟣)!! πŸ€π‘˜ + 𝟣

(f f f f)

At this point, you’d be forgiven for having your doubts that I’m making anything more simple here. Fear not! It all becomes clear with the introduction of the Wallis formula: 𝟀

(πŸ€π‘˜)!! Ο€ 𝟣 = lim ( . ) 𝟀 π‘₯β†’βˆž (πŸ€π‘˜ βˆ’ 𝟣)!! πŸ€π‘˜ + 𝟣

We can now write equation (f f f f) in the limit π‘˜ β†’ ∞ as 𝑀(πŸ€π‘˜ + 𝟣, π‘š) Ο€ β‰ˆ . 𝟀 𝑀(πŸ€π‘˜, π‘š) chalkdustmagazine.com

18

(f f f f)


chalkdust Okay, we’re nearly there. I promise. The final clever idea requires considering even and odd 𝑛 separately. Write 𝑛 = πŸ€π‘˜ for 𝑛 even (where π‘˜ is a positive integer). For 𝑛 odd, write 𝑛 = πŸ€π‘˜ + 𝟣 (again for a positive integer π‘˜ ). Now here’s the magic. Write:

𝑀(𝑛, π‘š) =

⎧( 𝑀(πŸ€π‘˜ + 𝟣, π‘š) ) βŽͺ βŽͺ 𝑀(πŸ€π‘˜, π‘š)

βˆ’πŸ£/𝟀

βˆšπ‘€(πŸ€π‘˜ + 𝟣, π‘š)𝑀(πŸ€π‘˜, π‘š)

⎨ 𝟣/𝟀 βŽͺ βŽͺ( 𝑀(πŸ€π‘˜ + 𝟣, π‘š) ) βˆšπ‘€(πŸ€π‘˜ + 𝟣, π‘š)𝑀(πŸ€π‘˜, π‘š) ⎩ 𝑀(πŸ€π‘˜, π‘š)

for 𝑛 = πŸ€π‘˜

(ie 𝑛 is even)

for 𝑛 = πŸ€π‘˜ + 𝟣

(ie 𝑛 is odd).

Substituting in our values from equations (f f f) and (f f f f) above gives

𝑀(𝑛, π‘š) β‰ˆ

βˆ’πŸ£/𝟀 π‘šπŸ€ ⎧( Ο€ ) βŽͺ 𝟀 √ πŸ€π‘˜ + 𝟣

for 𝑛 = πŸ€π‘˜

⎨ βŽͺ Ο€ 𝟣/𝟀 π‘šπŸ€ ( ) ⎩ 𝟀 √ πŸ€π‘˜ + 𝟣

for 𝑛 = πŸ€π‘˜ + 𝟣.

This can be put in to a single line by recalling equation (f f), and using the fact that for large π‘˜ , πŸ€π‘˜ + 𝟣 β‰ˆ πŸ€π‘˜ : Ο€ 𝑛 (mod 𝟀)βˆ’πŸ£/𝟀 π‘š . (f f f f f) 𝑀(𝑛, π‘š) β‰ˆ ( ) 𝟀 βˆšπ‘› So we finally have an approximate expression for 𝑀(𝑛, π‘š). Phew. From this, it’s only a small step to calculate how many of the 𝑛 players need to be made mafia initially in order to give the two teams equal chance of winning. To do this, we set 𝑀(𝑛, π‘š) = 𝟣/𝟀 in equation (f f f f f). Hence we find the optimal value of π‘š is approximately 𝑛 (mod 𝟀)+𝟣/𝟀 βˆšπ‘› Ο€ . ( ) 𝟀 𝟀

This is particularly interesting as it means that when creating a game of mafia, you can choose the initial number of mafia to ensure that the mafia (or the citizens) are unlikely to win by luck alone, and some skill has to be involved. Unfortunately though, this gives no indication of what that skill should be.

A fair game with 13 people (during a night phase): 3.55 mafia members (top) v 9.45 citizens (bottom). In a real game, however, it will probably be easier to round the number of people on each team to the nearest integer.

One final thing that I’d like to point out is the effect of the parity of 𝑛 on 𝑀(π‘š, 𝑛) and the optimal value of π‘š. We saw in the case of one mafia member how much of difference it makes, 19

spring 2021


chalkdust and we see it again here. It becomes even clearer when we plot the optimal value of π‘š against 𝑛:

optimal value of π‘š

𝟧

𝑛 odd

𝟦 πŸ₯

𝑛 even

𝟀 𝟣 𝟒

𝟒

𝟧

𝟣𝟒

𝟣𝟧

𝟀𝟒

𝟀𝟧

πŸ₯𝟒

𝑛

A graph of the optimal value of mafia π‘š against 𝑛 for 𝑛 odd (green), and 𝑛 even (blue)

So there we have it: I don’t have any surefire winning strategy to reveal to you. But in a way, that’s what I love so much about Mafia. Yes, maths can be used to play the game better, or to give an idea of how to structure a perfect game, but maths doesn’t give a way to guarantee you’ll win. It can inform wiser choices, but ultimately it comes down to how much you trust your friends. And if there’s one thing I’ve learnt playing Mafia, it’s that you can never truly trust your friends. Sophie Maclean Sophie has never been to UCL, is not a student, and is not a proper adult either. Sophie is the impostor.

a @SophieTheMathmo

β€œPro f. Fafn er h oard s h is su p p ly o f Hago ro m o!”

Smitha Maretvadakethope chalkdustmagazine.com

20


Which number system are you?

YOU ARE

Roman numerals

YOU ARE

YOU ARE

hexadecimal

binary 1

0

01000010 01101001?

SILICON CHIPS

START

MDCCXXIX

How do you like your potatoes?

YOU ARE

Cistercian

1729

WHAT’S A POTATO

What number is on your taxi?

1729

YOU ARE

tallies

WEDGES

YOU ARE

Cuneiform

…75303.1 YOU ARE THE

p-adics

LEFT

Ο€ CHART

What is this?....

Which hand do you write with?

How many mathematicians walk into the hotel bar?

COUNTABLY MANY

YOU ARE THE

RIGHT

1.30357…

PIE CHART

1

fractions

CONTINUE

YOU ARE THE

decimals

1 1+1

YOU ARE

continued fractions

What is 0.99999…?

1

<1

1 1+

YOU ARE THE

CONTINUE

1+

1

1

1 1+ 1+1

surreals

CONTINUE

21

1+

β‹―

=1

1 1+

1+

1 1+1 CONTINUE

1

1 1+1

spring 2021


chalkdust

On conditional probability: Cards, Covid, and Crazy Rich Asians Madeleine Hall

I

was watching the film Crazy Rich Asians the other day, as there’s not a lot to do at the moment besides watching Netflix and watching more Netflix. I thoroughly enjoyed the film and would highly recommend it. However, there was something that happened in the first few minutes which really got me thinking and inspired the subject of this Chalkdust article. The main character in the film is an economics professor (the American kind where you can achieve the title of professor while still being in your 20s and without having to claw your way over a pile of your peers to the top of your research field). Within the opening scenes of the film we see her delivering a lecture, in which she is playing poker with a student, while also making remarks about how to win using β€˜mathematical logic’. The bell rings seemingly halfway through the lecture the way it always does in American films and TV shows, and our professor calls out to her students β€œβ€¦and don’t forget your essays on conditional probability are due next week!” Now, I am not going to delve into the question of what type of economics course she is teaching that involves playing poker and mathematical logic, but it got me thinkingβ€”what exactly would an β€˜essay on conditional probability’ entail?

What is conditional probability? Conditional probability is defined as a measure of the probability of an event occurring, given that another event (by assumption, presumption, assertion, or evidence) has already occurred. For all chalkdustmagazine.com

22


chalkdust intents and purposes here, for two events 𝐴 and 𝐡, we’ll write the conditional probability of 𝐴 given 𝐡 as 𝑃(𝐴 | 𝐡), and define it as 𝑃(𝐴 and 𝐡) 𝑃(𝐴 | 𝐡) = . 𝑃(𝐡) The little bar | can just be thought of as meaning β€˜given’. Now that we’ve got some technicalities out of the way, let’s look at some examples of conditional probability. Imagine you are dealt exactly two playing cards from a well-shuffled standard 52– card deck. The standard deck contains exactly four kings. What is the probability that both of 𝟀 your cards are kings? We might, naively, say it must be simply (𝟦/𝟧𝟀) β‰ˆ 𝟒.𝟧𝟫%, but we would be gravely mistaken. There are four chances that the first card dealt to you (out of a deck of 52) is a king. Conditional on the first card being a king, there remains three chances (out of a deck of 51) that the second card is also a king. Conditional probability then dictates that: 𝑃(both are kings) = 𝑃(second is a king | first is a king) Γ— 𝑃(first is a king) 𝟦 πŸ₯ = Γ— β‰ˆ 𝟒.𝟦𝟧%. 𝟧𝟣 𝟧𝟀

The events here are dependent upon each other, as opposed to independent. In the realm of probability, dependency of events is very important. For example, coin tosses are always independent events. When tossing a fair coin, the probability of it landing on heads, given that it previously landed on heads 10 times in a row, is still 𝟣/𝟀. Even if it lands on heads 1000 times, the chance of it landing on heads on the 1001st toss is still 50%.

Bayes’ theorem Any essay on conditional probability would be simply incomplete without a mention of Bayes’ theorem. Bayes’ theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event. It is stated mathematically as:

Bayes’ theorem 𝑃(𝐴 | 𝐡) =

23

𝑃(𝐡 | 𝐴)𝑃(𝐴) . 𝑃(𝐡)

spring 2021


chalkdust We can derive Bayes’ theorem from the definition of conditional probability above by considering 𝑃(𝐴 | 𝐡) and 𝑃(𝐡 | 𝐴), and using that 𝑃(𝐴 and 𝐡) equals 𝑃(𝐡 and 𝐴). A fun (and topical!) example of Bayes’ theorem arises in a medical test/screening scenario. Suppose a test for whether or not someone has a particular infection (say scorpionitis) is 90% sensitive, or equivalently, the true positive rate is 90%. This means that the probability of the test being positive, given that someone has the infection is 0.9, or 𝑃(positive | infected) = 𝟒.𝟫. Now suppose that this is a somewhat prevalent infection, and 6% of the population at any given time are infected, ie 𝑃(infected) = 𝟒.𝟒𝟨. Finally, suppose that the test has a false positive rate of 5% (or equivalently, has 95% specificity), meaning that 5% of the time, if a person is not infected, the test will return a positive result, ie 𝑃(positive | not infected) = 𝟒.𝟒𝟧. Now imagine you take this test and it comes up positive. We can ask, what is the probability that you actually have this infection, given that your test result was positive? Well, 𝑃(infected | positive) =

𝑃(positive | infected)𝑃(infected) . 𝑃(positive)

We can directly input the probabilities in the numerator based on the information provided in the previous paragraph. For the 𝑃(positive) term in the denominator, this probability has two parts to it: the probability that the test is positive and you are infected (true positive), and the probability that the test is positive and you are not infected (false positive). We need to scale these two parts according to the group of people that they apply toβ€”either the proportion of the population that are infected, or the proportion that are not infected. Another way of thinking about this is considering the fact that 𝑃(positive) = 𝑃(positive and infected) + 𝑃(positive and not infected). Thus, we have 𝑃(positive) = 𝑃(positive | infected)𝑃(infected) + 𝑃(positive | not infected)𝑃(not infected).

And we can infer all the probabilities in this expression from the information that’s been given. Thus, we can work out that 𝑃(infected | positive) =

𝟒.𝟫 Γ— 𝟒.𝟒𝟨 β‰ˆ 𝟒.𝟧πŸ₯𝟦𝟩. 𝟒.𝟫 Γ— 𝟒.𝟒𝟨 + 𝟒.𝟒𝟧 Γ— 𝟒.𝟫𝟦

Unpacking this result, this means that if you test positive for an infection, and if 1 in 17 people in the population (approximately 6%) are infected at any given time, there is an almost 50% chance that you are not actually infected, despite the test having a true positive rate of 90%, and a false positive rate of 5% (compare to the proportion of the shaded area in the diagram filled by infected people). That seems pretty high. Here are some takeaways from this example: the probability that you have an infection, given that you test positive for said infection, not only depends on the accuracy of the test, but it also depends on the prevalence of the disease within the population.

chalkdustmagazine.com

24

In a population with 6% infected , this test will come back positive in 10.7% of cases.


chalkdust

Unprecedented applicability Of course, in a real-world scenario, it’s a lot more complicated than this. For something like (and, apologies in advance for bringing it up) Covid-19, the prevalence of infection (our 𝑃(infected) value) changes with time. For example, according to government statistics, the average number of daily new cases in July 2020 was approximately 667, whereas in January 2021 it was 38,600. Furthermore, 𝑃(infected) depends on a vast number of factors including age, geographical location, and physical symptoms to name only a few. Still, it would be nice to get a sense of how Bayes’ theorem can be applied to these uNp ReC eNt Ed times. An article from the UK Covid-19 lateral flow oversight team (catchy name, I know) released on 26 January 2021 reported that lateral flow tests (which provide results in a very short amount of time but are less accurate than the β€˜gold standard’ PCR tests) achieved 78.8% sensitivity and 99.68% specificity in detecting Covid-19 infections. In the context of probabilities, this means that 𝑃(positive | infected) = 𝟒.𝟩πŸͺπŸͺ, and 𝑃(positive | not infected) = 𝟒.𝟒𝟒πŸ₯𝟀. If 3% are infected with Covid-19 , a lateral flow test will come back positive 2.7% of the time.

On 26 January 2021, there were 1,927,100 active cases of Covid19 in the UK. Out of a population of 66 million, this is gives us a prevalence of approximately 3%, or 𝑃(infected) = 𝟒.𝟒πŸ₯. Taking all these probabilities into account, we have

𝑃(infected | positive) =

𝟒.𝟩πŸͺπŸͺ Γ— 𝟒.𝟒πŸ₯ β‰ˆ 𝟒.πŸͺπŸͺπŸ₯𝟫, 𝟒.𝟩πŸͺπŸͺ Γ— 𝟒.𝟒πŸ₯ + 𝟒.𝟒𝟒πŸ₯𝟀 Γ— 𝟒.𝟫𝟩

which means that the chances of you actually having Covid-19, given that you get a positive result from a lateral flow test, is about 88%. This seems pretty good, but can we make this any better? Instead of just taking the number of active cases as a percentage of the total population of the UK to give us our prevalence, we can alternatively consider 𝑃(infected) for a particular individual. For someone who has a cough, a fever, or who recently interacted with someone who was then diagnosed with Covid-19, we could say that their 𝑃(infected) is substantially higher than the overall prevalence in the country. The article Interpreting a Covid-19 test result in the BMJ suggests a reasonable value for such an individual would be 𝑃(infected) = 𝟒.πŸͺ. It’s worth mentioning that this article has a fun interactive tool where you can play around with sensitivity and specificity values to see how this affects true and false positivity and negativity rates. Taking this new value of prevalence, 𝑃(infected), into account, then 𝑃(infected | positive) =

𝟒.𝟩πŸͺπŸͺ Γ— 𝟒.πŸͺ β‰ˆ 𝟒.𝟫𝟫𝟫𝟒, 𝟒.𝟩πŸͺπŸͺ Γ— 𝟒.πŸͺ + 𝟒.𝟒𝟒πŸ₯𝟀 Γ— 𝟒.𝟀

25

If the prevalence increases to 80%, you can be much more certain of a positive result, but there are also more false negatives. spring 2021


chalkdust giving us a 99.9% chance of infection given a positive test result, which is way closer to certainty than the previous value of 88%. Can we do any better than this? Well, compared with the lateral flow Covid-19 tests, it has been found that PCR tests (which use a different kind of technology to detect infection) have substantially higher sensitivity and specificity. Another recent article in the BMJ published in September 2020 reported that the PCR Covid-19 test has 94% sensitivity and very close to 100% specificity. In a survey conducted by the Office for National Statistics in the same month, they measured how many people across England and Wales tested positive for Covid-19 infection at a given point in time, regardless of whether they reported experiencing symptoms. In the survey, even if all positive results were false, specificity would still be 99.92%. For the sensitivity and specificity reported in the BMJ article, this is equivalent to having a false negative rate of 6% and a false positive rate of 0%. If we plug these numbers in, regardless of what the prevalence is taken to be, we have: 𝑃(infected | positive) =

𝟒.𝟫𝟦 Γ— 𝑃(infected) = 𝟣. 𝟒.𝟫𝟦 Γ— 𝑃(infected) + 𝟒 Γ— 𝑃(not infected)

So when a test has a false positive rate of almost 0%, if you achieve a positive test result, there is essentially a 100% chance that you do in fact have Covid-19. So what can we take away from this? Well, we have seen that if a test has higher rates of sensitivity and specificity, then the probability of the result being a true positive is also higher. However, prevalence and the value of the probability of infection also play a big role in this scenario. This could be used as an argument for targeted testing only, for example if only people with symptoms were tested then this would increase the probability of the result being a true positive. Unfortunately, it is the case that a large number of Covid-19 infections are actually asymptomaticβ€”in one study it was found that over 40% of cases in one community in Italy had no symptoms. So, if only people with symptoms were tested, a lot of infections would be missed. In conclusion, I’m no epidemiologist, just your average mathematician, and I don’t really have any answers. Only that conditional probability is actually pretty interesting, and it turns out you can write a whole essay on it. The ending of Crazy Rich Asians was much better than the ending to this article. Go watch it, if you haven’t already.

Bae’s theorem 𝑃(Netflix | chill) = 𝑃(chill | Netflix)𝑃(Netflix) 𝑃(chill)

Madeleine Hall Madeleine is a pHd StUdEnT in mathematics and behavioural genomics at Imperial College London. She likes open water swimming, toast, the Oxford comma, and tHiS mEmE. She has found none of her optimised strokes of any use in the Serpentine.

a @maddygracehall chalkdustmagazine.com

26


THIS ISSUE…

IS THE EINSTEIN SUMMATION CONVENTION WORTH IT?

E TH E TH

G I B

E TH

T EN M GU AR

NO

YES ARGUES ELLEN JOLLEY

ARGUES SOPHIE MACLEAN

The Einstein summation convention is a way to write and manipulate vector equations in many dimensions. Simply put, when you see repeated indices, you sum over them, so βˆ‘π‘π‘–=𝟣 π‘Žπ‘– 𝑏𝑖 is written π‘Žπ‘– 𝑏𝑖 for example. This debate boils down to just one question: how much of your life do you spend doing tensor algebra? Those of us who undertake a positive amount of tensor algebra or vector calculus know that the goal is to be done with it as fast as possible! Try tensor algebra for even five minutes without using the summation conventionβ€”I promise you will tire of constantly explaining β€œyes, the sum still starts from 𝟣, and yes, it still goes to 𝑁.”

Before writing this argument, I had to Google β€˜summation convention’ which is all the evidence I need for why it’s just not worth it. I’ve learnt how to use the conventionβ€”multiple times! In fact, I’d say it’s something I’m able to use, yet I’m still not sure I know exactly what it is. Some of our readers won’t have ever heard of it (which is one strike against it). Some have heard of it but won’t know much about it (another strike). But I guarantee none would be confident saying they can use it without making any errors (if you think you would be, you’re in denial).

You’ll scream, β€œAll of them! I am summing over all indices! Obviously! Why’d I ever skip some??” If you’re confused how many you’ve got, use this simple guide: physicists use four; fluid dynamicists use three; and Italian plumbers use two. Wouldn’t it be nice to avoid saying this in every equation?

We don’t even have need for the convention! We already have a suitable way to notate summation: βˆ‘

You may cry that it’s easier to make mistakes with the convention; but for applied mathematicians, the joy comes in speeding ahead to the answer by any meansβ€”time spent on accuracy and proof is time wasted. And as the great mathematician Bob Ross said: there are no mistakes, just happy little accidents!

It’s taught to schoolkids. There is no ambiguity. And it’s so much less pretentious. Yes, the summation convention is fractionally faster to write out, but mathematicians are famed for being lazy and aloofβ€”maybe dispensing with it is all we need to break that stereotype! 27

spring 2021


chalkdust

Moonlighting agony uncle Professor Dirichlet answers your personal problems. Want the prof’s help? Contact c deardirichlet@chalkdustmagazine.com

Dear Dirichlet, As a successful author on spies who are also fish, I’m looking to branch out a little. What with the number of stream ing platforms, I’m hoping I can get a TV company to make my series of novels into a ten-episode drama. But it fee ls like a buyers’ marketβ€”how can I hook a produc er? Let minnow!

β€” Micholas H erron, Oxford

β– 

dirichlet says: May I recommend the school market. Each year there

is a new set of 7Β­yearΒ­olds looking to be entertained. For example, I am about to pitch the BBC my Downton Abbey / second world war / great railway infrastructure crossover series for children, with all the characters played by simple 3D shapes. I have already written to CubeΒ­onneville, Dame Sphera Lynne and Prismbard Kingdom Brunel. (Still waiting for a reply from the latter twoΝΎ Cube’s on board.) Dear Dirichlet,

ns. But when I get tner bought me some new jea For my birthday this year my par ring the day, matchsticks in the pockets! Du find ays alw I , obe rdr wa the of them out and pull out some his hand in one of my pockets, put r, ove e com l wil r tne par my ember the moral to do the same? Does he not rem of the matches! Am I supposed play with matches! film Frances the Firefly?... Never from 1990s public information rs, Wigan

β€” The Wrong Trouse

dirichlet says: FΓ©licitations!

You’ve been given the latest in French fashion: couture deNim! But also comΒ­misΓ¨reΒ­ations: nobody’s going to remove the last matchstick for you. If you’re happy to play along, sew up all but two pockets and keep the sticks in each pocket equal. Failing that, I suggest an eXORcism to heal these obviously cursed jeans. A word to the wise: run away if your partner offers you chocolate where you are only allowed to eat squares if you also eat those that are below it and to its right.

β– 

chalkdustmagazine.com

28


chalkdust

Dear Dirichlet, I’m putting on a hilarious satiric al political play at the Zoom the atre next month but I am having difficulty finding the right actors for the job. One bril liant scene involves multiple copies of the chancellor of the duchy of Lancaster chatte ring over each other. Genius! Not sure why I can ’t find any faces so far.

β€” Kimberly Donglesworth, Newca

stle

dirichlet says: In general, when drawing up your CAST, one should go anticlockwise from the fourth quadrant. But anyway, pop along to your local colliery and see if you can convince a square number of employees there to stand on an oversized chessboard. Ask everyone on a white square to stand on their heads. Bob’s your uncle! Your matrix of miners has become a matrix of β€˜Gove actors’. β– 

Dear Dirichlet,

some business from is running an event to drum up It’s all gone wrong! Our village stsβ€”the attracted the wrong sort of gue ’ve we but k, par ari saf al loc visitors to the village’s central sts are marauding over the the bea ge Hu d! ape esc e hav ls anima we get rid of the d for the festival tents. How can grass area, which we repurpose β€” Ray & Dave, Devon ls! invaders? We don’t have the skil

dirichlet says: Sounds like a mammoth task Β­ but ivorything’s going to be OK. Given that you’ve already set up the village Green’s funcΒ­ tion, just keep the noise down: you want the volume nice and discrete. Then naturally the animals should head to the village outskirts. I call this... the boundary elephant method. (Pass my regards to the catering team: as Hank β€˜Hankie’ Williams used to say, β€œHey Galerkin, what you got cookin’?”) β– 

Dear Dirichlet, Over lockdown I have become a bit of a Twitter celebrity. How do I capitalise on my success?

β€” Brabara Barrington, Wellington

β– 

dirichlet says: ON MY SUCCESS. More Dear Dirichlet, including seasonal specials, online at d chalkdustmagazine.com 29

spring 2021


chalkdust

ular p o p t s a le e h t e b o t How American president Brian Copeland, CC BY 2.0

Francisco Berkemeier

S

ome people say the US presidential election system is unfair, since one candidate can win the popular voteβ€”meaning there are more people voting for that candidate than for other candidatesβ€”but still fail to win the election. This means that the difference between the number of votes for each candidate is irrelevant to the election outcome, in the sense that if you didn’t count the extra votes, the result would be the same. This is the result of how the electoral system is designed: the presidency is not determined by the popular vote, but by a system called the electoral college which distributes 538 electoral college votes among the 50 states and DC. A state’s electoral votes are equal to the number of representatives and senators the state has in congress. House seats are apportioned based on population and so are representative of a state’s population, but then the extra two Senate seats per state give smaller states more power in an election. The electoral college is supposed to Martin Falbisoner, CC BY-SA 3.0 Electoral college votes correspond to seats in guarantee that populous states can’t dominate Congress, plus three additional votes for DC. an election, but it also sets up a disparity in representation by misrepresenting every state. As a result, it has happened five times since the founding of the republic that a president has won an election without winning the popular vote. Let me invite you to a thought experiment on the implications of such a system in an extreme scenario. chalkdustmagazine.com

30


chalkdust

How to win the presidency with only 22% of the vote We could ask how much candidate 𝐿 (loser) can win the popular vote by and still lose the election. A possible strategy is to first let candidate π‘Š (winner) marginally win enough states to guarantee at least 270 electoral votes. Then, in the remaining states, award candidate 𝐿 with all of the available votes on those states. Schematically, β€’ If π‘Š wins a state, they win it with one or two more votes than 𝐿 (depending on the parity of the total number of voters); β€’ If 𝐿 wins a state, they get 100% of the votes from that state. In fact, this is the optimal strategy, since in the states where candidate π‘Š wins, the popular vote difference is negligible, and the remaining states only increase the popular vote for candidate 𝐿, which is what we want. Any other vote-per-state distribution would decrease the popular vote difference. With our maximising strategy chosen, the question then becomes: how should we distribute the states between the two candidates? The best way to solve this problem is to use linear programming. This method is used to optimise a certain outcome (for example, maximising profit or minimising costs) given certain restrictions that are represented by linear relationships. In our case, we want to minimise the popular vote for π‘Š given that the total number of electoral votes they win is greater or equal to 270. Notice that with the strategy mentioned above, this is exactly the same question as maximising the popular vote difference. In fact, π‘Š wins with precisely 270 electoral votes. Considering maximal turnout rates in this extreme scenario, assume there are 214 million people voting. The calculations then tell us that 𝐿 wins the popular vote with roughly 122m more votes than π‘Š . This is almost four times the population of Canada! If 57% of the votes weren’t cast, the result would remain the same. Furthermore, candidate 𝐿 gets 168m votes, which is approximately 78% of the total votes and still loses! The electoral map in such situation is below. See the table in the online version of this article for a detailed breakdown of the voting.

π‘Š 𝐿

EV 𝟀𝟩𝟒 𝟀𝟨πŸͺ

PV 𝟦𝟨m (𝟀𝟀%) 𝟣𝟨πŸͺm (𝟩πŸͺ%)

Map data Β©OpenStreetMap contributors

US map in an extreme election scenario with two candidates and their respective electoral votes (EV) and popular votes (PV) in millions. The winner π‘Š is in yellow. Registered voters data from d worldpopulationreview.com

Usually, electoral votes more or less align with the popular vote. However, a number of times in US history, the person who took the White House did not receive the most popular votes. Our scenario is obviously extreme, but it is mathematically possible and begs the questions: should someone who only gets 22% of the popular vote really be the president? Should the US have a system that allows the possibility of over 100 million voters being irrelevant? Is that really fair? 31

spring 2021


chalkdust

More than two candidates Another curious case to consider is the mathematical consequence of having more than two candidates running for the presidency, as seen for example in the electoral college systems employed by Germany or India. In the US, even though typically there are other candidates running with other parties or independently, the race usually comes down to two sides. Assuming a tight race between 𝑛 candidates, we can explore various questions within the same extreme scenario. For instance, if 𝑛 candidates run, what is the maximum popular vote difference between candiIn reality, the US election system has a sepdate π‘Š (who wins the election with the most arate process to decide the presidency if no electoral college votes), and the total popular candidate gains more than half of the elecvote of candidates 𝐿𝟣 , … , πΏπ‘›βˆ’πŸ£ ? Furthermore, is toral college votes, called a contingent elecit possible to make every single candidate win tion. However, in this scenario any of the more popular votes than π‘Š and still lose? Let’s top three candidates according to the elecconsider the case of three candidates. Running toral vote could win, and so this process isn’t the model shows that this is indeed possible, amenable to simple mathematical modelling. resulting in:

π‘Š 𝐿𝟣 𝐿𝟀

EV 𝟣πŸͺ𝟣 𝟣𝟩πŸͺ 𝟣𝟩𝟫

PV 𝟣𝟫m (𝟫%) 𝟫𝟩m (𝟦𝟧%) 𝟫πŸͺm (𝟦𝟨%)

Map data Β©OpenStreetMap contributors

US map in an extreme election scenario with πŸ₯ candidates.

In this case, π‘Š wins with only 9% of the popular vote, while candidates 𝐿𝟣 and 𝐿𝟀 get 45% and 46% of the popular vote, respectively. Notice that, in the states that π‘Š wins, each candidate gets 𝟣/πŸ₯ of the votes in that state (for 𝑛 candidates, they would get 𝟣/𝑛), with π‘Š marginally winning, and in the remaining cases, the winning candidate still gets 100% of the votes. This could naturally be distributed differently, since there is now more than one losing candidate, but we have kept the same idea as before for simplicity.

π‘Š 𝐿𝟣 𝐿𝟀 𝐿πŸ₯ 𝐿𝟦 𝐿𝟧

Map data Β©OpenStreetMap contributors

EV 92 89 89 90 89 89

PV 𝟦m (𝟀%) 𝟦πŸ₯m (𝟀𝟒%) 𝟦πŸ₯m (𝟀𝟒%) 𝟦𝟀m (𝟣𝟫%) 𝟦𝟣m (𝟣𝟫%) 𝟦𝟀m (𝟀𝟒%)

Allowing more candidates results in a bizarre, yet possible popular vote differences for different numbers of candidates. chalkdustmagazine.com

32


Solving the minimisation problem for larger values of 𝑛 is still possible and yields more interesting results. The map above shows the outcome of an election with six candidates, where the winner gains the smallest popular vote, and the graph to the right shows the maximal popular vote difference up to 𝑛 = πŸͺ. For each 𝑛, every losing candidate gains a higher popular vote, but fewer (or equal) electoral votes than the winner. The fate of some states seems not to change with 𝑛. Interestingly, when 𝑛 = 𝟨, π‘Š wins with only 𝟀% of the popular vote! You might also notice that as the number of candidates increases, more or less every vote becomes irrelevant.

Popular vote difference (millions)

chalkdust 𝟀𝟀𝟒 Total eligible voting population 𝟀𝟒𝟒 𝟣πŸͺ𝟒 𝟣𝟨𝟒 𝟣𝟦𝟒 𝟣𝟀𝟒 𝟣𝟒𝟒

𝟀

πŸ₯

𝟦

𝟧

𝟩

𝟨

πŸͺ

Number of candidates (𝑛)

Maximal popular vote difference for between 2 and 8 candidatesβ€”effectively the number of irrelevant votes.

Disenfranchisement laws We can also study the impact of felony disenfranchisement laws that prevent millions of Americans from voting due to their felony convictions. Rates of disenfranchisement vary dramatically by state due to broad variations in voting prohibitions. For example, in 27 states felons lose their voting rights only while incarcerated, and receive automatic restoration upon release or after a period of time. In the other 11 states, voting rights are lost indefinitely for some crimes, while in three states (namely DC (OK not technically a state), Maine, and Vermont), felons never lose their right to vote, even while they are incarcerated. As of 2020, some of the key numbers can be summarised as follows: β€’ An estimated 5.2 million people are disenfranchised due to a felony conviction. β€’ One out of 44 adultsβ€”2.3% of the total eligible US voting populationβ€”is disenfranchised due to a current or previous felony conviction. β€’ The disenfranchisement distribution across correctional populations goes as follows: postsentence (43%), prison (24%), probation (22%), parole (10%) and other (1%).

Map data Β©OpenStreetMap contributors

Map data Β©OpenStreetMap contributors

(a) Felony disenfranchisement rates (%), 2020.

(b) The results, π‘Š is yellow, 𝐿 is green.

US map in an extreme election scenario without disenfranchisement laws.

33

spring 2021


chalkdust In this case, it is naturally interesting to study the disenfranchisement rates per state, as seen in the heat map on the previous page. The map on the left represents the disenfranchised population as a percentage of the adult voting eligible population in each state. Assuming again an extreme scenario where every felon can vote, we can redo the optimisation problem with these new numbers. The election map reflects the results under such assumption. In this case, π‘Š wins with 𝟀𝟣% of the popular vote, while candidate 𝐿 gets 𝟩𝟫%, which is not that different from the previous case. We could then conclude that taking felon votes into account doesn’t dramatically change this extreme scenario. That’s not to say that disenfranchisement is completely irrelevant, four states changed fate: namely Indiana, Missouri, Maryland and Georgia. Of course, a more sophisticated model which accounts for the political landscape in the US may find that one party is more affected by felony voters than another.

Final thoughts Naturally, there are many other parameters and assumptions that could be included in our testing, but I suspect that there will still be the possibility of candidates winning with much less than a majority of the popular vote. For instance, what would happen if DC or Puerto Rico became a state? Depending on the impact such change would have on the electoral votes attributed to each case, perhaps a different state distribution would emerge, but the overall disparity in an extreme scenario should persist. We could even extend these ideas to other election systems and challenge them by considering extreme scenarios. The main goal of this article was to, in an overly dramatic manner, highlight and discuss some of the issues with the US electoral system from a purely mathematical perspective. The model only looks at the implications of the electoral college in an extreme scenario, but I hope it is a starting point to think about why the system works in the way that it does, and perhaps how it could be adjusted to avoid the possibility of such unrepresentative outcomes. Again, I stress that the oversimplification in the model does not do justice to the complex world that is politics, but, if nothing else, it reveals some striking consequences of the US electoral system. Francisco Berkemeier Francisco is mathematician born in China and raised in Portugal. After spending two years in the deserts of Saudi Arabia doing a master’s at KAUST, he’s now a PhD student in mathematics and biology at University College London. He can be found playing the classical guitar and β€˜singing’ whenever cells and maths allow him to.

a @fpberkemeier l franciscoberkemeier Did you know... …that a graph is non-planar if and only if it contains a subgraph that is homeomorphic to either 𝐾𝟧 or 𝐾πŸ₯,πŸ₯ .

chalkdustmagazine.com

34


chalkdust

On the cover

Cellular automata Matthew Scroggs

T

he game of lifeβ€”invented by John Conway (see pages 56–57) in 1970β€”is perhaps the most famous cellular automaton. Cellular automata consist of a regular grid of cells (usually squares) that are (usually, see page 38) either β€˜on’ or β€˜off’. From a given arrangement of cells, the state of each cell in the next generation can be decided by following a set of simple rules. Surprisingly complex patterns can often arise from these simple rules. While the game of life uses a two-dimensional grid of squares for each generation, the cellular automaton on the cover of this issue of Chalkdust is an elementary cellular automaton: it uses a one-dimensional row of squares for each generation. As each generation is a row, subsequent generations can be shown below previous ones.

Elementary cellular automata In an elementary cellular automaton, the state of each cell is decided by its state and the state of its two neighbours in the previous generation. An example such rule is shown to the right: in this rule, the a cell will be on in the next generation if it and its two neighbours are on–off–on in the current generation. A cellular automaton is defined by eight of these rules, as there are eight possible states of three cells. 35

1

0

1

1 An example rule spring 2021


chalkdust In 1983, Stephen Wolfram proposed a system for naming elementary cellular automata. If on cells are 1 and off cells are 0, all the possible states of three cells can be written out (starting with 1 1 1 and ending 0 0 0 ). The states given to each middle cell in the next generation gives a sequence of eight ones and zeros, or an eight-digit binary number. Converting this binary number into decimal gives the name of the rule. For example, rule 102 is shown below. 1 1 1

1 1 0

1 0 1

1 0 0

0 1 1

0 1 0

0 0 1

0 0 0

0

1

1

0

0

1

1

0

Rule 102: so called because (0)1100110 is 102 in binary

Rule 102 is, in fact, the rule that created the pattern shown on the cover of this issue of Chalkdust. To create a pattern like this, first start with a row of squares randomly assigned to be on or off: 1 1 0 1 0 0 0 1 0 1 You can then work along the row, working out whether the cells in the next generation will be on or off. To fill in the end cells, we imagine that the row is surrounded by an infinite sea of zeros. 0 0 1 1 0 1 1 0 1 1 0 1 ... and so on until you get the full second generation:

If you continue adding rows, and colour in some of the regions you create, you will eventually get something that looks like the picture to the right. It’s quite surprising that such simple rules can lead to such an intricate pattern. In some parts, you can see that the same pattern repeats over and over, but in other parts the pattern seems more chaotic.


The pattern gets a square wider each row. This is due to the state 001 being followed by 1: each new 1 from this rule will lead to another 1 that is one square further left. But just when you think you’re getting used to the pattern of some small and some slightly larger triangles...

Surprise! There’s this huge triangle that appears out of nowhere.


chalkdust

Other rules Rule 102 is of course not the only rule that defines a cellular automaton: there are 256 different rules in total. Some of these are particularly boring. For example, in rule 204 each generation is simply a copy of the previous generation. Rule 0 is a particularly dull one too, as after the first generation every cell will be in the off state. 1 1 1

1 1 0

1 0 1

1 0 0

0 1 1

0 1 0

0 0 1

0 0 0

1

1

0

0

1

1

0

0

Rule 204 is one of the most boring rules as each new cell is a copy of the cell directly above it.

Some other rules are more interesting. For example, rules 30 and 150 make interesting patterns.

100 rows of rules 30 (left) and 150 (right) starting with a row of 100 cells in a random state

If you want to have a go at creating your own cellular automaton picture, you can find a template to fill in on the inside back cover of this issue of Chalkdust. If you’d rather get a computer to do the colouring for you, you can download the Python code I wrote to create the pictures in this article from d github.com/mscroggs/cellular-automata and try some rules out. There are also many ways that you can extend the ideas to create loads of different automata. For example, you could allow each cell to be in one of three states (β€˜on’, β€˜off’, or β€˜f’) instead of the two we’ve been allowing. You could then choose a rule assigning one of the three states to each of the 27 possible configurations that three neighbouring three-state cells could be in. But there are 7,625,597,484,987 different automata you could make in this way, so don’t try to draw them all... Matthew Scroggs Matthew is a postdoctoral researcher at the University of Cambridge. He hasn’t had time to play Klax since the noughties, but he’s pretty sure that Coke is it!

d mscroggs.co.uk a @mscroggs r mscroggs My favourite game

Velocity Raptor by TestTubeGames Jakob Stein

An online JavaScript (formerly Flash) game about special relativity, it’s a fun way to wrap your head around geometry that changes as you move. 𝐸/π‘šπ‘ 𝟀 chalkdustmagazine.com

38


chalkdust

Who is the best England manager? Flickr user Eric Kilby, CC BY-SA 2.0

Paddy Moore

M

y quest to answer this simple question began for the noblest of reasonsβ€”to win an argument with my wife. We are both football fans and have been following England all our lives (the phrase β€˜long-suffering’ has never been more apt). Our house is usually an oasis of calm and tranquillity, but one thing is guaranteed to get things kicking off: was Sven-GΓΆran Eriksson a good England manager? This year, we have had more time than usual at home together and the discussion has become heated. I believe that Sven took a golden generation of England players and led them to disappointing performances in three major tournaments and she points to the team reaching the last 8 at consecutive World Cups under his stewardship. I’ve been a maths teacher for over 25 years, so surely, I can prove I am correct using maths, right?f How do I prove that I am right? Well, the first, and most obvious, thing to do is to look at the playing

Anders Henrikson, CC BY 2.0

Sven-GΓΆran Eriksson wondering where his hat has gone.

f At this point, it is definitely not worth mentioning that in the 15 years we’ve been having this row, I never thought of applying any maths to the problem until my wife suggested it.

39

spring 2021


chalkdust record of Sven-GΓΆran Eriksson. He was manager of England from January 2001 until July 2006. In that time the team played 67 games and won 40 of themβ€”a win percentage of 59.7%. On its own that is a bit meaningless, so we need something to compare it to. It’s time for a spreadsheet. I am going to tidy up the data a little though. Firstly, up to 1946, there was no England coach. Even under Walter Winterbottom, the players were selected by committee, so I am going to exclude them. Secondly, caretaker managers like Stuart Pearce or Joe Mercer (or Sam Allardyce who was sacked after one game for β€˜reasons’) didn’t have enough games and so I’ll drop them from consideration. That gives us the trimmed list below right, ranked by winning percentage. This shows that Sven had a pretty average record, only just reaching the top half of the table. I was feeling suitably pleased with myself for having come up with such a convincing statistic, only to be shot down with, β€œYeah, but a lot of those games were meaningless Tonywalt, CC BY-SA 3.0 friendlies.” I mean, you could argue that playing for England in any Walter Winterbottom game is the pinnacle of a footballer’s career, and that international friendlies are always important games, but I decided to look at this as it seemed interesting (and I was confident that it would support my point even more). Since we were looking at competitive internationals, I decided to look at the overall results record rather than using just the win percentage. After all, there are three possible outcomes in football and a draw has value (although this value varies with the opponentβ€”a draw against Brazil is generally seen as a fairly decent result, whereas as a draw against Greece is not).

Fabio Capello Alf Ramsey Glenn Hoddle Ron Greenwood Sven-GΓΆran Eriksson Gareth Southgate Roy Hodgson Steve McClaren Bobby Robson Don Revie Terry Venables Graham Taylor Kevin Keegan

2008–2011 1963–1974 1996–1999 1977–1982 2001–2006 2016–2020 2012–2016 2006–2007 1982–1990 1974–1977 1994–1996 1990–1993 1999–2000

P

W

%

42 113 28 55 67 49 56 18 95 29 23 38 18

28 69 17 33 40 29 33 9 47 14 11 18 77

66.7% 61.1% 60.7% 60.0% 59.7% 59.2% 58.9% 50.0% 49.5% 48.3% 47.8% 47.4% 38.9%

To calculate this, I used 3 points for a win and 1 for a draw. This has been the standard across Selected England managers’ full competitive international results football since the 1980s as it (Correct to Nov 2020) rewards positive play. This may disadvantage the managers from before it was introduced because playing for a draw would have been more profitable in group games and qualifiers, but I feel it is the best of the options available (and I’m trying to prove that Sven was a negative manager and I think this will help me)... chalkdustmagazine.com

40


chalkdust

Sven-GΓΆran Eriksson Fabio Capello Ron Greenwood Roy Hodgson Glenn Hoddle Don Revie Alf Ramsey Gareth Southgate Bobby Robson Terry Venablesf Graham Taylor Kevin Keegan

P

W

D

L

F

A

Win %

Pts

Pts available

Pts %

38 22 26 31 15 10 33 36 43 5 19 11

26 15 17 19 9 6 20 22 22 2 8 4

9 5 5 9 3 2 6 6 14 3 8 3

3 2 4 3 3 2 7 8 7 0 3 4

69 54 48 73 26 22 56 80 90 8 34 17

26 16 17 18 8 7 29 29 22 3 14 10

68.4% 68.2% 65.4% 61.3% 60.0% 60.0% 60.6% 61.1% 51.2% 40.0% 42.1% 36.4%

87 50 56 66 30 20 66 72 80 9 32 15

114 66 78 93 45 30 99 108 129 15 57 33

76.3% 75.8% 71.8% 71.0% 66.7% 66.7% 66.7% 66.7% 62.0% 60.0% 56.1% 45.5%

Selected England managers’ full competitive international results (Correct to Nov 2020)

This did not go well and there was a significant amount of smugness, which I felt was inappropriate and irritating. To be honest, this is a compelling result and I needed to come back strong if I was to maintain any credibility in this argument. I felt a little disappointed that I had done all that work to prove this important point and it wouldn’t be any use. I was starting to get concerned that manipulating statistics to get the result I wanted was not working, when a thought occurred to meβ€”I might be able to use the Fifa ranking data to demonstrate that Sven-GΓΆran Eriksson’s England team was only able to beat lesser teams and often struggled against higher ranking sides. In short, I chose to take a leaf from the Trump playbookβ€”when you’re in trouble, smear the opposition. OK, so it’s not classy but, in this case, I think it is a valid point to explore. Were most of Sven’s competitive games against weaker opposition? This is a possibility because qualifiers and group games are seeded, and so England would be facing socalled lesser teams. For example, let’s consider the 2006 World Cup qualifying group (right). Only Poland finished ranked in the world’s top 50 international teams, which supports my contention that England were flat-track bullies under Sven. But this raised two interestingf questions:

Team England Poland Austria Northern Ireland Wales Azerbaijan

Pld

Pts

Ranking

10 10 10 10 10 10

25 24 15 9 8 3

9 23 72 (=) 101 72 (=) 113

The final positions and 2005 Fifa world rankings of Uefa group 6 in the 2006 World Cup qualifying.

1. How are the Fifa rankings calculated? 2. How can I use them to win this argument? f Terry Venables’ stats are decimated here because in the run up to Euro 96, England were only playing friendlies

because they had already qualified as hosts. f in my opinion

41

spring 2021


chalkdust

The Fifa ranking system The Fifa ranking system was introduced in December 1992, and initially awarded teams points for every win or draw, like a traditional league table. However, Fifa quicklyf realised that there were many other factors affecting the outcome of a football match and, over timef moved to a system based on the work of Hungarian–American mathematician ÁrpΓ‘d Γ‰lΕ‘ (more on him in a moment). The Fifa rankings are not helpful to me because they don’t cover all the managers I’m considering and because their accuracy, reliability and the many methods used to generate them were always questioned. Luckily, football fans have had these arguments before and there is an Elo ranking for all men’s international teams, which has been calculated back to the first international between England and Scotland in 1872 (a disappointing goalless draw).

Wikimedia commons user BaldL, CC BY-SA 4.0

Competitors in an esports event.

The Elo rating system compares the relative performance of the competitors in two-player games. Although it was initially developed for rating chess players, variations of the system are used to rate players in sports such as table tennis, esports and even Scrabble. Strictly speaking, we should be saying an Elo system, rather than the Elo system as each sport has modified the formula to suit their own needs.

So how does an Elo system calculate a ranking? Well, at the most basic level, each team has a certain number of points and at the end of each game, one team gives some points to the other. The number of points depends on the result and the rankings of the two teams. When the favourite wins, only a few rating points will be traded, or even zero if there is a big enough difference in the rankings (eg in September 2015, England beat San Marino 6–0, but no Elo points were exchanged). However, if the underdog manages a surprise win, lots of rating points will be transferred (for example, when Iceland beat England at Euro 2016, they took 40 points from England). If the ratings difference is large enough, a team could even gain or lose points if they draw. So teams whose ratings are too low or too high should gain or lose rating points until the ratings reflect their true position relative to the rest of the group. But how do you know how many points to add or take away after each game? Elo produced a formula for this, but there is a bit of mathsβ€”brace yourself. Firstly, Elo assumed that a team would play at around the same standard, on average, from one game to next. However, sometimes they would play better or worse but with those performances grouped towards the average. This is known as a normal distributionf f or bell curve, where outstanding results are possible but rare. In the graph, the π‘₯ -axis would represent the level of performance, and the 𝑦 -axis shows the probability of that happening. So, we can see that the f five years later f Over twenty years. I mean, why use an established and respected system when you can faff about making your

own useless one? To be fair, the women’s rankings have used a version of the Elo system since their inception, which may make Fifa’s unwillingness to use it for the men even stranger. f f Elo uses a logistic distribution rather than the normal, but the differences are small (I mean, what’s a couple of percent between friends?).

chalkdustmagazine.com

42


chalkdust chance of an exceptional performance is smaller than that of an unremarkable one and the bulk of games will have a middling level of skill shown. This means that if both teams perform to their standard, we can predict an expected score, which Elo defined as their probability of winning plus half their probability of drawing. Because we do not know the relative strengths of both teams, this expected score is calculated using their current ratings and the formulas 𝐸𝐴 =

𝟣

and

𝟣 + 𝟣𝟒(𝑅𝐡 βˆ’π‘…π΄ )/𝟦𝟒𝟒

𝟣 𝟣 + 𝟣𝟒(𝑅𝐴 βˆ’π‘…π΅ )/𝟦𝟒𝟒

.

In these formulas, 𝐸𝐴 and 𝐸𝐡 are the expected results for the teams, and 𝑅𝐴 and 𝑅𝐡 are their ratings. If you plot a graph of the 𝐸 values for different values of 𝑅𝐴 βˆ’ 𝑅𝐡 you get the graph shown to the left. It’s interestingf to note the shape of this

𝟣

Expected score (𝐸𝐴 )

𝐸𝐡 =

𝟒.πŸͺ 𝟒.𝟨 𝟒.𝟦

graph, which is a sigmoid, a shape that anyone who has drawn a cumulative frequency graph for their GCSE maths will recognise. It is an 𝟒 βˆ’πŸ£,𝟒𝟒𝟒 βˆ’πŸ§πŸ’πŸ’ 𝟒 𝟧𝟒𝟒 𝟣,𝟒𝟒𝟒 expression of the area under the distribution Difference in rating (𝑅𝐴 βˆ’ 𝑅𝐡 ) (ie the cumulative distribution function). The graph shows that if the difference between ratThe expected score for a range of differences in ings is zero, the expected result is 0.5. The systeam ratings. tem uses values of 1 for a win, 0.5 for a draw and 0 for a loss, so this suggests a draw is the most likely outcome. And if the difference is 380 in your favour, the expected score is 0.9, which suggests you are likely to winf . 𝟒.𝟀

The system then compares the actual result to the expected outcome and uses a relatively simple calculationf f to calculate the number of points exchanged: β€² = 𝑅 + 𝐾 (𝑆 βˆ’ 𝐸 ). 𝑅𝐴 𝐴 𝐴 𝐴 β€² is the new rating for team A, 𝑆 is the actual result of the game, and 𝐾 is a In this equation, 𝑅𝐴 𝐴 scaling factor. We’ll come back to 𝐾 in a moment. Recently, England (rating 1969) played Belgium (rating 2087) at the King Power stadium in Leuven, Belgium. It is generally thought that the home team is at an advantage and to reflect this, the home team gets a bonus 100 points to their rating which means there is a 218-point difference between the teams. England are clear underdogs, and we can calculate the expected result as follows:

𝐸𝐴 =

𝟣 𝟣 + 𝟣𝟒(𝟀𝟣πŸͺπŸ©βˆ’πŸ£πŸ«πŸ¨πŸ«)/𝟦𝟒𝟒

β‰ˆ 𝟒.𝟀𝟀

f Again, interesting to me. f An 𝐸 of 0.9 doesn’t necessarily mean you’ll win 90% of the games and lose the rest as other combinations also give 𝐴

an expected score of 0.9. For example, winning 80%, and drawing the rest or winning 85%, drawing 10% and losing 5% gives the same value. f f Honestly, it’s easier than it looks.

43

spring 2021


chalkdust This shows that this will be a tricky game for England, and a draw would be a good result. Unfortunately, England lost the game 2–0, an 𝑆𝐴 of 0 (still using 1 for a win, etc). Therefore we can calculate the rating change using the formula: β€² = 𝟣𝟫𝟨𝟫 + 𝐾 (𝟒 βˆ’ 𝟒.𝟀𝟀) 𝑅𝐴

Now we need to understand the 𝐾 value. In simple terms, the bigger the 𝐾 value we use, the more the rating will change with each result. We need to choose a suitable value so that it isn’t too sensitive, which would lead to wild swings, but also allows for teams to change position when they start to improve. β€² = 𝟣𝟫𝟨𝟫 + 𝟨𝟒(𝟒 βˆ’ 𝟒.𝟀𝟀) β‰ˆ 𝟣𝟫𝟧𝟨 𝑅𝐴 The world football Elo rankings adjust the 𝐾 value depending on the score and the competition. In our example, which was a Nations League game (a new competition between European teams with similar Fifa rankings), the base value for 𝐾 is 40. This is multiplied by 1.5 for a win by 2 clear goals giving a 𝐾 value of 60. This is a change of βˆ’πŸ£πŸ₯ points, and so Belgium would change by +𝟣πŸ₯ points to a new rating of 2100. Although I have focused on the world football Elo rankings, the Fifa rankings now use a system which is basically similar, with slight variations in the weightings and allowances. This brings me to the second, and more important part, of the question: can I use this to prove that I’m right? Unfortunately, this explanation shows that you can only use this type of ranking, whether it’s the Elo or the Fifa system, to compare with teams that were playing at that time. This means that trying to use it to look back over time is pointless. You can’t compare the performance of Alf Ramsey’s England with that of Steve McClaren using the Elo rankings, because it is not designed to do that.

What can I do? I can, however, use a similar ideaβ€” looking at England’s performance against differently rated teamsβ€”to judge Sven. To achieve this, I’ve collated all of England’s results in competitive games under Sven and used some spreadsheet magic to create the tables shown to the right.f This is conclusivef . Under Sven-GΓΆran

P

W

D

L

F

A

Win %

Points %

11

4

3

4

18

10

36.4%

45.5%

P

W

D

L

F

A

Win %

Points %

27

22

4

1

51

15

81.5%

86.4%

Performance of England under Sven-GΓΆran Eriksson against teams in the top 20 (top) and teams outside the top 20 (bottom). The full data is available at d chalkdustmagazine.com.

Eriksson, England were brilliantβ€”if the team they were playing were outside the top twenty. Against good teams, England were awful. For comparison, in the 2020–21 season, Manchester United have a win percentage of 63.2% and a points percentage f Do not ask how long this took. f It is. Just trust me on this.

chalkdustmagazine.com

44


chalkdust of 70.2%. On the other hand, Chelsea had a win percentage of 42.1% and a points percentage of 50.9% (based on results up to 27 January 2021), and they sacked the manager. I can finally conclude that I was right. Sven was a rubbish manager who was worse than Frank Lampard. Paddy Moore Paddy has been a maths teacher for over 25 years. He is a proud nerd and a perpetually disappointed football fan.

a @PaddyMaths My least favourite game

Poker

Sophie Maclean

I know it’s possible to calculate the probabilities of my opponents having each hand, and the expected amount of money I’d win from each pot, thereby determining the optimal strategy in a random game. What’s more, I know it’s within my mathematical capabilities to do this. But I can just never be bothered and always lose and it’s not fun. Or maybe this is just an incredible bluff… /

My favourite game

Nim

David Sheard

You and a friend have 𝑁 piles of things (coins, marbles, sticks… whatever), maybe each of a different size, and you take it in turn to take any positive number of things from exactly one pile. The winner (unless you are feeling particularly miserly) takes the final thing. At first it’s a kinda fun, surprisingly complicated, strategy game. Then you Google the winning strategy and find it involves a pretty disappointing and unintuitive operation on binary numbers called XOR- or Nim-sum which comes out of nowhere. Deflated but not defeated, after much head scratching it transpires that the losing states of the game form a beautiful 𝑁 -dimensional SierpiΕ„ski’s triangle elegantly described by XOR-sum, and which is strategically intuitive. And then after more Googling you learn that XOR-sum actually works for a huge number of games in a really fascinating way… or is that just me?

45

/

spring 2021


Puzzles

Looking for a fun puzzle but not got time to tackle the crossnumber? You’re on the right page.

One or two Put the answers to the clues in the grid by placing either one or two letters in each box.

1

For example, if the answers to the clues were cone, speed, cusp, and ended, the completed puzzle would look like: c on e u nd sp e ed

4

2

3

5

Across

Down

1 Hypotenuse Γ· opposite.

1 Not real and not imaginary.

4 The volume of these shapes is (area of base) Γ— height.

2 Adjacent Γ· hypotenuse. 3 It is impossible to a ruler and compass.

5 π‘₯ in 𝟀π‘₯ .

+ Arrange the digits Put the numbers 1 to 9 (using each number exactly once) in the boxes so that the sums are correct. The sums should be read left to right and top to bottom ignoring the usual order of operations. For example, 𝟦 + πŸ₯ Γ— 𝟀 is 14, not 10.

+

+

+

+ βˆ’

+ βˆ’

=𝟧

Γ— Γ—

Γ· = 12

Γ·

+

+

= 𝟣𝟩

= 8

= 𝟀𝟒 = 18

=

+

+

Arrange the digits II Put the numbers 1 to 9 (using each number exactly once) in the boxes so that the sums are correct.

+ +

= 17

+

an angle using

+

= 𝟀𝟦 = 14

chalkdustmagazine.com

46


chalkdust 7

Square filler Place the digits 0 to 9 in gaps in the grid (using each digit exactly once) so that every number in the completed crossnumber is square. As usual, no number begins with 0.

2

9

8 4

2

3

4

5 6

7

8

5

4

6

1

4

1

4

4

9

4 0

6

9

6 7

2

You can cross off the digits below as you use them. 0 1

2

0

Extra letters The words on the right are anagrams of words with a common theme with an extra letter added. If you write the themed words in the boxes to the left, and the extra letters in the extra letters column, two more words with the same theme will appear in the orange Extra boxes. letters

THRIFTY EITHER ENFIN TOZER YOWT STEVEN OWEN NOTE Add brackets Add one set of brackets to each equation below to make them correct. The usual order of operations (Γ— and Γ· before + and βˆ’) should be followed.

𝟀 + 𝟀 Γ— 𝟀 + 𝟀 Γ· 𝟀 + 𝟀 = 𝟣𝟣

𝟣𝟒 Γ— πŸ₯ + 𝟨 Γ— πŸͺ βˆ’ πŸ₯ = 𝟦πŸͺ𝟒

𝟨 + 𝟦 𝟀 ÷ 𝟀 𝟀 = 𝟩𝟒 47

spring 2021


chalkdust

Surfing on wavelets Flickr user Warm Winds Surf Shop, CC BY 2.0

Johannes Huber

H

igh-speed internet and digital storage get cheaper, but the challenge of sharing large files is one that anybody who spends their time working on computers faces. Digital images in particular can be a pain if they lead to long loading times and high server costs. If you have ever seen an image on the internet, then you have certainly encountered the JPEG format because it has been the web standard for almost 30 years. I am sure, however, that you have never heard of its potential successor, JPEG2000, even though it recently celebrated its 20th anniversary. If so, then that is unfortunate because it produces much better results than its predecessor.

Same principle but different outcome The best way to understand why different formats give very different outcomes is to look at a specific example. I have compressed a picture of myself using JPEG and JPEG2000. In both cases I sacrifice image quality in favour of space savings, which leads to errors in the resulting image. With JPEG, an image usually starts to become blurry as soon as it is compressed. You’ve probably noticed it with images on the web. Additionally, there can also be a colour loss so the image has a duller appearance overall. In my picture, the most obvious thing to suffer is that smooth colour gradients have been replaced by monochrome areas which make it appear like a picture inside a colour-by-numbers book. We can see some distinct patches of grey on the left wall for example. With JPEG2000, on the other hand, image quality usually only starts to noticeably decrease after extensive compression. As you can see, the image on the right still looks relatively unchanged. chalkdustmagazine.com

48


chalkdust

Original

JPEG

JPEG2000

Image of the author compressed using JPEG and JPEG2000. Note that compression does not make the image smaller, they have been scaled down here to make room for the details.

The differences between the two compressions are the result of the underlying mathematical procedures. JPEG uses the discrete cosine transform, whereas JPEG2000 uses the discrete wavelet transform. β€˜Discrete’ refers to the fact that computers only deal with information in chunks called bits while β€˜cosine’ and β€˜wavelet’ stand for the functions that are used to sample the image. The wavelet transform is so-called because the function it uses looks like small waves when graphed:

Left to right: Daubechies wavelets of order 1, 2, 4 and 8.

The one on the left is also called a Haar wavelet. We will encounter it in matrix form later when I explain how JPEG2000 works. I want to show you how it can save more storage space while still delivering better looking images.

From analogue to digital When we transform an analogue image to a digital image, we view it as a grid of blocks with a predetermined size. We call these blocks pixels, which stands for β€˜picture elements’, and assign each of them a number corresponding to their colour. And just like that, we have transformed our image with coloured blocks into a matrix with numbers. The entries in a monochrome image matrix are brightness values from 0 for black to 255 for white. Colour images require a few additional parameters, but the important thing is that the file size of a digital image depends on the number of pixels and the size of the colour scale. 49

spring 2021


chalkdust Sometimes a low resolution is quite sufficient since our human eyes cannot detect any difference from the original once we view an image at a certain distance. That is why JPEG is ideal for small images such as thumbnails where resolution does not matter as long as you can recognise what is depicted. However, when the image is larger, you have already witnessed some potential side effects of the compression with JPEG in my sample picture. We can check the occurrence of these so-called artefacts with suitable image galleries. Think of these as collections of images with problematic patterns that are susceptible to various defects. The image β€˜Barbara’ below, for example, is perfectly suited for the detection of block artefacts due to the rapid sequences of lightdark areas it contains.

Fabien Petitcolas, public domain

Original

JPEG

JPEG2000

The test image β€˜Barbara’, with a detail compressed using JPEG and JPEG2000.

The principle of compression is that unnecessary information must be located and thrown away. We will use some elegant mathematics to change the values in our image matrix so that this redundant information becomes visible. We try to find as many entries as possible according to specific criteria and set them to zero. The result is an approximation that is then efficiently coded. I will not go into detail here, but the general idea of this last step is as follows: since each digital colour value is saved as a list of ones and zeros, we look for values that appear with higher frequency and assign them abbreviated labels instead to save space. In the case of the value zero, we could write it as just 0 instead of its complete 8-bit binary notation 00000000. If we have lots of black areas in our image we can use this shorter notation to save space. This, however, is not exactly the algorithm used in image compression; look up β€˜Huffman encoding’ if you are interested.

Let’s look at the numbers The memory requirement of a digital image is measured in bits. One bit corresponds to the smallest possible digital storage unit that takes either the value 0 or 1. We can use this to work out the compression rate by dividing the number of bits for the compressed image by the number for the original. If the fraction is close to zero, this means that we saved a lot of space and vice versa. My original image is 2976 Γ— 3968 β‰ˆ 12,000,000 pixels large and requires about 12,500,000 bytes of storage space. To convert bytes into bits, we multiply by eight (1 byte = 8 bits), which gives us chalkdustmagazine.com

50


chalkdust 100,000,000 bits. The JPEG file requires about 2,220,000 bits, and the JPEG2000 file only about 800,000 bits. With this, we get compression rates of approximately 0.02 for JPEG and 0.01 for JPEG2000. Both of my compressions are quite good since each requires only a tiny percentage of the original memory space. We can see, however, that the JPEG file suffers from a noticeable drop in image quality. The JPEG2000 version of my picture, on the other hand, looks relatively unchanged while still being more efficient with only about 36% the size of the JPEG file. Let’s find out how this is possible.

Flickr user Christiaan Colen, CC BY-SA 2.0

Computers store data in bits contain-

As a rule of thumb: JPEG files with compression rates being the value 0 or 1. low 0.25 tend to experience severe quality losses. The discrete cosine transformation used by JPEG represents the values of pixel blocks (usually πŸͺ Γ— πŸͺ pixels in size) as combinations of cosine oscillations and makes use of orthographic projections to replace them with simplified versions. Imagine it like this: the transformation has a set of versatile building blocks like Lego bricks that we can combine to build any image. If you restrict yourself to use only a few different types of Lego bricks, you can still approximate the image, but the more you limit your options, the cruder the approximation will look as you can see in the β€˜Barbara’ image. The reason why the corners and edges of block artefacts become more apparent at higher compression rates is that the available options are not versatile enough to create a convincing approximation.

The discrete wavelet transform To figure out how JPEG2000 works, we start with a simple task: we count the number of passengers at a train station throughout an eight-hour shift which gives us a list of numbers: (206, 306, 59, 69, 16, 16, 5, 3). Let’s suppose we want to send this information to someone, but we need to compress it first to save space (this is just an exampleβ€”in practical applications, compression becomes necessary only when you send a lot more data). How should we choose the numbers we send? Assuming we are satisfied with a rough count, a viable option could be to round the numbers to their nearest multiple of ten: (𝟀𝟒𝟒, πŸ₯𝟒𝟒, 𝟨𝟒, 𝟩𝟒, 𝟀𝟒, 𝟀𝟒, 𝟣𝟒, 𝟒), but there is a better way. Another possibility is to form averages of the four consecutive pairs: (𝟀𝟧𝟨, 𝟨𝟦, 𝟣𝟨, 𝟦). Doing this requires only four values, but nobody can decrypt the original numbers without additional information. Fortunately, we still have four empty slots left in our list. We can choose another number for each of the four pairs which allows us to decrypt the input from the average and tells us something else about the numbers. The linear transform Μƒ 𝙒 ∢ (π‘Ž, 𝑏) β†’ ((𝑏 + π‘Ž)/𝟀, |𝑏 βˆ’ π‘Ž| /𝟀) generates both the mean of a number pair (π‘Ž, 𝑏) and the absolute difference between the mean and either of the two numbers. The four additional numbers are now (𝟧𝟒, 𝟧, 𝟒, 𝟣). This process is easily reversible: we get our starting numbers by adding and subtracting the differences from the corresponding means. More revealing, however, is that these differences measure the distribution of the original numbers. A high value means that the original number pair was far apart and a low value means they were close together. 51

spring 2021


chalkdust The discrete wavelet transform used by JPEG2000 takes advantage of the fact that these differences capture the redundant information we can remove when we want to compress an image. As long as the values are smaller than a certain threshold which depends on the intended level of compression, we can set them to zero. In our example, the threshold could be 10, so the compressed differences are (𝟧𝟒, 𝟒, 𝟒, 𝟒), and reversing the process gives us the list (𝟀𝟒𝟨, πŸ₯𝟒𝟨, 𝟨𝟦, 𝟨𝟦, 𝟣𝟨, 𝟣𝟨, 𝟦, 𝟦). The result is very similar to the original list even though we have simplified most of the differences. For images, this means that we match neighbouring pixels with a similar colour by setting their difference to zero, which we can then efficiently code to save space. When doing this with a computer, we can use matrix algebra to simplify the calculations. If we have a list of numbers, that is, a vector 𝒗 of even length, we can write the transformed vector: π’˜ =Μƒ 𝙒 𝒗 . If we have two or four data points, for example, the transformation is a 𝟀 Γ— 𝟀 or 𝟦 Γ— 𝟦 matrix: 𝟣/𝟀 𝟣/𝟀 𝟒 𝟒 βŽ› ⎞ 𝟣/𝟀 𝟣/𝟀 𝟒 𝟒 𝟣/𝟀 𝟣/𝟀 Μƒ Μƒ ⎜ ⎟. π™’πŸ€ = ( π™’πŸ¦ = ), βˆ’πŸ£/𝟀 𝟣/𝟀 𝟒 𝟒 ⎟ βŽœβˆ’πŸ£/𝟀 𝟣/𝟀 𝟒 βˆ’πŸ£/𝟀 𝟣/𝟀⎠ ⎝ 𝟒 We can also reverse the process by using the rules for matrix multiplication: Μƒ 𝙒 βˆ’πŸ£ π’˜ = 𝒗 . The discrete Haar wavelet transform (HWT) for an input vector of length 𝑁 = πŸ€π‘˜ (with π‘˜ ∈ β„•) is 𝙒𝑁 . The factor √𝟀 is added to make the matrix orthogonal (this simplifies defined as 𝙒𝑁 = √𝟀 Μƒ calculating the reverse transformation) but you only need to pay attention to Μƒ 𝙒𝑁 to understand what is going on. It is great for illustrating how wavelet transforms work since it is relatively simple. The first 𝑁 /𝟀 rows of the result of the Haar wavelet transform contain the averages of the number pairs and the remaining 𝑁 /𝟀 rows their respective differences.

Transforming an image Now we can apply the transform to any 𝑁 Γ— 𝑀 pixel image matrix 𝘽 . Since an image is a twodimensional array of numbers, we want to apply the transform both vertically and horizontally. More precisely, instead of a single list with 𝑁 elements, we now transform 𝑀 lists with 𝑁 elements each. The result is also an 𝑁 Γ— 𝑀 matrix divided into two halves: the upper average block and the lower difference block. You can follow the process in the picture overleaf, starting from the left. This first step corresponds to the one-dimensional Haar wavelet transform (1D-HWT) because we only transformed the matrix vertically. To do the same with the rows and thus transform the Μƒ = 𝙒𝑀 𝘽 π™’π‘βˆ’πŸ£ . The result image completely, we use the rules for matrix multiplication and get 𝘽 of the complete, or two-dimensional, Haar wavelet transform (2D-HWT) is again an 𝑁 Γ— 𝑀 matrix divided into four blocks. On the right, you can see that most of the intensities are contained in the upper left block while the remaining blocks are mostly black because they include mainly the sought-after values close to zero that we can round down to save space. Let’s take a closer look at their composition. The upper left block is composed of averages of the rows and columns and looks like a smaller version of the image. The lower left block reveals information about the horizontal details, while the upper right block contains the vertical ones. Lastly, the lower right block is made up of differences of the rows and columns and corresponds to chalkdustmagazine.com

52


chalkdust

Left: 1D-HWT; right: 2D-HWT.

the diagonal details. Notice that the three detail blocks only have high values at locations with a drastic change in brightness which is why we can see the outlines of the images there.

Repetition and reversion In practice, we repeat this process as often as possible to increase the saving potential even further by reapplying it to the previous approximation in the top left corner. Each time the process is repeated, the block containing the averages becomes more compact. For an 𝑁 Γ— 𝑀 pixel image matrix, 𝑝 iterations of the Haar wavelet transform can be performed if both 𝑁 and 𝑀 are divisible by πŸ€π‘ . To restore the original image, we need to apply the inverse transform to the approximation as many times as needed. You can think of it as sharpening the image because the approximation gets more and more detailed with each iteration we undo. Of course, this is only completely possible if we compressed the file without discarding any data. When we reverse the process after rounding values to zero, we end up with an approximation of the original image which is our compressed version. 53

2D-HWT applied twice. spring 2021


chalkdust

Wrapping things up Now we know why the results with JPEG2000 are not only more efficient but also better looking. It transforms the image as a whole instead of small πŸͺ Γ— πŸͺ chunks. You might ask: β€œWhy is it not being used by the general public despite faring so much better than JPEG?” The answer could be that it takes a lot of time and effort to change the standards for such a large-scale application like image processing. Another hurdle is that the old format still seems to be sufficient. JPEG2000 has found its home in fields like medical scanning, but we probably will not see it taking over anytime soon. Today there are also some new contenders like WebP and AVIF that may supersede JPEG2000 anyway. Increasing demand for less data traffic due to environmental concerns might lead to a rise in popularity for alternative formats. I urge everybody interested in image processing to check out JPEG2000 and compress a few image files for themselves. Johannes Huber Johannes is a teacher trainee who studies maths and geography education at the University of Vienna. He is part of the project Mathematik macht Freu(n)de which could be translated as β€˜maths brings joy (and friends)’ where he creates explanatory videos and coaches high school students. In his free time he leads cub scouts and does parkour.

d underdetermined.blogspot.com Cryptic crossnumber by Emβ€”Dasher 1

2

4

3 5 7

6 8

9

Each clue points to a word or phrase whose length is in square brackets [ ], which in turn points to a number whose length is in round brackets ( ). These numbers should be entered in the grid. All clues have a unique answer and, as usual, no numbers begin with 0. Use of the internet is recommended for some clues. If you get stuck, you can find some hints at d chalkdustmagazine.com.

Across

Down

1 Breadmaker’s act mindful! [6,5] (2) 3 Muddled cue in French from exclamation!

[5] (1)

4 Heartless noble learner’s offspring is a Lord?

[6] (3)

6 Magazine formation.

[4,4,3] (3)

9 Roman god’s first saint replaced by church tie.

chalkdustmagazine.com

[5] (2)

1 Batting team sindarin.

adds

energy

to

[6] (2)

2 Enumerate, on reflection, a sleuth’s base for a tart.

[3] (3)

5 Gain height without width requires, apparently, German Bee repellent?

[5] (3)

7 Introduction of sphere’s central goal.

[5] (2)

8 Caesar’s parade lacks acidity and [11] (1) meets very angry power-share.

54


We love it when our readers write to us. Here are some of the best emails, tweets and letters we’ve been sent. Send your comments by email to c contact@chalkdustmagazine.com, on Twitter a @chalkdustmag, or by post to e Chalkdust Magazine, Department of Mathematics, University College London, Gower Street, London WC1E 6BT, UK.

Dear Chalkdust,

Some might say I was too over-excited when I received this card from one of our passionate & dedicated leaders in maths this weekβ€”I say it was totally justified & I can’t wait to get started! Rhiannon Rainbow a @Noni_Rainbow

Dear Chalkdust,

I look forward to your card every year. It is a gift!! Thank you for doing it.

Great from Chalkdust: flo-map fractions love investigating repeating decimals with learners from elementary through univ math majors.

Halcyon Foster a @halcyonfoster

John Golden a @mathhombre

My copy of Issue 12 has arrived containing an article by me! Very excited right now! They even sent Issue 11 so I have plenty of fun maths reading to enjoy! Brad Ashley a @pogonomaths

Nice review of our recent publication Geometry Juniors by Ed Southall from Chalkdust. This title has been shortlisted by Chalkdust as their Book of the Year. The Mathematical Association a @MathematicalA

β€œIt’s the kind of book that I wouldn’t be surprised to find a future mathematician citing as the book that made them fall in love with maths.” humbled! Thank you Chalkdust Ed Southall a @edsouthall

Absolutely delighted to see Vicky Neale’s Why Study Mathematics? on the Chalkdust Book of the Year 2020 shortlist. Thanks for including it. Much deserved, Vicky: it is such a helpful book! LPP a @LPPbooks

55

spring 2021


chalkdust

Significant figures

John Conway Thane Plambeck, CC BY 2.0

Jamie Handitye and Jakob Stein

M

y most memorable encounter with the work of late mathematician John Horton Conway came from a friend of mine I met as a first year graduate student. As we sat across from each other in the department common room, each having made little progress with our research, he slid me a piece of paper with five dots drawn on it. This game, he explained, consisted of us each taking turns to draw a line between any two dots, with the midpoint of the line we drew then counting as an additional dot. Although the lines could bend in any direction, they were not allowed to intersect each other, and each dot could join at most three line segments. The game was over when one player could not make any more moves, and the other player was declared the winner. At first, I was quickly defeated, and I spent quite some time trying to come up with the best strategies against my skilled opponent. The game that we spent our lunchtime playing was Sprouts, invented by Conway and his friend Michael Paterson during their time at the University of Cambridge, and was later popularised by Martin Gardner in his Scientific American column Mathematical Games. Conway is perhaps best known for his interest in games: he invented many, and his two books on the subject On numbers and games and Winning ways for mathematical plays include detailed analyses of many two-player games. He was a regular contributor to Gardner’s column, and was a major figure in the world of recreational mathematics in his own right. chalkdustmagazine.com

56

As a graduate student, Conway proved that every positive whole number can be written as the sum of at most 37 fifth powers.


chalkdust Born in 1937, Conway grew up in Liverpool, and attended Cambridge as an undergraduate, staying on for his postdoctoral research in number theory, and eventual appointment as fellow and lecturer. He moved to Princeton in 1986, where he remained for the rest of his career. According to those who knew him, he was always ready to play: he would carry around puzzles, pennies, coat-hangers, and dice on him, ready to stoke the imagination of some unwitting colleague with a lively demonstration or challenge. Often described as charismatic, he certainly fulfilled certain stereotypes of the eccentric mathematician, but was also an inspiration for many of those he taught and spoke to, and remains so even after his death in April 2020 from complications due to Covid-19. One of Conway’s most famous inventions was the game of life, a very simple type of β€˜game’, that takes place on a grid of pixels. Each pixel starts either switched on or off, and each second, any off pixel with exactly three on neighbours will also switch on, and only on pixels with two or three on neighbours will remain on. Despite these simple rules, the game of life is actually so-called Turing complete, meaning that in theory, any computer programme could be run using these pixels. This is an example of a cellular automaton, for more see pages 35–38.

Conway was a prominent mathematician, not only dedicated to his work on popular games: on the contrary, his willingness to approach any topic with the same enthusiasm led to him contributing to research fields across mathematics. His interests included number theory, topology, analysis, group theory, classical geometry, even theoretical physics. Analysts, for example, may be familiar with his base 13 function, a function that takes every value between 0 and 1, but is discontinuous everywhere. Among academics, he is better known for his work in group theory, in particular on sporadic simple groups and the Monstrous Moonshine conjecture: a mathematical theory that connects the sporadic groups, mysterious algebraic structures coming from group theory, with functions called modular forms, coming from analysis. His name continues to be relevant, not only through his own considerable research, but also through those who took inspiration from him. In 2018, in a branch of topology called knot theory, a long-standing conjecture was solved by then-graduate student Lisa Piccirillo, which involved the classification of a knot which bears Conway’s name. But for all his contributions, it was Conway’s willingness to collaborate, and share his love of ideas, that are an example for all those interested in mathematics. So, to live by that example, I encourage you to pick up some pencils and sheet of paper, find a friend, and go play a game of Sprouts. Jamie Handitye Jamie is a second year mathematics student at Christ’s College Cambridge. His main interests are in group theory, number theory, and a touch of algebraic geometry. Jakob Stein Jakob is a PhD student and mathematician from London and works mainly in differential geometry. In his spare time, he likes to draw, and think about mathematics in art.

a @jakob_media 57

spring 2021


#13 Set by Humbug 1 7

14

2 8

15

9

31

48

33

55

50

6 12

19

13

20

23 27

34

28

35

29

36

39 42

49

11

18

26

38 41

5

22 25

32

10

17

16

21 24

4

3

30

37

40 44

43

51

52

45

46

53

47

54

57

56

Each clue in this crossnumber contains two statements joined by a logical connective. If the connective is and, then both the statements are true. If the connective is nand, then at most one of the statements is true. If the connective is or, then at least one of the statements is true. If the connective is nor, then neither of the statements is true. If the connective is xor, then exactly one of the statements is true. If the connective is xnor, then either the statements are both true or they are both false. Although many of the clues have multiple answers, there is only one solution to the completed crossnumber. As usual, no numbers begin with 0. Use of Python, OEIS, Wikipedia, etc is advised for some of the clues. To enter, send us the sum of the across clues via the form on our website ( d chalkdustmagazine.com) by 18 September 2021. Only one entry per person will be accepted. Winners will be notified by email and announced on our blog by 1 November 2021. One randomly-selected correct answer will win a Β£100 Maths Gear goody bag, including non-transitive dice, a Festival of the Spoken Nerd DVD, a dodecaplex puzzle and much, much more. Three randomly-selected runners up will win a Chalkdust T-shirt. Maths Gear is a website that sells nerdy things worldwide. Find out more at d mathsgear.co.uk chalkdustmagazine.com

58


1A is a multiple of 13 and 1A is a square number. 1D is equal to 31A or 1D is equal to 41A. 1D is prime xor 7A is odd. 2D is equal to 3D or 2D is equal to 4D. The sum of the digits of 3A is 27 or the sum of the digits of 3D is 27.

17D and 15D share a factor greater than 1 xnor 17D is the product of 1D and 48A. 18A is a palindrome and 18D is a palindrome. The sum of the digits of 18A is 2 and the sum of the digits of 18D is 2. 19D is prime xor 27A is prime.

4D is a cube number xor 9D is a cube number.

20A is prime nor 20A is a multiple of a 2-digit prime.

4D is a square number xor 11D is a square number.

20D is a palindrome and the sum of the digits of 20D is 16.

5A is equal to 1A xor 5A is equal to 3A.

22A is a multiple of 5 xnor 22A is the product of 1D and 48A.

5D is equal to 2D xor 6D is equal to 1D. 7A is equal to 54A xor 7A is equal to 12A. 8A is equal to 3A xor 8A is equal to 10A. 10A is a factor of 3A or 4D is a factor of 3A. 12A is equal to 54A xor 12A is equal to 48A. The sum of the digits of 13D is 17 and the sum of the digits of 7D is 17. 14A is prime nor 14A is a multiple of a 2-digit prime.

23A is 6 times 27A and 28D is equal to 19D. 24A is a multiple of 8 and 45D is a multiple of 24A. The sum of the digits of 24D is 5 xnor the sum of the digits of 49D is 5. 29A is equal to 37A xor 30D is a palindrome. 29A is prime xor 37A is prime. 32D is a multiple of 41A and 41D is a multiple of 41A. 35D is equal to 39A and 35D is equal to 35A.

The sum of 15D and 16A is 742 and the difference between 15D and 16A is 38.

37D is a prime xor the sum of the the digits of 37D is 5.

16A is equal to 25A xor 17D is equal to 25A.

37D’s first digit is 1 nand 37D’s last digit is 1.

16D is a multiple of 100 xor 21A is a multiple of 100.

The sum of the digits of 37D is 7 xor the sum of the digits of 30D is 7.

The sum of the digits of 16D is 5 xor the sum of the digits of 21A is 5.

38A is a multiple of 10A and 39A is a multiple of 10A. 59

40A is a multiple of 100 and 45D is a multiple of 100. 40A is a multiple of 45D nor 45D is a multiple of 40A. 40A is a square number xnor 45D is a square number. 42A is a multiple of 9 xnor 39A is a multiple of 9. 42A is an anagram of 33A and 34D is an anagram of 33D. 43D is a multiple of 10A and 26D is a multiple of 10A. 44A is equal to 5A xor 36D is equal to 5A. 46A is a factor of 40A xor 46A is a factor of 45D. 47D is a factor of 40A xor 47D is a factor of 45D. 48A is a cube number and 50A is a cube number. 48A is a square number and 50A is a square number. 49D is a multiple of 5 xor 50D is a multiple of 5. 52A is two more than 55A and the sum of the digits of 52A is 3. 52D is a Fibonacci number and 53D is a Fibonacci number. 54A is equal to 7A xor 54A is equal to 12A. 54D is a Fibonacci number xor 54D is the sum of 52D and 53D. 54D is a equal to 52D nor 54D is equal to 53D. 56A is a square number xor 56A is a cube number. 57A is a square number xor 57A is greater than 200. spring 2021


chalkdust

Highways Agency, CC BY 2.0

Aryan Ghobadi

A

s trends go, diagrammatic algebra has taken mathematics by storm. Appearing in papers on computer science, pure mathematics and theoretical physics, the concept has expanded well beyond its birthplace, the theory of Hopf algebras. Some use these diagrams to depict difficult processes in quantum mechanics; others use them to model grammar in the English language! In algebra, such diagrams provide a platform to prove difficult ring theoretic statements by simple pictures. As an algebraist, I’d like to present you with a down-to-earth introduction to the world of diagrammatic algebra, by diagrammatising a rather simple structure: namely, the set of natural numbers! At the end, I will allude to the connections between these diagrams and the exciting world of higher and monoidal categories. Nowβ€”imagine yourself in a lecture room, with many others as excited about diagrams as you (yes?!), plus a cranky audience member, who isn’t a fan of category theory, in the front row:

chalkdustmagazine.com

60


chalkdust What we would like to draw today is the process of multiplication for the natural numbers. In its essence, multiplication, Γ—, takes two natural numbers, say 2 and 3, and produces another natural number...

β€” six! β€”

Because it takes two elements and produces just one, multiplication is formally called a binary operation: we can say it is a function π‘š ∢ β„• Γ— β„• β†’ β„•, where, for example, π‘š(𝟀, πŸ₯) = 𝟨. We will keep this π‘š notation for natural number multiplication to avoid confusion with the socalled product of two sets 𝐴 and 𝐡, which is the set of all possible pairs from 𝐴 and 𝐡 and is denoted by 𝐴 Γ— 𝐡 = {(π‘Ž, 𝑏) ∢ π‘Ž ∈ 𝐴, 𝑏 ∈ 𝐡} Now we draw (reading diagrams from top to bottom): β„•

↦

↦

β„•

πŸ₯

𝟀

π‘š=

↦

β„•

𝟨

Multiplication, π‘š, can really be thought of as a β€˜meta-road’: it’s a one-way road with two entry lanes, both departing from two cities whose cars correspond to natural numbers, and one exit lane leading to natural-number-land again. We call our roads β€˜meta’ because two cars, 2 and 3, enter the lanes at the same time, possibly colliding in the middle, passing through time and space, and a brand new car, 6, exits into the city. β€” But how does your picture show any of the properties of multiplication on the natural numbers?

Do not be alarmed by this interruption! I am ready to respond.

Diagrams for a monoid A monoid structure is a fancy word for some of the nice properties that the multiplication of natural numbers satisfies: (i) (ii)

associativity a unit element exists

π‘š(π‘₯, π‘š(𝑦, 𝑧)) = π‘š(π‘š(π‘₯, 𝑦), 𝑧) π‘š(𝟣, π‘₯) = π‘₯ = π‘š(π‘₯, 𝟣)

𝟀 Γ— (πŸ₯ Γ— 𝟧) = πŸ₯𝟒 = (𝟀 Γ— πŸ₯) Γ— 𝟧 𝟣 Γ— π‘₯ = π‘₯ = π‘₯ Γ— 𝟣 βˆ€π‘₯ ∈ β„•.

Now we simply visualise these properties using our pictorial notation. Associativity translates to these compound meta-roads being the same: 61

spring 2021


chalkdust β„•

β„•

β„•

β„•

β„•

β„•

=

OK... β€”

β„•

β„•

But why are the diagrams the same? The key ingredient is that we need to put on our topological glasses! We don’t care about length or curvature in our roads. It’s as if the asphalt moves freely above the sand! With our new glasses, all the following diagrams are the same and the middle lane can move freely from one side to the other: =

=

=

=

The second property we need to visualise is the unit element 𝟣 ∈ β„•. In previous diagrams, any car from β„• can use the roads, whereas to discuss multiplication by 1, we need a unique car to use the road. So we draw a special diagram for the road where only the car corresponding to 1 can use the lane. The unit conditions require one more ingredient. Each city can have a boring β€˜identity road’ id, where nothing happens to cars taking this road. They simply leave and enter the city looking the same. With this in mind, the diagrams representing the unit condition turn into the following picture: β„• β„• β„• = β„•

idβ„• = β„•

β„•

This should not be a surprise since it is natural to think of multiplication by 1, π‘š(𝟣, π‘₯) for any π‘₯ , as a function from β„• to β„•, which ultimately sends every number to itself. Putting our topological glasses back on, looks as if the diagram for the identity road grew an extra hair, so we can push it back in! =

=

=

In our car metaphor, the left side represents a main road with an additional lane entering it, but this lane is reserved for a β€˜harmless’ car that does not interact with any of the other cars. So, it’s the same as if the main road were the identity road, where nothing happens to the cars driving on it. Not so fast!

β€” So we did it!

β„• is a commutative monoid... where’s your diagram for that? β€”

Here the cranky listener is using the old trick of deploying fancy words to heckle me. The word commutative just means that the order in which we multiply the numbers doesn’t matter. Formally, chalkdustmagazine.com

62


chalkdust π‘š being commutative means

for any π‘Ž, 𝑏 ∈ β„•.

π‘š(π‘Ž, 𝑏) = π‘š(𝑏, π‘Ž)

For example, 𝟀 Γ— πŸ₯ = 𝟨 = πŸ₯ Γ— 𝟀. To represent this, we need our roads to pass over each other. We need to build bridges! If we can build bridges and allow lanes to pass over each other, ie diagrams like , then commutativity translates to these diagrams being equal: β„•

β„•

β„•

β„•

= β„•

β„•

To truly see this property, we need to upgrade our glasses to 3D glasses to capture three-dimensional topology. If we view the string diagrams through our 3D glasses, then one could unwind the righthand diagram by rotating it as so: =

=

=

=

β€” But even still… why would anyone care how you draw multiplication as a diagram?

To placate this restless member of the audience, I will present the punchline a bit early and use the keyword β€˜category’ before explaining what it is. The reason we can draw a commutative monoid such as β„• as a three-dimensional diagram is because commutative monoids live in what we call braided categories such as the category of sets. Today’s algebraists will tell you that a braided category is an example of a weirder structure called a 3-category, which has some 3D topology hidden in it. But this takes us into the daunting world of higher categories, and by this point my heckler is hopefully intrigued but has too much pride to ask me to elaborate. So what’s a category? β€”

Aha! Back to our story...

Categories In the same way that looking at the connections between cities in a country is more enlightening than looking at the cities independently, in mathematics it’s more useful to understand the relation between mathematical objects. For example, instead of looking at sets β„•, ℝ, {𝟣, 𝟀, πŸ₯}, βˆ…, I really need to discuss functions between sets to understand how sets relate to each other. This now fits in a 63

spring 2021


chalkdust bigger framework, a category. A category has some cities, for example sets 𝐴, 𝐡 and 𝐢 , and some roads 𝑓 ∢ 𝐴 β†’ 𝐡 between the cities, with two extra rules! 1. If roads 𝑓 ∢ 𝐴 β†’ 𝐡 and 𝑔 ∢ 𝐡 β†’ 𝐢 are part of my category, then so is a composition road 𝑔𝑓 ∢ 𝐴 β†’ 𝐢 which is made up from joining roads 𝑓 and 𝑔 (first taking the road 𝑓 to the city 𝐡 followed by the road 𝑔 ).

2. Every city should have a special β€˜safe’ road, called the identity road, like the identity function idβ„• for β„•.

𝐴

𝐴 =

𝑔𝑓 𝐢 id𝐴 𝑓

𝑔

𝐢

𝐴 𝐴

𝐡

𝑓

𝐴 =

𝑓

𝐴 =

𝐡

𝑓 id𝐡

𝐡 𝐡 𝐡 Categories provide a platform to draw one-dimensional diagrams and a β€˜1D calculus’, ie a way to manipulate these diagrams, as I’ve shown on the right there.

The category of sets has sets as cities and functions as roads. The identity road for each city 𝐴 is just the identity function id𝐴 ∢ 𝐴 β†’ 𝐴, where id𝐴 (π‘Ž) = π‘Ž for all π‘Ž ∈ 𝐴.

Monoidal categories The missing piece for a 2D calculus is a way to write in the horizontal direction. When we visualised π‘š ∢ β„• Γ— β„• β†’ β„• as a diagram, we said that writing two cities 𝐴 and 𝐡 next to each other meant the product of the two sets 𝐴 Γ— 𝐡. In other words, writing cities in rows should have a good meaning, where β€˜good’ means that roads between these cities can run parallel in the vertical direction. That is, in the case of sets, for every pair of functions 𝑓 ∢ 𝐴𝟣 β†’ 𝐴𝟀 and 𝑔 ∢ 𝐡𝟣 β†’ 𝐡𝟀 , we have a new function 𝑓 Γ— 𝑔 ∢ 𝐴𝟣 Γ— 𝐡𝟣 β†’ 𝐴𝟀 Γ— 𝐡𝟀 . In our diagrams, we represent the road 𝑓 Γ— 𝑔 by the roads 𝑓 and 𝑔 running parallel: 𝐴𝟣 𝐡𝟣

𝐴𝟣 Γ— 𝐡𝟣 𝑓 ×𝑔

𝑔

= 𝑓

𝐴𝟀 Γ— 𝐡𝟀

𝑓 Γ— 𝑔 (π‘ŽπŸ£ , π‘πŸ£ ) = (𝑓 (π‘ŽπŸ£ ), 𝑔(π‘πŸ£ )),

(π‘ŽπŸ£ , π‘πŸ£ ) ∈ 𝐴𝟣 Γ— 𝐡𝟣

𝐴𝟀 𝐡𝟀

Similar to the identity roads acting as ineffective components in the vertical direction, we require an β€˜empty city’ 𝐸 which behaves indifferently in the horizontal direction: 𝐴 𝐸 = 𝐴 = 𝐸 𝐴.

A bit more formally, for each pair of objects 𝐴 and 𝐡, the object β€˜π΄ next to 𝐡’ is written as 𝐴 βŠ— 𝐡. Parallel roads are written as 𝑓 βŠ— 𝑔 and 𝐸 is called the unit. A category with an βŠ— operation on pairs of cities and roads and a unit 𝐸 is called monoidal. It should be clear that monoidal categories provide a setting for 2-dimensional diagrams: chalkdustmagazine.com

64


chalkdust 𝑓 βˆΆπ΄β†’π΅βŠ—πΆ 𝐴

𝑔 βˆΆπ΄βŠ—π΅ →𝐸 𝐴

𝐡

β„Ž ∢ 𝐴 𝟣 βŠ— 𝐴𝟀 βŠ— 𝐴πŸ₯ β†’ 𝐡 𝟣 βŠ— 𝐡𝟀 𝐴πŸ₯

𝐴𝟀

𝐴𝟣

β„Ž 𝐡

𝐢

β€˜πΈ ’ = nothing

𝐡𝟣

𝐡𝟀

The monoidal structure on the category of sets is given by 𝐴 βŠ— 𝐡 = 𝐴 Γ— 𝐡, 𝑓 βŠ— 𝑔 = 𝑓 Γ— 𝑔 ; and 𝐸 = {βˆ—} is the set with one element, so that {βˆ—} Γ— 𝐴 = {(βˆ—, π‘Ž) ∢ π‘Ž ∈ 𝐴}. By now the room is probably silent and the fear that the audience has long drifted off into sweet dreams of differential equations dawns on me. But...

β€” How do these monoidal categories relate to monoids like β„• you were talking about at the start?

An intelligent question! In the same way you call a set a monoid when you can multiply its elements, a category is called monoidal when you can β€˜multiply’ its cities and roads, and instead of a unit element you have a unit city. A trendier way to say this is β€œmonoidal categories categorify monoids”. This is reflected in the fact that a monoid structure on an object of a category only makes sense when the category itself has a monoidal structure.

Braided monoidal categories In a braided category, the order of cities in a row can be swapped! To swap any two cities 𝐴 and 𝐡, we need a method of travelβ€”a roadβ€”from 𝐴 βŠ— 𝐡 to 𝐡 βŠ— 𝐴. These roads should have two entry lanes from the cities 𝐴 and 𝐡, and two exit lanes into 𝐡 and 𝐴, in that order. We’d also like these roads, which we denote by 𝑏𝐴,𝐡 , to resemble the 3D picture , which we saw when describing the commutative property of β„•. The next rules which need to be satisfied are directly influenced by topology. βˆ’πŸ£ resembling Firstly, each pass over road 𝑏𝐴,𝐡 should also be invertible by a road 𝑏𝐴,𝐡 the move . As apparent in the diagram on the right, the composition of two such roads should be the same as the identity roads of 𝐴 and 𝐡 running parallel.

=

The other conditions which need to hold just mean that if you take a number of cities (𝐴, 𝐡, 𝐢) and reorder them (maybe to 𝐢, 𝐡, 𝐴) via such passover roads, the outcome should be the same journey: 65

spring 2021


chalkdust 𝐴 𝐡 𝐢

𝐴 𝐡 𝐢

∘ (𝑏𝐴,𝐡 βŠ— id𝐢 ) β†’

← ∘ (id𝐴 βŠ—π‘π΅,𝐢 ) =

∘ (id𝐡 βŠ—π‘π΄,𝐢 ) β†’

← ∘ (𝑏𝐴,𝐢 βŠ— id𝐡 )

(𝑏𝐡,𝐢 βŠ— id𝐴 ) β†’

← (id𝐢 βŠ—π‘π΄,𝐡 ) 𝐢 𝐡 𝐴

𝐢 𝐡 𝐴

Geometrically this translates to β€˜the order in which the roads lay above each other matters, not the order in which one passes over the other’. As in this picture, the road connected to 𝐴 lies above the road connected to 𝐡, which itself lies above the road connected to 𝐢 . However, the order in which they pass over each other does not matter. A monoidal category with passover roads for any pair of cities, as described above, is called braided. In the category of sets, the passover roads for sets 𝐴 and 𝐡 are provided by 𝑏𝐴,𝐡 ∢ 𝐴 Γ— 𝐡 β†’ 𝐡 Γ— 𝐴,

𝑏𝐴,𝐡 (π‘Ž, 𝑏) = (𝑏, π‘Ž),

π‘Ž ∈ 𝐴, 𝑏 ∈ 𝐡.

For those with some university algebra knowledge, another important example of braided monoidal categories is the category of vector spaces with the tensor product of vector spaces. This is in fact where the notation βŠ— comes from. I’m sure he can’t top this... β€”

Well...

The big finale... higher algebra! Let’s say we want to describe a larger system than cities and roads between them. We really want to know how two roads 𝑓 , 𝑔 between two cities 𝐴, 𝐡 are related to each other. Under this geographical metaphor, this would entail looking at which streets connect the two roads within the two cities: city 𝐴 𝐴

road 𝑓 2-road data

city 𝐡 𝐡

𝑓 𝐡

𝐴 𝑔

road 𝑔

We call such a pair of streets connecting roads 𝑓 and 𝑔 a 2-road between 𝑓 and 𝑔 . A 2-category carries the information of cities, roads and 2-roads (for those not entertained by my metaphors: objects, morphisms and 2-morphisms) where we draw roads and 2-roads by β†’ and β‡’, respectively. Similarly to how we can compose ordinary roads, we compose 2-roads πœƒ ∢ 𝑓 β‡’ 𝑔 and πœ‚ ∢ 𝑔 β‡’ β„Ž β€˜vertically’ to produce a new 2-road πœ‚ βˆ˜π‘£ πœƒ ∢ 𝑓 β‡’ β„Ž (drawn on the left, overleaf). We can only do this when 𝑓 , 𝑔 and β„Ž are all roads between the same two cities 𝐴, 𝐡. But in addition to this vertical composition, 2-roads also have a horizontal composition (drawn on the right): chalkdustmagazine.com

66


chalkdust vertical composition 𝑓 𝐴

πœƒ

𝑓 𝑔 𝐡 = 𝐴

πœ‚

horizontal composition 𝑓

πœ‚ βˆ˜π‘£ πœƒ

β„Ž

𝐡

𝐴

𝐢 = 𝐴

πœ‹

𝐡

πœƒ 𝑔

β„Ž

β„Žπ‘“

β„Ž

πœ‹ βˆ˜β„Ž πœƒ

𝐢

π‘˜π‘”

π‘˜

Such compositions need to act well together, ie the order of composing horizontally or vertically should not matter: horizontal β€’

β€’

=

vertical

β€’ = β€’

β€’

vertical

=

β€’

β€’

β€’ = β€’

β€’

horizontal Diagrams like the above provide a platform for a 2-dimensional calculus as well and this is no coincidence. The information for a monoidal category is equivalent to the information needed for a 2-category with a single city. To better understand this, compare the pictures we have been drawing: monoidal category

equivalent to

cities, eg 𝐴 roads, eg π‘š composition of roads monoidal operation βŠ— for cities roads running parallel: βŠ— for roads empty city 𝐡

𝐴

βˆ—

𝐴

𝐢

𝐷

𝑓

𝑔 𝐹

𝐸

𝐢

𝐡 π‘š

βˆ—

id𝐢

𝑓 𝐸

roads from βˆ— to βˆ—, eg 𝐴 2-roads, eg π‘š vertical composition roads composing horizontal composition identity road from βˆ— to βˆ—

𝐢

π‘š 𝐷

2-category with one city βˆ—

βˆ—

id𝐢 βˆ— 𝐢

𝑔 𝐹

The diagram on the right shows how information transfers between the two settings. This brings us 67

spring 2021


chalkdust back to why we can draw a commutative monoid, such as the natural numbers, via 3D diagrams. First remember that to talk about a monoid being commutative, we needed to be able to swap elements. So we really need a braided monoidal category. In a similar fashion to how monoidal categories are 2-categories in disguise, a braided category is a 3-category with one city and one road, and provides a 3D calculus, where our commutative monoid β„• can live!

β€” 𝑛 ∈ β„• cheers! Monodial categories rule! β€”

So maybe now while these cheers fill the air, my heckler walks out of the lecture room and slams the door. I smile with pride, knowing that β€˜category theory won today’. No mathematicians were harmed during the making of this article. All audience members were fictitious and no real mathematicians were forced to attend my lecture. Aryan Ghobadi Aryan is a PhD student in mathematics at Queen Mary University of London, working with categories in quantum algebra. He is often the cranky audience member in the front row.

d sites.google.com/view/aghobadimath My favourite game

Prisoner’s dilemma Belgin Seymenoğlu

What I find interesting about the prisoner’s Prisoners Cooperate Defect dilemma is that it shows that even if the most (1 yr, 1 yr) (10 yrs, free) beneficial outcome for two parties appears to Cooperate Defect (free, 10 yrs) (5 yrs, 5 yrs) be for both to cooperate, one or both of them may be tempted to defect anyway. Moreover, we see many variants showing up on TV and in the real world, eg will countries cooperate to cut carbon emissions? Will two players on the game show Golden Balls choose to split their money or steal from the other? 8/10

Did you know... …that while there are the five Platonic solids we all know and love, there are actually six β€˜Platonic’ polytopes in four dimensions, but only three in each dimension greater than four.

My favourite game

Gran Turismo The Cardigans

chalkdustmagazine.com

68

8/11


ZOOM CONFERENCE It's the twenties and there is time for… Zingo! Tired of those long meetings that seem never to end? Stuck in a boring conference talk? Just want virtual learning to be over? While away the time playing our most up-to-date version of the classic game of BINGO! Play with friendβ€”just ensure you’re on mute before shouting β€˜full zouse’.

Call with 144 participants who aren’t automatically muted

β€œThere’s a weird echo (echo) (echo)”

Speaker who doesn’t know how to share their slides

Arriving late because of time zones

Losing connection and presenting to the void for 5 minutes

β€œYou’re on mute”

Windows XP hill virtual background

Having a complete meltdown and garnering national infamy

Pet cameo

β€œCan you repost the link?”

β€œYou’re not on mute”

Leaning back and half disappearing into your background

Three people have a personal conversation in a call of 30

Speaker overruns by 10 minutes yet the chair doesn’t Jackie Weaver them

Arriving early because of time zones

Someone off camera hands speaker a cup of tea…

… β€œOoh can I have oneβ€”milk no sugar” says some joker in the chat

Turning a boring editorial meeting into a magazine feature

69

spring 2021


On this page, you can find out what we think of recent books, films, games, and anything else vaguely mathematical. Full reviews of many of the items featured here can be found at d chalkdustmagazine.com

Only Connect This was a thrilling final, with fiendish questions and some excellent quizzing. It was also lovely to be able to support long-time friend of Chalkdust, Katie Steckles, and her team. What comes next in this sequence:

ggiii gggii ggggi

?

ggggg The Tired Sounds of

Formalized Music

Stars of the Lid Great lockdown album.

Iannis Xenakis Formalised Music is a manual, of sorts, to create a machine that writes music. It can be quite frustrating to decipher Xenakis’s writing, but his ideas on on the mathematics of composition remain influential and are worth exploring.

ggggg

Stars of the Lid and their Refinement of the Decline Perhaps the only thing better than The Tired Sounds of...

gghii

ggggg

Poems and Paradoxes Kyle Evans & Hana Ayoob A really fun collection of mathematical titbits and poems with delightful artwork.

Klax It’s no longer the nineties, but there’s still time for Klax.

gggii

ggggh

Apollo 13

Virtual meetings

Brilliant. The 13th numbered thing is always the best.

There’s no βˆšβˆ’πŸ£ in MS Teams. I just wish i wasn’t on it all the time.

ggggg

chalkdustmagazine.com

giiii

70


Chalkdust Book of the Year 2020 Molly and the Mathematical Mystery by Eugenia Cheng and Aleksandra Artymowska This is a beautiful book. Its pages are large, and full of wonderful illustrations. On each page, the reader is encouraged to help Molly continue on her adventure by finding information under flaps, opening flaps to change available routes, or even using the flaps to construct a path for Molly that takes her out of the page. This book was selected by the editors of Chalkdust to be the Chalkdust Book of the Year 2020, based on our four judging criteria: style, control, damage and aggression.

Chalkdust Readers’ Choice 2020 Mathematical Adventures! by Ioanna Georgiou and Asuka Young This book doesn’t shy away from difficult ideas, such as the existence of different sizes of infinity, and offers an excellent opportunity for a child to meet interesting bits of maths that would often be deemed β€˜too difficult’ for a few more years. This book would be a great way to rediscover and share these interesting mathematical ideas with a younger relative. This book was voted by our readers to be the Chalkdust Readers’ Choice 2020.

Shortlisted The winners were selected from our shortlist of seven books released in 2020. The seven nonwinning books are all also very good. They were:

The Wonder Book of Geometry by David Acheson; How to Make the World Add Up by Tim Harford; Hello Numbers! What Can You Do? by Edmund Harriss, Houston Hughes and Brian Rea; Why Study Mathematics? by Vicky Neale; and Geometry Juniors by Ed Southall. 71

spring 2021


TOP TEN

This issue features the top ten calculator buttons. To vote on the top ten waves, go to d chalkdustmagazine.com

At 10, it’s Mambo No. 5 (A Little Bit Of...) by Lou Bega.

At 9, it’s All Apologies by Nirvana.

10

9

At 8, it’s Mambo No.5 (A Little Bit Of... ) by Lou Bega.

At 7, it’s Up Allnight by Beck.

8

7

At 5, it’s M+ambo No.5 (A Little Bit Of...) by Lou Bega.

At 6, it’s Mambo No.5 (A Little Bit Of.. . ) by Lou Bega.

5

6

At 4, it’s Mambo No.5 (A Little Bit 0f...) by Lou Bega.

At 3, it’s My Name = by Eminem.

4

3

At 2 this issue, it’s Thunderstruck by AC/DC.

At 1, it’s Mambo No.5 (A Little Bit Off...) by Lou Bega.

2

chalkdustmagazine.com

1

72


Colour your own cellular automaton

1 Pick a number between 0 and 255.

(see pages 35–38)

2 1 1 1

1 1 0

1 0 1

1 0 0

0 1 1

0 1 0

0 0 1

0 0 0

Convert your number to binary and write its digits in the boxes above. 3

Flip a coin 8 times and write the results (0 for heads, 1 for tails) in the first row.

Use the rules you defined above to fill the rest of the grid with 1s (black) and 0s (any other colour you like).

4


Peopl et hi nkmat hemat i csi s compl i cat ed.Mat hemat i csi s t hes i mpl ebi tβ€” i t ’ st hes t uff wecanunder s t and. I t ’ scat st hatar ecompl i cat ed. J ohnConway, 1 937–2020

chal kdus t mag @chal kdus t mag chal kdus t mag chal kdus t magaz i ne. com


Turn static files into dynamic content formats.

CreateΒ aΒ flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.