i
r r (
r
e s r l n n
Welcome EDITORIAL Editor Catherine Emma Ellis Editor-in-chief Graham Barlow Art editor Efrain Hernandez-Mendoza
MARKETING
Welcome!
Group marketing manager NOT YET APPOINTED Marketing manager Richard Stephens
PRODUCTION & DISTRIBUTION Production controller Marie Quilter Production manager Mark Constance Printed in the UK by William Gibbons & Sons Ltd on behalf of Future Distributed by Seymour Distribution Ltd, 2 East Poultry Avenue, London EC1A 9PT, Tel: 0207 429 4000 Overseas distribution by Seymour International
CIRCULATION Trade marketing manager Juliette Winyard (07551 150 984)
LICENSING International director Regina Erak regina.erak@futurenet.com +44 (0)1225 442244 Fax +44 (0)1225 732275
MANAGEMENT Content & marketing director Nial Ferguson Head of content & marketing, technology Nick Merritt Group editor-in-chief Paul Newman Group art director Steve Gotobed Future is an award-winning international media group and leading digital business. We reach more than 49 million international consumers a month and create world-class content and advertising solutions for passionate consumers online, on tablet & smartphone and in print. Future plc is a public company quoted on the London Stock Exchange (symbol: FUTR). www.futureplc.com
Chief executive Zillah Byng-Maddick Non-executive chairman Peter Allen &KLHI ÀQDQFLDO RIÀFHU Richard Haley Tel +44 (0)207 042 4000 (London) Tel +44 (0)1225 442 244 (Bath)
All contents copyright © 2014 Future Publishing Limited or published under licence. All rights reserved. No part of this magazine may be reproduced, stored, transmitted or used in any way without the prior written permission of the publisher. Future Publishing Limited (company number 2008885) is registered in England and Wales. Registered office: Registered office: Quay House, The Ambury, Bath, BA1 1UA. All information contained in this publication is for information only and is, as far as we are aware, correct at the time of going to press. Future cannot accept any responsibility for errors or inaccuracies in such information. You are advised to contact manufacturers and retailers directly with regard to the price and other details of products or services referred to in this publication. Apps and websites mentioned in this publication are not under our control. We are not responsible for their contents or any changes or updates to them. If you submit unsolicited material to us, you automatically grant Future a licence to publish your submission in whole or in part in all editions of the magazine, including licensed editions worldwide and in any physical or digital format throughout the world. Any material you submit is sent at your risk and, although every care is taken, neither Future nor its employees, agents or subcontractors shall be liable for loss or damage.
P
rogramming is no longer the domain of the computer scientist. It has become an essential everyday skill for anyone involved with computers. Whether that’s tinkering with websites, writing a macro for your home accounts, or playing around with Minecraft on a Raspberry Pi, being able to put one line of code after another is your path to computing liberation. Coding Academy 2015 offers plenty of help for beginners, with a whole section dedicated to essential programming concepts and principles, plus a selection of projects that turn those concepts into real tools. You can then use the same skills to create an interactive online calendar, or go into Web 3.0 company startup mode with Ruby on Rails. And if you’re looking to expand your repertoire of programming languages even further, we’ll give you a tour of everything from C and SDL to Python and Perl. So what are you waiting for? Get stuck in, and happy coding!
LINUX is a trademark of Linus Torvalds, GNU/Linux is abbreviated to Linux throughout for brevity. All other trademarks are the property of their respective owners. Where applicable code printed in this magazine is licensed under the GNU GPL v2 or later. See www.gnu. org/copyleft/gpl.html. Disclaimer: All tips in this magazine are used at your own risk. We accept no liability for any loss of data or damage to your computer, peripherals or software through the use of any tips or advice.
3
Contents
Contents
Code concepts Types of data ................................................................................................................ 8 More data types..................................................................................................... 10 Abstraction .................................................................................................................. 12 Files and modules................................................................................................ 14 Use an IDE..................................................................................................................... 18 Write a program .................................................................................................... 20 Add features .............................................................................................................. 22 Put it all together .................................................................................................. 24 Data modules........................................................................................................... 26 Data storage.............................................................................................................. 28 Data organisation................................................................................................30 Data encryption..................................................................................................... 32 Spot mistakes..........................................................................................................34
Ruby Ruby: Master the basics.............................................................................. 38 Ruby: Add a little more polish ............................................................... 42 Ruby: Modules, blocks and gems....................................................49 Ruby on Rails: Web development .................................................... 54 Ruby on Rails: Code testing .................................................................... 58 Ruby on Rails: Site optimisation......................................................... 62
More languages C and beyond: Code a starfield .......................................................... 68 Scheme: Learn the basics .........................................................................72 Scheme: Recursion............................................................................................76 Scheme: High order procedures ......................................................80 4
Contents
PHP PHP: Write your first script....................................................................... 86 PHP: Build an online calendar ..............................................................90 PHP: Extend your calendar .....................................................................94 PHP: Get started with MySQL ............................................................. 98 PHP: Do more with MySQL .................................................................. 102
Modern Perl Modern Perl: Track your reading....................................................108 Modern Perl: Build a web app............................................................. 112 Modern Perl: Adding to our app....................................................... 116
Python Python: Different types of data ....................................................... 122 Python: Code a system monitor .................................................... 124 Python: Clutter animations .................................................................. 128 Python: Stream video ................................................................................. 132 Python: Code a Gimp plugin ............................................................... 136 Python: Gimp snowflakes ..................................................................... 140 Python: Make a Twitter client............................................................ 144 Minecraft: Start hacking ..........................................................................148 Minecraft: Image wall importing ....................................................150 Minecraft: Make a trebuchet ..............................................................154 Minecraft: Build a cannon ...................................................................... 158
5
.
.
. .
. :
;
:
.
,
.
.
.
.
,
;
n
c
;
o
n ;
Code concepts
Code concepts E
very programmer, whether they’re coding for bare-metal embedded programs or stringing together website functions, needs to know the basics. Whether you’re a relative beginner or an old hand, this look at the fundamentals of coding will strengthen your command of the basic techniques so you can spend more time being creative.
Types of data ................................................................................................................ 8 More data types..................................................................................................... 10 Abstraction .................................................................................................................. 12 Files and modules................................................................................................ 14 Use an IDE..................................................................................................................... 18 Write a program .................................................................................................... 20 Add features .............................................................................................................. 22 Put it all together .................................................................................................. 24 Data modules........................................................................................................... 26 Data storage.............................................................................................................. 28 Data organisation................................................................................................30 Data encryption..................................................................................................... 32 Spot mistakes..........................................................................................................34
7
Code concepts
Code concepts: Types of data Functions tell programs how to work, but it’s data that they operate on. Jonathan Roberts explains the basics of data in Python.
I
n this article, we’ll be covering the basic data types in Python and the concepts that accompany them. In later articles, we’ll look at a few more advanced topics that build on what we do here: data abstraction (p12), fancy structures such as trees, and more.
What is data?
While we’re looking only at basic data types, in real programs getting the wrong type can cause problems, in which case you’ll see a TypeError.
8
In the world, and in the programs that we’ll write, there’s an amazing variety of different types of data. In a mortgage calculator, for example, the value of the mortgage, the interest rate and the term of the loan are all types of data; in a shopping list program, there are all the different types of food and the list that stores them – each of which was its own kind of data. The computer’s world is a lot more limited. It doesn’t know the difference between all these data types, but that doesn’t stop it from working with them. The computer has a few basic ones it can work with, and that you have to use creatively to represent all the variety in the world. We’ll begin by highlighting three data types. First, we have numbers – 10, 3 and 2580 are all examples of these. In particular, these are ‘ints’, or integers. Python knows about other types of numbers, too, including ‘longs’ (long integers), ‘floats’ (such as 10.35 or 0.8413) and ‘complex’ (complex numbers). There are also strings, such as ‘Hello World’, ‘Banana’ and ‘Pizza’. These are identified as a sequence of characters enclosed within quotation marks. You can use either double or single quotes. Finally, there are lists, such as [‘Bananas’, ‘Oranges’, ‘Fish’]. In some ways, these are like a
string, in that they are a sequence. What makes them different is that the elements that make up a list can be of any type. In this example, the elements are all strings, but you could create another list that mixes different types, such as [‘Bananas’, 10, ‘a’]. Lists are identified by the square brackets that enclose them, and each item or element within them is separated by a comma.
Working with data There are lots of things you can do with the different types of data in Python. For instance, you can add, subtract, divide and multiply two numbers and Python will return the result: >>> 23 + 42 65 >>> 22 / 11 2 If you combine different types of numbers, such as an int and a float, the value returned by Python will be of whatever type retains the most detail: that is to say, if you add an int and a float, the returned value will be a float. You can test this by using the type() function. It returns the type of whatever argument you pass to it. >>> type(8) <type ‘int’> >>> type(23.01) <type ‘float’> >>> type(8 + 23.01) <type ‘float’> You can also use the same operations on strings and lists, but they have different effects. The + operator concatenates (combines together) two strings or two lists, while the * operator repeats the contents of the string or list. >>> “Hello “ + “World” “Hello World” >>> [“Apples”] * 2 [“Apples”, “Apples”] Strings and lists also have their own special set of operations, including slices. These let you select a particular part of the sequence by its numerical index, which begins from 0. >>> word = “Hello” >>> word[0] ‘H’ >>> word[3] ‘l’ >>> list = [‘banana’, ‘cake’, ‘tiffin’] >>> list[2] ‘tiffin’ Indexes work in reverse, too. If you want to reference the last
Code concepts element of a list or the last character in a string, you can use the same notation with a -1 as the index. -2 will reference the second-to-last character, -3 the third, etc. Note that when working backwards, the indexes don’t start at 0.
Methods Lists and strings also have a range of other special operations, each unique to that particular type. These are known as methods. They’re similar to functions such as type() in that they perform a procedure. What makes them different is that they’re associated with a particular piece of data, and hence have a different syntax for execution. For example, among the list type’s methods are append and insert. >>> list.append(‘chicken’) >>> list [‘banana’, ‘cake’, ‘tiffin’, ‘chicken’] >>> list.insert(1, ‘pasta’) >>> list [‘banana’, ‘pasta’, ‘cake’, ‘tiffin’, ‘chicken’] As you can see, a method is invoked by placing a period between the piece of data that you’re applying the method to and the name of the method. Then you pass any arguments between round brackets, just as you would with a normal function. It works the same with strings and any other data object, too: >>> word = “HELLO” >>> word.lower() ‘hello’ There are lots of different methods that can be applied to lists and strings, and to tuples and dictionaries (which we’re about to look at). To see the order of the arguments and the full range of methods available, you’ll need to consult the Python documentation.
Variables In the previous examples, we used the idea of variables to make it easier to work with our data. Variables are a way to name different values – different pieces of data. They make it easy to manage all the bits of data you’re working with, and greatly reduce the complexity of development (when you use sensible names). As we saw above, in Python you create a new variable with an assignment statement. First comes the name of the variable, then a single equals sign, followed by the piece of data that you want to assign to that variable. From that point on, whenever you use the name assigned to the variable, you’re referring to the data that you assigned to it. In the examples, we saw this in action when we referenced the second character in a string or the third element in a list by appending index notation to the variable name. You can also see this in action if you apply the type() function to a variable name: >>> type(word) <type ‘str’> >>> type(list) <type ‘list’>
Other data types There are two other common types of data that are used by Python: tuples and dictionaries. Tuples are very similar to lists: they’re a sequence data type, and they can contain elements of mixed types. The big difference is that tuples are immutable – that is to say, once you create a tuple you cannot change it, and that tuples are
identified by round brackets, as opposed to square brackets: (‘bananas’, ‘tiffin’, ‘cereal’). Dictionaries are similar to a list or a tuple in that they contain a collection of related items. They differ in that the elements aren’t indexed by numbers, but by ‘keys’ and are created with curly brackets: {}. It’s quite like an English language dictionary. The key is the word that you’re looking up, and the value is the definition of the word. With Python dictionaries, however, you can use any immutable data type as the key (strings are immutable, too), so long as it’s unique within that dictionary. If you try to use an already existing key, its previous association is forgotten completely and that data lost forever. >>> english = {‘free’: ‘as in beer’, ‘linux’: ‘operating system’} >>> english[‘free’] ‘as in beer’ >>> english[‘free’] = ‘as in liberty’ >>> english[‘free’] ‘as in liberty’
The Python interpreter is a great place to experiment with Python code and see how different data types work together.
Looping sequences One common operation that you may want to perform on any of the sequence types is looping over their contents to apply an operation to every element contained within. Consider this small Python program: list = [‘banana’, ‘tiffin’, ‘burrito’] for item in list: print item First, we created the list as we would normally, then we used the for… in… construct to perform the print function on each item in the list. The second word in that construct doesn’t have to be item, that’s just a variable name that gets assigned temporarily to each element contained within the sequence specified at the end. We could just as well have written for letter in word and it would have worked just as well. That’s all we have time to cover in this article, but with the basic data types covered, we’ll be ready to look at how you can put this knowledge to use when modelling real-world problems in later articles. In the meantime, read the Python documentation to become familiar with some of the other methods that it provides for the data types we’ve looked at before. You’ll find lots of useful tools, such as sort and reverse! Q 9
Code concepts
Code concepts: More data types Learn how different types of data come together to solve a real problem as Jonathan Roberts counts some words.
I
n the first Code Concepts tutorial, we introduced Python’s most common data types: numbers (ints and floats), strings, lists, tuples and dictionaries. We demonstrated how they work with different operators and a few of their most useful methods. We didn’t, however, give much insight into how they might be used in real situations. In this article, we’re going to fix that. We’re going to write a short program that counts the number of times each unique word occurs in a text file. Punctuation marks will be excluded, and if the same word occurs but in different cases (eg, the and The), they will be taken to represent a single word. Finally, the program will print the results to the screen. It should look like this: the: 123 you: 10 a: 600 ... As an example, we’ll be using The Time Machine, by HG Wells, which you can download from Project Gutenberg, saving it in the same folder as your Python file under the name timemachine.txt. As the program description suggests, the first thing we’ll need to do is make the text accessible from inside our Python program. This is done with the open() function:
Hardly surprisingly, our counting program, after being sorted, finds ‘the’ to be the most common word in The Time Machine, by HG Wells.
10
tm = open(‘timemachine.txt’, ‘r’) In this example, open() is passed two variables. The first is the name of the file to open; if it were in a different directory from the Python script, the entire path would have to be given. The second argument specifies which mode the file should be opened in: r stands for read, but you can also use w for write or rw for read-write. Notice we’ve also assigned the file to a variable, tm, so we can refer to it later in the program. With a reference to the file created, we also need a way to access its contents. There are several ways to do this, but today we’ll be using a for… in… loop. To see how this works, try opening timemachine.txt in the interactive interpreter and then typing: >>> for line in tm: print line ... The result should be every line of the file printed to the screen. By putting this code in to a .py file, say cw.py, we’ve got the start of our Python program.
Cleaning up The program description also specified that we should exclude punctuation marks, consider the same word but in different cases as one, and that we’re counting individual words, not lines! As it stands, we’ve been able to read only entire lines as strings, however, with punctuation, strange whitespace characters (such as \r\n) and different cases intact. Looking at the Python string documentation (http:// docs.python.org/library), we can see that there are four methods that can help us convert line strings into a format closer to that specified by the description: strip(), translate(), lower() and split(). Each of these are methods, and as such they’re functions that are applied to particular strings using the dot notation. For example, strip(), which removes specified characters from the beginning and end of a string, is used like this: >>> line.strip() When passed with no arguments, it removes all whitespace characters, which is one of the jobs we needed to get done. The function translate() is a method that can be used for removing a set of characters, such as all punctuation marks, from a string. To use it in this capacity, it needs to be passed two arguments, the first being None and the second being the list of characters to be deleted. >>> line.translate(None, ‘!”#$%&\’()*+,-./:;<=>?@[\\]^_`{|}~’) lower() speaks for itself, really: it converts every character in
Code concepts a string to lower-case. split() splits distinct elements inside a string into separate strings, returning them as a list. By passing an argument to split(), it’s possible to specify which character identifies the end of one element and the start of another. >>> line.split(‘ ‘) In this example, we’ve passed a single space as the character to split the string around. With all punctuation removed, this will create a list, with each word in the string stored as a separate element. Put all of this in the Python file we started working on earlier, inside the for loop, and we’ve made considerable progress. It should now look like this: tm = open(‘timemachine.txt’, ‘r’) for line in tm: line = line.strip() line = line.translate(None, ‘!”#$%&\’()*+,-./:;<=>?@ [\\]^_`{|}~’) line = line.lower() list = line.split(‘ ‘) Because all of the string methods return a new, modified string, rather than operating on the existing string, we’ve re-assigned the line variable in each line to store the work of the previous step.
Uniqueness Phew, look at all that work we’ve just done with data! By using the string methods, we’ve been able to remove all the bits of data that we weren’t interested in. We’ve also split one large string, representing a line, into smaller chunks by converting it to a list, and in the process gotten at the exact, abstract concept we’re most interested in: words. Our stunning progress aside, there’s still work to be done. We now need a way to identify which words are unique – and not just in this line, but in every line contained within the entire file! The first thing that should pop in to your head when thinking about uniqueness is of a dictionary, the key-value store we saw in the first article (p8). It doesn’t allow duplicate keys, so by entering each word as a key within a dictionary, we’re guaranteed there won’t be any duplicates. What’s more, we can use the value to store the number of times each word has occurred, incrementing it as the program comes across new instances of each key. Start by creating the dictionary, and ensuring that it persists for the entire file – not just a single line – by placing this line before the start of the for loop: dict = {} This creates an empty dictionary, ready to receive our words. Next, we need to think about a way to get each word in to the dictionary. As we saw last time, ordinarily a simple assignment statement would be enough to add a new word to the dictionary. We could then iterate over the list we created above (using another for loop), adding each entry to the dictionary with a value of 1 (to represent that it has occurred once in the file). for word in list: dict[word] = 1 But remember, if the key already exists the old value is overwritten and the count will be reset. To get around this, we can place an if-else clause inside the loop: if word in dict: count = dict[word] count += 1
dict[word] = count else: dict[word] = 1 This is a bit confusing because dict[word] is being used in two different ways. In the second line, it returns the value and assigns it to the variable count, while in the fourth and seventh lines, count and 1 are assigned to that key’s value, respectively. Notice, too, that if a word is already in the dictionary, we increment the count by 1, representing another occurrence.
Python’s Standard Library reference, http:// docs.python. org/library, is an invaluable source for discovering what methods are available and how to use them.
Putting it together Another data type wrestled with, another step closer to our goal. At this point, all that’s left to do is insert some code to print the dictionary and put it all together and run the program. The print section should look like this and be at the very end of the file, outside of the line-looping code. for word,count in dict.iteritems(): print word + “: “ + str(count) This for loop looks different to what you’ve seen before. By using the iteritems method of the dictionary, we can access both the key (word) and value (count) in a single loop. What’s more, we’ve had to use the str() function to convert count, an integer, into a string, as the + operator can’t concatenate an integer and a string. Try running it, and you should see your terminal screen filled with lines like: ... other: 20 sick: 2 ventilating: 2 ...
Data everywhere! That’s all we planned to achieve in this particular tutorial and it’s actually turned out to be quite a lot. As well as having a chance to see how several different types of data and their methods can be applied to solve a real problem, we hope you’ve noticed how important it is to select the appropriate type for representing different abstract concepts. For example, we started off with a single string representing an entire line, and we eventually split this into a list of individual strings representing single words. This made sense until we wanted to consider unique instances, at which point we put everything into a dictionary. As a further programming exercise, why not look into sorting the resulting dictionary in order to see which words occur most often? You might also want to consider writing the result to a file, one entry on a line, to save the fruits of your labour. Q 11
Code concepts
Code concepts: Abstraction Jonathan Roberts shows you how creating abstractions can make code more reliable and easier to maintain.
I
n the first couple of Code Concepts tutorials, we’ve been looking at data. First, we introduced some of Python’s core data types, and then we demonstrated how they can be put to use when solving a real problem. The next datarelated topic we want to consider is abstraction, but before we get on to that, we’re first going to look at abstraction in general and as it applies to procedures. So, this time we’ll take a brief hiatus from data, before returning to it in a later article.
Square roots To get our heads around the concept of abstraction, let’s start by thinking about square roots and different techniques for finding them. One of these was discovered by Newton, and is thus known as Newton’s method. It says that when trying to find the square root of number (x), we should start with a guess (y) of its square root. We can then improve that by averaging our guess (y) with the result of dividing the number (x) by our guess (y). As we repeat this procedure, we get closer and closer to the square root. In most attempts, we’ll never reach a definite result, we’ll only make our guess more and more accurate. Eventually, we’ll reach a level of accuracy that is good enough for our needs and give up. Just to be clear about what’s involved, take a look at the table below for how you would apply this method to find the square root of 2 (eg, x). It’s a lot of work just to find the square root of a number! Imagine if when you were in school, every time you had to find a square root you had to do all these steps manually. For instance, solving problems involving Pythagoras’ theorem would be much more unwieldly. Luckily, assuming you were allowed calculators at school, there’s another, much simpler method to find square roots. Calculators come with a button marked with the square root symbol, and all you have to do is press this button once –
what could be easier? This second approach is what’s known as an abstraction. When working on problems, such as those involving Pythagoras’ theorem, we don’t care how to calculate the square root, only that we can do it and get the correct result. We can treat the square root button on our calculator as a black box – we never look inside it and we don’t know how it does what it does, all that matters is we know how to use it and that it gives the correct result. Abstraction a very powerful technique that makes programming a lot easier, as it helps us to manage complexity. To demonstrate how abstraction can help, consider this Python code for finding the longest side of a right-angled triangle: import math def pythag(a, b): a2b2 = (a * a) + (b * b) guess = 1.0 while (math.fabs((guess * guess) - a2b2) > 0.01): guess = (((a2b2 / guess) + guess) / 2) return guess The first thing to note is that it’s not in the least bit readable. Sure, with a piece of code this short, you can read through it reasonably quickly and figure out what’s going on, but at a glance it’s not obvious, and if it were longer and written like this, you’d have a terrible time figuring out what on Earth it was doing. What’s more, it would be very difficult to test the different parts of this code as you go along (aka incremental development, vital for building robust software). For instance, how would you break out the code for testing whether or not a guess is close enough to the actual result (if you can even identify it), or the code for improving a guess, to check that it works? What if this function didn’t return the expected results, how would you start testing all the different parts to find where the error was? Finally, there’s code in here that could be reused in other functions, such as that for squaring a number, for taking an average of two numbers, and even for finding the square root of a number, but none of it is reusable because of the way it’s written. You could type it all out again, or copy and paste it, but the more typing you have to do, the more obscure code you have to copy and paste, and the more likely mistakes are to make it in to your programming. Let’s try writing that code again, this time coming up with some abstractions to fix the problems listed above. We haven’t listed the contents of each new function we’ve
“This is a very powerful technique that makes programming easier.”
Finding a square root Guess (y) Division (x/y)
12
Average (((x/y) + y)/2)
1
2/1 = 2
(2 + 1)/2 = 1.5
1.5
2/1.5 = 1.33
(1.33 + 1.5)/2 = 1.4167
1.4167
2/1.4167 = 1.4118
(1.4118 + 1.4167)/2 = 1.4142
Code concepts created, leaving them for you to fill in. import math: def square(x): ... def closeEnough(x, guess): ... def improveGuess(x, guess): ... def sqrt(x, guess): ... def pythag(a, b): a2b2 = square(a) + square(b) return sqrt(a2b2) Here, we’ve split the code in to several smaller functions, each of which fulfils a particular role. This has many benefits. For starters, how much easier is the pythag() function to read? In the first line, you can see clearly that a2b2 is the result of squaring two numbers, and everything below that has been consolidated in to a single function call, the purpose of which is also obvious. What’s more, because each part of the code has been split into a different function, we can easily test it. For example, testing whether improveGuess() was doing the right thing would be very easy: come up with a few values for x and guess, do the improvement by hand, and then compare your results with those returned by the function. If pythag() itself was found not to return the correct result, we could quickly test all these auxiliary functions to narrow down where the bug was. And, of course, we can easily reuse any of these new functions. If you were finding the square root of a number in a different function, for instance, you could just call the sqrt() function: six characters instead of four lines means there’s far less opportunity to make mistakes. One final point: because our sqrt code is now abstracted, we could change the implementation completely, but so long
“This code can be improved by taking advantage of scope.” as we kept the function call and arguments the same, all code that relies on it would continue to work properly. This means that if you come across a much more efficient way of calculating square roots, you’re not stuck with working through thousands of lines of code, manually changing every section that finds a square root: you do it once, and it’s done everywhere. This code can be improved still further by taking advantage of scope – see http://bit.ly/1CQS7xl for
Abstraction
Java C Assembler Object code There are layers of abstraction underneath everything you do on a PC – you just don’t often think of them.
more details. closeEnough() and improveGuess() are particular to the sqrt() function – that is to say, other functions are unlikely to rely on their services. To help keep our code clean, and make the relationship between these functions and sqrt() clear, we can place their definitions inside the definition of sqrt(): def sqrt(x, guess): def closeEnough(x, guess): ... def improveGuess(x, guess): ... ... These functions are now visible only to code within the sqrt() definition – we say they’re in the scope of sqrt(). Anything outside of it has no idea that they even exist. This way, if we later need to define similar functions for improving a guess in a different context, we won’t face the issue of colliding names or the headache of figuring out what improveGuess1() and improveGuess2() do.
Our final code for finding the longest side of a triangle is longer than what we had to start, but it’s more readable, more robust, and generally better!
Layers of abstraction Hopefully, this example has demonstrated how powerful a technique abstraction is. Bear in mind that there are many layers of abstraction present in everything you do on a computer that you never think of. For instance, when you’re programming do you know how Python represents integers in the computer’s memory? Or how the CPU performs arithmetic operations such as addition and subtraction? The answer is probably no. You just accept the fact that typing 2 + 3 in to the Python interpreter returns the correct result, and you never have to worry about how it does this. You treat it as a black box. Think how much longer it would take you to program if you had to manually take care of what data went in which memory location, to work with binary numbers, and translate alphabetic characters in to their numeric representations – thank goodness for abstraction! Q 13
THE BEST Z97 MOBO?
AMAZING DUAL GPU LAPTOP
¤ ASUS MAXIMUS VII FORMULA RATED!
¤ 17-INCH AORUS X7 V2 ON TEST
4K SCREENS ON A BUDGET
DOUBLE DISPLAY
REMOTE LOGIN ACCESS A WINDOWS PC ANYWHERE ON EARTH
¤ IIYAMA B2888 UHSU REVIEWED
HOW TO MANAGE YOUR MULTI-MONITOR SETUP
8-CORE SENSATION!
FASTEST EVER CPU
MAKE THE START MENU EVEN MORE POWERFUL
s for Includes guide well! Windows 8 as
100
ISSUE 297/NOVEMBER 2014
SUPER START
WINDOWS
SECRETS
New Intel Haswell-E chip redefines performance
UNLOCKED
EXCLUSIVE Inside Intel's game-changing 5960X TESTED The new CPU, X99 mobos and RAM PLUS How to overclock Haswell-E to 4.4GHz
Discover new ways to speed up your PC, ž [ SUREOHPV VDYH WLPH DQG PXFK PRUH 6(Ç&#x2021;8Ćş,Ç?< Ç&#x153;3(ǨƎÇ&#x201E;/
LOCK DOWN YOUR DOCUMENTS 7KH VLPSOH JXLGH WR NHHSLQJ \RXU folders safe and password protected
BUILD IT!
A FULL HD GAMING PC FOR ÂŁ468
FREE DISC!
2015 GAMES
MED DIA
7Ç&#x160;67Ç&#x160;'
PACKED WITH 40 ESSENTIAL APPS
PREVIEW
BEST FOR ALL NEW PCS
DO ALL THIS AND MOREâ&#x20AC;Ś Â&#x2021; %RRVW \RXU KDUG GULYH Â&#x2021; 5HPRYH PDOZDUH Â&#x2021; *HW EHWWHU SULQWV Turn to p98 now!
¤ 16 games worth getting excited about right now WITCHER 3
STREA AM YOU UR 6HQG PXVLLF YLG GHR DQG SK KR KRWRV DURXQG G \RXU U KRP PH PH
ARKHAM KNIGHTS
PLUS THE ULTIMATE PC BUYER'S GUIDE
Windows tutorials
We help you to buy the right laptop or WDEOHW ZLWK FRQĂ&#x20AC; GHQFH
New things to do
Buying advice
Help & support
100% jargon free
SAVE UP TO 45%
SAVE UP TO 45%
SAVE UP TO 40%
SAVE UP TO 70%
SAVE UP TO 50%
FROM ÂŁ25.49
FROM ÂŁ23.49
FROM ÂŁ25.49
FROM ÂŁ15.99
FROM ÂŁ23.49
INCLUDES DVD VIDEO | SAMPLES | TUTORIALS Issue 283
Technique and technology for making music
FROM
STUDIO TO
STAGE
The essential guide to setting up an electronic live show
TECHNIQUE
FUTURE BASS
Create powerful, complex synth bass in your DAW DVD missing? Ask your vendor
TESTED Roland System-1
ON STAGE WITH
FACTORY FLOOR See the set-up behind the UKâ&#x20AC;&#x2122;s ďŹ nest electronic live act
SAVE UP TO 40%
SAVE UP TO 55%
SAVE UP TO 40%
SAVE UP TO 35%
SAVE UP TO 50%
SAVE UP TO 50%
SAVE UP TO 40%
FROM ÂŁ26.49
FROM ÂŁ12.99
FROM ÂŁ17.99
FROM ÂŁ22.49
FROM ÂŁ20.99
FROM ÂŁ21.49
FROM ÂŁ25.49
2 easy ways to order /Z501
Or call us on 0844 848 2852 quote Z501
Lines open Mon to Fri 8am â&#x20AC;&#x201C; 9.30pm and Sat 8am â&#x20AC;&#x201C; 4pm
Savings compared to buying 2 yearâ&#x20AC;&#x2122;s worth of full priced issues from UK newsstand. This offer is for new print subscribers only. You will receive 13 issues in a year. Full details of the Direct Debit guarDQWHH DUH DYDLODEOH XSRQ UHTXHVW ,I \RX DUH GLVVDWLVÂżHG LQ DQ\ ZD\ \RX FDQ ZULWH WR XV RU FDOO XV WR FDQFHO \RXU VXEVFULSWLRQ DW DQ\ WLPH DQG ZH ZLOO UHIXQG \RX IRU DOO XQPDLOHG LVVXHV 3ULFHV FRUUHFW DW point of print and subject to change. For full terms and conditions please visit: myfavm.ag/magterms Offer ends: 31st January 2015
0U[YVK\JPUN H NSVIHS [LJO IYHUK [OH[ WYVTPZLZ [V JOHUNL [OL ^H` `V\ JVUZ\TL [LJOUVSVN`
*V]LYPUN UN [OL SH[LZ[ UL^ [LJOUVSVN` VSVN`
PU KLW[O KL PU KL[HPS KL M\SS` [LZ[LK NLHY
Code concepts
Code concepts: Files and modules Graham Morrison expands your library of functions and grabs external data with just two lines of Python.
F
or the majority of programming projects, you don’t get far before facing the age-old problem of how to get data into and out of your application. Whether it’s using punched cards to get patterns into a 19th century Jacquard textile loom, or Google’s robots skimming websites for data to feed its search engine, dealing with external input is as fundamental as programming itself. And it’s a problem and a concept you may be more familiar with on the command line. When you type ls to list the contents of the current directory, for example, the command is reading in the contents of a file, the current directory, and outputting the contents to another, the terminal. Of course, the inputs and outputs aren’t files in the sense most people would recognise, but that’s the way the Linux filesystem has been designed – nearly everything is a file. This helps when you want to save the output of a command, or use that output as the input to another. You may already know that typing ls >list.txt will redirect the output from the command to a file called list.txt, but you can take this much further because the output can be treated exactly like a file. ls | sort -r will pipe (that’s the vertical bar character) the output of ls into the input of sort to create
a reversed alphabetical list of a folder’s contents. The complexity of how data input and output can be accomplished is entirely down to your programming environment. Every language will include functions to load and save data, for instance, but this can either be difficult or easy depending on how many assumptions the language is willing to make on your behalf. However, there’s always a logical sequence of events that need to occur. You will first need to open a file, creating one if it doesn’t exist, and then either read data from this file, or write data to it, before explicitly closing the file again so that other processes can use it. Most languages require you to specify a read-mode when you open a file, as this tells the filesystem whether to expect file modifications or not. This is important because many different processes may also want to access the file, and if the filesystem knows the file is being changed, it won’t usually allow access. However, many processes can access a read-only file without worrying about the integrity of the data it holds, because nothing is able to change it. If you are familiar with databases, it’s the same kind of problem you face with multiple users accessing the same table. In Python, as with most other languages, opening a file to write or as read-only can be done with a single line: >>> f = open(“list.txt”, “r”) If the file doesn’t exist, Python will generate a “No such file or directory” error. To avoid this, we’ve used the output from our command line example to create a text file called list.txt. This is within the folder from where we launched the Python interpreter.
“If the filesystem knows the file is being changed, it won’t allow access.”
When you read a file, most languages will step through its data from the beginning to the end in chunks you specify. In this example, we’re reading a line at a time.
Environment variables Dealing with paths, folders and file locations can quickly become complicated, and it’s one of the more tedious issues you’ll face with your own projects. You’ll find that different environments have different solutions for finding files, with some creating keywords for common locations and others leaving it to the programmer. This isn’t so bad when you only deal with files created by your projects, but it becomes difficult when you need to know where to store a configuration file or load a default icon. These locations may be different depending on your Linux distribution or desktop, but with a cross-platform language such as Python, they’ll also be different for each operating system. For that reason, you might want to consider using environment variables. These are similar to variables with a global scope in many
16
Code concepts programming languages, but they apply to any one user’s Linux session rather than within your own code. If you type env on the command line, for instance, you’ll see a list of the environmental variables currently set for your terminal session. Look closely, and you’ll see a few that apply to default locations and, most importantly, one called HOME. The value assigned to this environmental variable will be the location of your home folder on your Linux system, and if we want to use this within our Python script, we first need to add a line to import the operating system-specific module. The line to do this is: import os This command is also opening a file, but not in the same way we opened list.txt. This file is known as a module in Python terms, and modules like this ‘import’ functionality, including statements and definitions, so that a programmer doesn’t have to keep re-inventing the wheel. Modules extend the simple constructs of a language to add portable shortcuts and solutions, which is why other languages might call them libraries. Libraries and modules are a little like copying and pasting someone’s own research and insight into your own project. Only it’s better than that, because modules such as ‘os’ are used by everyone, turning the way they do things into a standard.
Setting the standard There are even libraries called std, and these embed standard ways of doing many things a language doesn’t provide by default, such as common mathematical functions, data types and string services, as well as file input/output and support for specific file types. You will find the documentation for what a library does within an API. This will list each function, what it does, and what it requires as an input and an output. You should also be able to find the source files used by the import (and by #include in other languages). On most Linux systems, for example, /lib/python2.x will include all the modules. If you load os.py into a text editor, you’ll see the code you’ve just added to your project, as well as which functions are now accessible to you. There are many, many different modules for Python – it’s one of the best reasons to choose it over any other language, and more can usually be installed with just a couple of clicks from your package manager. But this is where the ugly spectre of dependencies can start to have an effect on your project, because if you want to give your code to someone else, you need to make sure that person has also got the same modules installed. If you were programming in C or C++, where your code is compiled and linked against binary libraries, those binary libraries will also need to be present on any other system that runs your code. They will become dependencies for your project, which is what package managers do when you install a complex package.
To see what we mean, add the following piece of code to your project: f = open(os.environ[“HOME”]+”/list.txt”,”r”) This line will open the file list.txt in your home folder. Python knows which home folder is yours, because the os.environ function from the ‘os’ module returns a string from an environmental variable, and the one we’ve asked it to return is HOME. But all we’ve done is open the file, we’ve not yet read any of its contents. This might seem counterintuitive, but it’s an historical throwback to the way that files used to be stored, which is why this is also the way nearly all languages work. It’s only after a file has been opened that you can start to read its contents: f.readline() The above instruction will read a single line of the text file and output this to the interpreter as a string. Repeating the command will read the next line, because Python is remembering how far through the file it has read. Internally, this is being done using something called a pointer, and this too is common to the vast majority of languages. Alternatively, if you wanted to read the entire file, you could use f.read(). As our file contains only text, copying the contents to a Python string is an easy conversion. The same isn’t true of a binary file. Rather than being treated as text, the organisation of the bits and bytes that make up a binary file are organised according to the file type used by the file – or no file type at all if it’s raw data. As a result, Python (or any other language) would be unable to extract any context from a binary file, causing an error if you try to read it into a string. The solution, at least for the initial input, is to add a b flag when you first open the file, as this warns Python to expect raw binary. When you then try to read the input, you’ll see the hexadecimal values of the file output to the display. To make this data useful, you’ll need to do some extra work, which we’ll look at next time; but first, make sure you close the open file, as this should ensure the integrity of your filesystem. As you might guess, the command looks like this: f.close and it’s as easy as that! Q
Binary files have no context without an associated file type and a way of handling them, which is why you get the raw data output when you read one.
“If you load os.py into a text editor you’ll see the code you’ve just added.”
The os module Getting back to our project, the ‘os’ module is designed to provide a portable way of accessing operating systemdependent functionality so that you can write multi-platform applications without worrying about where files should be placed. This includes knowing where your home directory might be.
17
Code concepts
Code concepts: Use an IDE Lazy Graham Morrison explains why it’s never too early to start using a development environment.
I
t’s a good idea to use an editor with syntax highlighting when writing your code. Using an editor such as Kate or Gedit, with your language selected for highlighting, marks all the different elements within your code a different colour. There’s a good reason why this makes your life easier – when you are new to a language, syntax highlighting will help you to see easily when an element isn’t recognised, or when parenthesis is broken, or when you’ve made a simple formatting error. But this is a solution that also scales – many experts will also use highlighting because it means they have one less thing to worry about, especially if you’re hammering out code faster than a famished Richard Stallman. Which is maybe why Stallman’s Emacs editor has got some excellent syntax highlighting of its own.
Why use an IDE? Syntax highlighting is just as useful for beginners as it is for experts because it gives you less to think about and more time to code, which is why – even when you’re a beginner – it’s worth finding an editing environment you can grow into, and one that will accommodate your projects as they get larger while you learn. Text editors are great
for editing single-file projects using an interpreted language, but can become cumbersome when projects get bigger. This is where Integrated Development Environments can help take the strain. Not only will they manage the various files within a project, they’ll also manage how those files are built into a single executable, as well as how functions and objects from one file can be used within another. This might not be particularly applicable to Python, but most IDEs work in the same way, so you can take your skills with you when you move to a different environment. And to get a better understanding of how capable IDEs can be, even for the beginner, we’re going to cover a few of their essential functions and how you can start using them for your projects.
Down in Komodo In almost all the examples we’ve used in this series of code concept guides, we’ve used Python to illustrate the ideas and concepts covered in the text. We’ve even mentioned some of the Python IDEs available, such as Eric, but we’ve never covered any alternative, or how you use any of them with the language, and how similar functions are available for other IDEs for other languages. There are many Python IDEs, but perhaps because of its crossplatform credentials, some of the more popular ones are commercial. This isn’t ideal when you’re starting out, so we’re going to forgo the commercial options and look at a free alternative. The one we’ve chosen is called Komodo Edit. It’s open source and fairly comprehensive, but it’s also the little brother of a closed source commercial version called Komodo IDE, so your skills will be transferable if you need a more comprehensive solution. Installation from the download is as easy as untarring the file and running ./install.sh in the new directory. Most distributions will now show a shortcut to the IDE on your desktop, or you can run the bin/komodo executable from your home directory. Your first view of this application might be a little overwhelming, as the default configuration offers a large news pane at the top, mostly containing an advertisement for the commercial version. But spend a few moments familiarising yourself, and it will soon feel like home. The two panels on the left, for instance, contain a simple file manager at the top and a project view at the bottom. A ‘project’ is what an IDE calls the glut of code, configuration and IDE files that come together to create a single application or project. You can create a new project by clicking on the small symbol to the right, and when you’ve created a new
“Syntax highlighting is as useful for beginners as it is for experts.”
Syntax highlighting and code completion are two of the best reasons for using an integrated development environment such as Komodo.
18
Code concepts project, you can drag source files onto the project name, or right-click on it, to add new files. We’d recommend starting with a new Python 3-derived template. The New File dialog allows you to choose between many different languages supported by Komodo, but all this really does is pre-define an environmental variable at the top of the file and make sure the file extension is correct.
Code constructions With a fresh file ready for your Python code, we’ll now give some examples of how an IDE will help with code constructions. Taking a cue from previous tutorials in the Code Concepts series, type import. As we’ve covered previously, this is the command to add extra functionality to Python by importing code from other modules or libraries. When using a simple text editor, you had to know the exact name of the module you wanted to import. With Komodo, you’ll be presented with a list of modules that are already installed, and you just have to choose the one you’re after. Choose math to add the mathematical functions. Now add the following code to the file: def square(x): return x * x As you might remember, this is a super-simple function that returns the square of x, and you should have found typing those two lines easier than with a text editor. The tab would have been added automatically, for example, and now when you type print square (10) on a separate line, Komodo already knows about your new square function and prompts you to include a value within its brackets. Unfortunately, Komodo Edit doesn’t integrate the running functionality within the application, which means you need to run your scripts semi-manually. Pressing Ctrl+R or selecting Tools > Run Command from the menu opens a small dialog, and into this you need to type %(python) %F. All this is doing is replacing %(python) with the name of the default Python executable, as defined within the Preferences panel, and %F with the full path to the script you’re currently editing. Running this command will attempt to execute the script, printing any output into a new panel that appears below the editor. If there are any errors, they’ll also appear in the Command Output panel and you can click on the errors to force the editor to jump to their position within the file you’re editing.
Komodo includes many templates for starting a project, but if you find yourself with the same setup each time, you can also create your own.
stepped through explains how the majority of IDEs work. They act mostly as an advanced editor sitting on top of the tools that run the code or build the binaries. Now we’ve executed the run command, we can add it to Komodo’s toolbox, and from there create a keybinding to run the same command when we need to run our scripts. From the Run dialog, enable the Add to Toolbox checkbox. A panel on the right will appear, complete with the Python command we’ve been using to execute our code. Right-click this and select Properties. From here, you can rename the command to something slightly friendlier and use the Key Binding page to assign a keyboard shortcut to the function. Back at the code face, and to explore Komodo Edit further, start another function called def square_root(x):, and for the code within the function, type math. What you’ll then see is the list of functions provided by the math module we imported with the previous command. Select math.sqrt and as soon as you add the first (, you’ll see a small pop-up box that informs you of exactly what the function is going to do, complete with its expectations for passed variables. This is what makes Komodo Edit so powerful for learning how to use a language. It helps beginners to work on their projects without having to refer constantly to the documentation.
“You should also take a look at the Syntax Checking Status page.”
Syntax checking You should also take a look at the Syntax Checking Status page. This updates in real-time to show you any errors that creep into your code as you’re typing, such as an incorrect indentation when you create a new function in Python. This is also why you need to make sure the correct language is preconfigured from the drop-down menu on the bottom-right of the screen, as this is where Komodo loads all its languagedependent intelligence from. This is set automatically when you create a Python project. You should also see that when you do create a function in the editor, small square brackets in the left encase those sections of code that are logically disconnected from the main flow of execution. You can click on the small plus icons to fold these sections away to make your code easier to read. Many IDEs, and even editors, offer the same facility. In fact, the simple process we’ve just
Part-time programmers Some developers argue that code completion promotes lazy programming because, they say, you never really learn a language while you let an application complete function names for you and highlight any mistakes. We think this is partly true, but it doesn’t take into account hobbyist or parttime programmers. Professionals spend every day of their working lives surrounded by code, so their best option is always going to be to master the language they rely on for a living. It will happen without them trying. But for those of us who code only occasionally, when our schedules allow, tools such as code completion and syntax highlighting can make us more productive. And this is where IDEs can make a massive difference. Q 19
Code concepts
Code concepts: Write a program Jonathan Roberts shows you how to re-implement classic Unix tools to bolster your Python knowledge and build real programs.
I
n the next few pages, we’re aiming to get you writing real programs. Over the next few tutorials, we’re going to create a Python implementation of the popular Unix tool cat. Like all Unix tools, cat is a great target because it’s small and focused on a single task, all the while using several different operating system features, including accessing files, pipes, and so on. This means it won’t take too long to complete, but at the same time will expose you to a selection of Python’s core features in the Standard Library, and once you’ve mastered the basics, it’s learning the ins-and-outs of your chosen language’s libraries that will let you get on with real work.
Our goal for the project overall is to: Create a Python program, cat.py, that when called with no arguments accepts user input on the standard input pipe until an end of line character is reached, at which point it sends the output to standard out. When called with file names as arguments, cat.py should send each line of the files to standard output, displaying the whole of the first file and then the whole of the second file. It should accept two arguments: -E, which will make it put $ signs at the end of each line and -n, which will make it put the current line number at the beginning of each line. This time, we’re going to create a cat clone that can work with any number of files passed to it as arguments on the command line. We’re going to be using Python 3, so if you want to follow along, make sure you’re using the same version, as some features are not backwards-compatible with Python 2.x.
“You now know more than enough to start writing real programs.”
Python files
The final program we’ll be implementing. It’s not long, but it makes use of a lot of core language features you’ll be able to re-use time and again.
20
Let’s start at the easiest part of the problem: displaying the contents of a file, line by line, to standard out. In Python, you access a file with the open function, which returns a fileobject that you can later read from, or otherwise manipulate. To capture this file-object for use later in your program, you need to assign the result of running the open function to a variable, like so: file = open(“hello.txt”, “r”) This creates a variable, file, that will later allow us to read the contents of the file hello.txt. It will only allow us to read from this file, not write to it, because we passed a second argument to the open function, r, which specified that the file should be opened in read-only mode. With access to the file now provided through the newlycreated file object, the next task is to display its contents, line by line, on standard output. This is very easy to achieve, since in Python files are iterable objects. Iterable objects, like lists, strings, tuples and dictionaries, allow you to access their individual member elements one at a time through a for loop. With a file, this means you can access each line contained within simply by putting it in a for loop, as follows: for line in file: print(line) The print function then causes whatever argument you pass to it to be displayed on standard output.
Code concepts If you put all this in a file, make it executable and create a hello.txt file in the same directory, you’ll see that it works rather well. There is one oddity, however – there’s an empty line between each line of output. The reason this happens is that print automatically adds a newline character to the end of each line. Since there’s already a newline character at the end of each line in hello.txt (there is, even if you can’t see it, otherwise everything would be on one line!), the second newline character leads to an empty line. You can fix this by calling print with a second, named argument such as: print(line, end=””). This tells print to put an empty string, or no character, at the end of each line instead of a newline character.
Passing arguments This is perfectly fine, but compared to the real cat command, there’s a glaring omission here: we would have to edit the program code itself to change which file is being displayed to standard out. What we need is some way to pass arguments on the command line, so that we could call our new program by typing cat.py hello.txt on the command line. Since Python has ‘batteries included’, this is a fairly straightforward task as well. The Python interpreter automatically captures all arguments passed on the command line, and a module called sys, part of the Standard Library, makes this available to your code. Even though sys is part of the standard library, it’s not available to your code by default. Instead, you first have to import it to your program and then access its contents with dot notation – don’t worry, we’ll explain this in a moment. First, to import it to your program, add: import sys to the top of your cat.py file. The part of the sys module that we’re interested in is the argv object. This object stores all of the arguments passed on the command line in a Python list, which means you can access and manipulate it using various techniques we’ve seen in past Code Concepts and will show in future ones. There are only two things you really need to know about this. They are: The first element of the list is the name of the program itself – all arguments follow this. To access the list, you need to use dot notation – that is to say, argv is stored within sys, so to access it, you type sys.argv, or sys.argv[1] to get the first argument to your program. Knowing this, you should now be able to adjust the code we created previously by replacing hello.txt with sys.argv[1]. When you call cat.py from the command line, you can then pass the name of any text file, and it will work just the same.
The output of the real Unix command, cat, and our Python re-implementation, are exactly the same in this simple example.
since this is the name of the program itself. If you think back to our previous article on data types and common list operations, you’ll realise that this is easily done with a slice. This is just one line: for file in sys.argv[1:]: Because operating on all the files passed as arguments to a program is such a common operation, Python provides a short cut for doing this in the Standard Library, called fileinput. To use this shortcut, you must first import it by putting import fileinput at the top of your code. You will then be able to use it to recreate the rest of our cat program so far, as follows: for line in fileinput.input(): print(line, end=””) This shortcut function takes care of opening each file in turn and then making all their lines accessible through a single iterator. That’s about all that we have space for in this tutorial. Although there’s not been much code in this example, we hope you’ve started to get a sense for how much is available in Python’s Standard Library (and therefore how much work is available for you to recycle), and how a good knowledge of its contents can save you a lot of work when implementing new programs. Q
“The part of the sys module we’re interested in is the argv object.”
Many files Of course, our program is meant to accept more than one file and output all their contents to standard output, one after another, but as things stand, our program can only accept one file as an argument! To fix this particular problem, you need to loop over all the files in the argv list. The only thing you need to be careful of when you do this is that you exclude the very first element,
21
Code concepts
Code concepts: Add features Jonathan Roberts’ tour of the Python programming language continues, as we write a clone of the Unix cat command.
L
ast time, we showed you how to build a simple cat clone in Python. In this tutorial, we’re going to add some more features to our program, including the ability to read from the standard input pipe, just like the real cat, and the ability to pass options to your cat clone. So, without further delay, let’s dive in. Fortunately, you already know everything you need to interact with the standard input pipe. In Linux, all pipes are treated just like files: you can pass a file as an argument to a command, or you can pass a pipe as an argument – it doesn’t matter which you do, because they’re basically the same thing. In Python, the same is true. All you need to get to work with the standard input pipe is access to the sys library, which if you followed along last time you already have. Let’s write a little sample program first to demonstrate: import sys for line in sys.stdin: print(line, end=””) The first line imports the sys module. The lines that follow are almost identical to those we had last time. Rather than specifying the name of a file, however, we specified the name of the file-object, stdin, which is found inside the sys module. Just like a real file, in Python the standard input pipe is an iterable object, so we use a for loop to walk through each line. You might be wondering how this works, however, since standard input starts off empty. If you run the program, you’ll see what happens. Rather than printing out everything that’s present straight away, it will simply wait. Every time a new line character is passed to standard input (by pressing
return), it will then print everything that came before it to standard output. Right, now we have two modes that our program can operate in, but we need to put them together into a single program. If we call our program with arguments, we want it to work like last time – that is, by concatenating the files’ contents together; if it’s called without any arguments, we want our program to work by repeating each line entered into standard input. We could easily do this with what we know so far: simply check to see what the length of the sys.argv array is. If it’s greater than 1, do last lesson’s version, otherwise do this version: if len(sys.argv) > 1: [last month...] else: [this month...] Pretty straightforward. The only point of interest here is the use of the len() function, seeing as we’re on a journey to discover different Python functions. This function is built in to Python, and can be applied to any type of sequence object (a string, tuple or list) or a map (like a dictionary), and it always tells you how many elements are in that object. There are more useful functions like this, which you can find at http://docs.python.org/3/library/functions.html.
“Python provides us with a much more powerful alternative to sys.argv.”
The Python language comes with all the bells and whistles you need to write useful programs. In this example, you can see the replace method applied to a string in order to remove all white space – in the tutorial, we used the rstrip method for a similar purpose.
22
Parsing arguments and options This is quite a simplistic approach, however, and Python provides us with a much more powerful alternative to sys.argv. To demonstrate, we’re going to add two options to our program that will modify the output generated by our program. You may not have realised it, but cat does in fact have a range of options. We’re going to implement the -E, which shows dollar symbols at the end of lines, and -n, which displays line numbers at the beginning of lines. To do this, we’ll start by setting up an OptionParser. This is a special object, provided as part of the optparse module, that will do most of the hard work for you. As well as automatically detecting options and arguments, saving you a lot of hard work, OptionParser will automatically generate help text for your users in the event that they use your program incorrectly or pass --help to it, like this: [jon@LT04394 ~]$ ./cat.py --help Usage: cat.py [OPTION]... [FILE]... Options:
Code concepts The Python 3 website provides excellent documentation for a wealth of built-in functions and methods. If you ever wonder how to do something in Python, docs. python.org/3/ should be your first port of call.
-h, --help show this help message and exit -E Show $ at line endings -n Show line numbers just like a real program! To get started with OptionParser, first import the necessary components: from optparse import OptionParser You may notice that this looks a bit different to what we saw before. Instead of importing the entire module, we’re only importing the OptionParser object. Next, you need to create a new instance of the object, add some new options for it to detect with the add_option method, and pass it a usage string to display: usage = “usage: %prog [option]... [file]...” parser = OptionParser(usage=usage) parser.add_option(“-E”, dest=”showend”, action=”store_ true”, help=”Show $ at line endings”) parser.add_option(“-n”, dest=”shownum”, action=”store_ true”, help=”Show line numbers”) The %prog part of the usage string will be replaced with the name of your program. The dest argument specifies what name you’ll be able to use to access the value of an argument once the parsing has been done, while the action specifies what that value should be. In this case, the action store_true says to set the dest variable to True if the argument is present, and False if not. You can read about other actions at http://docs.python. org/3/library/optparse.html. Finally, with everything set, you just need to parse the arguments that were passed to your program and assign the results to two array variables: (options, args) = parser.parser_args() The options variable will contain all user-defined options, such as -E or -n, while args will contain all positional arguments left over after parsing out the options. You can call these variables whatever you like, but the two will always be set in the same order, so don’t confuse yourself by putting the variables the other way around!
With the argument-parsing code written, you’ll next want to start implementing the code that will run when a particular option is set. In both cases, we’ll be modifying the string of text that’s output by the program, which means you’ll need to know a little bit about Python’s built-in string editing functions. Let’s think about the -E, or showend, option first. All we want this to do is replace the invisible line break that’s at the end of every file (or every line of the standard input pipe, as implied by pressing return), and replace it with a dollar symbol followed by a line break. The first part, removing the existing new line, can be achieved by the string.rstrip() method. This removes all white space characters by default, at the right-hand edge of a string. If you pass a string to it as an argument, it will strip those characters from the right-hand edge instead of white space. In our case, just white space will do.
Completing the job The second part of the job is as simple as setting the end variable in the print statement to the string $\n and the job is almost complete. We say almost complete because we still need to write some more logic to further control the flow of the program based on what options were set, as well as whether or not any arguments are passed. The thing is, this logic needs to be a bit more complicated than it ordinarily would be because we need to maintain a cumulative count of lines that have been printed as the program runs to implement the second -n, or shownum, option. While there are several ways you could achieve this, in the next tutorial we’re going to introduce you to a bit of object orientated programming in Python and implement this functionality in a class. We’ll also introduce you to a very important Python convention – the main() function and the name variable. In the meantime, you can keep yourself busy by investigating the string.format() method and see if you can figure out how you can append a number to the beginning of each line. Q
“Don’t confuse yourself by putting the variables the other way around!”
23
Code concepts
Code concepts: Put it all together Jonathan Roberts’ guide to the Python programming language continues. In this tutorial, we’re going to finish our clone of cat.
W
e’ve come quite a long way over the last two tutorials, having implemented the ability to echo the contents of multiple files to the screen, the ability to echo standard input to the screen and the ability to detect and act upon options passed by the user of our program. All that remains is for us to implement the line number option and to gather together everything else we’ve written into a single, working program.
nested for loops, although they’re not nearly as readable as object-oriented code! When building complicated programs, figuring out how to organise them so they remain easy to read; it’s easy to track which variables are being used by which functions; and easy to update, extend, or add new features, can be challenging. To make this easier, there are various ‘paradigms’ that provide techniques for managing complexity. One of these paradigms is object-oriented programming. In object-oriented programming, the elements of the program are broken down into objects which contain state – that is variables – that describe the current condition of the object, and methods, that allow us to perform actions on those variables or with that object. It’s a very natural way of thinking, because it mirrors the real world so closely. I can describe a set of properties about my hand, such as having five fingers that are in certain locations, and I can describe certain methods or things I can do with my hand, such as moving one finger to press a key, or holding a cup. My hand is an object, complete with state and methods that let me work with it. We’re going to turn our cat program into an object, where its state records how many lines have been displayed, and its methods perform the action of the cat program – redisplaying file contents to the screen.
“It’s a very natural way of thinking because it mirrors the real world.”
Objects Last time, we ended by saying that there are many ways we could implement the line counting option in our program. We’re going to show you how to do it in an object-oriented style, as it gives us an excuse to introduce you to this aspect of Python programming. You could, however, with a bit of careful thought, implement the same function with some
Python objects
Just to prove that it works, here’s our cat implementation, with all of the options being put to use.
24
Python implements objects through a class system. A class is a template, and an object is a particular instance of that class, modelled on the template. We define a new class with a keyword, much like we define a new function: class catCommand: Inside the class, we specify the methods (functions) and state that we want to associate with every instance of the object. There are some special methods, however, that are often used. One of these is the init method. This is run when the class is first instantiated in to a particular object, and allows you to set specific variables that you want to belong to that object. def __init__(self): self.count = 1 In this case, we’ve assigned 1 to the count variable, and we’ll be using this to record how many lines have been displayed. You probably noticed the self variable, passed as the first argument to the method, and wondered what on
Code concepts Earth that was about. Well, it is the main distinguishing feature between methods and ordinary functions. Methods, even those with no other arguments, must have the self variable. It is an automatically populated variable, that will always point to the particular instance of the object that you’re working with. So self.count is a count variable that’s exclusive to individual instances of the catCommand object.
The run method We next need to write a method that will execute the appropriate logic depending on whether certain options are set. We’ve called this the run method: def run(self, i, options): #set default options e = “” for line in i: #modify printed line according to options if options.showend: [...last month] if options.shownum: line = “{0} {1}”.format(self.count, line) self.count += 1 print(line, end=e) Notice that we’ve passed the self variable to this method, too. The two other arguments passed to this function are arguments that we’ll pass when we call the method later on, just like with a normal function. The first, i, is going to be a reference to whichever file is being displayed at this moment, while the options variable is a reference to the options decoded by the OptParse module. The logic after that is fairly clear: for each line in the current file, modify the line depending on what options have been set. Either we do as last tutorial, and modify the end character to be “$\n” or we modify the line, using the .format method that we suggested you research last time, to append the count variable, defined in the init method, to the rest of the line. We then increment the count and print the line. The most important part is the use of self. It lets us refer to variables stored within the current instance of the object. Because it’s stored as part of the object, it will persist after the current execution of the run method ends. As long as we use the run method attached to the same object each time we cat a new file in the argument list, the count will remember how many lines were displayed in the last file, and continue to count correctly. It might seem more natural, given the description of methods as individual actions that can be taken by our objects, to split each argument into a different method, and this is a fine way to approach the problem. The reason we’ve done it this way is we found it meant we could re-use more code, making it more readable and less error-prone. Now all that’s left to do is to tie everything together. We’re going to do this by writing a main function. This isn’t required in Python, but many programs follow this idiom, so we will too: def main(): [option parsing code ...] c = catCommand()
The completed program isn’t very long, but it has given us a chance to introduce you to many different aspects of the Python language.
if len(args) > 1: for a in args: f = open(a, “r”) c.run(f, options) else: c.run(sys.stdin, options) We’ve not filled in the object parsing code from last time, because that hasn’t changed. What’s new is the c = catCommand() line. This is how we create an instance of a class, how we create a new object. The c object now has a variable, count, that is accessible by all its methods as the self.count variable. This is what will allow us to track line numbers. We then check to see whether any arguments have been passed. If they have, we call the run method of the object c for each file that was passed as an argument, passing in any options extracted by OptParse along the way. If there weren’t any arguments, we’d simply call the run method with sys.stdin instead of a file object. The last thing we need to do is actually call the main function when the program is run: if __name__ == “__ main__”: main() These last two lines are the oddest of all, but quite useful in a lot of circumstances. The name variable is special – when the program is run on the command line, or otherwise as a standalone application, it is set to main; when it’s imported as an external module to other Python programs, it’s not. This way, we can automatically execute main when run as a standalone program, but not when importing it as a module. Q
“The last thing to do is call the main function when the program runs.”
25
Code concepts
Code concepts: Data modules Jonathan Roberts introduces a way of untangling the mess of your code and adding structure to your programs.
I
n the last few code concept tutorials, we’ve mentioned how programming is all about managing complexity, and we’ve introduced you to quite a few of the tools and techniques that help programmers do this. From variables to function definitions or object orientation – they all help. One tool we’ve yet to cover, in part because you don’t start to come across it until you’re writing larger programs, is the idea of modules and name spaces. Yet, if you’ve written a program of any length, even just a few hundred lines, this is a tool that you’re no doubt desperate for. In a long file of code, you’ll have noticed how quickly it becomes more difficult to read it. Functions seem to blur in to one another, and when you’re trying to hunt down the cause of the latest error, you find it difficult to remember exactly where you defined that all important variable. These problems are caused by a lack of structure. With all your code in a single file, it’s harder to determine the dependencies between elements of your program – that is, which parts rely on each other to get work done – and it’s harder to visualise the flow of data through your program. As your programs grow in length, other problems also occur. For instance, you may find yourself with naming conflicts, as two different parts of your program require functions called add (adding integers or adding fractions in a mathematics program, for example), or you may have written a useful function that you want to share with other programs that you’re writing, and the only tool you have to do that is boring and error-prone copy and pasting.
“As your programs grow in length, other problems also occur.”
Untangling the mess Modules are a great way to solve all of these problems, letting you put structure back in to your code, enabling you to avoid naming conflicts, and making it easier for you to share useful chunks of code between programs. You’ve no doubt been using them all the time in your code as you’ve relied on Python built-in or third-party modules to provide lots of extra functionality. As an example, remember the optparse module we used on page 22. We included it in our program with the import statement, like so: import optparse After putting this line at the top of our Python program, we magically got access to a whole load of other functions that automatically parsed command line options. We could access
26
them by typing the module’s name, followed by the name of the function we wanted to execute: optparse.OptionParser() This was great from a readability perspective. In our cat clone, we didn’t have to wade through lots of code about parsing command line arguments; instead we could focus on the code that dealt with the logic of echoing file contents to the screen and to the standard output pipe. What’s more, we didn’t have to worry about using names in our own program that might collide with those in the optparse module, because they were all hidden inside the optparse namespace, and reusing this code was as easy as typing import optparse – no messy copy and pasting here.
How modules work Modules sound fancy and you might think they’re complicated, but – in Python at least – they’re really just plain old files. You can try it out for yourself. Create a new directory and inside it create a fact.py file. Inside it, define a function to return the factorial of a given number: def factorial(n): result = 1 while n > 0: if n == 1: result *= 1 else: result *= n n -= 1 return n Then, create a second Python file called doMath.py. Inside this, first import the module you just created and then execute the factorial function, printing the result to the screen: import fact print fact.factorial(5) Now, when you run the doMath.py file, you should see 120 printed on the screen. You should notice that the name of the module is just the name of the file, in the same directory, with the extension removed. We can then call any function defined in that module by typing the module’s name, followed by a dot, followed by the function name.
The Python path The big question that’s left is, how does Python know where to look to find your modules? The answer is that Python has a pre-defined set of locations that it looks in to find files that match the name specified in your import statements. It first looks inside all of the built-in modules, the location of which are defined when
Code concepts By splitting your code up in to smaller chunks, each placed in its own file and directory, you can bring order to your projects and make future maintenance easier.
you install Python; it then searches through a list of directories known as the path. This path is much like the Bash shell’s $PATH environment variable: it uses the same syntax, and serves exactly the same function. It varies, however, in how the contents of the Python path are generated. Initially, the locations stored in the path consist of the following two locations: The directory containing the script doing the importing. The PYTHONPATH, which is a set of directories predefined in your default installation. You can inspect the path in your Python environment by importing the sys module, and then inspecting the path attribute (typing sys.path will do the trick). Once a program has started, it can even modify the path itself and add other locations to it.
Variable scope in modules Before you head off and start merrily writing your own modules, there’s one more thing you need to know about: variable scope. We’ve no doubt talked about scope as a concept before, but as a quick refresher, scope refers to the part of a program from which particular variables can be accessed. For instance, a single Python module might contain the following code: food = [“apples”, “oranges”, “pears”]
print food def show_choc(): food = [“snickers”, “kitkat”, “dairy milk”] print food show_choc() print food If you run that, you’ll see that outside the function the variable food refers to a list of fruit, while inside the function, it refers to a list of chocolate bars. This small program demonstrates two different scopes: the global scope of the current module, in which food refers to a list of fruit, and the local scope of the function, in which food refers to a list of chocolate. When looking up a variable, Python starts with the innermost variable and works its way out, starting with the immediately enclosing function, and then any functions enclosing that, and then the module’s global scope, and then finally it will look at all the built-in names. In simple, single-file programs, it’s a bad idea to put variables in the global scope. It can cause confusion and subtle errors elsewhere in your program. Modules help with this problem, because each module has its own global scope. As we saw above, when we import a module, its contents are all stored as attributes of the module’s name, accessed via dot notation. This makes global variables less troublesome, although you should still be careful when using them. Q
Python style While many people think of Python as a modern language, it’s actually been around since the early 1990s. As with any programming language that’s been around for any length of time, people who use it often have learned a lot about the best ways to do things in the language – in terms of the easiest way
to solve common problems, and the best ways to format your code to make sure it’s readable for co-workers and anyone else working on the code with you (including your future self!). If you’re interested in finding out more about these best practices in Python, there are two very useful resources from
which you can start learning: http://python.net/~goodger/ projects/pycon/2007/idiomatic/ handout.html www.python.org/dev/peps/pep0008 Read these and you’re sure to gain some deeper insight into the language.
27
Code concepts
Code concepts: Data storage Jonathan Roberts explains how to deal with persistent data and store your files in Python.
S
torage is cheap: you can buy a 500GB external hard drive for less than £40 these days, and even smartphones come with at least 8GB of storage, and many are easily expandable up to 64GB for only the price of a pair of jeans. It’s no surprise, then, that almost every modern application stores data in one way or another, whether that’s configuration data, cached data to speed up future use, saved games, to-do lists or photos. The list goes on and on. With this in mind, this Code Concepts is going to demonstrate how to deal with persistent data in our language of choice – Python. The most obvious form of persistent storage that you can take advantage of in Python is file storage. Support for it is included in the standard library, and you don’t even have to import any modules to take advantage of it! To open a file in the current working directory (that is, wherever you ran the Python script from, or wherever you were when you launched the interactive shell), use the open() function: file = open(“lxftest.txt”, “w”) The first argument to open is the filename, while the second specifies which mode the file should be opened in – in this case, write, but other valid options include read-only (r) and append (a). In previous issues, we’ve shown you that this file object is in fact an iterator, which means you can use the in keyword to loop through each line in the file and deal with its contents, one line at a time. Before reviewing that information, however, let’s look at how to write data to a file.
“Almost every modern application stores data in one way or another.”
Writing to files Suppose you’re writing your own RSS application to replace Google Reader. You’ve already got some way to ask users to enter in a list of feeds (perhaps using raw_input(), or perhaps using a web form and CGI), but now you want to store that list of feeds on disk so you can re-use it later when you’re checking for new updates. At the moment, the feeds are just stored in a Python list: feeds = [“http://newsrss.bbc.co.uk/rss/newsonline_uk_ edition/front_page/rss.xml”, “http://www.tuxradar.com/rss”] To get the feeds in to the file is a simple process. Just use the write method: for feed in feeds: file.write(“{0}\n”.format(feed))
28
Easy! Notice how we used the format string function to add a new line to the end of each string, otherwise we’d end up with everything on one line – which would have made it harder to use later. Re-using the contents of this file would be just as simple. Using the file as an iterator, load each line in turn in to a list, stripping off the trailing new line character. We’ll leave you to figure this one out. When using files in your Python code, there are two things that you need to keep in mind. The first is that you need to convert whatever you want to write to the file to a string first. This is easy, though, since you can just use the built-in str() function, eg, str(42) => “42”. The second is that you have to close the file after you’ve finished using it – if you don’t do this, you risk losing data that you thought had been committed to disk, but that had not yet been flushed. You can do this manually with the close method of file objects. In our example, this would translate to adding file.close() to our program. It’s better, however, to use the with keyword: with open(“lxf-test.txt”, “a”) as file: feeds = [line.rstrip(“\n”) for line in f] This simple piece of Python handles opening the file object and, when the block inside the with statement is finished, automatically closes it for us, too! If you’re unsure what the second line does, look up Python list comprehensions; they’re a great way to write efficient, concise code and to bring a little bit of functional style in to your work.
Serialising Working with files would be much easier if you didn’t have to worry about converting your list (or dictionary, for that matter) in to a string first of all – for dictionaries in particular, this could get messy. Fortunately, Python provides two tools to make this easier. The first of these tools is the pickle module. Pickle accepts many different kinds of Python objects, and can then convert them to a character string and back again. You still have to do the file opening and closing, but you no longer have to worry about figuring out an appropriate string representation for your data: import pickle ... with open(“lxf-test.txt”, “a”) as file: pickle.dump(feeds, file) … with open(“lxf-test.txt”, “r”) as file: feeds = pickle.load(file)
Code concepts
If you’re interested in persistent data in Python, a good next stopping point is the ZODB object database. It’s much easier and more natural in Python than a relational database engine (www.zodb.org).
This is much easier, and it has other applications outside of persisting data in files, too. For example, if you wanted to transfer your feed list across the network, you would first have to make it in to a character string, too, which you could do with pickle. The problem with this is that it will only work in Python – that is to say, other programming languages don’t support the pickle data format. If you like the concept of pickling (more generically, serialising), there’s another option that does have support in other languages, too: JSON. You may have heard of JSON – it stands for JavaScript Object Notation, and is a way of converting objects into human-readable string representations, which look almost identical to objects found in the JavaScript programming language. It’s great, because it’s human readable, and also because it’s widely supported in many different languages, largely because it’s become so popular with fancy web 2.0 applications. In Python, you use it in exactly the same way as pickle – in the above example, just replace pickle with json throughout, and you’ll be writing interoperable, serialised code!
Shelves Of course, some code bases have many different objects that you want to store persistently between runs, and keeping track of many different pickled files can get tricky. There’s another Python standard module, however, that uses Pickle underneath, but makes access to the stored objects more intuitive and convenient: the Shelve module. Essentially, a shelf is a persistent dictionary – that is to say, a persistent way to store key-value pairs. The great thing about shelves, however, is that the value can be any Python object that Pickle can serialise. Let’s take a look at how you can use it. Thinking back to our RSS reader application, imagine that as well as the list of feeds to check, you wanted
to keep track of how many unread items each feed had, and which item was the last to be read. You might do this with a dictionary, eg, tracker = { “bbc.co.uk”: { “last-read”: “foo”, “num-unread”: 10, }, “tuxradar.co.uk”: { “last-read”: “bar”, “num-unread”: 5, }} You could then store the list of feeds and the tracking details for each in a single file by using the shelve module, like so: import shelve shelf = shelve.open(“lxf-test”) shelf[“feeds”] = feeds shelf[“tracker”] = tracker shelf.close() There are a few important things that you should be aware of about the shelve module: The shelve module has its own operations for opening and closing files, so you can’t just use the standard open function. To save some data to the shelf, you must first use a standard Python assignment operation to set the value of a particular key to the object you want to save. As with files, you must close the shelf object once finished with, otherwise your changes may not be stored. Accessing data inside the shelf is just as easy. Rather than assigning a key in the shelf dictionary to a value, you assign a value to that stored in the dictionary at a particular key: feeds = shelf[“feeds”]. If you want to modify the data that was stored in the shelf, modify it in the temporary value you assigned it to, then re-assign that temporary value back to the shelf before closing it again. That’s about all we have space for this tutorial, but keep reading, as we’ll discuss one final option for persistent data: relational databases (eg, MySQL). Q 29
Code concepts
Code concepts: Data organisation Jonathan Roberts uses SQL and a relational database to add some structure to his extensive 70s rock collection.
I
n the last tutorial, we looked at how to make data persistent in your Python programs. The techniques we looked at were flat-file based, and as useful as they are, they’re not exactly industrial scale. As your applications grow more ambitious, as performance becomes more important, or as you try to express more complicated ideas and relationships, you’ll need to look towards other technologies, such as an object database or, even, a relational database. As relational databases are by far the most common tool for asking complex questions about data today, in this tutorial we’re going to introduce you to the basics of relational databases and the language used to work with them (which is called SQL, or Structured Query Language). With the basics mastered, you’ll be able to start integrating relational databases into your code. To follow along, make sure you’ve got MySQL or one of its drop-in replacements installed and can get access to the MySQL console: mysql -uroot If you’ve set a password, use the -p switch to give that as well as the username. Throughout, we’ll be working on a small database to track our music collection.
“Relational databases are used to ask complex questions about data.”
Relationships Let’s start by thinking about the information we want to store in our music collection. A logical place to start might be thinking about it in terms of the CDs that we own. Each CD is a single album, and each album can be described by lots of other information, or attributes, including the artist who created the album and the tracks that are on it. We could represent all of this data in one large, homogeneous table – like the one below – which is all well
Duplicated data
30
Album
Free At Last
Free At Last
Artist
Free
Free
Track
Little Bit of Love
Travellin’ Man
Track Time
2:34
3:23
Album Time
65:58
65:58
Year
1972
1972
Band Split
1973
1973
Relational database Album Name
Free At Last
Running Time
65:58
Year
1972
Artist_id
1
and good, but very wasteful. For every track on the same album, we have to duplicate all the information, such as the album name, its running time, the year it was published, and all the information about the artist, too, such as their name and the year they split. As well as being wasteful with storage space, this also makes the data slower to search, harder to interpret and more dangerous to modify later. Relational databases resolve these problems by letting us split the data and store it in a more efficient, useful form. They enable us to identify separate entities within the database that would benefit from being stored in independent tables. In our example, we might split information about the album, artist and track into separate tables. We would then only need to have a single entry for the artist Free (storing the name and the year they split), a single entry for the album Free At Last (storing its name, the year published and the running time), and a single entry for each track in the database (storing everything else) in each of their respective tables. All that duplication is gone, but now all the data has been separated, what happens when you want to report all the information about a single track, including the artist who produced it and the album it appeared on? That’s where the ‘relational’ part of relational database comes in. Every row within a database table must in some way be unique, either based on a single unique column (eg unique name for an artist, or unique title for an album), or a combination of columns (eg album title, year published). These unique columns form what is known as a primary key. Where a natural primary key (a natural set of unique columns) doesn’t exist within a table, you can easily add an artificial one in the form of an automatically incrementing integer ID. We can then add an extra column to each of our tables that references the primary key in another table. For example, consider the table above. Here, rather than giving all the information about the artist in the same table, we’ve simply specified the unique ID for a row in another table, probably called Artist. When we want to present this album to a user, in conjunction with information about the artist who published
Code concepts it, we can get the information first from this Album table, and then retrieve the information about the artist, whose ID is 1, from the Artist table, combining it together for presentation.
SQL That, in a nutshell, is what relational databases are all about. Splitting information into manageable, reusable chunks of data, and describing the relationships between those chunks. To create these tables within the database, to manage the relationships, to insert and query data, most relational databases make use of SQL, and now that you know what a table and a relationship is, we can show you how to use SQL to create and use your own. After logging into the MySQL console, the first thing we need to do is create a database. The database is the top-level storage container for bits of related information, so we need to create it before we can start storing or querying anything else. To do this, you use the create database statement: create database lxfmusic; Notice the semi-colon at the end of the command – all SQL statements must end with a semi-colon. Also notice that we’ve used lower-case letters: SQL is not case sensitive, and you can issue your commands in whatever case you like. With the database created, you now need to switch to it. Much as you work within a current working directory on the Linux console, in MySQL, many commands you issue are relative to the currently selected database. You can switch databases with the use command: use lxfmusic; Now to create some tables: create table Album ( Album_id int auto_increment primary key, name varchar(100) ); create table Track ( Track_id int auto_increment primary key, title varchar(100), running_time int, Album_id int ); The most obvious things to note here are that we’ve issued two commands, separated by semi-colons, and that we’ve split each command over multiple lines. SQL doesn’t care about white space, so you can split your code up however you like, as long as you put the right punctuation in the correct places. As for the command itself, notice how similar it is to the create database statement. We specify the action we want to take, the type of object we’re operating on and then the properties of that object. With the create database statement, the only property was the name of the database; with the create table statement, we’ve also got a whole load of extra properties that come inside the parentheses and are separated by commas. These are known as column definitions, and each commaseparated entry describes one column in the database. First, we give the column a name, then we describe the type of data that is stored in it (this is necessary in most databases), and then after that we specify any additional properties of that column, such as whether or not it is part of the primary key. The auto_increment keyword means that you don’t have to worry about specifying the value of Track_id when inserting data, as MySQL will ensure that this is an integer that gets incremented for every row in the database, thus forming a
MariaDB is a drop-in replacement for the MySQL database, and is quickly finding favour among distros including Mageia, OpenSUSE and even Slackware.
primary key. You can find out more about the create table statement in the MySQL documentation at http://dev. mysql.com/doc/refman/5.5/en/create-table.html.
Inserts and queries Inserting data into the newly created tables isn’t any trickier: insert into Album (name) values (“Free at Last”); Once again, we specify the action and the object on which we’re acting, we then specify the columns which we’re inserting into, and finally the values of the data to be put in. Before we can insert an entry into the Track table, however, we must discover what the ID of the album Free At Last is, otherwise we won’t be able to link the tables together very easily. To do this, we use the select statement: select * from Album where name = “Free At Last”; This command says we want to select all columns from the Album table whose name field is equal to Free At Last. Pretty self-explanatory really! If we’d only wanted to get the ID field, we could have replaced the asterisk with Album_id and it would have taken just that column. Since that returned a 1 for me (it being the first entry in the database), we can insert into the Track table as follows: insert into Track (title, running_time, Album_id) values (‘Little Bit of Love’, 154, 1); The big thing to note is that we specified the running time in seconds and stored it as an integer. With most databases, you must always specify a data type for your columns, and sometimes this means you need to represent your data in a different manner than in your application, and you’ll need to write some code to convert it for display. That said, MySQL does have a wide range of data types, so many eventualities are covered. That’s all we have space for this month, but don’t let your MySQL education stop there. Now you’ve seen the basics, you’ll want to investigate foreign keys and joins, two more advanced techniques that will let you be far more expressive with your SQL. You’ll also want to investigate the different types of relationship, such as one-to-one, one-to-many, many-to-one and many-to-many. Finally, if you want to integrate MySQL with your programming language of choice, look out for an appropriate module, such as the python-mysql module for Python. Q 31
Code concepts
Code concepts: Data encryption Learn the principles behind encryption – Ben Everard unpacks how it prevents snoopers from reading your data.
W
hen writing a program, we usually assume that the other applications are friendly. We don’t normally try to hide our data from them. In fact, we normally try to write our files so that other programs can read them. We use text encodings, XML and other standards to make sure that our files play nicely with other software. However, there are times when we want to keep our information to ourselves. Perhaps we need to transmit it across an insecure network, or put it on a USB key that could be lost. Whatever the case, we need to make sure that prying eyes can’t see what it is, and for this we use encryption. Before we get started, we should say that the first rule of data encryption is never try to create your own. Unless, that is, you have a PhD in the appropriate area of mathematics and plenty of experience. There are a number of standards that are generally considered unbreakable, and most languages have a good set of encryption libraries that support these. These libraries will be far more secure than any you can create yourself; just make sure you read the documentation properly to avoid using insecure options. Here we’re going to break this rule, but only to show how encryption works. All encryption starts with data you want to hide. It then applies some method of rendering that data unreadable. Ideally, it should have some method of recovering the data that only the original user can perform. Usually this is done with a password (when the password is a string of binary information rather than an alpha-numeric word, it’s referred
“The first rule of data encryption is this: never to try to create your own.”
to as a ‘key’, but the basic principal is the same). There are two different types of encryption: symmetric key and asymmetric key. The former uses the same key to decrypt information as it used to encrypt it, while the latter uses two different keys.
Symmetric encryption Here we’re going to take a look at a simple (and fairly insecure) method of symmetric encryption (also known as private key encryption) for text that uses the XOR (exclusive OR) function. XOR takes two binary digits as an input (which can each be either 0 or 1). It outputs a 1 if one of the two inputs is a 1, and a 0 if either none or both is (See the XOR Truth Table, right). We can XOR strings of data by XORing each item in turn. That simple XOR provides all we need for our encryption method which we’ll get onto in a minute. First we’ll look at the information we’re going to encrypt. Text, like all computer data, is stored in binary as 1s and 0s. ASCII text encoding stores each character as a string of eight ‘bits’ of binary information. For example, ‘B’ is 01000010, ‘e’ is 01100101 and ‘n’ is 01101110. ASCII text, then is just a chain of these characters, each one eight bits long. Ben, therefore is 010000100110010101101110. Now, back to our encryption method. We’re going to use a password that’s just a single character (we told you it wasn’t going to be secure!). It can be any single character. And our encryption method is to XOR each letter of our text with our key (See XORing characters box, above-right). That ciphertext is our unreadable text. Without knowing the key, there’s no way a program can read it… or is there? We know the key (it’s ‘A’), but how can we use this to get the original text back? Actually, it’s really simple. Back in
Asymmetric encryption Symmetric encryption is great for securing files, but it has some problems when securing communication. For starters, you need a secure way of sharing the keys with everyone. If you wanted an encrypted communication with, for example, Google, you’d somehow need to obtain a key, and this key would need to be different for every person Google communicated with otherwise they’d be able to eavesdrop. To get around this, we have asymmetric encryption (which is also known as public key encryption). In this method there are two keys,
32
one public and the other private. Anything that is encrypted with the public key can only be decrypted with the private key and visa versa. Now, for example, if you need an encrypted communication with Google, you only need to know Google’s public key. Using this public key encryption, you can exchange a single-use key for a session of symmetric encryption. For further info and an example of an asymmetric implementation, see Neil Bothwick’s tutorial at techradar.com/news/internet/data-privacy-howsafe-is-your-data-in-the-cloud--1170332/1.
Code concepts XORing characters b Text
0
1
0
Key A
0
1
0
e
0
0
0
1
0
0
1
1
0
0
0
0
1
0
1
0
XOR Cypher text
0
0
0
0
0
n
0
0
1
0
1
0
1
1
0
0
0
0
1
0
1
0
XOR 0
1
1
0
0
1
0
0
0
1
1
1
0
0
0
0
0
1
1
1
1
XOR 1
0
0
0
0
1
0
1
We can encrypt a stream by XORing each character in turn with our key.
school you may have learned that (a + b) + c = a + (b + c). That is, with addition at least, it doesn’t matter which order you do the addition; you always get the same result. Well, it turns out that the same thing is true of XOR. But before we can use that, there are two more things we need to know: anything that is XOR’d with itself is 0, and anything XOR’d with 0 is unchanged. So, key XOR key is 0 and text XOR 0 is text. That means that text XOR (key XOR key) is text, and we’ve just learned that this is the same as (text XOR key) XOR key. Since our cyphertext is just text XOR key, we now know that cyphertext XOR key is our original text. Basically, that’s just a really long way of saying that we can decrypt something exactly the same way we encrypted it.
Statistical attacks We now have our method of encryption and decryption, but it’s not very secure. First of all, there are only 256 possible keys (if we include all eight bit strings, and not just the ones that have ASCII characters), so it’s perfectly feasible for an attacker to check every one in turn. However, it turns out they don’t have to. In English text, some characters are very common, and others are quite rare. For example, the space character usually makes up 15–20% of all the characters in a piece of text and the lower-case E about 10% while lower-case Z can be as little as 0.02% and capital Z almost never. Since a given character in our text will always evaluate to the same character in our cyphertext, these proportions will come across. For example, given the cyphertext: 00001001 00011110 00000011 01110001 00110010 00111000 00100001 00111001 00110100 00100011 00100010 01111101 01110001 01100001 00100010 01110001 00101000 00111110 00100100 01110001 00100010 00110100 00110100 01111101 01110001 00110000 00100011 00110100 01110001 00111110 00111110 00100101 01110001 00100010 00110100 00110010 00100100 00100011 00110100 We can see that the most common 8-bit string is: 01110001.
XOR truth table Input 1
Input 2
XOR
0
0
0
0
1
1
1
0
1
1
1
0
We can take a guess that this is the encrypted form of the space character. We know that text XOR text is 0 and key XOR 0 is key. Therefore we know that (text XOR key) XOR text is key. So, if we XOR the most common character with space (00100000), we get the key. In this case, it’s 01010001 which corresponds to the ASCII character ‘Q’ Using this, we can decrypt the whole text to: XOR ciphers, as you see, are not secure This is known as a statistical attack and is one of the most common ways of attacking encryption. The simplest way to prevent your data falling victim to it is to use an encryption method that’s well known and been tested by the best minds in the business. AES, Two-fish, Three-fish and Serpent are all good choices. As we mentioned at the start, you should find a library that implements one or more of them in your language of choice. You may have noticed a slight flaw in this plan. How do you find out a server’s public key in the first place? After all, if an attacker could trick you into using their public key, they could read all the supposedly secure communications. We get around this by using certificates. When you install a web browser, it will come with public keys for a number of trusted certificate authorities. A website can then get one of these authorities to sign their public key. When you go to an encrypted web page, the server sends you a certificate (which contains both their public key, and the signature from the authority) and the encrypted method. Since you trust the authority, you can now trust this certificate, and read the page safe in the knowledge that it’s been securely transmitted and not tampered with. This method isn’t perfect. First, it requires you to trust a range of authorities (you can see how many authorities you trust in your web browser. For example, in Firefox, go to Edit > Preferences > Advanced > View Certificates > Authorities). It’s quite a few, and most are probably companies you’ve never heard of, much less trust. If a hacker or disgruntled employee gets into the computer systems at any one of them, they could intercept almost any encrypted web traffic they wanted. But that’s not the only way these certificates can be subverted. If an attacker can find a way of forging the digital signature, they can trick a computer into thinking a communication comes from a particular source when it doesn’t. In fact, it was this method that allowed the Flame malware to get into Iranian computers. Microsoft had used the insecure MD5 (rather than the more secure SHA) hash, and attackers were able to use this to make the computer think something was signed when it wasn’t. Be careful! Q 33
Code concepts
Code concepts: Spot mistakes Bug reports are useful, but you don’t really want to cause too many. Alex Cox explains what to avoid and how to avoid it.
I
t doesn’t matter how much care you put into writing your code. Even if you’ve had four cups of coffee and triplecheck every line you write, sooner or later you are going to make a mistake. It might be as simple as a typo – a missing bracket or the wrong number, or it could be as complex as broken logic, memory problems or just inefficient code. Either way, the results will always be the same – at some point, your program won’t do what you wanted it to. This might mean it crashes and dumps the user back to the command line. But it could also mean a subtle rounding error in your tax returns that prompts the Inland Revenue to send you a tax bill for millions of pounds, forcing you to sell your home and declare yourself bankrupt.
Finding the problems
The IDLE Python IDE has a debug mode that can show how your variables change over time.
34
How quickly your mistakes are detected and rectified is dependent on how complex the problem is, and your skills in the delicate art of troubleshooting. For instance, even though our examples of code from previous tutorials stretch to no more than 10 lines, you’ve probably needed to debug them as you’ve transferred them from these pages to the Python interpreter. When your applications grow more complex than just a few lines or functions, you can spend more time hunting down problems than you do coding. Which is why before you worry about debugging, you should follow a few simple rules while writing your code. The first is that, while you can’t always plan what you’re going to write or how you’re going to solve a specific problem, you should always go back and clean up whatever code you end up with. This is because it’s likely you’ll have used nowredundant variables and bolted on functionality into illogical places. Going back and cleaning up these areas makes the code easier to maintain and easier to understand. And making your project as easy to understand as possible
becomes important as it starts to grow, and you seldom revisit these old bits of code. Whenever you write a decent chunk of functionality, the second thing you should do is add a few comments to describe what it does and how it does it. Comments are simple text descriptions about what your code is doing, usually including any inputs and expected output. They’re not interpreted by the language or the compiler – they don’t affect how your code works, they are there purely to help other developers and users understand what a piece of code does. But, more importantly, they are there to remind you of what your own code does. This might sound strange, but no matter how clear your insight might have been when you wrote it, give it a few days, weeks or months, and it may as well have been written by someone else for all the sense it now makes. And as a programmer, one of the most frustrating things you have to do is solve a difficult problem twice – once when you create the code, and again when you want to modify it but don’t understand how it works. A line or two of simple description can save you days of trying to work out what a function calculates and how it works, or may even obviate the need for you to understand anything about what a piece of code does, as you need to know only the inputs and outputs.
The importance of documentation This is exactly how external libraries and APIs work. When you install Qt, for instance, you’re not expected to understand how a specific function works. You need only to study the documentation of the interface and how to use it within the context of your own code. Everything a programmer needs to know should be included in the documentation. If you want to use Qt’s excellent sorting algorithms, for example, you don’t have to know how it manages to be so efficient, you need to know only what to send to the function and how to get the results back. You should model your own comments on the same idea, both because it makes documentation easier, and because self-contained code functionality is easier to test and forget about. But we don’t mean you need to write a book. Keep your words as brief as they need to be – sometimes that might mean a single line. How you add comments to code is dependent on the language you’re using. In Python, for example, comments are usually demarcated by the # symbol in the first column of a line. Everything that comes after this symbol will be ignored by the interpreter, and if you’re using an editor with syntax highlighting, the comment will also be coloured differently to make it more obvious. The more detail
Code concepts you put into a comment the better, but don’t write a book. Adding comments to code can be tedious when you just want to get on with programming, so make them as brief as you can without stopping your flow. If necessary, you can go back and flesh out your thoughts when you don’t feel like writing code (usually the day before a public release). When you start to code, you’ll introduce many errors without realising it. To begin with, for example, you won’t know what is and isn’t a keyword – a word used by your chosen language to do something important. Each language is different, but Python’s list of keywords is quite manageable, and includes common language words such as and, if, else, import, class and break, as well as less obvious words such as yield, lambda, raise and assert. This is why it’s often a good idea to create your own variable names out of composite parts, rather than go with real words. If you’re using an IDE, there’s a good chance that its syntax highlighting will stop you from using a protected keyword.
Undeclared values A related problem that doesn’t affect Python is using undeclared values. This happens in C or C++, for instance, if you use a variable without first saying what type it’s going to be, such as int x to declare x an integer. It’s only after doing this you can use the variable in your own code. This is the big difference between compiled languages and interpreted ones. However, in both languages, you can’t assume a default value for an uninitialised variable. Typing print (x) in Python, for instance, will result in an error, but not if you precede the line with x = 1. This is because the interpreter knows the type of a variable only after you’ve assigned it a value. C/C+ can be even more random, not necessarily generating an error, but the value held in an uninitialised variable is unpredictable until you’ve assigned it a value. Typos are also common, especially in conditional statements, where they can go undetected because they are syntactically correct. Watch out for using a single equals sign to check for equality, for example – although Python is pretty
good at catching these problems. Another type of problem Python is good at avoiding is inaccurate indenting. This is where conditions and functions use code hierarchy to split the code into parts. Python enforces this by breaking execution if you get it wrong, but other languages try to make sense of code hierarchy, and sometimes a misplaced bracket is all that’s needed to create unpredictable results. However, this can make Python trickier to learn. Initially, if you don’t know about its strict tabbed requirements, or that it needs a colon at the end of compound statement headers, the errors created don’t make sense. You also need to be careful about case sensitivity, especially with keywords and your own variable names. When you’ve got something that works, you need to test it – not just with the kind of values your application might expect, but with anything that can be input. Your code should fail gracefully, rather than randomly. And when you’ve got something ready to release, give it to other people to test. They’ll have a different approach, and will be happier to break your code in ways you couldn’t imagine. Only then will your code be ready for the wild frontier of the internet, and you’d better wear your flameproof jacket for that release. Q
You have to be careful in Python that the colons and indentation are in the correct place, or your script won’t run. But this does stop a lot of runtime errors.
Comment syntax Different languages mark comments differently, and there seems to be little consensus on what a comment should look like. However, there are a couple of rules. Most languages offer both inline and block comments, for example. Inline are usually for a single line, or a comment after a piece
Bash BASIC C C++ HTML
of code on the same line, and they’re initiated by using a couple of characters before the comment. Block comments are used to wrap pieces of text (or code you don’t want interpreted/compiled), and usually have different start and end characters.
# A hash is used for comments in many scripting languages. When # is followed by a ! it becomes a shebang # and is used to tell the system which interpreter to use, for example: #!/usr/bin/bash REM For many of us, this is the first comment syntax we learn /* This kind of comment in C can be used to make a block of text span many lines */ // Whereas this kind of comment is used after the // code or for just a single line <!-- Though not a programming language, we’ve included this because you’re likely to have already seen the syntax, and therefore comments, in action -->
Java
/** Similar to C, because it can span lines, but with an extra * at the beginning */
Perl
= heading Overview As well as the hash, in Perl you can also use something called Plain Old Documentation. It has a specific format, but it does force you to explain your code more thoroughly =cut
Python
‘’’ As well as the hash, Python users can denote blocks of comments using a source code literal called a docstring, which is a convoluted way of saying ‘enclose your text in blocks of triple quotes’, like this ‘’’
35
I
Ruby
Ruby I
f you want to be a hip, cool web developer, you had better learn Ruby on Rails. This web development framework takes the grunt work out of building scalable web apps, leaving you to do the problem-solving without having to reinvent the wheel every time you start a new project.
Ruby: Master the basics.............................................................................. 38 Ruby: Add a little more polish ............................................................... 42 Ruby: Modules, blocks and gems....................................................49 Ruby on Rails: Web development .................................................... 54 Ruby on Rails: Code testing .................................................................... 58 Ruby on Rails: Site optimisation......................................................... 62
37
Ruby
Ruby: Master the basics Juliet Kemp introduces the ins and outs of the Ruby programming language – enough to write your first program.
R
Quick tip Indentation doesn’t matter from a code point of view, but the Ruby community prefers two-character indentation.
uby on Rails is the current web stack framework, popping up on open source projects all over the web. The Rails part is the web stack; the language underlying it is Ruby. It’s flexible, highly object-oriented, and quick to develop in. Ruby itself is growing rapidly in popularity alongside Rails, and it’s very easy to jump in and get started. To install Ruby, see the boxout. Once you’ve installed it, to get an idea of how it works just type irb at the command line. This fires up the Interactive Ruby Shell, which allows you to type in code and get its value back immediately. Try it out: :001 > 3*5 => 15 :003 > print ‘Hello!’ Hello! => nil :004 > puts ‘Is there anyone there?’ Is there anyone there? => nil Here, both print and puts (‘put string’ – print a string to standard output) are methods. You could also put the parameter in brackets if you prefer, eg, puts(“Is there anyone there?”). In Ruby, brackets are often optional, so it’s up to you (or the project you’re working on) to decide what your preferred coding style is (the Ruby community norms tend to be minimal and leave brackets out unless needed for clarity). In both cases, the return value of the function is nil, whereas the return value of 3*5 is 15. You can even write a method in IRB. Try this out: > def hithere ?> return ‘Hello!’ ?> end > => nil > hithere > => “Hello!” Here we define a method that returns a string, then call it, and
the return value is, as expected, our string. Note that IRB is smart enough to recognise that the method isn’t finished until the end line and doesn’t return anything until then. However, it’s not necessary to use return to get a return value from a Ruby method. Try this: > def hithere2 ?> ‘Hello; no return’ ?> end > => nil > hithere2 > => “Hello; no return” Ruby methods will automatically return the evaluation of the last line of the method. So you need return only if you have multiple possible return values, or to improve code
Note the different return values with 7 (treated as integer) and 7.0 (treated as floating point).
Install Ruby RVM (the Ruby Version Manager) is the easiest way to install Ruby. Among other things, it allows you to install and use multiple versions of Ruby on one machine, which may come in handy later on in your Ruby experience. If you have Git installed, all you need is: \curl -L https://get.rvm.io | bash -s stable --ruby (yes, the backslash is correct). This
38
will download and install RVM, Ruby, and any other basic necessities, and you’ll be prompted to do anything else you need to (as a rule, this should just be to source the RVM script). For more information, see the RVM website: https://rvm.io/rvm/install, or for other ways of installing Ruby try www.ruby-lang.org/en/ downloads.
Halfway through installing RVM.
Ruby clarity with more complex code. To run code as a file rather than in IRB, just put the commands in a file with the extension .rb, and run it with ruby myfile.rb. Alternatively, you can add a shebang line at the top, make the file executable, and call it anything you like:
#!/usr/bin/ruby -w puts ‘I can run Ruby!’ Note the -w flag, which turns on warnings – this is good practice. You can run this file (once executable) with ./myfile.rb.
Write your first Ruby program We’re going to write a little Ruby program that acts as a basic ‘notebook’. By the end of the tutorial, notebook.rb will show your current notes, and allow you to add another one on the end. See www.linuxformat.com/files/ca2015.zip for full code details; the initial version is called notebook_v1.rb. First, let’s look at writing to a file. Input and output in Ruby are handled by the IO class, and the File class is a subclass of that. Create a notebook.rb file that looks like this: #!/usr/bin/ruby -w nbk = File.open(‘notebook.txt’, ‘w’) nbk.puts ‘My first note’ nbk.close As you’ve almost certainly heard, in Ruby everything is an object. This includes things like numbers, which other OO languages (eg Java) often treat as primitive types. In Ruby, absolutely everything can have a method or an instance variable associated with it. Among other things, this means that the standard way of making something happen looks like thing.method. Here, we use the File class methods to open a new file to write. Note that unlike in (for example) Java, you don’t have to explicitly use new() to create a new object of a particular type. nbk is automatically set up as a File object – to test this, you can run that line in irb then type nbk.class.name to return File. We can then call .puts “string” on it, and close it. Set the file’s execute bit, run it with ./notebook.rb, then take a look at notebook.txt. There are a couple of alternative methods you could use to write to the file; both nbk.write “My String\n”, and nbk << “__My String\n” will work. However, with both of these you need to explicitly add the newline, which puts will automatically add. As it stands, this method will clobber any data that already exists in the file every time you run it. To add more data on the end of the file, you need to use append mode instead of write mode, using File.open(‘notebook.txt’, ‘a’). Make that change now, so your notes file will get steadily bigger. What about reading the data back? Add these lines at the end of notebook.rb: nbk_read = File.open(‘notebook.txt’) while line = nbk_read.gets do puts line end nbk_read.close We open the file again (you might want to reorder the lines to avoid opening it twice), then start a while block. In Ruby, the syntax for this is while CONDITION do on a single line, followed by the block to perform on each repetition, then end to mark the end of the block. Here, the while block repeats for as long as gets returns a line from the file (myfile.gets takes a single line at a time from myfile). All we do in the block is to output that line with puts, to standard out (the console). To create a new note, we want to be able to get user input. The most basic way to do this is with a single string. Comment out the write to file section of your notebook.rb
file, and edit it to look like this: nbk = File.open(‘notebook.txt’, ‘a+’) while line = nbk.gets do puts line end puts ‘Enter a new note’ note = gets nbk.puts note nbk.close We open the file with the parameter ‘a+’ – this means to open for reading and appending. So we’ll read and output the existing notes, then ask the user for a new one, using puts to output the query string to the console, and gets to get a string from the console (standard in). We then use nbk.puts to write the string (our new note) to the file. Run this, and you’ll immediately notice that while you do get asked for a new note, you don’t see the old ones. But if you add the note then look at notebook.txt, all your old notes and your new one are still there. What’s happened? The answer is that a+ automatically positions the IO stream at the end of the file, ready to append. To read back all the old notes, you’ll need to add this line before the while loop: nbk.rewind This repositions the IO stream at the start of the file. Note that if you do a writing operation after this, the stream will automatically move to the end of the file, so you won’t clobber anything. Using gets and puts like this automatically does the Right Thing with newlines; each added note is put on its own line. In other circumstances, you might want to lose the extraneous newline from your new note, which you can do by using note = gets.chomp. You can also make the code even neater by reducing that last section to a mere three lines: puts ‘Enter a new note’ nbk.puts gets nbk.close
Quick tip Ruby treats both semicolons and newline as the end of a statement. An operator (+, -, \, etc) at the end of a line indicates a continuation. Other whitespace is usually ignored – use the -w switch to flag up the rare occasions where it is used to interpret ambiguous statements.
Running version one of the code a couple of times; including fixing an error where I left an old line hanging around at the bottom of the file.
39
Ruby Create a class
Quick tip When looking for a method, Ruby will try the named class first, then its parent, up the inheritance chain. Here, there’s only one ancestor: the basic Object class (check this with Note.superclass). You can explicitly call the ancestor of a method you’re overriding with super.
Running the code on the notes generated by the first version has errors, as there are no titles; once that’s fixed, it runs fine. The second set of edits allow me to input a new note.
40
So far, this has been structured much as a functional program – do one thing, then do the next thing. If you want to write a more complicated program, you’ll want to create your own classes. We’ll start again with a blank file – the code for this version is in notebook_v2.rb on the ZIP file at www. linuxformat.com/files/ca2015.zip. This time, we’ll start by defining a Note class. Each Note object has a title and a body: class Note def initialize(title, body) @title = title @body = body end end initialize is automatically called when you create a new object by calling Class.new, so you can set up your object’s initial state. Here, Note.initialize takes two parameters, title and body. By Ruby convention, local variables (and parameters, like these, that act like them) start with a lowercase letter, while classes start with a capital letter. Each Note object will have its own title and body, so each object will have instance variables for title and body. Instance variables in Ruby always begin with @. Here, we have @title and @body. To test this, add these lines under the class definition: myNote = Note.new(‘Note 1’, ‘this is a note’) puts myNote.inspect This creates a new Note object with the given title and body, then uses the inspect method to take a look at the object. Run this, and you should get this output: #<Note:0x10e092978 @body=”this is a note”, @title=”Note 1”> It looks like it’s done the right thing, but the formatting isn’t great. Objects in Ruby have a standard method, to_s, which will output the object as a string. However, if we try puts myNote.to_s, we’ll just get the output #<Note:0x10e092978 – the object ID, which isn’t that useful to us. To solve this problem, we can override the to_s method for our Note class. Add this method definition inside the Note class definition, after the initialize method: class Note def to_s “Note: #{title}, #{body}” end end myNote = Note.new(“Note 1”, “this is a note”) puts myNote.to_s
For the first time, we’re using “ rather than ‘ – this is because “ allows variable interpolation, while ‘ doesn’t, and here we want to use interpolation. We could also write #{@ title}, or even #title. The preference in the Ruby community seems to be for #{title}. Now if we run the program, we’ll get the more useful output Note: Note 1, this is a note. Next, what if we want to access just the title of the note, or just the body? The instance variables title and body are private to their specific object; no other object can access them. This is useful in that it avoids objects changing other objects accidentally. But it does mean you need to do something explicit if you want to be able to access them. We could write a couple of methods to do that, by adding this in the class definition: class Note def title @title end def body @body end end myNote = Note.new(“Note 1”, “this is a note”) puts myNote.title puts myNote.body This will output the title and the body. However, because this is such a common thing to want to do, Ruby provides you with a shortcut method, attr_reader. Replace the title and body methods we just added with this: class Note attr_reader :title, :body end myNote = Note.new(“Note 1”, “this is a note”) puts myNote.title puts myNote.body You could make only the title, or only the body, accessible via attr_reader. The :foo notation creates a Symbol object that corresponds to the @foo instance variable, allowing you to manipulate it via meta methods like this. You might also want to be able to set the instance variables, and sure enough there’s a convenient shortcut method for that, too: class Note attr_writer :title, :body end myNote = Note.new(“Note 1”, “this is a note”) myNote.title = “Note 1 edited” puts “New title is: “ + myNote.title To create getter and setter methods both at once, use the shortcut attr_accessor :title, :body. Next, add a few lines to the program to request a second note: puts “Enter new note title” myTitle = gets.chomp puts “Enter new note body” myBody = gets.chomp myNote2 = Note.new(myTitle, myBody) puts myNote2.to_s This prompts the user for a new note (title and body), creates a new Note object, then outputs it. One question you might be interested in is how many notes there are in total? To keep track of this, you need to use a class variable; a variable that exists only once, for the Note class as a whole, and is incremented every time you create a new Note:
Ruby Documentation It’s always a good idea to document your code clearly for others (or for yourself at a later time!). One popular option for Ruby is TomDoc (http://tomdoc.org). Here’s how that looks with the final version of our Note class: # Public: class to define a note. class Note @@notes = 0 @@notebook_file = ‘notebook.txt’ # Public: Initialize a Note. # # title - The String title of the Note. # body - The String body of the Note. def initialize(title, body)
# ... code here ... end # Public: Write Note to file. # # Returns nothing. def write_to_file # ... code here ... end # Public: Class method to return name of notebook file. # # Returns the String name of the notebook file.
class Note @@notes = 0 def initialize(title, body) @title = title @body = body @@notes += 1 end def Note.total_notes “Total notes: #{@@notes}” end end Class variables are written as @@foo. We set it at the top of the class, then increment it in the constructor every time a new Note is created. To find out the value of a class variable, we can create a class method, using Class.classmethod, as here. Add this line to the end of the file, after you’ve added your two notes, to call the method: puts Note.total_notes You can also refer to a class variable within a regular instance method, so you could do the same thing with an instance method: class Note def total_notes “Total notes: #{@@notes}” end end puts myNote.total_notes However, that means having to call it via a particular note. Conceptually, it makes more sense to use a class method. Another way to refer to a class method is to use self.total_ notes. It’s just a matter of preference. You may have noticed that in this version of the program, your notes don’t last from one instance of the program running to the next one. Let’s roll in the File interaction we used before to write out to a file. To see the version of the code for this last part of the tutorial, download the archive from www.linuxformat.com/files/ ca2015.zip. Add this method to the Note class: class Note @@notebook_file = “notebook.txt” def write_to_file nbk = File.open(@@notebook_file, ‘a’) nbk.puts(@title + “,” + @body) nbk.close end end myNote = Note.new(“Note 1”, “this is a note”) myNote.write_to_file # comment out the rest of the file for now, for ease of testing Our write_to_file method does what it says on the tin: writes a given note to the end of the general notebook file (defined as a class variable). Run this, then have a look at
def self.notebook_file # ... code here ... end # Public: Gets/Sets the String title and body of the Note. attr_accessor :title, :body You should state what the method does, describe any arguments, and give a return value. Constructor (initialize) and attribute (attr accessor, etc) methods can be shorthanded as shown here. TomDoc is designed to be both human-readable and machine-parsable; check the webpage out for more information.
notebook.txt and you should see the note added. What about reading your notes back? This shouldn’t be an instance method, as we want to be able to read back the existing notes independently of any specific note. We could do it either as as part of the command flow of the program (outside the class altogether), or as a class method. Here it is outside the Note class: class Note def self.return_file @@notebook_file end end nbk = File.open(self.return_file, ‘r’) while line = nbk.gets do note = line.split(‘,’) thisNote = Note.new(note.first, note.last) puts thisNote.to_s end The variable note is used as an array; but Ruby doesn’t insist that you declare variables or their types in advance, so we just go ahead and use it. line.split(‘,’) splits line on comma. If you miss out the argument (line.split) you would split it on whitespace, which is no good for us, as we can have whitespace within either a note title or note body. We then use the first and last variables of the array to create a new Note, and print it to screen. As we’re not doing anything with thisNote other than printing it, we could reduce this further: while line = nbk.gets do puts line.split(‘,’) end This will output each title and body on a separate line. If you use the print command, you won’t get any spaces. Creating thisNote and using to_s gives you more control over the output format. Q
Quick tip You might want to error-check here that you only have two values in the array. Try:
if note.length != 2 puts “There is a problem: note has too many fields!” next end
Experimentation Irb, Interactive Ruby, is a great tool for experimenting and trying out code snippets. Making good use of irb can really speed up code production. For example, if you enter a string, then a dot, then hit Tab, irb will give you a list of the methods you can use on a String object. Since in Ruby everything is an object, this works for anything you input. If completion isn’t working, try irb --readline -r irb/completion. ri provides online Ruby documentation.
To see all the classes ri knows about, try ri -c; then try ri FileUtils to see documentation for the FileUtils class. To get documentation on a specific method, try ri String#split. You can also install the Ruby Documentation Bundle for easy access to a bunch of resources, including the free version of Programming Ruby: The Pragmatic Programmer’s Guide (aka the Pickaxe), an FAQ, and a couple of tutorials. It’s also available online.
41
Ruby
Ruby: Add a little more polish Build on your Ruby – reorganise your code, learn about modules and blocks, and introduce a few tests – with Juliet Kemp.
I
n the first part of this series of introductory Ruby tutorials, we got started with Ruby and wrote a basic single-file notebook program. In this next section, we’ll improve the overall structure, interface and usability of the code, looking at how best to organise it and split different parts out. We’ll also introduce command-line option parsing, learn about
modules, and do a bit of testing. Last time, I suggested using RVM to install Ruby and a couple of other bits and pieces. It’s a good idea to update RVM fairly regularly, as a new stable release comes out every month or two. It’s easy to do: just type rvm get stable, and check for any notes that appear in the output.
Organise your code In the previous tutorial, all of our code was in a single file. This is fine for getting started, but as soon as your project gets to be any reasonable size, it is likely to become confusing. Single-file code is also less likely to be reusable by other projects, and it is harder to write automated tests, because you can’t test parts of the code without having to run the whole program. If you look at the code from the previous tutorial, we have the definition of a Note mixed up with the logic that creates test Notes. It would be much easier to read if we break that out into separate files. Unlike some other languages (such as Java), Ruby doesn’t enforce any particular organisational standards. But there is a set of conventions emerging from within the community (which are also used by the RubyGems system, which we’ll look at in the next tutorial), so we’ll take that approach here. Even with a very small program, it’s worth getting the hang of structuring things like this, so it’s a habit when you start on larger things.
“As your project gets to any reasonable size, it’ll become confusing.”
Let’s take a look at our notebook program – the original is in the ZIP file (www.linuxformat.com/files/ca2015.zip) as notebook_old.rb. Currently, it has several sections: The Note class, which defines a Note and a couple of methods. Writing a test note to file. Opening the file and reading the notes back. Getting another note from the user. The second one of these (writing a test note), is a bit of a red herring; it’s really more of a test, or a proof-of-concept, so we’ll ignore it. We can divide the code up then into three operations: a class with the Note definition, a section which reads Notes back from a file, and an input interface. Effectively, the first two of these are library files, and the final one is a command-line interface. The Ruby convention, then, is a directory set up like this: notebook/ # top level bin/ # command line interface lib/ # library files test/ # test files We’re also going to set this all up as a module; if other people want to use parts of this code in future, it’s not good practice to have it all sloshing around in the top-level namespace (see the boxout for more on modules, and a brief introduction to mixins, which we won’t be using here). So we’ll
Modules and mixins Modules are a way of grouping classes together to use the same namespace. This eliminates problems of a method that is named the same as another unrelated method from another class. So if light is a method concerning the electromagnetic spectrum, but light also concerns the set of things which are not heavy, you can identify them with EMSpectrum.light() and Weight.light() (or Mass.light(), but that’s
42
another discussion). Modules remove the need for multiple inheritance, by providing mixins. If you include a module within a class definition: require bar class Foo include Bar # .... end then the class gains access to all of that module’s methods and variables. Note that you
do need to require the file that the module definition lives in; include doesn’t do that for you. It’s also important to remember that include doesn’t copy the methods; it creates a reference to them. If you change any Bar methods in Foo, they’ll also be changed for any other modules that include Bar as well. Mixins can get more complicated and useful than this, but we’ll tackle that in another tutorial.
Ruby call the module Notebook, and the Note class will be Notebook::Note. This also means that it’s good practice to create a lib/notebook/ subdirectory to keep the library files in, for ease of navigation at a later date. A Notebook::Foo module will then be in lib/notebook/foo.rb, which makes it easy to find. So, we can shift the Note class wholesale into lib/ notebook/note.rb, and just add a Module line at the top: module Notebook class Note # see the DVD for the code end end Next, we’ll shift the reading-in-from-file code into lib/ notebook/reader.rb. Currently, this code looks like this: nbk = File.open(Note.notebook_file, ‘r’) while line = nbk.gets do note = line.split(‘,’) if note.length != 2 puts ‘There is a problem!’ next end thisNote = Note.new(note.first, note.last) puts thisNote.to_s end That doesn’t look much like a class just yet. What we want is a Reader constructor that takes a file as its parameter, then has a read method that reads and outputs the notes (of course, this isn’t the only way to organise this code; you can probably think of several different ways off the top of your head. Feel free to play around with them). Let’s rewrite Reader: module Notebook # Public: class to read back notes from a file class Reader # Public: initialize the reader # # file - The String name of the file to read in from def initialize(file) @file = file end # Public: read back from the notebook file # # Returns nothing def read File.open(@file, ‘r’) do |f| while line = f.gets do note = line.split(‘,’) if note.length != 2 puts “There is a problem!” next end puts Note.new(note.first, note.last).to_s end end end end end This demonstrates an important feature of Ruby: the block. The while...do...end is one style of block, which is probably familiar to you from other languages, and does what you’d expect: execute the code between do and end while f.gets continues to return lines. The other block is here: File.open(@file, ‘r’) do |f|
Ruby in blocks Blocks in Ruby are used to interact with methods. The basic format is: variable.method do |n| # put some code here that operates on the variable n end The variable n is identified by method, then the code block is passed into method, and is applied to each n in turn. So when opening a file, File. open(@file, ‘r’) do |f| opens a file and creates a filehandle, here f. The code block after this will then be applied to f. We won’t delve into the code that makes this possible just yet, but this
highly extensible way of hooking an arbitrary piece of code into an existing method is one of the things that makes Ruby so powerful, and fun, to use. Blocks can be either enclosed in braces, or with do... end. The Ruby standard is to use braces for single-line blocks, and do...end for multi-line blocks. In the next tutorial, we’ll delve into blocks a bit more, and look at how to write methods that use blocks. For now, just get accustomed to the syntax and the way blocks are used in existing Ruby methods.
# execute code here end This block of code is executed on the filehandle returned by File.open(@file, ‘r’), as represented by f. One of the advantages of using this syntax is that the file will automatically be closed once the block is finished. Blocks are a powerful and important part of Ruby, but they can also take a little getting used to. See (Ruby in Blocks box, above) for a bit more detail. Now we can write the bin/notebook command (no .rb extension – commands usually don’t have an extension). We want this to have as little in it as possible: require_relative ‘../lib/notebook/runner’ runner = Notebook::Runner.new() runner.run Spot the deliberate mistake here: we don’t yet have a Runner module. Next, then, dump all the rest of the code from the original version into notebook/lib/notebook/runner.rb, with a little bit of a rewrite to make it into a module: require_relative ‘note’ require_relative ‘reader’ module Notebook class Runner def run reader = Reader.new(Note.notebook_file) reader.read puts ‘Enter new note title’ myTitle = gets.chomp puts ‘Enter new note body’ myBody = gets.chomp myNote2 = Note.new(myTitle, myBody) myNote2.write_to_file end end end Where require looks in the standard library files, require_ relative looks at a path relative to the current file. So it’s a good choice for loading files within your project. This class doesn’t need an initialize() method, as it has no instance variables or anything else that needs initialising; it’s just created as a blank object to hold the run() method. You’ll need to touch notebook.txt (in the main notebook/ directory) to avoid getting a missing file error; then run ruby -I lib bin/notebook, and you should see any existing notes printed to screen, before being asked to input a new note.
Quick tip To make the program create the file if it doesn’t exist, go back to Reader.read and replace notebook = File.open(@file, ‘r’) with notebook = File.open(@file, ‘a+’). This is readappend mode and creates a file if that file doesn’t exist.
43
Ruby Testing, one, two, three
Quick tip Running a test suite is easy. Just create test/test_ suite.rb, with the following lines:
require_relative ‘test_note’ require_relative ‘test_reader’ and run it with ruby notebook/ test/test_suite.rb.
Testing is an important part of software development, and the way that our code is now set up is intended to make it easier to test. There’s a standard Ruby test framework, Test::Unit, and a useful gem, shoulda, to help you out a bit more. Install this with gem install shoulda (you’ll likely get quite a lot of other stuff along with it). Shoulda is a test library that helps you to write clearer tests. With shoulda, you can provide context for tests, so you can group them by feature or scenario. Specifically, it allows you to write context, setup and should blocks, which combine to create specific tests. Let’s take a look at this in action. We’ll create some tests for note.rb in test/test_note.rb. Here we’ll look at one of them in detail: require ‘test/unit’ require ‘shoulda’ require_relative ‘../lib/notebook/note’ class TestNote < Test::Unit::TestCase context “With no notes” do should “have total notes return 0” do assert_equal “Total notes: 0”, Notebook::Note.total_notes end end context “With notes” do setup do @title = “test_title” @body = “test_body” @test_note = Notebook::Note.new(@title, @body) end should “output title and body string for to_s” do assert_equal “Note: #{@title}, #{@body}”, @test_note.to_s end should “have last line of file equal to test note values after write_to_file” do @test_note.write_to_file @last_line = `tail -n 1 notebook.txt` assert_equal “#{@title}, #{@body}”, @last_line.chomp
Running tests, and looking at the structure of the module.
44
end should “have total_notes return 2” do assert_equal “Total notes: 2”, Notebook::Note.total_notes end end end This generates these tests: “test: With no notes have total_notes return 0” “test: With notes should output title and body string for to_s” “test: With notes should have last line of file equal to test note values after write_to_file” “test: With notes should have total_notes return 2” The string labels for the context and should blocks are just labels; they can be whatever you like. Within the context block, the setup block will be run once for each test. You can also nest context blocks if you want (see the boxout earlier for more on blocks).
Testing assert_equal does what you’d expect from the name. The first argument should be what you expect, and the second argument what you actually get. There’s a full set of assertions available via Test::Unit, and documented at RubyDoc; they include assert_not_equal, assert_raise, assert_throws, and so on. Shoulda also adds assert_ contains and assert_same_elements, for working with arrays. Run this with ruby test/test_note.rb. It turns out that we get a failure: 1) Failure: test: With notes should have last line of file equal to test note values after write_to_file. (TestNote) [notebook/ test/test_note.rb:32]: <”test_title, test_body”> expected but was <”test_title,test_body”>. There’s a misplaced space in there. We need to decide whether we want the space (in which case we edit the code), or don’t (in which case we edit the test). Note that the tests run in alphabetical order, first of the context blocks, and then of the should blocks, within each context block. This means that our ‘zero notes’ test, for example, needs to be alphabetically first, or the class variable total_notes will already be incremented. If you have a situation like this, you could consider either running the tests separately, or numbering the tests to be clear about the requirement. The same issue happens with “have total_notes return 2”; the number returned will depend on how many other tests are run before this one. In fact, this also draws attention to a problem with total_notes more generally: as things stand, it only keeps track of notes created during this session, rather than tracking how many notes are in the notebook file. This is a code problem rather than a test problem, though! We could also run some more tests – for example, we could change the code to require that either or both of the note title and body are non-nil, and test accordingly. Another issue with the tests as they currently stand is that our test are messing up our actual notebook file! It would certainly be worth changing the code to use a test notebook – but in fact, we want to be able to specify a notebook file anyway, so we’ll leave that til the next section. We can also write some tests for the Reader class: class TestReader < Test::Unit::TestCase
Ruby context “read” do should “return nothing when reading test file” do test_file = “testfile” note = “Test Note,test body” File.open(test_file, ‘w’) {|f| f.write(@note) } reader = Notebook::Reader.new(test_file) assert_equal nil, reader.read File.delete(test_file) end end end You may have noticed the problem here: the Reader class currently just outputs to the console, and doesn’t return anything. There are ways to test console output, but we won’t go into them here because we’re going to do a bit of rewriting of this class anyway. You’ll see another way of writing a File block, though: File.open(test_file, ‘w’) {|f| f.write(@note) } As with the Reader class, this opens the file, runs the block (in {}) on it, then closes it when the block is finished. Remember to delete the test file afterwards!
Another failed test; this time I’d removed a piece of code but not the corresponding test.
Options and minor improvements Another improvement would be to add in some commandline option parsing. The Options class provides an API for that – install it, if you don’t already have it, with gem install OptionsParser. The first option we’ll add is one to take a reference to a notebook file, or to use the default notebook. require ‘optparse’ module Notebook class Options DEFAULT_NOTEBOOK = “notebook.txt” attr_reader :notebook def initialize(argv) @notebook = DEFAULT_NOTEBOOK parse(argv) end private def parse(argv) OptionParser.new do |opts| opts.banner = “Usage: notebook [ options ]” opts.on(“-n”, “--notebook PATH”, String, “Path to notebook file”) do |notebook| @notebook = notebook end opts.on(“-h”, “--help”, “Show this message”) do puts opts
exit end opts.parse!(argv) end end end end We’re going to set up an accessor method for @notebook, so it can be used elsewhere in the code. Then initialize() just takes the arguments passed in when creating the class, and runs the private parse() method. This is where the work is done. Check out the box for the details on the various OptionParser methods. Now we need to fix up Runner to use the options: require_relative ‘options’ # as well as other files class Runner attr_reader :options def initialize(argv) @options = Options.new(argv) end def run reader = Reader.new(@options.notebook) # ... rest of code as before end
Parsing options When creating a new OptionParser, you use a do block to set up how it behaves in various situations. (new() yields itself when called with a block – see the other boxout for more on blocks). opts.banner() creates a heading banner for any output produced. opts.on() adds an option switch and handler for that switch. The first argument is a short switch (you could miss this out if you prefer),
and the second one is a long switch with a mandatory argument. To specify, instead, an optional argument, you would use ”--notebook [PATH]”. For a switch with no argument, use ”--notebook”. Note that if you don’t specify an argument here, an argument passed in on the command line will be silently ignored. We also tell OptionParser to cast the argument to a String, and provide a description string. The do block then acts on the command-
line argument, referred to by |notebook|. The help option demonstrates an option switch without an argument. Once all the option switch handlers are set up, opts.parse!() parses whatever has actually been passed in on the command line, removing each one as it is dealt with (parse would leave them in place). There’s scope to get much more complicated and detailed with OptionParser, but it’s straightforward once you have the basic idea.
45
Ruby end And edit notebook/bin/notebook to pass in commandline arguments: require_relative ‘../lib/notebook/runner’ runner = Notebook::Runner.new(ARGV) runner.run If you try running it without an argument, it should now work; but if you try bin/notebook -n myfile, nothing will be written to myfile. This is because we still have the notebook file hard-coded in lib/notebook/note.rb. So let’s take out
any references to @@notebook_file in the Note class (including self.notebook_file), and rewrite write_to_file to take an argument: def write_to_file(file) File.open(file, ‘a+’) { |f| f.puts(@title + ‘,’ + @body) } end You’ll notice that we’re using that same handy block syntax again. Back to the Runner class, and edit the write_ to_file line: myNote.write_to_file(@options.notebook)
This or that?
Trying out various option switches from the command line.
46
The Runner here is still pretty basic – it outputs what you have, and adds something else. It would be good to have the option to do one or the other. Back to Options for a bit more parsing. Add this into the class: class Options attr_reader :notebook, :add, :read def parse(argv) @add = false @read = false OptionParser.new do |opts| # .... as before opts.on(“-a”, “--add”, “Add a note”) { @add = true } opts.on(“-r”, “--read”, “Read back notes”) { @read = true } # .... as before end end end Note that you can do these easy options as a single-line piece of code, using { } rather than do end to set off the block. This can improve readability when used judiciously; but always bear in mind that it’s better to use a couple more lines and have more readable code, than to crush it all onto one line. While we’re here, we’ll also add something to deal with an invalid argument. Replace the line opts.parse!(argv) with this: begin opts.parse!(argv)
rescue OptionParser::InvalidOption => e puts e puts opts exit(1) end begin/rescue/end is the Ruby way of handling exceptions. The begin block contains code that might throw an exception, and the rescue block, or blocks, handle specific exceptions. You can also add an else block, which runs if there are no exceptions, and an ensure block, which runs whatever happens, before closing it out with end. Now edit Runner to deal with these new options: class Runner def run reader = Reader.new(@options.notebook) if @options.read reader.read end if @options.add puts ‘Enter new note title’ title = gets.chomp puts ‘Enter new note body’ body = gets.chomp note = Note.new(title, body) note.write_to_file(@options.notebook) end end end Try out your new arguments, and you should be able to either read back your old notes (ruby bin/notebook -r), or add a new one (ruby bin/notebook -a). If you try an invalid switch, eg, -f, you’ll get the help output. By default (ie, without any switch), this now does nothing; to make it either read or output by default, change the default value at the top of Options.parse. Note that you may also want to change the behaviour of the switch further down – eg, if you set @read = true at the top, then unless you set @read = false when you parse the -a switch, you’ll both read back old notes and add a new one. You may have realised at some point during this tutorial that total_notes is no longer doing the Right Thing, as it only tracks notes within a particular session, rather than reading the total from the file. For now, just delete it and any references to it, as we’re going to be arranging the notes a bit differently as of the next round of edits in the next tutorial. In the next and final tutorial in this series, we’ll look at blocks a bit further; find out more about data storage so we can edit and delete notes; look at packaging, gems, and rake; find out more about mixins; and discover a few more bits and pieces of Ruby syntax and usage along the way. Q
THE BEST LINUX TUTORIALS FOR 2015!
OUT NOW! DELIVERED DIRECT TO YOUR DOOR 2UGHU RQOLQH DW www.myfavouritemagazines.co.uk RU ¾QG XV LQ \RXU QHDUHVW VXSHUPDUNHW QHZVDJHQW RU ERRNVWRUH
SERIOUS ABOUT HARDWARE?
NOW ON APPLE NEWSSTAND Download the day they go on sale in the UK!
PERFORMANCE ADVICE FOR SERIOUS GAMERS ON SALE EVERY MONTH
MASTER STEAM HOME STREAMING TODAY
SUPERTEST Z97 MOTHERBOARDS INTHE LABS
MAKE YOUR OWN GAMES Build a platform game in Minecraft
SSD GROUP TEST
SUPER SIZED +SUPER FAST THROW OUT YOUR HARD DRIVE
¤ High-capacity SSDs ¤ From just 36p per GB ¤ All the latest controllers rated NO.1 FOR REVIEWS
GET READY FOR ELITE: DANGEROUS
Gain an unfair advantage with the best controllers around
4K ON A BUDGET ¤ £499 AOC 4K screen ¤ Tweaking for hi-res ¤ High-end gaming rigs
NEXT-GEN CPU
DEVIL'S CANYON
PLUS ¤ Screenshots that look awesome ¤ Clean up your audio ¤ Build your own music server ¤ Stream to Twitch easily with Raptr
SAMSUNG 850 PRO 512GB
The world's first 3D V-NAND solid-state drive in the labs
BUILD A BUDGET GAMING PC
¤ Complete systems from just £337 ¤ AMD Kabini vs Intel Pentium ¤ More powerful than next-gen consoles
THE BEST GAME ENGINES
Intel tweaks Haswell for top performance
CREATE CHARACTERS How games developers turn concepts into powerful heroes
¤ AORUS M7 THUNDER ¤ AMD A10-7800 ¤ ASUS ROG G550 PLUS LOADS MORE!
OPTIMISE BOOT TIMES
MAKE YOUR OWN MUSIC
How to tweak the Windows boot process to load your OS quicker
How to create tunes from scratch with the best free tracker tools
NVIDIA SHIELD
THE IDEAL TABLET FOR GAMERS! FULL REVIEW INSIDE
¤ What next for Nvidia, Intel and AMD? ¤ Next-gen CPUs, GPUs and more... ¤ Ultra-fast high capacity SSDs
NO.1 FOR REVIEWS
GIGABYTE Z97 GAMING 5 INTEL CORE i5-4690K APPLE iMAC 21-INCH PLEXTOR M6e M.2
PLUS ¤ Speed up Windows ¤ The best gaming headsets revealed ¤ Master PC audio ¤ Make awesome pixel art in GIMP
VS
SCREEN WARS
ASUS RoG SWIFT & LG ULTRAWIDE
G-Sync smoothness takes on cinematic gaming immersion
Making your own games has never been easier
AMD'S FUTURE VISION The technologies that will make AMD a force to be reckoned with
NO.1 FOR REVIEWS
FUTURE PC TECH!
NEXT-GEN SSD
NO.1 FOR REVIEWS
CRUCIAL MX100 CORSAIR RAPTORSSD K40 GAMDIAS EROS GIGABYTE P34G
JUST IN
INTEL'S NEW Z97 MOTHERBOARDS
Should you upgrade to Intel’s latest motherboard chipset?
SUPERTEST THE BEST 4K DISPLAYS AVAILABLE
TROUBLESHOOTING TIPS
HOT NEW GAMING TECH
Machine won’t boot? We’ll get it up and running again!
Discover what’s getting game developers excited
Ruby
Ruby: Modules, blocks and gems Learn more about modules and mixins, blocks and yields, and how to get your code out there, with Juliet Kemp.
I
n the previous two tutorials we got started with Ruby, learnt some more syntax and structures, and began to organise our code in the way that’s expected in the Ruby community. In this tutorial, we’ll find out more about mixins (the other big use of modules in Ruby), blocks and yields (one
of Ruby’s neatest features), and finally, how to package your code as a gem for ease of install and sharing with others. We’ll be building again on the code used in the last tutorial, the Notebook command-line tool to collect short notes (see www.linuxformat.com/files/ca2015.zip for the code).
Data structures and storage At the end of the last tutorial, we had code that could read our notes back from a file, print them out to screen, and add one to the end. What this doesn’t allow for is deleting or editing any notes, since the notes aren’t saved at any point. To delete or edit notes, we’ll need to read them into a data structure so we can refer to them elsewhere. Ruby has the standard data structures, including hashes and arrays, so let’s stick with the straightforward option and put our notes into an array once read in (we’d need to add a key to use a hash, as neither title nor body is guaranteed to be unique). class Reader attr_reader :notebook def read notebook = Array.new File.open(@file, ‘a+’) do |f| while line = f.gets do @@total_notes += 1 note = line.split(‘,’) if note.length != 2 puts “There is a problem!” next end notebook << Note.new(note.first, note.last) end end notebook.each { |x| puts x.to_s } end # rest of class
end This creates an array, and adds each Note to it with <<, the ‘shovel’ operator, which adds an item to the end of the array. It then outputs the whole array. Run this, and you should see an output a bit like this: Note: argh, bin Note: a, b Note: any, ping Note: my, dog Note: test_title, test_body The main issue is that there is no easy way for the user to reference each note. Replace that notebook.each line with: notebook.each_with_index { |val, index| puts “#{index}: #{val}” } We don’t need to explicitly call the to_s method; since we’re referring to our Note in a String context, Ruby will automatically use the appropriate to_s method (while we’re at it, though, edit the to_s method to remove that extraneous “Note:” string). Run this, and your notes will have an index associated with them: 0: argh, bin 1: a, b 2: any, ping 3: my, dog 4: test_title, test_body But how are we going to interact with these? More perturbingly, if you start experimenting and try to refer to this array from another file, you’ll find it’s empty. What’s going on?
Flexible initialization When you use Foo.new in Ruby, it calls Foo. initialize. In our code so far, we have some initialize() methods without any arguments, and one with an argument (Reader. initialize(file)). If you call Reader.new with no argument, Ruby will throw an error. But what if we wanted to set a default value? (in our code
the default value is set in Options, but we could move it) We could then either specify a notebook file, or call initialize without an argument and use the default. Ruby provides a way to do this without having to write multiple constructors: def initialize(file = “notebook.txt”)
@file = file end This will use a variable passed into the constructor if there is one (Reader.new(foo. txt)) and notebook.txt if not. You could also set a constant earlier in the file and use that (def initialize(file = DEFAULT_NOTEBOOK)).
49
Ruby Singletons The problem with the code on page 49 is that every time you create a new Reader, you’ll also create a new notebook array, which will make it impossible to be sure that you’re always referring to the same array (or that the array has any notes in it). What we want instead is a Singleton class, which can be instantiated only once. Happily, Ruby provides a module to do this. We’ll create a singleton NoteStore class to go alongside Reader, and move some of our functionality into that: require ‘singleton’ module Notebook class NoteStore include Singleton attr_accessor :notebook_array def initialize @notebook_array = Array.new end def add(note) @notebook_array << note end def edit(number) new_note = @notebook_array[number].edit @notebook_array[number] = new_note end def delete(number) @notebook_array.delete_at(number) end def output @notebook_array.each_with_index { |val, index| puts “#{index}: #{val}” } end end end You may at this point notice that this all looks quite clear and straightforward, which is often a good sign that your code is doing the right thing. The reading in will still be done in Reader (see below), but this new class is used to store and access the array data. We’re doing much the same as we were with the array in Reader. The magic happens with that include Singleton line. This makes NoteStore use the Singleton module, which is an example of how Ruby uses modules as mixins, to provide an inheritance mechanism. See the boxout (Modules, classes and mixins, oh my…, p51) for more on this.
Singularity What the Singleton module does, among other things, is to disable the new() method and add an instance() method. The first time the class is called, a new object will be created; but the new() method is private, so it can’t be called by any other classes or modules. NoteStore.new would return an error. This ensures that there is one, and only one, instance of the class. It also creates a method called instance(), which allows you to access this single instance of the class. To make use of this, Reader now looks like this: class Reader # no @notebook or @total_notes variable needed def read File.open(@file, ‘a+’) do |f| # while loop as before, but take out total_notes
50
NoteStore.instance.add(Note.new(note.first, note.last)) end NoteStore.instance.output end def write_all File.open(@file, ‘w’) do |f| NoteStore.instance.notebook_array.each { |x| f.puts(x) } end end def total_notes_string total_notes = NoteStore.instance.notebook_array.length “Total notes: #{total_notes}” end end Reading in from the file adds each element to the array in our NoteStore instance, then we output it to the user. Writing the array out again (once we’ve edited it) overwrites the existing file content, again accessing the contents of that NoteStore array. And we can use the array length to get our total notes info. NoteStore then calls a Note.edit method, so let’s next write that: class Note def edit puts “Current title is: #{title}; enter new title or enter to keep” new_title = gets.chomp if (new_title != “”) @title = new_title end puts “Current body is: #{body}; enter new body or enter to keep” new_body = gets.chomp if (new_body != “”) @body = new_body end return self end end Again, this is straightforward. If there’s a new title or body, we change the values in the Note, and return the Note itself. In NoteStore.edit, that new note is used to replace the old one in the notebook array. Finally, then, Runner and Options need to be set up to fire all of this off. First, we add an edit option to Options: class Options attr_reader :notebook, :add, :read, :edit, :edit_number def parse(argv) # rest of code here as before @edit = false OptionParser.new do |opts| opts.on(“-e”, “--edit NUMBER”, Integer, “Edit a specific note”) do |number| @edit = true @edit_number = number end end end Now we add code to Runner to handle it: require_relative ‘notebook’ module Notebook class Runner
Ruby def run reader = Reader.new(@options.notebook) # read and add options as before if @options.edit reader.read if @options.edit_number >= NoteStore.instance. notebook_array.length puts “No note of that number; can’t edit!” return end NoteStore.instance.edit(@options.edit_number) reader.write_all end
end end end The only thing to really draw your attention to here is the error-handling; we need to check that the number to edit actually exists within the array. Try now with ruby bin/notebook -r to see what notes you have, then ruby bin/notebook -e 2 to edit the note with index 2, then ruby bin/notebook -r again to see what you now have. It should all work as expected. In fact, as you might already have thought, Reader and NoteStore could just as well be the same (singleton) class; try making that change to the code yourself.
Quick tip
Blocks and yields: the lowdown We’ve used blocks in several places in the code, but without really looking at what they’re doing. Let’s try using a block with a yield to understand what is actually happening under the surface. Blocks and yields are one of the most powerful features of Ruby, so they’re well worth getting to grips with. Before doing something with our real code, take a look at a very simple code block. def my_first_block puts “Starting block...” yield puts “...and ending block” end my_first_block { puts “Hello” } Run this and it should output Starting block... Hello ...ending block The yield statement just spits out what was passed in between the brackets. This can be multi-line if you want. So far, so good; but the blocks we’ve already seen in code use a parameter (look at the File.open blocks, for example). How do we set up a block with a parameter? def things_with_five yield 5
end things_with_five { |x| puts 3 * x } This outputs 15. If you swap in 15 / x for 3 * x in the block, you’ll get 3. Now try this: def things_with_five_and_ten yield 5 yield 10 end things_with_five_and_ten { |x| puts 3 * x } Output is 15, then 30. Effectively, what happens is this pseudocode: things_with_five_and_ten: puts 3 * 5 puts 3 * 10 Each time yield takes the block, sticks first 5, then 10, into the x variable, and runs the block. The yield statement runs the code you write in the block, but applies it in the context of its own method. Now let’s write a NoteStore.do_to_all method, which will apply a change to every single note, a Note.edit_title method as a test, and an option in Runner to call it: class NoteStore def do_to_all i = 0;
If you get an error LoadError: cannot load such file -- myfile, check that all the files are included in the s.files line of the gemspec. If using git ls-files, make sure everything is checked in to git!
Modules, classes and mixins, oh my… In our code until now, we’ve really only been using modules for their namespace purposes. The use of the Singleton module demonstrates its other purpose: mixins. One way of thinking of modules is to see them as providing characteristics, whereas classes provide things. Since things can have characteristics, classes can include modules, and thereby access their ‘characteristics’ (methods and variables). This is demonstrated by NoteStore. The include Singleton line means that NoteStore includes the instance() method, the newly private new() method, and the other rewritten or added methods that make the Singleton pattern work. Another example might be if we wanted to have two different sorts of notes: ones which were editable and ones which were not. We could set this up as follows:
An Editable module, describing various methods which could apply to an Editable thing. A Note class, which has methods and variables which apply to (all, including noneditable) Notes. An EditableNote class, which is a subclass of Note and includes Editable. Subclasses, in Ruby as with other languages, can be thought of as a ‘specialisation’ of their parent class. It would look like this: EditableNote < Note includes Editable # rest of class goes here EditableNote would inherit methods from both Editable and Note, and could also override those to make its own versions. However! Notice that classes cannot inherit variables. Instance variables in Ruby are created when a value is first assigned to them. If an
instance variable uses an inherited method that assigns a value to a variable (for example, EditableNote might inherit a set_title method which sets the @title variable), it will then acquire its very own @title variable. But that variable won’t ‘shadow’ the instance variable in the parent class. Another thing to keep in mind is an important feature of modules: they can’t be instantiated. Only a class can be instantiated. This means that you could instead of using Singleton, write a module as a type of singleton class (containing variables and methods); however, the Singleton module is probably a better way of doing this, as it has done the hard work for you. Keep an eye out when looking at Ruby code for ways in which modules are used as mixins, and how that can make your Ruby code more flexible and user-friendly.
51
Ruby
Blocks in action!
while @notebook_array[i] new_note = yield @notebook_array[i] @notebook_array[i] = new_note i += 1 end end end class Note def edit_title(title) @title = title return self end end class Runner def run # ... code as before if @options.all reader.read NoteStore.instance.do_to_all do |n| n.edit_title(@options.all_title) end reader.write_all end end end
(You’ll also need to add an all option to the Options class, which takes a parameter with which we’ll replace all the titles. This is exactly the same syntax as the other options; see the link on the Contents page for the code). So, Runner.run calls the do_to_all method on our singleton NoteStore, with an edit_title block applied to its variable n. do_to_all uses a while loop to put each element of @notebook_array in turn in as n, via the yield statement. One way to see this is that do_to_all throws one note at a time back to Runner.run, so it can be substituted in for that n variable, like this: new_note = @notebook_array[i].edit_title(@options.all_title) The do_to_all method then runs its next couple of lines of code before looping round again. It’s acting as an iterator. In fact, because @notebook_array is an Array, and Arrays already have an iterator method called each, you can simplify this further: def do_to_all @notebook_array.each do |i| new_note = yield i i = new_note end end This time, we’re nesting our yield/block structures: do_to_ all itself uses a block to access the members of @notebook_ array one by one, and then passes them back up to the block in Runner.run. Which is pretty neat, if a little hard to get your head around initially. OK! With all that in place, try running ruby bin/notebook -x NOTHING to see all your titles replaced by the string NOTHING (to use this in anger, you’ll probably want to think of some other methods to apply to your Notes; you might want to append a string to the title or body instead, for example, or even create an interactive method which allows you to make a change to each title one by one). Blocks and yields are one of the most powerful aspects of Ruby, so keep playing around with them, and look out for them in other methods and classes that you use. You’ll soon notice that they show up nearly everywhere, and that too will help you get used to their structure and uses.
Building, rake and gems Once you’ve produced a decent piece of code, the next thing you might want to do with it is to share it with the Ruby community; and the standard way of doing that is with a gem. You may have already used gems – effectively, they’re packages for Ruby, and RubyGems provides a straightforward package management system. As of Ruby 1.9, it’s part of the standard Ruby install, so you shouldn’t need to do anything more to use gems. The standard commands to manage already-existing gems are: gem install mygem gem uninstall mygem gem list --local #lists installed gems gem list --remote #lists available gems What about creating your own gem? We’ve already set our Notebook code up in a gem-like way, with bin/, lib/, and test/ directories. There are a couple of things we’re missing, though, before we can package up our gem. First, it’s customary to have a notebook.rb, which just sets up our other library files: require ‘notebook/note’
52
require ‘notebook/notestore’ require ‘notebook/options’ require ‘notebook/reader’ require ‘notebook/runner’ This helps to ensure that namespaces work properly and no one steps on anyone else’s toes. We’ll also rewrite notebook/bin/notebook just a little to fit the new gem structure: #!/usr/bin/env ruby begin require ‘notebook’ rescue LoadError require ‘rubygems’ require ‘notebook’ end runner = Notebook::Runner.new(ARGV) runner.run The begin/rescue/end control structure here avoids requiring RubyGems. If someone is not using RubyGems to manage their path, you don’t want to force them to do so. But
Ruby it’s reasonable to use RubyGems in the rescue__ block, as a second chance to load your gem if it hasn’t been found by whatever the local path management system is. Your Gem will need a version number, and best practice is to store that in lib/notebook/version.rb: module Notebook VERSION = “0.0.1” end Next, we need to write a basic gemspec, notebook. gemspec, which lives in the top directory of your module. A gemspec is a specification for your gem, with a list of attributes, most of which are optional. The required ones are date, name, summary, and version. platform and require_paths are technically also required, but both have defaults that should work fine, so you needn’t specify them yourself. Here’s a short gemspec for our gem: # -*- encoding: utf-8 -*# lib = File.expand_path(‘../lib’, __FILE__) $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib) require ‘notebook/version’ Gem::Specification.new do |s| s.name = ‘notebook’ s.version = Notebook::VERSION s.date = ‘2013-02-10’ s.summary = “Notebook” s.description = “A notebook gem to hold single-line notes” s.authors = [“Juliet Kemp”] s.email = ‘juliet@example.com’ s.files = `git ls-files`.split(“\n”) s.test_files = `git ls-files test/*`.split(“\n”) s.executables = “notebook” s.homepage = “” end The lines at the top allow your version file to be located. Check the Rubygem documentation for more gemspec options you might want. One common one is to specify runtime dependencies, but as our only external library, OptionParser, is part of the main Ruby install, there’s no need to include it here. Git is the strongly recommended way to keep track of the files in a Gemspec without having to write them all out (some gem/bundle tools won’t work at all if there’s no git repo), but
a flat list of files here would of course also do the job. It’s important to list all of the files in your gem; any files not listed here won’t be available to the gem, which will cause it to break. It’s also good practice to write a README.md file to demonstrate usage; here’s a very brief one: # Notebook ## Installation gem install notebook ## Usage require ‘notebook’ ‘notebook -h’ for help information on the command line Normally, you’d include API information here, but our gem is currently designed to be used from the command-line rather than within another piece of Ruby code. Finally, build and install the gem: gem build notebook.gemspec gem install ./notebook-0.0.1.gem You should now be able to type just notebook -r and have your notebook read back (one thing to bear in mind is the current location of your notebook file; you may need to specify it on the command line if you’ve moved directories). You’re now ready to contribute your (ruby) gems of code wisdom to the community as a whole, as you continue along your new Ruby programming path. Q
Here we’re building and installing our gem package.
Using Rake and Bundler Rake is the Ruby equivalent of the Unix tool make, and operates similarly. It’s useful to automate your gem build process. The Bundler gem (available using gem install bundler) will also help you to build a well-structured gem. If you’re starting from scratch, you can use Bundler to create a directory for you to code in, including a nice new git repo for it. If you have already worked through your Gemspec construction by hand, as above, you can still use the Rake tools to automate your install, by using this Rakefile: require “bundler/gem_tasks” Now rake install will build and install your Gem.
The other tasks Bundler installs automatically are ‘build’ (which builds your gem) and ‘release’ (which tags it, pushes the source to GitHub, and pushes the gem to rubygems.org Be sure it really is finished and ready for public release before you do this! You can also set Rake up to manage your tests. One way to do this is to add these lines to your Rakefile: require ‘rake/testtask’ Rake::TestTask.new do |t| t.libs << “test” t.test_files = FileList[‘test/test*.rb’] t.verbose = true end Type rake test to run everything in test/test*.
Running tests with Rake; time to rewrite the test suite…
53
Ruby
Ruby on Rails: Web development Gavin Montague introduces us to Ruby on Rails, a powerful framework that puts programmer happiness first.
T
he banner on the Ruby on Rails homepage isn’t what you might expect. It doesn’t proclaim Rails to be the fastest, most powerful, or most all-encompassing framework; instead, it gives Rails’ goal as “optimising programmer happiness and sustained productivity”. If you’ve had previous dalliances with other frameworks, you might appreciate this goal. Too many development tools sacrifice the productivity of the programmer in exchange for more ‘power’ in the code. Thankfully, Rails is not only a powerful tool – it’s nice to use too! The structure of Rails is such that it encourages good design, manageable code and best practice at each stage of development. This means you’ll spend less time being tied in knots by cryptic code, unexpected bugs and boiler-plating, giving you more time to build awesome things. Over the course of this three-part series, we’ll see how Rails can help us to not only make great web apps, but build them in a way that keeps us happy and productive. In this part, we’ll take a quick tour of the framework, bootstrap our sample application and add some functionality.
Get started Although your distro almost certainly has packages available for Ruby, we’re going to use rbenv to get us up and running. Rbenv allows us to create and manage entirely isolated versions of Ruby inside a user’s home environment. Ruby developers use this to hop between different versions of Ruby as projects require, but its main use for us is to avoid touching the root-owned system and ensure that we all have a consistent starting point. Start by installing rbenv and ruby-build via the Git source code management tool; if you don’t have Git, your package manager will be able to provide it. Open a terminal and type: $ git clone git://github.com/sstephenson/rbenv.git ~/.rbenv Make your shell aware of rbenv by adding it to your start-up files. If you don’t have .bash_profile in your home directory, change the end of the command to ~/.profile, or
The console Rails is a web framework, but it’s not just accessible in your browser. bundle exec rails console This starts the interactive console. You can execute code directly from within your Rails environment but without the need for a browser. For
54
example, you can add a new task: t = Task.create(:title=>”Use the console”, :due_at=>Time.zone.now+10. minutes) The console is a great way of quickly experimenting with objects and methods that you’re unfamiliar with.
see the rbenv home page for help. $ echo ‘export PATH=”$HOME/.rbenv/bin:$PATH”’ >> ~/. bash_profile Reload your shell, or open a new window, and type rbenv – you should see rbenv print itself out. If so, we can now install Ruby – the language underlying Rails. $ git clone git://github.com/sstephenson/ruby-build.git ~/. rbenv/plugins/ruby-build $ rbenv install 1.9.3-p392 $ rbenv rehash After this, running `ruby --version` should print out the path to Ruby in your ~/.rbenv folder. We’ll also be using SQLite3 to store data later. Your package manager will have a suitable version, eg: $ sudo apt-get install sqlite3
Introducing Ruby Ruby is intended to be a language that makes programmers productive and happy. The syntax lends itself to very clear expression of ideas, and a working grasp of its simple but powerful features can be picked up in an hour or two. In technical terms, Ruby is an Object Oriented language: everything you interact with is treated as a self-contained ‘box’ of data that you perform operations on via methods. For example: a = “a string” a.reverse # “gnirts a” array = [3, 2, 1] array.sort # [1, 2, 3] 5.next # 6 In Ruby, you’ll spend most of your time writing classes, best thought of as a blueprint for creating objects. Here, we create a new class of Person, who knows that they have a name and how to return a greeting. class Person attr_accessor :name def initialize(name) self.name = name end def greet “Hi, I’m #{name}” end end bob = Person.new “Bob” bob.greet => “Hi, I’m Bob” An interesting feature of Ruby is that parentheses are largely optional, and certain non-alphabetic characters can be
Ruby
used in function names, so it’s perfectly valid to write: name = “Bob” name.is_a? String array = [“apple”, “orange”, nil] array.empty? array.compact! This makes Ruby ideal for writing very expressive, easilyread code. The following is executable Rails code, but its meaning is quite apparent, even to non-programmers. objects.each do |o| o.update_timestamp and o.save! unless o.new_record? end We’ll be focusing on Rails from here on in, but I’d recommend having a look at the previous Ruby tutorials starting on page 38.
Set up Rails Rails is distributed as a gem: the Ruby community’s standard package format, so let’s install it and generate the skeleton for our application: $ gem install bundler $ gem install rails --version=3.2.12 $ rbenv rehash $ rails new todolist --skip-test-unit Rails is an opinionated framework. There’s a correct place for everything in its world, and this is enforced by all projects, starting with the pre-defined file structure. If you use Git for source control management, you’ll also appreciate that Rails has dropped off appropriate gitkeep and gitignore files. Dependencies on other Ruby libraries are managed via the gemfile at the root of your new project. Open it up and add the following around line three. gem “therubyracer”, “~> 0.11.4” group :development, :test do gem “rspec-rails”, “~> 2.13.0” end Then, back in Terminal, we’ll install these gems with bundler and run some post-install magic. We’ll use these libraries next month, when we’ll learn about behaviour-driven development, but for now they can be ignored. $ bundler install $ rbenv rehash $ rails generate rspec:install
Explore the MVC Rails adheres to the MVC (Model/View/Controller) design pattern. This is an abstract way of thinking about where responsibility for different aspects of your application lie. Rails maps these layers to three eponymous folders in /app. Your models are meaningful collections of data that will generally be very specific to your application. Users, Messages, Calendars, Tasks and Projects are suitable model objects. It’s up to the Model to handle its own storage and relationships with other Models, and ensure that its internal state is both accurate and valid. In our app, Tasks and Projects will be models. The views present our data to the end-user. In a Rails app, this normally means HTML, JSON, XML or PDF. Views should worry about how stuff looks, but not much else. Rails provides a number of helper libraries for turning Models into HTML, dealing with internationalisation and so forth. The middle men of the stack are the controllers. They deal with interpreting
Rails’ scaffold tool might not produce the prettiest pages in the world, but it’s a great way of getting up and running quickly with well-designed code.
the incoming request and lining up the correct Models to pass to the View for display. In our app, the controller decides which Task is to be operated on, what the operation is and what templates to render. The usefulness of the MVC approach is that it creates clear areas of responsibility. For example, a Task will have certain requirements before it can be considered ‘valid’ (eg, it must have a title and a due date). The controller need never know what the requirements are, but it will know how to ask a task, “are you valid?” and where to direct the browser as a result. Similarly, Tasks know nothing about HTML, but the view knows how to take data and present it as a form, table and so forth.
Create your tasks Let’s see how this works out in practice by building our first feature: $ bundle exec rails generate scaffold task title:string notes:text due_at:datetime done:boolean This generates a ‘scaffold’: a boilerplate version of our MVC stack for the Task class. This scaffold contains just enough code to enable us to manipulate a database table called ‘tasks’ via a Model, a Controller and some Views. On seeing the scaffold, most people react in one of two ways: “OMG! Rails writes code for me!” or “Oh, Rails is just a boilerplate generator”. In fact, Rails is neither of these things. The scaffold is intended as a training wheel and a tool for getting running quickly. As its name suggests, scaffolding gives support during construction: it’s not meant to be permanent. What the scaffold does give us is an idiomatic example of how to write code in Rails. Rails has created an SQLite database and built a table to store our objects. It does this through migrations: timestamped Ruby files, which can be found in /db/migrate. Notice how we’ve not written anything that ties us to a kind of database. If I were running MySQL, PostgreSQL or SQLite this migration would generate the correct SQL to build the table.
55
Ruby By describing our tables in Ruby rather than SQL, we remain adaptable. Let’s now build the database and start Rails’ builtin server: $ bundle exec rake db:migrate $ bundle exec rails server Open http://localhost:3000/tasks in your browser to start interacting with the scaffold by adding, editing and deleting items. On the new/edit actions pages, you’ll see how Rails has read the format of our database columns and generated appropriate field types corresponding to their format: checkbook for boolean, selects for dates, textareas for big strings. If you need to quit the Rails server, just press Control+C. Notice how Rails has also decided your URL structure for you. By default, Rails adheres to a REST architecture (http:// en.wikipedia.org/wiki/Representational_state_transfer). In practical terms, this means that all Rails projects automatically share the same relationships between object, controllers and URLs. This makes building open-source extensions easier because all apps will share the same assumptions about where functionality should live. The file config/routes.rb is responsible for mapping URLs to controllers, actions and parameters, as we’ll see later. You can do an awful lot with Rails’ routing language, but the default will serve our purposes today.
The model Open your text editor inside the ./app directory, and we’ll start examining models/task.rb. Not a lot there, is there? Rails’ ActiveRecord classes take care of all the interactions with your database and leave you to think about the behaviour of the object. All our model initially contains is a list of attributes that can be set by bulkassignment. Rails will infer what attributes a task object should have, based on the columns of its database table. We just have to make sure we protect any sensitive fields that users shouldn’t be able to set as they edit records. Additionally, Rails provides a huge number of commonly needed functions for database-backed applications, such as the management of inter-model relationships and management of validity. A task isn’t much use if you don’t know what it is, so let’s make sure we always have a title before allowing tasks to be saved. Add the following inside the Task class – anywhere between the first line and the final ‘end’: validates :title, :presence=>true Go back to your browser and try creating a new task without a title – you’ll be presented with an error state. The
Rails 4 The most recent version of Rails at the time of writing is 4.1.6, which includes some exciting features to make developers and users even happier. These are: Support for streaming data to the browser. Turbolinks – AJAX goodness to speed up every request. A common API for background queuing. And a lot more!
56
We’ve used Rails 3 here to ensure that you can complete this series of tutorials without having to rely on prerelease software. However, upgrading from 3 to 4 should be relatively easy, and the basic structure and features of any Rails 3 app won’t be too heavily affected. If you’re interested in progressing further with Rails, keep an eye on www.rubyonrails.org, where you’ll be able to find out when new versions are released.
Rails comes with a built-in web server that’s suitable for development work. Fire it up, visit localhost:3000 and follow what’s happening in the terminal.
validate method in Rails allows us to enforce a large number of common requirements on our models without having to hand-roll code each time we want to make sure a field is a certain length, a certain state, or present/absent. Let’s add a slightly more clever validation. We shouldn’t be able to create Tasks with due dates that have passed, so edit task.rb, then try to create a new task that’s in the past. validate :due_at_is_in_the_past def due_at_is_in_the_past errors.add(:due_at, ‘is in the past!’) if due_at < Time.zone. now end That worked, but wait. What happens if we try to mark a previously created task that’s overdue as complete? Go to one of your pre-existing tasks that has passed and try to save it. Hmm, looks like we’ll have to be a bit smarter and specify that this condition applies only to new tasks. Alter your code: validate :due_at_is_in_the_past, :on=>:create Rails now knows to only apply this validation when records are first created.
The controller When you submitted the task form, how did Rails translate it into data and how did it then know to display the form again or redirect us off to the show action? To answer that, we look at controller/tasks_controller.rb Each method in our tasks_controller becomes the entrypoint for a user’s browser request. Depending on the URL and the type of request (POST, GET, etc), Rails’ routing system will execute one action on one controller inside your application and deliver the output back to the browser. Take a look at the update method def update @task = Task.find(params[:id]) respond_to do |format| if @task.update_attributes(params[:task]) format.html { redirect_to @task, notice: ‘...’ } format.json { head :no_content } else format.html { render action: “edit” } format.json { render json: @task.errors, status: :unprocessable_entity } end end end
Ruby In not a lot of code we accomplish quite a lot. We start by finding one of our tasks via the Task class find method and the id parameter that Rails has extracted from the URL. We then go into a responds_to block, which is Rails’ way of dealing with different output types. We’re not dealing with json here, so focus on the HTML formatters. First, we try to update_attributes on our task object with the data passed in from the form. This will either return true or false, depending on what state our task has been put into. If we fail to save the task, usually because of a validation error, the controller renders our edit action. If we are successful, the browser is redirected to the @task itself. Rails assumes that if we ask to redirect to an object rather than a URL then there must be a ‘show’ action of a controller named after the object’s class. Notice here what we didn’t have to do. The incoming request was accessed directly: Rails automatically parsed it into Ruby objects for us to manipulate. We didn’t have to know about the internals of our task object. Our controller simply relied on @task to manage its own state, and instead concerned itself with the result. Finally, by following Rails’ naming conventions we were able to redirect/render as required without directly calling URLs or template files. This might not seem like a lot, but consider that pretty much every page of every web application on the planet is a collection of showing, editing, updating, creating or destroying database records. Rails provides just the right level of abstraction to make the whole process almost seamless. You can always go deeper if need be, but an awful lot can be achieved without the need to do this.
The view Finally, let’s look at our templating language. Open the views directory, and you’ll see we have five files: in Rails terms, four are ‘actions’ and one is a ‘partial’. All are written in Rails’ default templating language, ERB. Essentially, ERB is just Ruby injected into text files. Although this means you can technically call any Ruby method in your templates, it would be exceptionally bad form to do so. Templates should only concern themselves with the presentation of data. Generally speaking, if you find more than two consecutive lines of Ruby in a template then something’s wrong. Inside our ‘edit’ template, you’ll see that it just renders a different template: the ‘form’ partial. This makes a lot of
The best place to get more information on Rails is www. rubyonrails.org, the project’s home. Here, you’ll find numerous screencasts, links to other tutorials and more than enough information to take Rails to the next level.
sense when you consider that updating an existing task and creating a new one are essentially the same process. Why bother having two copies of the same form when we can just reuse one? This exposes another principle of Rails: Do not Repeat Yourself, or DRY. Code should exist in one definitive, canonical place. If we had to maintain two copies of the same form then one would almost certainly get out of step with the other. DRY keeps our code clean and our minds uncluttered. Look at the _form.html.erb partial. You’ll see that the majority of the template takes place inside form_for(@task). This wraps our @task object inside a form_helper, which maps our object’s data to HTML tags, deals with displaying error messages and decides which URL our form should point to.
Add a new feature Now that we know what lives where, let’s add a new feature to our app: tasks should have a priority, and higher-priority tasks should be displayed first on our index. First off, we’ll need a new field on our model to store the priority in. We do this by creating a new migration. By following the convention of ending the name with that of the model, Rails will deal with making sure the correct table is altered. $ bundle exec rails generate migration add_priority_to_tasks priority:integer $ bundle exec rake db:migrate Next, we’ll update our form to allow users to assign a priority. Open up the form partial and add a new field to expose our priority in HTML: <div class=”field”> <%= f.label :priority %><br /> <%= f.text_field :priority %> </div> Before we can make use of this new field, tell our model that users are allowed to make changes to the field directly: class Task < ActiveRecord::Base attr_accessible :due_at, :notes, :summary, :priority Actually, while we’re here let’s also ensure that people don’t try anything silly, such as assigning the priority “very” to a task... validates :priority, :numericality => true We can now build tasks and ensure that they have a priority. All that’s left is to make them display in the correct order. Tell our controller’s index action to specifically request that the collection is sorted by priority by changing the first line where we build @tasks: @tasks = Task.order(:priority) Rails provides a very nice syntax for building up quite complex queries on our database, without having to drop down to SQL. For example, we could exclude any completed task from our index by changing to: @tasks = Task.where(:done=>false).order(:priority) To make sure your tasks are displayed in the correct order, try to get your index template to explicitly display the priority for each task. At this point, we’ve got a functional Rails application and have oriented ourselves to the correct way of doing things. In the next tutorial, we’ll focus on perhaps the most major part of Rails’ march to happiness: test driven development. We’ll see how to drive development of features by creating an automated test suite that runs alongside our application and constantly checks for bugs, design problems and unnecessary code. Q 57
Ruby
Ruby on Rails: Code testing Gavin Montague shows how Test Driven Development can improve your code and catch bugs.
I
n the previous tutorial we looked at how Rails seeks to optimise developer productivity and happiness. We took a tour of the Rails framework and built a basic to-do list application. You can find the completed app, along with instructions to get it up and running, in the ZIP file at www. linuxformat.com/files/ca2015.zip. This time we’ll look at how Test Driven Development (TDD) can be used to build our application, catch bugs and improve code, all things that I think you’ll agree will contribute to programmer happiness. The TDD methodology is a deceptively simple rethink of how most developers find themselves working. To see the changes that TDD brings, let’s first look at the ‘Test’ part. If you were to watch most web developers at work they’d constantly skip back and forth between an editor and a browser. After changing code they’ll swap to the browser, reload the page, look for errors and then hop back to the editor. This isn’t very productive behaviour. Firstly, it’s massively inefficient to be shifting back and forth between two programs. Not only does it force you to push the code to the back of your mind as you remember what steps to take in the browser, but it’s also physically slow. Imagine trying to manually test a sign-in system where each change requires the developer to sign out, clear cookies and start over. Secondly, consider all the moving parts between the edited code and the end result in the browser. If the page doesn’t load, what does that mean? How can we isolate what part of our application has failed? Even worse, what if it’s
failed in some silent way that’s not immediately apparent from the front-end? In TDD we use an automatic test suite to address these issues. Over time a series of standalone tests are built that can automatically run without any human input and make a series of ‘assertions’ to check our code. The tests are reproducible so we never have to worry about someone forgetting to manually check functionality each time they edit the code. In terms of speeding the process up, a well-written suite can typically run several hundred tests in less than 10 seconds. Additionally, the tests we write will be isolated, which means that they will run without reference to the rest of our application as a whole. This means that if we break something it will be immediately obvious where the fault is and we won’t have to spend time picking our way back through the code looking for what’s gone wrong. At a higher level, it’s even possible to write tests that actually take control of your browser to exercise the full-stack of your application, with tools like Capybara.
“It’s massively inefficient to be shifting back and forth between two programs.”
Run tests with guard You’ll quickly get bored of manually triggering your test suites, so why not do it automatically? You can use the guard gem (https://github.com/ guard/guard) to watch your project and automatically run the correct subset of tests whenever a change is saved. Add guard’s dependencies to your gemfile: gem ‘guard’ gem ‘guard-RSpec’
58
gem ‘libnotify’ Then install them and start guard: $ bundle install && rbenv rehash $ bundle exec guard init $ bundle exec guard Now, as you save files in your project, guard will intelligently run the matching parts of your suite. It can also hook into several desktop notification systems to provide more visual feedback on how you’re progressing.
Your code as documentation Strangely, very little of a programmer’s day is spent writing new stuff. We spend most of our time reading and rewriting code provided by others or our past selves. It’s therefore important to optimise for readability and understandability, or to quote Damian Conway: “Always write your code as though it will have to be maintained by an angry axe murderer who knows where you live.” A test suite can help developers get oriented with new code by giving them a living, breathing set of specifications to run. This is often easier to understand than parsing cryptic inline comments that were written weeks ago and never updated. You’ll also find that because we always want to be able to run our tests in isolation, our methods will be smaller, more descriptive and have less dependency on other parts, all of which will help with readability.
Red, green, refactor All the benefits mentioned so far are what you get for writing tests, but what about the ‘Driven’ part of TDD? Well, we call it ‘Test Driven’ because the first step of development is to write a test that fails. Hang on – we start by writing a test? How can we write a test before we have any code? This is the key
Ruby feature of Test Driven Development. Before we write any production code we first define what the code should do by way of a test. In this context the test serves two purposes: it acts as a target for us to work towards and as a line over which we don’t step. We write just enough code to pass our current test and then re-evaluate where we are before either starting on a new test or altering our existing code. As you build up code through multiple cycles, two trends should naturally emerge. Your code will become simpler as a result of focusing on writing just enough to hop to the next green stage. Too many developers will go off on flights of fancy writing large, tightly coupled, overly complex methods and classes that become impossible to debug. A Test Driven system is more likely to be composed of many tiny interconnected parts that can all be operated independently and are easy to understand in isolation. Additionally, you’ll spend less time chasing developmental dead-ends. And because you start by writing a test that actually exercises the code as it’s finally intended to be used, you’ll become much more aware of dependencies and flaws in the interface design. The TDD development cycle can be summed up as ‘red, green, refactor’. We start with a failing test: ‘red’. We then write just enough code to make the test pass: ‘green’. Finally, we ‘refactor’ our new code for maintainability and performance, safe in the knowledge that if we break anything our suite will catch it. A full discussion of the merits and drawbacks of Test Driven Development could fill up an entire bookshelf, but if you’re interested in the theory and evidence behind the technique, I recommend you take a look at this paper from – dare I say it – Microsoft, which provides a comprehensive overview (http://research.microsoft.com/ en-us/groups/ese/nagappan_tdd.pdf). But that’s enough theory, let’s look at TDD in practice.
A simple RSpec example Rails ships with the Test::Unit library, which is a perfectly decent test tool, but we’ll be using RSpec today because I find its syntax and setup much easier to get my head around. Technically, RSpec is a Behaviour Driven Development (BDD) tool, but for our purposes we can ignore the slight differences between this and classic TDD. Open a new file inside last month’s project at lib/hello_bot.rb: describe HelloBot do describe “#greet” do it “says ‘hi’ to friends” do HelloBot.greet(“Bob”).should == “Hi, Bob” end end end This shows us the format of an RSpec test. We use Ruby’s block syntax and the describe method to lay out a series of tests which are marked by the it method. Inside each of our tests we will make an assertion. In this case we assert that our output should be “Hi, Bob”. Now run your spec with the command: +$ bundle exec rspec ./lib/hello_bot.rb Notice that we’re running our test even though we haven’t actually defined the HelloBot module yet. This is how we ‘drive’ our development: by starting off writing a test that exercises the code that we haven’t yet written, it becomes easier to imagine how we want to use that code later on. RSpec will complain that the module doesn’t exist, so let’s
It might not look like much, but if you can see this, congratulations! You’ve test-driven your first feature.
jump forward a few steps by adding a definition to the top of the file and rerunning the spec: module HelloBot def self.greet(name) “Hi, #{name}” end end When you run the tests a single dot will appear in your terminal indicating the completion of a test. Try altering the output of the greet method and rerunning to see how that alters the output. RSpec has many configuration options that can be used to print more or less information, or even to format your results as HTML. These options will become more useful as your suite gets bigger: +$ bundle exec rspec ./lib/hello_bot.rb -f d --color +$ bundle exec rspec ./lib/hello_bot.rb -f h > ~/Desktop/ RSpec.html +$ bundle exec rspec --help If you find a combination you like, add it to the .rspec file in the project root to apply them automatically. I’m going to add another test to our suite for you to complete: our method should also know how to deal with formal situations. Add this test to your spec, then try to get back to green. Have a look at the code in the source archive (www.linuxformat.com/files/ca2015.zip) if you get stuck: it “is more formal with people whose names are unknown” do HelloBot.greet(“Miss Smith”).should == “How do you do, Miss Smith?” end
Rails and RSpec In your project’s spec directory you’ll find an approximate mirror of the app directory. We subdivide our tests to match the various layers of our application to make management easier. Running the tests in Rails requires a bit more orchestration than in our simple example, but this is mostly taken care of automatically. To conform to our idea of isolation, RSpec will do its best to ensure that nothing bleeds between tests. All instance variables, class definitions and so forth are reset between each test, but because almost all Rails applications will be backed by a database, we have to get round the fact that each of our tests might potentially alter our persistent data. Luckily, Rails handles almost all of this as part of its various environments, which you can see in the config/
59
Ruby environments and config/database.yml files. Running our tests via rake will ensure that our test database is created and correctly reset for each test. Our test suite can be run via rake, either in full or by providing one of the test subsets, such as ‘models’: $ bundle exec rake spec $ bundle exec rake spec:models
Fixing a bug If you run the full suite from the last tutorial’s app you’ll see that we’re starting with half a dozen errors. The output is too big to print here, but if you read through it you’ll see that the error relates to our requirement that a new task can’t have a due_at time in the past. If you try to save a task that has no due_at date set, an exception is thrown. Let’s fix that by opening spec/models/task_spec.rb and adding a failing test: describe “#due_at_is_in_the_past” do it “doesn’t throw an exception if due_at is nil” do lambda { Task.new(:due_at=>nil).due_at_is_in_the_past }.should_not raise_error end end Here we use a different expectation to trap any raised exception and report it back as a test failure. In this test we’re doing more than just capturing the bug in code: we’re also suggesting the correct outcome. If a task isn’t supplied with a due_at value then it should simply save ‘nil’. Run this test with rake spec:models and watch it explode. Our tests are failing because our record tries to compare a Time object with nil: a no-no in Ruby. We now adjust our Task class: def due_at_is_in_the_past errors.add(:due_at, ‘is in the past!’) if (due_at && (due_at < Time.zone.now )) end Our test now passes! There’s not much to refactor so we’ll skip that step and run our full suite. It’s good practice to do this after each successful cycle to make sure our changes haven’t broken any other part of the app. In order to better prioritise our time, items that are due soon should appear in red on our index. To implement this we’ll first add the concept of due_soon? to our model. As before, start by writing a test in task_spec.rb that expresses the code we want to be able to call: describe “#due_soon?” do it “is true if due in less than 24 hours” do task = Task.new(:due_at => Time.zone.now + 23.hour) task.should be_due_soon end
Uncle Bob’s three rules of TDD Bob Martin, known by the community as ‘Uncle Bob’, is one of the clearest voices on what good TDD looks like. There are many pearls of wisdom on his site (www.cleancoder.com), but my personal favourites are his Three Rules for Test Driven Development: 1 You are not allowed to write any production code unless it is to make a
60
failing unit test pass. 2 You are not allowed to write any more of a unit test than is sufficient to fail; and compilation failures are failures. 3 You are not allowed to write any more production code than is sufficient to pass the one failing unit test. If you can stay within these boundaries when practising TDD, you won’t go far wrong.
(Fig 1) RSpec can format the results of your tests in a variety of readable formats, including HTML.
it “is false if due in more than 24 hours” do task = Task.new(:due_at => Time.zone.now + 25.hour) task.should_not be_due_soon end end There’s a little bit of RSpec magic being used here, to make your tests a little more readable, it understands that be_due_soon means that the method due_soon? should return true or false. Next, update the Task class in app/ models/task.rb: def due_soon? (due_at < Time.zone.now + 24.hours) end This passes, but recall that our due_at attribute is allowed to be nil. We should add a test to describe what should happen here: it “is false if no due date is set” do task = Task.new(:due_at => nil) task.should_not be_due_soon end As it did earlier, the test fails because we can’t compare nil with a Time object. We’d best amend our method: def due_soon? return false if !due_at (due_at < Time.zone.now + 24.hours) end This will give us green tests in our model layer and we can rerun the full suite. Try extending your tests to cope with the specification, ‘A completed task can never be due soon’. A solution is available in the sample app.
Helper tests It’s bad practice to put anything other than the most minimal flow control in templates, so we’ll put our formatting logic in a helper. Add a failing test to spec/ helpers/tasks_helper.rb: describe “task_title_formatter(task)” do before do @task = Task.new(:title=>”task”) end it “adds a ‘due’ css class to tasks which are due_soon” do @task.due_at = Time.zone.now task_title_formatter(@task).should == “<span class=’due’>task</span>” end
Ruby it “adds no extra classes to tasks which aren’t due_soon” do @task.due_at = nil task_title_formatter(@task).should == “<span>task</ span>” end end Again, I’d recommend you add the tests one at a time and try to develop incrementally towards a solution, which should look something like this: module TasksHelper def task_title_formatter(task) if task.due_soon? “<span class=’due’>#{task.title}</span>”.html_safe else “<span>#{task.title}</span>”.html_safe end end end Finally, you will need to update your index.html.erb view to call task_title_formatter(title) and add .due to your application.css.scss file (we’ll look at why this isn’t just a plain CSS file in the next and final tutorial). If you start up the Rails server your index page should now look something like the grab on p59.
Controller test Notice that our new feature didn’t actually need to be tested in either the controller or the view. Rails tends towards a style of design known as ‘Fat Model, Skinny Controller’. Where we have custom functionality it’s often pushed down to the model or, for presentation data, out to a helper. A well-designed Rails controller will usually contain very little code because all the web-specific stuff, like session-handling, URL parsing and header generation, is handled automatically by Rails. That said, it’s the controller’s responsibility to manage the users’ login status, permissions etc and these should be tested thoroughly. Once a task has been completed it should no longer be editable through the web interface. Let’s drive out this feature in two parts: we’ll remove the edit links from our task’s show page and stop our controller from allowing updates. We’ll start with the controller’s spec. Go to the description of our update method around line 105 of tasks_controller_spec.rb and start a new nested describe block: describe “PUT update” do describe “where the task has already been completed” do
To find out more about RSpec, your best resource is the project documentation at www.relishapp.com/RSpec.
before do @task = Task.create! valid_attributes @task.update_attribute :done, true end it “does not update the task” do put :update, {:id => @task.to_param, :task => { “title” => “t” }}, valid_session Task.any_instance.should_not_receive(:update_ attributes) end it “redirects the user back to the index” do put :update, {:id => @task.to_param, :task => { “title” => “t” }}, valid_session response.should redirect_to(:action=>:index) end Our tests here are slightly different from before. Remember I said that a controller’s main job is to orchestrate other objects. This means that we’re not so much interested in the outcome of some actions, but whether the actions trigger other events. In our first test we attach an expectation directly onto our Task class using a mocking library. A mock object can be used in place of a real one, but may have additional behaviour. If you were testing a library to transfer money between bank accounts it could get pretty expensive to run your tests against real APIs. Instead you would mock out the various responses the bank could give (ok.xml, fail.xml etc) and run tests against them. Here we use a mock to make sure that no update_attributes call is made to any task. You can alter the update method in tasks_controller.rb to pass the tests: respond_to do |format| if @task.done? format.html { redirect_to tasks_url, notice: “Completed tasks can’t be changed.” } elsif @task.update_attributes(params[:task])
View testing Finally, we can test our view. Open up show.html.erb_spec. rb and add a test against a completed task: it “doesn’t link to edit on complete tasks” do @task.done = true render rendered.should_not match(/Edit/) end That test will fail, so make it pass by updating show.html.erb: <%= link_to ‘Edit’, edit_task_path(@task) unless @task. done? %> Congratulations, you’ve now test-driven your second feature. Try extending the behaviour to not show edit links on the index page. Remember to write your tests first. We’ve only grazed the surface of TDD here, but I hope it’s given you some idea of how useful it can be in helping your development workflow. The key to getting the greatest value from it is to apply it from the very start of your project and to avoid the temptation to skip steps of the Red-Green-Refactor loop. Over time, you’ll get faster at writing tests and faster at deciding what tests should be written, and that’s the path to TDD happiness. In the next tutorial we’ll look at how Rails simplifies the client-side aspects of web development with the assetspipeline and the JavaScript compiler CoffeeScript. Q 61
Ruby
Ruby on Rails: Site optimisation Gavin Montague shows how Rails, with help from Ajax, CoffeeScript and SASS, can help front-end development.
O
ver the past two tutorials we’ve looked at how Ruby on Rails optimises developer happiness by providing sensible defaults and enforcing best practice in its users. In part one we took a tour of a simple Rails application, explored the MVC and built our basic to-do list manager in a matter of minutes with the Scaffold tool. In the previous guide, we looked at how Rails uses Test Driven Development to produce higher quality code in less time. This time we’ll turn our attention to the client side of web development. Although Rails is a server-side tool it is remarkably opinionated about how one should build a frontend. We’ll see how Rails removes a lot of the friction of working with JavaScript and CSS, and how best practice for handling static assets is incorporated into the framework.
JavaScript has a mixed reputation because of its occasional eccentricities. CoffeeScript tries to fix them and you can try it out for yourself in the browser.
62
Unobtrusive JavaScript Rails was one of the first web frameworks to integrate Ajax into its core. This made for faster, more responsive sites, but the markup it generated wasn’t notably good quality. That all changed in Rails 3, which generates clean, semantic markup and integrates with a range of JavaScript frameworks in an unobtrusive manner. Let’s look at how easy it is to add Ajax interactivity to our to-do application. Before we get started, you’ll need a good set of front-end developer tools.
Debugging JavaScript can be a frustrating task, but it’s much more bearable if you have the right tools for the job. Install an up-to-date version of Firefox and the Firebug toolbar (www.getfirebug.com). If you haven’t used Firebug before, refer to its dedicated box (see Debugging JavaScript, p64) for usage instructions. At present, deleting a task in our application triggers a full page reload. You can see this happening with the Firebug Net Viewer. Our application must rebuild the whole page, send it back to the client and then have the client render it. How wasteful! It would be much nicer to deal with the deletion as Ajax and remove the offending task from the current page. In many frameworks we’d have to build this from the ground up. Rails has support for this kind of action baked in, which saves each developer from reinventing the wheel.
Ajax on Rails When you delete a task from the list, you’ll be met with a JavaScript confirmation before you can proceed. But where does this come from? Inspect the page source and you’ll see that each delete link contains: <a [...] data-confirm=”Are you sure?”>delete</a> Rails uses HTML5 data attributes to describe behaviours to a JavaScript framework – in this case jQuery – in an unobtrusive way. The markup of the page isn’t littered with script tags or inline ‘o’clock’ attributes. This behaviour is injected in afterwards, as is best practice. In the <head> of your page you’ll see our scaffold has already included jQuery and jquery_ujs libraries, which deal with setting up and monitoring the behaviour indicated by these extra attributes. If you haven’t used jQuery before, I’d recommend taking a look at the official tutorial (http://bit. ly/13t9K7S). If you have a preferred JavaScript framework, there’s likely to be an analogous adaptor library available for use with Rails. To let jQuery know that our delete links should use Ajax we only need to make two small changes to our code. Update the delete link to include: link_to ‘Destroy’, task, remote:true, method: :delete, data: { confirm: ‘Are you sure?’ } In your tasks controller, flip the responds_to block inside the delete method: respond_to do |format| format.json { head :no_content } format.html { redirect_to tasks_url } end
Ruby CoffeeScript & SASS everywhere! If you’re keen on trying out CoffeeScript or SASS, but not fortunate enough to be into Ruby on Rails, no problem! Both are standalone tools that can be used in any project, albeit without the built-in integration that Rails provides. SASS is distributed as a Ruby Gem and can usually be installed system-wide with one line: sudo gem install sass Once it’s installed you simply have to tell SASS which scss files you want it to watch and
what to call the generated output. This, for example, is how you tell it to watch mobile.scss: $ sass --watch mobile.scss:mobile.css SASS can be left running in the background while you work on the scss file: it will output a new version of the stylesheet immediately after each save. The installation of CoffeeScript is a little more involved. Although the actual compiler can be run in any JavaScript environment, you’ll need
Reload the page and the rendered links will now contain a data-remote attribute. This tells jQuery to intercept any clicks on the link and submit the request via an XMLHttpRequest instead. Open the network console in Firebug then delete a task. You’ll see the request being fired off in the background. Refresh the page and the task will be gone, but that’s not the most user-friendly behaviour. It would be better if our user got an instant visual feedback. Create a new file at app/assets/javascripts/ajax_tasks.js and add: $(document).on(‘ajax:success’, ‘.index-table a[datamethod=”delete”]’, function() { return $(this).closest(‘tr’).fadeOut(); }); If you’ve used jQuery before, this should look relatively familiar. We tell our document to watch for any successful Ajax pseudo-event that originates from links with the delete data-method. When we receive this event we fade out the table row around the originating link. Save your changes, refresh the page and try deleting a task.
Rails requests
node.js and its package manager, npm, to run the command-line tool. Installing node.js itself will vary between platforms and your best guide is the project’s site (http://nodejs.org/ download). With that in place: $ npm install -g coffee-script $ coffee --watch --compile application.coffee As with SASS, the Coffee command-line tool can be left to monitor all the required files for changes in the background.
that is not a happy thought. JavaScript is a very good language and when it comes to client-side interactivity it’s the only game in town. However, it does have some eccentricities, particularly in the eyes of Ruby and Python developers: Trailing semicolons are sort of required, but not really. Variables are global by default unless specified as var. Return values are explicit (in Ruby, all methods return values implicitly). C-style control structures for loops and decisions can feel quite messy.
Using CoffeeScript None of these things make JavaScript a bad interpreted programming language per se, but it’s a language that a lot of developers don’t look forward to working with. That’s why we have CoffeeScript, the precompiler for JavaScript. To quote its website: “Underneath that awkward Java-esque patina, JavaScript has always had a gorgeous heart. CoffeeScript is an attempt to expose the good parts of JavaScript in a simple way.” The CoffeeScript project isn’t part of Rails, but out of the box, our to-do app has been set up to seamlessly work with the compiler. Let’s look at a small piece of CoffeeScript to demonstrate its key features. Create a new file at app/assets/ congratulations.js.coffee: @congratulation_bot = messages: [“Well done”, “Great job”, “”, “Top stuff”] name: name congratulate: (message) -> alert message+” “+@name unless message.length is 0 overcongratulate: -> for message in @messages @congratulate message By amending our Ajax deletion method from earlier, we can receive the praise of our congratulation bot each time we delete a task. window.congratulation_bot.name = “Gavin”; $(document).on(‘ajax:success’, ‘.index-table a[datamethod=”delete”]’, function() { window.congratulation_bot.overcongratulate(); return $(this).closest(‘tr’).fadeOut(); }); On the next refresh, Rails will automatically compile the CoffeeScript and serve the JavaScript output in its place. This happens seamlessly and you’ll never have to worry about making sure an up-to-date version is being served.
“CoffeeScript is an attempt to expose the good parts of JavaScript in a simple way.”
At this point it’s worth looking at how Rails deals with different kinds of requests when rendering output. Open your tasks_controller and focus on the destroy method we altered. In the tutorial on p54, I said that the job of the controller is twofold: it lines up the correct operations on the models and then decides what output should be rendered. Generally these two halves are independent of one another and Rails’ respond_to syntax gives us a neat way to organise our code. The destroy method starts by dealing with the modelwrangling: we find the correct task and we destroy it. In the respond_to block we then tell Rails how different kinds of client should be handled. In our full-page refresh the browser request for HTML is handled as a redirect back to the index page. Our Ajax request is handled by sending back a successful, but empty, HTTP response. Developing APIs in Rails for other consumers is often as simple as tacking on an extra responds_to formatter. For example, if the output is to be consumed by a desktop program then HTML wouldn’t be suitable, but json would be ideal. Try visiting /tasks.json and /tasks/<id>.json to see how the respond_to block in each of these methods automatically generates JSON formatted output. Looking at the format of how json is generated, try getting your ToDo controller to respond with XML output. The Ajax support in Rails is useful for simple functionality. But at some point in a complex application you’ll have to write a chunk of complex JavaScript and, if you’re anything like me,
63
Ruby To see how CoffeeScript’s syntax breaks down, go to http://coffeescript.org and paste our code into the online compiler. You’ll see the output appear on the right-hand side. It’s slightly easier to experiment online with the code than reloading pages in your app. Let’s look at some of the differences CoffeeScript brings.
CoffeeScript features
If you’re debugging JavaScript or examining the dialogue between the browser and the server, Firefox’s Firebug extension is a must-have tool.
First, like Python, CoffeeScript is white space sensitive. This is becoming a feature of more modern languages. Given that all developers indent their code for readability, why not make a feature of it and lose the curly brackets? The ungainly for-loop is represented in CoffeeScript as the much more readable for message in messages, which should be familiar to Ruby developers. We also get access to Ruby-style suffix flow control and the unless keyword, which serves as a negative if. Also, as a nod to Ruby, we can use ‘@’ as a shortcut for the keyword this when referencing a scoped variable. The most ubiquitous feature of CoffeeScript is probably ‘->’ the so-called ‘dash-rocket’ that replaces the function keyword. Most JavaScript rapidly becomes a mess of callbacks and nested functions, which become difficult to read. The dash-rocket is far more readable. There are other lumps of syntactic sugar to sweeten our code, such as testing for the existence of a variable with ‘?’, Python-style chained comparison and multi-line strings. CoffeeScript’s inclusion in Rails is somewhat divisive, but I find it to be a welcome addition. The project’s website contains a more detailed walkthrough of the language’s features, but its basics can be picked up in a few minutes. You could start by trying to
replace our unobtrusive JavaScript delete method with its CoffeeScript equivalent. In the same way CoffeeScript extends JavaScript to increase programmer productivity, Rails also includes SASS to improve our relationship with stylesheets. Anyone who’s ever tried to wrangle the stylesheets for a large web project will probably have encountered the following problems: CSS selector structure leads to a lot of duplication when targeting related nested elements. The only way to share style between unrelated elements is via classes, which results in either duplication of styles or presentational class names. There’s no way to show relationships between numbers: how can we show that the margin should always be twice the line-height? Although CSS keywords exist for some colours, there’s no way to specify our own, such as ‘error-red’ or ‘logo-blue’.
SASS SASS extends CSS with variables and nesting, which we’ll see below, along with ‘mixins’ (reusable chunks of CSS), mathematical operations (for example, ‘set the margin to line-height/2’) and even colour maths (for example, ‘set the background to halfway between blue and green’). The true value of SASS doesn’t become apparent until you’re managing large stylesheets and sadly our little application doesn’t quite meet that criterion. We’ll just look at two of the most important features of SASS: nesting and variables. Append the following to app/assets/stylesheets/ tasks.css.scss and then reload the page to see it in action. $header_color: #0000FF; $danger_color: #FF0000; .fancy_text { font-weight:bold; color: $header_color; } h1 { @extend .fancy_text; } #error_explanation { h2 { background:$danger_color; } } table { th { @extend .fancy_text; } a[data-method=delete] { color:$danger_color; } } With SASS we are able to nest CSS selectors and use it to organise our styles into a more visible hierarchy. We also gain the ability to extract common colours as meaningfully named
Debugging Javascript Developing JavaScript and Ajax behaviour is best done with a good client-side extension to your browser, such as Firebug for Firefox. Install Firebug as you would any Firefox extension, and then press F12 to bring up its extensive set of tools. The Console and Net tabs are of most interest to us. Both of these are disabled by default, but you can click on their little black triangle icons to activate them. The Console gives you access to a JavaScript REPL that runs within the scope of your page.
64
Fire it up when your browser is pointed at a page within our Rails app, and start to type ‘jQuery’ into the bottom bar. Firebug will autocomplete function and object names, show you the output of operations, and is a generally invaluable help when trying to work out what’s going on inside a particular page. When debugging Ajax it’s often difficult to visualise why problems occur. For instance, is a failure because an Ajax request isn’t firing or because it’s not being answered by the server?
Sometimes you’ll find it’s something more arcane. The content-type, maybe. The best tool to use here is the Net Monitor. Enable the Net Monitor and reload the page. As resources come in you’ll see them represented as rows showing the request, the status and the loading time. Opening up any of these rows will enable you to drill down and examine every detail of the request, from the outgoing headers through to the response from the server.
Ruby variables, rather than having to always refer back to an external style guide to remind us which shade of red is which. The SASS website has excellent documentation on all the language’s features. Even if you don’t use Rails, I’d urge you to have a read and see how you can incorporate it into your web framework of choice. Speed is important. Users have a spectacularly low attention span and any lag in your site will cost you visitors. Amazon famously established a relationship between response time and income where a 100 millisecond reduction in load time resulted in a 1% bump in revenue. While you should always try to optimise the server-side component (by correctly indexing your database, cache templates etc) a more substantial improvement is optimising how a browser will render your site. Rails provides a set of built-in features (collectively called the Asset Pipeline) which attempts to maximise the delivery speed of your pages.
Optimise website rendering To demonstrate how Rails helps us, we’ll need to alter our server settings. By default, Rails runs in development mode where speed is sacrificed in favour of clarity, more is logged and files are delivered unadulterated, and much of the environment is reloaded on every request. Start the server in development mode, as you have done to date, and save the task index’s HTML source from your browser out to a file. We’ll use this for comparison as we examine the optimisations that are contained in production mode. The built-in web server Rails uses is not at all suitable for running in a production environment, but we can use it here. Open up the config/environments/production. rb file and remove the comment from the following line to make our Rails server handle static content. config.serve_static_assets = true You’ll then need to build a new database for the production environment, prepare your assets for delivery and start the server: $ bundle exec rake db:migrate production $ bundle exec rake assets:precompile $ bundle exec rails server -e production One thing to note about the production environment is that changes made to the app won’t take effect until the server is restarted. When you’re done with this section, remember to drop back into development mode. Let’s look at some of the differences between our development and production pages.
associated with requesting each resource on a page. Fewer files means fewer handshakes and faster load times. In addition to concatenating files, the pipeline also attempts to minimise the file size of the grouped file.
Achieve better caching By default Rails will remove non-essential white space from JavaScript and CSS, but it’s also possible to use more destructive compressors that will rewrite JavaScript to use shorter variable names and various other optimisation methods. On our production page you’ll notice that the filenames of both your CSS and JavaScript files have been suffixed with seemingly random strings, eg application-27b252c669ac 588cef435fa3d3e8aebf.css These suffixes are MD5 hashes of the file contents. By tying the name of our files directly to the content that they contain, we can aggressively cache them on the browser without any fear of serving stale content to a client. Any change to the file will result in a change to the hash, thus causing a cache-miss in the browser which will, in turn, request the updated file. If the file’s contents do not change between versions then a correctly configured web server can make sure the browser never has to download the file more than once. As with SASS and CoffeeScript, Rails provides developers with a sensible set of defaults for handling asset delivery. By enforcing this at the level of the framework, it ensures that all Rails projects automatically incorporate a decent set of optimisations without any intervention by the developer. I hope you’ve enjoyed this three-part tour of Rails. By necessity we’ve skipped over much of what Rails can do in order to focus on a few of the aspects that distinguish it from other frameworks. There’s a huge amount left to explore, including how ActiveRecord makes working with complex database relationships trivial; the wealth of community tools available, and much, much more. If you’d like to learn more about Rails, pick up the Pragmatic Programmers book Agile Web Development With Rails by David Thomas or follow the popular RailsCast screencast series (http://railscasts.com). Good luck with your future Rails projects and I hope they leave you full of programmer happiness! Q
Rails ships with support for SASS – Syntactically Awesome Stylesheets – a rethink of how Cascading Stylesheets ought to work.
“Speed is important. Users have a spectacularly low attention span.”
Get faster load times Comparing the <head> of your development and production pages, you’ll see that the production version contains far fewer resource declarations. Where the development environment loaded in each of our scripts and stylesheets separately, our production page has compressed all these down to just two files. Rails allows you to specify ‘manifests’ of nested JavaScript/CSS files, as shown in your application.js/css files. The development environment expands this out to include the individual files, but production will concatenate them all into a single lump. A surprising amount of a page’s loading time is spent waiting for the HTTP overhead
65
a
’)
]
.
s .
<
. . -
. : )
);
:
)
a
.
_ ),
)
u
.
(
r
.
t a
=
c
<
.
I -
.
, $
;
);
);
Lesser-known Languages More languages
More languages T
his section contains a potpourri of languages; some that are cutting edge and some established. But they each offer a unique insight into how code can be constructed and organised, offering many techniques you can take back to your language of choice.
C and beyond: Code a starfield .......................................................... 68 Scheme: Learn the basics .........................................................................72 Scheme: Recursion............................................................................................76 Scheme: High order procedures ......................................................80
67
More languages
C and beyond: Mike Saunders takes you on a whirlwind tour of three programming languages and toolkits, showing how to make a starfield effect in each. but it’s a really good way to discover how these languages work in action, and how identical goals are achieved using their varying approaches. We’re not going to provide lengthy, meandering introductions to the languages – the best thing you can do is read the code and our short explanations, and then start hacking around with it yourself. So, without further ado…
Low-level: C and SDL
W
e’ve taken a good look at Ruby, but how many programming languages do you need to know to be a really good coder? It’s not an exact science, but we’d say three or four. Of course, you can learn a single language inside-out and become an absolute genius at it, but no programming language does everything perfectly, so you’d still be missing out on other features and ideas. It’s a bit like human languages: even if you can speak English wonderfully well, there’s a lot to be gained from learning one of its relatives, such as German or French. You pick up more than just replacement words – you learn to think in a different way, and discover a new culture in the process. So a really great hacker usually knows a handful of different programming languages. This is why we recommend that everyone, even hobbyist coders, should try a few different languages at some point, as they all have something to contribute. Learn a low-level language and you’ll discover a lot about memory management and pointers, for instance; and with a high-level language you can manage complicated algorithms more easily. Once you’re comfortable with a bunch of languages, you can always pick the right one to solve a job (too many coders become experts at just a single language, such as C++, and then try to solve every single problem using it). So with this in mind, we decided to make this tutorial about multiple languages. But there’s a twist: we’re going to do the same thing in each. That might sound a bit pointless,
68
We’re going to write a parallax starfield simulation, which shows a bunch of stars scrolling across the screen at varying speeds. It’s the sort of thing you can use in a screensaver, or as the background for a side-scrolling shoot ’em-up game. Most importantly, it shows how to achieve a number of things in the language: creating arrays, performing maths, doing loops, plotting pixels and so forth. Now, we’re starting off at a pretty low level in the form of C and SDL. C is regarded by many as a ‘portable assembly language’, and doesn’t include hand-holding features of higher-level languages, such as garbage collection. With C, you can interact more closely with memory and hardware, and that’s ideal for certain tasks. SDL (Simple DirectMedia Layer), meanwhile, is a very popular multimedia library that enables you to work with images, fonts and sounds. It’s not the easiest library to use when compared with game development kits, but like C it gives you plenty of control. So we’ll start with this happy couple, looking at the ‘raw’ way of making a starfield, and then move on to some highlevel alternatives later. Here’s the code – you can find it on the coverdisc as starfield.c (open the c_with_sdl directory inside starfield.tgz). #include <stdlib.h> #include <SDL.h> #define MAX_STARS 100 typedef struct { int x, y, speed; } star_type; star_type stars[MAX_STARS]; int main() { int i; Uint8 *p; SDL_Surface *screen; SDL_Init(SDL_INIT_VIDEO); atexit(SDL_Quit); screen = SDL_SetVideoMode(640, 480, 8, SDL_ SWSURFACE); for(i = 0; i < MAX_STARS; i++) { stars[i].x = rand()%640;
More languages
Code a starfield stars[i].y = rand()%480; stars[i].speed = 1 + rand()%16; } for(i = 0; i < SDL_NUMEVENTS; i++) { if (i != SDL_QUIT) SDL_EventState(i, SDL_ IGNORE); } while (SDL_PollEvent(NULL) == 0) { SDL_FillRect(screen, NULL, SDL_ MapRGB(screen->format, 0, 0, 0)); for(i = 0; i < MAX_STARS; i++) { stars[i].x -= stars[i].speed; if(stars[i].x <= 0) stars[i].x = 640; p = (Uint8 *) screen->pixels + stars[i].y * screen->pitch + stars[i].x * screen->format>BytesPerPixel; *p = 255; } SDL_UpdateRect(screen, 0, 0, 0, 0); SDL_Delay(30); } return 0; } Not bad, eh? In just over 50 lines of code, we have a snazzy starfield effect, all using plain C and SDL without any fancy layers on top. To compile the binary from this, enter: gcc -o starfield starfield.c `sdl-config --cflags` `sdl-config --libs` Note the backtick characters here – the key to generate them will probably be at the top-left of your keyboard. Basically, anything inside backticks is treated as a command, and the output from them is generated before the complete command (from gcc onwards) is processed. So if you type sdl-config --libs on its own, for instance, you can see the compiler parameters required to build SDL programs. To run the compiled program, enter ./starfield. Anyway, let’s look at the code. The first two lines tell the compiler simply that we want to include header files for the standard library (so that we can generate random numbers later), and SDL (so that we can use routines from the library). Then we have a #define line, which tells GCC that all instances of MAX_STARS from here onwards in the source code should be replaced with the number 100. This allows us to experiment with the number of stars by changing just one line, instead of having to make changes all over the code. Next up, we define a structure – that is, a collection of variables that can be referenced together under one name. We call this star_type, and every variable we create with this type will have X and Y coordinate variables inside it, along with a speed (they’re all integer numbers). With this line:
star_type stars[MAX_STARS]; we create a new array of star_type structures, called stars. So, in the coverdisc version of the code, that’s an array of 100 stars, which can be referenced from star[0] to star[99].
My main man Now it’s time to kick off the code itself, inside the main() function (the first one that is executed in a C program – and indeed the only one in our case). We declare i as an integer variable that we’ll use for counting purposes later. Then we declare p as an unsigned 8-bit integer pointer, which is a variable that we’ll use to point to the graphics data later on. And, lastly, we have a screen, a pointer that we’ll use with SDL.
To make the screenshot more exciting, here’s what the C implementation with 2000 double-size stars whizzing around.
“Pointers can be quite fiddly to use and many high-level languages avoid them.” If you’ve never heard of pointers before, they’re like variables, but instead of holding numbers that we can use directly they hold the address in memory of other variables. So you might have the integer variable x, which holds the number 50, and is sitting at location 1000 in RAM. If you create a pointer variable called y and point it to x, then y won’t contain 50 but will contain the memory address instead – 1000. They can be quite fiddly to use and many high-level languages avoid them, but it’s worth being aware of them.
69
More languages x is less than zero), we start it again from the right-hand side, at 640. The two lines beginning with p and *p are used to plot the stars. In the first, we set our p pointer variable to point at the exact area of the graphics data where the pixel should be plotted. This data is stored inside the pixels member of the screen structure, and because it’s a linear one-dimensional row of bytes we need to do some maths to match it up with our 2D image. Once we have p pointing to the right place, we place the number 255 – meaning white – inside the byte that it points to by dereferencing it (using the star). Finally, we tell SDL that we’re finished with our drawing operations so it should render the whole lot to the screen, and then have a delay of 30 miliseconds so that it doesn’t run too quickly. And we’re done! This might seem a bit complicated if you’ve never done any C programming before, so let’s move on to the higherlevel alternatives. Once you have those sussed out, come back to this one and it will be clearer. Even humble Ncurses, mixed up with a bit of Perl, is capable of parallax-scrolling starfields.
High-level: Python and Pygame Moving on, we have three lines which initialise the SDL subsystems, tell the compiler that we want to call the SDL_ Quit routine when the program ends, and create a new window (640 pixels wide by 480 high, in 8-bit colour mode for 256 colours). The call to SDL_SetVideoMode returns a structure containing the display information, so we make sure that our screen pointer is pointing to it. Next up, we have two for loops. The first one populates our array of stars, using random X and Y locations for the starting points, along with their speeds. Try changing the 16 in the speed line to higher and lower numbers to see the effects. The second for loop tells SDL that we don’t want it to pester us with every keyboard or mouse movement event it receives – it should ignore anything apart from when the user closes the window. And then we come to the while loop, where all the fun takes place. We tell SDL that we want to perform this loop until the window is closed, and our first order of business is to blank out the screen with black (RGB 0, 0, 0) using the SDL_ FillRect function (the NULL here makes it draw to the whole window). Then we cycle through the star array, updating the horizontal positions of each star by subtracting their speed values from their x coordinates. The window’s pixels go from 0 on the left to 639 on the right, and 0 on the top to 479 at the bottom. So, if a star flies off the left of the window (ie, its
Compared with what we’ve just done, life is a lot easier when you’re using Python and its Pygame library. Working with graphics is a lot simpler, and you don’t have to fiddle around with pointers. Here’s the code – it’s called starfield.py, and if you’ve got Pygame installed then you can run it with ./ starfield.py. If you’re new to Python you should find this code quite readable, but it’s important to note the indentation here, which is vital for program flow. Code belonging to blocks (such as loops) must always be indented. #!/usr/bin/env python import pygame from random import randrange MAX_STARS = 100 pygame.init() screen = pygame.display.set_mode((640, 480)) clock = pygame.time.Clock() stars = [] for i in range(MAX_STARS): star = [randrange(0, 639), randrange(0, 479), randrange(1, 16)] stars.append(star) while True: clock.tick(30) for event in pygame.event.get(): if event.type == pygame.QUIT: exit(0)
A path to follow We’re bound to start a few flame wars here, but anyway… If you want to spread your wings and become a great all-round programmer, these are the languages we’d recommend exploring: Assembly This will get you familiar with the nuts and bolts of programming, working directly with hardware and memory. Telling the CPU directly what to do is way more satisfying than magically obscuring things behind compilers. x86 is a bit of a mess, so to start with get a ZX Spectrum or Commodore 64 emulator, and try out their assembly languages (Z80 and 6502 respectively).
70
C It’s close to being the ‘standard’ programming language, if there could ever be such a thing. It’s available almost everywhere, most of the Linux kernel is written in it, and it combines low-level features with some higherlevel abstractions. Most implementations of other programming languages are written in C. Python This gives you a taste of a high-level language, with object orientation and highly readable code. There’s a vast range of add-on modules, making it ideal for all kinds of programming, from network tools to GUI desktop apps. We’ll look at Python later on.
Lisp The constant use of brackets may drive you insane, but it’s a good way to learn about functional programming – a very different approach to the likes of C and Python. We’re not saying these are the best languages to learn, or indeed the most useful (if you want to make a career in programming, then you’ll want to learn Java, C# or Objective-C). But we think that if you spend some time with the above languages, you’ll absorb a huge amount of knowledge and flesh out your programming prowess so that you’re a great all-rounder.
More languages screen.fill((0,0,0)) for star in stars: star[0] -= star[2] if star[0] < 0: star[0] = 640 screen.set_at((star[0], star[1]), (255, 255, 255)) pygame.display.flip() Logic-wise, this implementation is very similar to the C one, so you can compare features in the languages. We start off by telling Python that we want to use the Pygame module and the randrange number-generating routine from the random module, and then set up a variable called MAX_ STARS which has the same purpose as its C equivalent. Then we get Pygame fired up and create a window, assigning it to a variable called screen, before setting up a background timer using Pygame’s clock facility, to slow things down a bit later. Next comes the construction of our star array (or list in Python parlance), in the form of stars = []. You can see that Python is a lot more flexible than C, and you don’t have to declare the size of an array at the start. What we do here is create, step by step, 100 star objects and provide each of them with three values: the X coordinate, the Y coordinate, and the speed, just like in the C version. After creating each star object we drop it into our stars list using append. Then there’s the while True loop, which is the main loop. First, we have a delay using clock.tick, and then we tell Pygame to process any keyboard or mouse events coming in (so it can quit the program if the user closes the window). Then, we use the fill routine of our screen object to black out the window, and start cycling through the stars. For each star, we subtract its third element (speed) from its first (X position), making the stars move left (elements in an array or list are counted from zero). Like in the C version, we check to see if the star has gone off the left-hand side of the screen. Then we plot a white pixel (255, 255, 255 in RGB format) at the X and Y positions, and we’re done with the star processing loop. Lastly, we call the display flip routine, which renders all of our previous drawing operations to the screen. Overall, it’s shorter, simpler and easier to read than the C/SDL version, and you can imagine that writing games in Pygame is a lot of fun.
Unusual-level: Perl and Ncurses Finally, let’s take a look at that equally loved and hated master of text processing, Perl. And to mix things up even more, instead of using graphics to render the starfield, we’re going to use the text terminal. How? By printing full-stop characters for the stars! There’s a very helpful library called Ncurses that’s available to most programming languages; it makes handling the terminal window (such as moving around, disabling the cursor, etc) pretty easy. Here’s the code, which you’ll find in the zip file listed on the contents page: #!/usr/bin/perl $numstars = 100; use Time::HiRes qw(usleep); use Curses; $screen = new Curses; noecho; curs_set(0); for ($i = 0; $i < $numstars ; $i++) { $star_x[$i] = rand(80);
Get hacking! Once you’ve spent a bit of the time with these programs, why not see if you can expand them? Some ideas: Try experimenting with different window sizes and dimensions. For the C and Python versions, you can add a few lines so that the stars have random colours. Take input from the keyboard to affect the direction of the stars and their speed. For C and SDL, see the documentation website at www.libsdl.org/cgi/docwiki. cgi – the quick guide is especially
useful. Pygame has especially good tutorials and reference guides at www. pygame.org/docs, while Perl + Ncurses aren’t so well documented, but you can often find examples of a specific instruction by searching online. Also see http://tldp.org/HOWTO/NCURSESProgramming-HOWTO – it’s for C, but most of the functions are implemented in Perl as well. And if you get totally stuck, or just want to share your work, pop by the LXF forums at www. linuxformat.com/forums and head into the Programming section.
$star_y[$i] = rand(24); $star_s[$i] = rand(4) + 1; } while (1) { $screen->clear; for ($i = 0; $i < $numstars ; $i++) { $star_x[$i] -= $star_s[$i]; if ($star_x[$i] < 0) { $star_x[$i] = 80; } $screen->addch($star_y[$i], $star_x[$i], “.”); } $screen->refresh; usleep 50000; } By now, the general structure of this code should make a lot of sense to you. There’s one major difference here, though; instead of having an array of stars containing coordinates and speeds for each one (ie, an array of arrays),
Quick tip If you’re thinking about writing a game, check out the Allegro library (http://alleg. sf.net). It’s stable, mature, crossplatform and many impressive games have been written with it, as you can see at www.allegro. cc (browse the Action category in Projects on the left, for instance).
“There’s a helpful library called Ncurses that’s available to most programming languages.” to make things simpler we’ve just set up three arrays. star_x holds the X coordinates of the stars, star_y the Y coordinates, and star_s the speeds (at least one, so that no stars are static). You’ll see that the dimensions of 80x24 refer to the standard X Window System terminal size, but you can change these for bigger terminals. Here, addch is the Ncurses routine to print a character – and interestingly, it takes the Y coordinate first. usleep pauses execution for the specified number of microseconds. Oh, and the noecho and curs_set instructions at the start say that we don’t want to see our own keyboard output or the text cursor. So, we’ve explored generating a starfield in three languages with three toolkits, and hopefully it has tempted you to try more languages and libraries. You can see how the basic algorithms in a program are usually the same across implementations, but there’s always something new to learn. Enjoy, and happy hacking. Q 71
More languages
Scheme: Learn the basics
To help you get used to learning new languages, Jonathan Roberts explains how to get started with the simple but popular Scheme.
S
o far in this guide we’ve given you a crash course in Ruby and introduced you to the essentials of some other popular languages. One that we haven’t mentioned so far is Scheme, which is a shame really, as many of the most well regarded introductions to programming and computer science choose it as their teaching tool. While there are many reasons for this, suffice it to say that Scheme is a popular choice because it’s a simple language with little syntax to learn. This means it does a wonderful job of demonstrating many important principles that are obscured in other languages by their more complex syntax. As well as being a good language for new programmers, Schemeis also interesting to those with more experience. Unlike most of the languages we’ll focus on in this guide, it’s primarily a functional language, as opposed to an objectorientated or imperative one. As such, it’s a great chance to see some different approaches to common problems. With all that in mind, over the next three issues we’re going to introduce you to Scheme. To begin with, we’re going to focus on some of the basics of Scheme, with later articles looking at some particularly functional approaches to solving problems.
Installing Scheme Almost all Linux distributions come with a Scheme interpreter built in. It’s called Guile, but it’s pretty complex and we won’t be using it. Instead, we’re going to use a tool called Dr. Racket, which you can find in some distributions’ repositories or by heading to its official website: http:// racket-lang.org/download. Dr. Racket is actually designed to work with a particular dialect of Scheme (which is itself a dialect of Lisp, the language Emacs is written and extended in) called Racket.
Dr. Racket is our tool of choice for programming in Scheme. It provides many useful features, including bracket balancing and tools for debugging code.
But it’s a great programming environment, and you can make it work with standard Scheme by going to the Language > Choose Language menu and selecting R5RS from the window that appears. With Dr. Racket happily installed, we can start programming.
Simple expressions All programming is really about manipulating information, data. That data can represent anything, from the ingredients used to mass produce a certain kind of biscuit to the position of a robot’s arms and claws on a factory assembly line. In order to manipulate this data, we must find ways to represent it that a computer can understand, and we must be
Dr. Racket Dr. Racket is a programming environment that makes coding in Scheme much easier than in a plain text editor. Its main window is split in two. The top half can be used for entering, saving and loading entire programs, made up of many functions and definitions. When you have completed work on part of a program and you want to see how it works, press the Run button in the top-right of the screen and Dr. Racket will evaluate the program and display any results in the bottom half of the window. As well as being used to display results of
72
programs entered in the top half, the bottom half of Dr. Racket’s main window can be used as an interactive interpreter. If you type Scheme expressions in to it and press return, it will immediately show the results. This is a handy way to quickly check that small snippets of code work as you expect. It’s also the part that we’ve used throughout this month’s tutorial. There are other parts of Dr. Racket that make it an appealing environment to code in. Elsewhere in the article, we’ve mentioned its
automatic bracket balancing, which helps us to avoid syntactic mistakes. However, it also provides a number of other tools to help us spot mistakes in our code. For instance, if you try to run a program with a mistake, Dr. Racket will provide a description of the error and even highlight the part of the code where it occurred. What’s more, when we start working with more complicated programs the Debug button in the top-right can help us track what the program does and understand how it evaluates different types of expressions.
More languages Programming paradigms At the start of this article, we said one of the things that sets Scheme apart is it’s a largely functional language, as opposed to an objectorientated or imperative one. If you’re new to programming, this might not have meant much to you, but we’ve got you covered. These three obscure sounding titles refer to different programming paradigms – that is, particular ways of thinking about problems and programming solutions for them, different styles of programming. When programming in an imperative style, the main focus is on recording and adjusting the state of the program. This is often done through the use of variables and functions that modify their value directly. This particular paradigm
matches the way programs are executed on the hardware, but it can make programs harder to develop and test, since the programmer must take in to account the state of many external variables, not just the inner workings of a single function. That is to say, in imperative languages functions have side-effects. The imperative paradigm is most often associated with Assembly language or C. Object-orientated programming attempts to resolve the problem of external variables by structuring a program around the idea of separate objects. Each object records its own state, which it keeps separate from the rest of the program. Each object also specifies a number of methods (functions) that let other
able to define the processes that we will use to manipulate it, too. Doing this for complex kinds of data is obviously very challenging, so we’re going to start off with a far simpler type: numbers. In Scheme, numbers and many common (and some not so common) operations that you might perform on them are known as primitives – they’re built in to the language itself. You can experiment with this in the bottom half of the Dr. Racket window. This part of the window is an interactive interpreter: you can type Scheme expressions in to it and the result of evaluating the expression will immediately be displayed to you. Try typing a 5 and pressing return and you’ll see another 5 – this tells us that the result of evaluating a simple number (a primitive) is the number itself.
Doing calculations If you combine numbers with their basic operations, you can use Dr. Racket’s interpreter and Scheme as a simple calculator, although the syntax is a bit different from what you might be used to from school: > (+ 5 5) 10 > (- 8 4) 4 > (* 2 3 4) 24 As you can see, the operation comes first and the numbers it’s applied to (the operands) come afterwards; what’s more, the whole expression must be wrapped in brackets. This syntax has two advantages: first, you can easily apply the operation to as many numbers as you like without having to repeat the symbol. The other advantage is that there’s never any question about what order to perform the calculation in, just remember that you evaluate the inner-most expression first and work your way out: > (+ 3 (* 2 4) (+ 4 (- 8 3))) 20 In this example, you calculate (- 8 3) first, then (+ 4 5) and (* 2 4), before finally doing (+ 3 8 9). To make it easier to keep track of all the brackets, Dr. Racket will highlight matching brackets and which part of the expression they apply to as you close them off. Our lives are also made easier thanks to a convention called pretty-printing. Rather than writing the entire
parts of the program inspect and modify its state. It models the real world well, where most things appear to us as individual objects. A cooker, for instance, is an object. It has controls, like methods, that let me modify and inspect its state without affecting other parts of the kitchen. Finally, functional programming attempts to do away with the idea of state completely. There are no external variables, no side effects. This makes designing and testing programs much simpler, since every time a function is run with the same input values, it will return the same output. There’s no need to take into consideration any external variables, or its interaction with any external objects.
expression and all its sub-expressions on a single line, you can break it, aligning all the operands (the numbers each operation is applied to) vertically: > (+ 3 (* 2 4) (+ 4 (- 8 3)))
Variables When we’re programming, the data we’re working with rarely represents just numbers. For instance, a number might be the quantity of elderberries needed to make 30 litres of elderberry wine, or it might be the volume of a bottle needed to store the wine that’s being made. To make our program more manageable as it becomes more complex, to make it more readable, one of the most important things a language can do is let us give the numbers we’re working with names. In Scheme, this is done with the define statement: > (define pi 3.14) > (define radius 3) As you can see, it works just like any of the mathematical operators in our earlier example. The only difference is that
Pretty-printing and bracket-balancers make dealing with Scheme’s many brackets much easier. Once you get the hang of the brackets, you’ll come to love the lack of ambiguity.
73
More languages the operands must come in a certain order: first is the name you want to give to the variable and then comes the value to be assigned. After defining a variable, you can then use it just as you would any primitive value built in to the language: > pi 3.14 > (* 2 pi radius) 18.84 A good way to think about how Scheme evaluates this expression is like you were taught to do algebra in school: through substitution. It will look up the values for pi and radius and then re-write the expression with these values in place: (* 2 3.14 3). After getting down to primitives, it will then carry out the final evaluation and return the result (this is a simplification, but it’s a good way to think about things).
Procedures, aka functions
Pretty-printing helps separate the function name and formal parameters from the body. Once defined, we can use functions just like any other primitive operation.
We’re not restricted to giving names only to primitive pieces of data, however. We can even give names to entire procedures if we want to. The second expression, above, calculates the circumference of a circle with a radius of 3. You might remember from school that there’s a general formula for describing how to calculate the circumference of any circle: 2 times pi times the radius. In the expression above, we executed a specific instance of the procedure described by this formula, but Scheme lets us express the general form, as well: > (define (circumference radius) (* 2 3.14 radius)) The final part of this expression (* 2 3.14 radius) is exactly the same as we saw above (* 2 pi radius), the only difference is that the value of radius isn’t set yet. Instead, it refers to the name in the previous part of the expression (circumference radius). We can also define procedures using pretty-printing:
> (define (circumference radius) (* 2 3.14 radius)) This makes it easier to distinguish between the different parts of the expression: the top line gives the name and any parameters, while everything below that is the body of the procedure – the actual operations to be carried out. Once we’ve defined our procedure like this, we can use it just like any of the primitive operations: > (circumference 10) 62.800... > (circumference 5) 31.400... What happens when we do this is that, within the expression, the substitution model is once again applied – circumference is substituted by the body of the procedure, while the value given in place of radius is put in to the appropriate parts of the expression. In these examples, this results in the evaluation of (* 2 3.14 10) and (* 2 3.14 5).
Conditional expressions One tool that all languages, including Scheme, have is the ability to take an action only if a certain condition is true. This is obviously a vital ability, as it’s something we do all the time when following a procedure. For example, if the elderberry wine is clear on day 35, progress to the next step; if it’s not, keep stirring. In Scheme, this ability is implemented through three separate tools. The first is the existence of relational operations that allow us to inspect the relationship between two things: > (< 10 5) #f > (> 10 5) #t > (= 10 10) #t < is the less than symbol, and checks whether the number on the left is less than the number on the right; > is the greater than symbol and checks whether the number on the left is greater than the number on the right; finally, = tests whether the two numbers are equal. These operations will always return #f for false and #t for true. The second is the existence of the boolean operations that allow us to combine the results of multiple relational expressions: > (and (> 10 5) (= 10 10)) #t > (or (< 10 5) (= 10 10)) #t > (not (> 10 5)) #f As is ever the case, and returns #t if, and only if, all the expressions within are true, or if any of the expressions are true, and not inverts the result of the expression, turning true to false and false to true.
Case analysis The final tool that Scheme gives us is the ability to perform case analysis. This allows us to perform different operations depending on the result of any of the tests described above. The general form of a case analysis in Scheme is: (cond (<test> <expression>) (<test> <expression>) …. (<test> <expression>))
74
More languages By putting the code for Fizz-Buzz in the definitions window, the top part of Dr. Racket, we can use its debugger to walk through its execution step-by-step.
The evaluation of a cond expression like this checks the top test first. If it’s true, it will evaluate the associated expression and the case analysis will be finished; it it’s false, however, it will proceed to the next test down and do the same. If none of the tests are true, it will eventually reach the bottom of the cond expression and nothing will have happened. There’s an alternative form of cond, too, that replaces the final test with the keyword else. This means that if none of the other tests evaluate to true, then carry out the final expression whatever.
((= (remainder x 5) 0) ‘buzz)) (else ‘shh)) You can see how this fits the pattern of the case analysis laid out above, even as it introduces a few other concepts. First, note that remainder is a primitive function that returns the remainder of a division between two numbers, eg, the remainder of 5 divided by 3 is 2. By checking to see if the remainder is 0, we can check whether a number is a multiple of another. Second, note that to use plain English words in Scheme, we have to put a single quote at the beginning of the word. The first test, then, checks to see whether x is a multiple of 3 and 5, in which case it evaluates the expression ‘fizz-buzz, which as with numbers, returns itself. The second checks whether x is a multiple of just 3, returning ‘fizz, and the one after that whether x is a multiple of 5, returning ‘buzz. The else clause at the end returns ‘shh in case the number isn’t a multiple of either 3 or 5. Now you know all the basic elements of the Scheme programming language, we’ll move on to look at lists and recursion – two elements that are put to great use in functional programming languages. Q
“To use plain English words, put a single quote at the beginning.”
Fizz-Buzz For instance, imagine we were making a program to play the game Fizz-Buzz. The rules of the game say that if a number is a multiple of 3, it should return the word Fizz, if it’s a multiple of 5, it should return the word Buzz and if it’s a multiple of 3 and 5, it should return Fizz-Buzz: > (define (fizz-buzz x) (cond ((and (= (remainder x 3) 0)) (= (remainder x 5) 0)) ‘fizz-buzz) ((= (remainder x 3) 0) ‘fizz)
Exercises Having learned the basics of Scheme, you ought to solidify your knowledge by working on some example problems. Having a good grasp of the material covered here will make sure you’re ready for the next tutorial. EXERCISE 1 Without using Dr. Racket’s interactive interpreter, evaluate the following expression: (+ 2 (* (- 5 3) 4 (/ 2 (- 3 1)))). Rewrite the expression in pretty-printing format. EXERCISE 2 Write a procedure, called
currency-conv, that accepts a value in Pounds Sterling and returns a value in US Dollars. Use Google to look up the current rate of conversion. EXERCISE 3 Without using Dr. Racket’s interactive interpreter, evaluate the following expressions: > (define a 2) > (define b 3) > (define (square x) (* x x)) > (square 2) > (define (sum-squares a b) (+ (square a) (square b)))
> (sum-squares 2 4) EXERCISE 4 A cinema sets prices based on age. If you’re younger than six, you get in for free; if you’re younger than 12, you get in for £3; if you’re older than 65, you also get in for £3. Otherwise, you have to pay full adult price, which is £5. Complete the following Scheme procedure to calculate how much someone has to pay to go to the cinema. > (define (ticket-price age) (...
75
More languages
Ready to get your brain in a twist? In the second part of our guide to Scheme, Jonathan Roberts introduces recursion and some useful data types.
L
ast time, we introduced you to Scheme and walked you through the basics of the language, and indeed of programming in any language. By the end of it, you knew how to create procedures, including ones that can handle different situations through case analysis. This time, we’re going to introduce you to recursion. As we do so, we’re going to introduce a new data type that will let your programs work with much more than simple numbers.
Recur… what? If you’re put off by the weird-sounding name – recursion – in the introduction to this article, don’t panic as there’s nothing much to it really; in fact, once you know what it’s all about, you may think it’s pretty cool – that is, if you enjoyed Inception. If you’ve ever watched that movie, or stood in a hall of mirrors and seen the way the reflections seem to carry on for ever, that’s recursion – the thing that you’re interested in repeats itself within itself: dreams within dreams, mirrors within mirrors. To understand how this can be put to work in programming, imagine that you work for a large publishing company, with 100 advertising sales people, 30 database marketers, 20 photographers and 100 magazines, each with six people working on them. Your boss comes along and asks you to find out what every single person had for breakfast. “Ugh!” you think, “what a boring task!”. After procrastinating for a while, you decide that the easiest way to get out of this is to pass the ball along. You call the head of advertising and say that the boss wants them to find out what all their staff had for breakfast, and ask them to let you have the information once they’ve found out; you then do the same to the head of database marketing, the
photographers and the magazines. Obviously, the head of the magazines doesn’t want to ask 600 people, so she does the same thing you did: asks the head of each magazine to find out what each of their staff had for breakfast, and to let her know the results when they’ve found out. The advertising team do the same thing. The magazine editors have a small enough team to just ask everyone, which is what they do. After some time passes, magazine editors begin reporting in what their staff had for breakfast, as do the advertisers. Eventually, all the managers you contacted know what all the staff below them had for breakfast, and they ring you and let you know. You can now tell the boss. After getting in touch with just four people, you’ve managed to find out what 750 people had for breakfast – this is recursion in action. One over-arching task, finding out what everyone in the company had for breakfast, was completed by breaking it in to smaller, although identical, sub-tasks, with a tiny amount of effort. What’s more, while the whole task seemed daunting and complex to solve, the smaller tasks were very easy to solve – just ask a handful of people what they had for breakfast.
Factorials Exactly the same principle can be applied to code. A popular example of this is calculating factorials – that is, calculating the number of ways a list of items can be arranged. By itself, this sounds like a daunting problem to solve. Where on earth do you start, and how can you be sure that you’ve got all the arrangements? Mathematicians figured out how to do this a long time ago, and the procedure can be described thus: If the length of the list is one, then the answer is 1 – there’s
Iterative recursion Experienced programmers may know that the factorial problem is easily solved in other languages through a more common method. Instead of writing a recursive function, you can just assign some variables and write a loop, as this Python example shows: def factorial(n): fact = 1 for x in range(1,n + 1): fact *= x return fact
76
This is, in fact, a more efficient way to solve the factorial problem. By reusing variables like this, the amount of memory required to calculate a factorial is constant, no matter how big a number we feed in to it. In contrast, the recursive implementation given in the article will need more and more space in order to remember all the parts of the multiplication it will have to do later. If you write a few more examples of our Scheme procedure by hand, you’ll see this illustrated in the way the line
www.linuxformat.com
grows to the right. Scheme isn’t inefficient, however. Being a simple language, it doesn’t have a special form to support loops like this, but the same effect can be achieved with standard procedures and recursion: (define (fact n) (define (fact-iter count result) (cond ((> count n) result) (else (fact-iter (+ count 1) (* count result))))) (fact-iter 1 1))
More languages only one way to arrange a list with a single item in it. For any other length list, the number of arrangements equals that list’s length multiplied by the factorial of all the length lists smaller than it. For example, 1 factorial (written 1!) equals 1; 2! = 2 x 1!; 3! = 3 x 2! x 1! etc... There are two points to note here. First is that this is a recursive technique – the factorial of any number can be found by finding the factorial of smaller and smaller numbers. The second is that there’s a ‘base case’, that is a simple case at which the recursion stops and a simple, primitive answer is returned (in this case, the base case is 1! = 1). This can be translated in to a Scheme procedure quite literally: (define (factorial n) (cond ((= n 1) 1) (else (* n (factorial (- n 1)))))) The clever thing about this is that the procedure calls itself! This is what makes it a recursive procedure. To understand how this works, you can apply the substitution model that we talked about last time and work through a small run of factorial by hand: (factorial 3) (* 3 (factorial 2)) (* 3 (* 2 (factorial 1))) (* 3 (* 2 1)) (* 3 2) 6 Each time the procedure is evaluated, the interpreter substitutes the formal parameters into the appropriate parts of the procedure’s body. Instead of reaching primitive operations that it can evaluate straightaway, however, it finds it has to evaluate another procedure – itself. Eventually, the parameter given to factorial will be 1, which is a primitive and can be evaluated, and the procedure
Exercises As with the previous tutorial, here are some Scheme-based exercises to keep you busy. EXERCISE 1 Write a procedure that takes a list as input and returns the reversed version of it: eg, (1 2 3 4) becomes (4 3 2 1). EXERCISE 2 Draw a box and pointer diagram that shows how you can use pairs and lists to build a tree structure. EXERCISE 3 The Towers of Hanoi is a
famous puzzle. In it, there are three pegs: A, B and C. On peg A is a stack of six discs, each of a different size, with the largest at the bottom and the smallest at the top. The challenge is to move the entire stack to peg C, obeying the following rules: 1. Only one disc can be moved at a time. 2. A disc can never be sat on another smaller than it. EXERCISE 4 Describe a recursive method for solving this puzzle. For an extra challenge, implement it in Scheme.
can unravel itself, just like the managers all reporting back to the person above them. Without this base case, however, the recursive procedure would run forever and your interpreter would eventually give up. Having a base case, then, is a vital element in all recursive programming and thinking, and it’s always wise to start writing recursive procedures with a base case to ensure you don’t miss it.
Pairs The other way recursion is put to work in programming is in creating data representations. To see how this works in Scheme, we must first introduce one of its fundamental data structures: pairs. These are Scheme’s data building blocks. Remember what we said at the start of the last article: programming is all about data, and specifically manipulating that data. So far, we’ve seen how to write procedures that can
Dreams within dreams, mirrors within mirrors…
77
More languages manipulate simple numbers, but it’s rare that real world data is this simple. For instance, we might want to write procedures that can manipulate information about a CD, including title, artist and year of production; equally, we might want to write a procedure that controls the pixels on our monitor, including their colour and position. In the pixel example, we could obviously treat each piece of information separately, but it’s far more natural to understand the pixel as a single object that’s composed of the other, smaller bits of data. In Scheme, it is pairs that let us represent compound data. For instance, to represent a pixel on screen at position (3, 4), we could create a pair called point: > (define point (cons 3 4)) The procedure cons, short for construct, takes two arguments, which it compounds together in to a single object. The elements of that object, that is, the x and y coordinates, can be accessed through two more primitive procedures, car and cdr. > (car point) 3 > (cdr point) 4
Then you can create the distance procedure, using the primitive sqrt: (define (distance point) (sqrt (+ (square (car point)) (square (cdr point))))) To use this, you can then create a new compound structure, as we did above, and pass it to distance, where car and cdr do the work of accessing the individual coordinates. Now, cons, car and cdr all have their names for historical reasons, and what they do is hardly obvious. You can make your programs more readable, however, by wrapping them in other procedures. For instance, the following procedures give more obvious names, given the context of our distance program, that will make it easier to read: (define (make-point x y) (cons x y)) (define (get-x point) (car point)) (define (get-y point) (cdr point))
“In Scheme, it’s pairs that let us represent compound data.”
Calculating distance Once you’ve created a new compound object with cons, you can pass it around just as you would any primitive object. To see how this works, consider using cons, car and cdr to write a small procedure that calculates the distance a point on a graph is from the origin, that is the point (0, 0). To do this you need to: Add the squares of the x and y coordinates together Take the square root of the resulting number As ever, this translates quite literally in to Scheme. First, create the square procedure: (define (square x) (* x x))
Pairs within pairs This is quite useful, but what if you wanted to represent a compound object with more than two elements? The answer is that cons can be used to combine any kind of object, even other pairs: > (define x (cons (cons 1 2) (cons 3 4))) > (car x) (1 . 2) > (cdr x) (3 . 4) > (car (car x)) 1 > (car (cdr x)) 3 This time, you can see that the car and cdr of x both point to another pair, where calling car and cdr once again will
Scope In the iterative version of factorial that we demonstrated, you may have noticed that we defined a procedure inside another procedure. What was going on here? The first thing you need to know is that a program is constructed of a number of ‘environments’. These environments provide a mapping between names, including primitive or user-defined ones, and the values and procedures they specify. The environment can then provide a context for procedures to be evaluated in. For example: > (define x 10) > (define y 7) > (+ x y) 17 The define procedure adds the names x and y to the environment, associating them with their respective values. When the addition is performed, the interpreter looks up the procedure associated with the symbol +, and
78
then substitutes the values of x and y in to its body for evaluation, all according to the values stored in the environment.
Recursive environments One problem with the environment model is that there can only be a single occurrence of each name in an environment, since otherwise the interpreter wouldn’t know which value or procedure you meant to refer to when you used it. The trouble is, within a single program you might want to refer to different kinds of points in different circumstances, but you’d be unable to re-use the make-point, get-x and get-y procedures, since they’d already been used. To get around this, Scheme has many environments arranged in a hierarchy. At the top of the hierarchy is the global environment, which we saw above. Below this, however, each procedure gets its own unique environment, with
its own set of names and values. So when we do (define (make-point x y) we’re creating a new environment. All the assignments made in this local environment are then invisible to the outside world. Nothing from the global environment can see inside. So if you define a global procedure get-x and a local one, they won’t conflict. It’s worth noting that local values always take precedence over global ones, so if you do define two values with the same name, the local value gets evaluated. If there’s no local value supplied, however, the interpreter will look to whichever environment surrounds it for a value to use. It’s often a good idea, then, to define helper procedures, such as fact-iter, inside the local environment rather than the global one. This decreases the chance of creating names that conflict, and helps make the close association between two procedures clearer.
More languages reach the base values. As you can see, when using car and cdr as we did before, you get access to the pairs that are held in those positions. By calling car or cdr again, you then get access to the primitive data that’s held inside those pairs. Dreams within dreams, mirrors within mirrors, pairs within pairs – this, as you might have guessed, is a recursive data structure. It is a pair’s ability to combine other pairs that makes it such a powerful building block, allowing us to build all kinds of fancy data structures that we can use to represent the real world with. For instance, pairs can be used to represent sequences: > (define li (cons 1 (cons 2 (cons 3 ‘())))) > li (1 2 3) Here, the cdr of each pair is the next item in the list. The final pair’s cdr is a special symbol, ‘(), that is used by Scheme to represent an empty list. This is, however, such a common structure in Scheme that there is some syntactic sugar to make this easier to code: > (define l (list 1 2 3)) This will create exactly the same list, only called l and without all the confusing cons. The elements of a list can be accessed with successive cars and cdrs: > (car l) 1 > (car (cdr l)) 2 > (cdr l) (2 3) The third example here is the most important. Notice that the cdr of the list is the list minus the first element. We’re going to put this to use in a moment.
Finally, you can see that a sequence is easily represented by a chain of pairs, each with its cdr pointing to the next pair in the sequence, and its car to the value of the current element. The final cdr has a strike through it, representing ‘(), the empty list.
x
list
car
cdr
3
4
car
cdr
car
cdr
car
cdr
3
4
1
2
car
cdr
car
cdr
1
2
car
cdr
3
Recursing a list Lists, being a recursive data structure, are naturally dealt with by recursive procedures. Imagine writing a procedure to find the length of a list. That sounds quite complicated, but if you recognise that the empty list, at the end of the list, has length 0, then you’ve got a base case that can be used for making a simple recursive procedure to do the hard work for you.
You Advertising
Advertising managers
Database marketing
Magazines Photography
Magazine editors
Magazine staff By asking a few people, recursion enables you to find out what many people had for breakfast. The same technique can be used to solve difficult programming problems.
(define (list-length list) (cond ((null? list) 0) (else (+ 1 (list-length (cdr list)))))) In this example, we’ve used the primitive null? predicate that checks to see if we’re looking at an empty list. Besides that, the procedure is almost identical to the one we used to calculate factorials, the only difference being that rather than returning a smaller number, we use cdr to return a smaller segment of the overall list, and the base case is the empty list rather than 0. As another example, consider the task of returning the nth element of a list. You could do it manually with a lot of cdrs, but that would look terrible and be very error prone. It’s quite easy to solve as a recursive procedure, though: (define (list-get list n) (cond ((= n 0) (car list)) (else (list-get (cdr list) (- n 1))))) Once again, the structure of the procedure is exactly the same as we’ve seen before, the only things we’re changing are the base case and what we pass on to the next iteration of the procedure. In this example, we’ve had to construct an external base case – that is, n, the number of the element we want to return from the list, which we reduce as we step along the list, counting the number of elements we’ve already passed. Once this counter reaches 0, we return the current element, not the rest of the list (hence we use car). That’s all we’ve got space for in this tutorial. In the next and final article of this chapter, we’ll look at map, filter, enumerate and accumulate, four procedures that are key to functional programming, and which can even shed some light on how to get the best out of Bash. Q 79
More languages
Scheme: Highorder procedures Jonathan Roberts explains how to take your Scheme skills to the next level with an advanced technique for building procedures that can multi-task.
I
n the previous tutorial, we introduced Scheme’s universal data building block, pairs, along with the idea of recursion. This month, we’re going to look at a more advanced idea that is central to functional programming: higher-order procedures. Just as we saw with giving names to variables and functions in part one of this series, or with the use of pairs which let us pass related bits of data around a program together, higher order procedures are another technique for managing complexity in programs. Instead of giving names or combining lumps of data together, however, higher-order procedures let us build procedures that are capable of doing more than one thing, based on their parameters. In most programming languages, the ability to construct higher-order procedures comes from their treatment of functions as first-class objects – that is to say, like any other primitive, they can be passed as arguments to other functions and returned as values from them, too. To see how this works, how it relates to higher-order procedures and to see why it’s useful, read on.
(define wish-list-b (list ‘laptop ‘holiday ‘wallet)) One thing you might like to do is find out whether a particular wish list contains a certain item – maybe you want to use this information to find users with similar desires, and make recommendations based on their peers’ wish lists. In Scheme, you could write a simple recursive procedure to do this: (define (car-in-wish-list wish-list) (cond ((null? wish-list) ‘false) ((eq? (car wish-list) ‘car) ‘true) (else (car-in-wish-list (cdr wish-list))))) This example tests to see whether a given wish list contains the symbol ‘car: if it does, it returns true, otherwise, it returns false. As with all our work on lists last time, this procedure takes a recursive approach to solving the problem. It looks very much like the procedures we saw in the Part 2: There’s a base case at the start of the function, ensuring that we don’t get stuck in a never-ending loop. In this instance, our base case is the empty list, as found by the primitive null? predicate. We then do some work on the current element of the list, this time checking to see whether it meets a certain criteria – that is, whether it’s the symbol ‘car. Finally, we apply the same process to the remaining elements in the list. As clever as this is, it’s actually not that useful. What happens if you want to check to see whether a wish list contains the symbol lxf-subscription or laptop? You’d have
“We’re going to look at an idea that is central to functional programming.”
Abstract procedures Before we look at higher-order procedures, let’s start by looking at the simpler idea of abstract procedures. Imagine that you run an online shopping business and your users can construct wish lists of items they would like to buy. In Scheme, a typical wish list might look like this: (define wish-list-a (list ‘lxf-subscription ‘1984 ‘car))
Lazy evaluation One very cool functional programming technique that we haven’t had time to cover in detail is the idea of lazy evaluation. The idea is that, if you assign the result of an expression to a variable, the value of that variable doesn’t matter until you make use of it. As such, evaluation of the expression can be delayed until later in the application or, depending on the expression, it can be partially evaluated, doing only those calculations that are strictly necessary at the current point of the program.
80
As well as reducing execution time, this can also reduce the amount of memory used when expressions are applied to very large sets of data. A good example of lazy evaluation would be a random number generator. You wouldn’t want to use one that could only return, 10, 100 or maybe 1,000 random numbers, but rather you want one capable of creating an infinite set of random numbers. If you were to build an infinitely large set of random numbers right from the start, however, you’d use an infinite amount of memory
and time to generate it. Instead, you can use lazy or deferred evaluation to only generate one random number at a time. In Scheme, lazy evaluation can be achieved by encapsulating variables within a scope and with the delay procedure. Other languages provide support for lazy evaluation as well, however; in particular, Python’s iterator data structure represents this idea in that language, and in Python 3, a common example is the range function.
More languages to write new procedures for each and every symbol you want to check for. This is bad because repetition leads to mistakes; it’s also bad because it makes programming, normally a fun and challenging hobby, into a boring one! To fix this, you can create a more abstract, or more general, version of the procedure. For example, consider what the differences would have been between car-in-wish-list and laptop-in-wish-list: the only thing to change between the two would be the symbol ‘car in the second line, which would become ‘laptop. Recognising this, you can easily create a more general version by adding a new parameter to the procedure: the symbol to search for in the list. (define (in-wish-list symbol wish-list) ...
A higher-order procedure This was an easy example, but in many situations featuring repetition, the same techniques work: look at examples of the similar procedures, note what changes between them, and then abstract these out of the body of the procedure and in to the parameters. In fact, exactly the same technique can be applied when creating higher-order procedures. Consider the following procedure, which extracts all the even numbers from a list and returns them as a new list: (define (ev? list) (cond ((null? list) ‘()) ((= (remainder (car list) 2) 0) (cons (car list) (ev? (cdr list)))) (else (ev? (cdr list))))) Again, in itself this is quite a clever procedure, but it could be made more general or abstract. What would happen if you wanted to create a new procedure that would extract all the odd numbers from a list, or all the numbers that are divisible by 3? You’d once again find yourself writing three almost identical procedures: first checking for the empty list, then checking to see if the current item in the list matches your criteria, whether even, odd or divisible by 3, and then moving on to the next item if it doesn’t match. As before, you can create a more abstract version of this procedure by looking for the differences in similar functions. Unlike in our last example, this time the difference actually comes in the predicate used to check whether the current item meets our requirements, which seems altogether more difficult to abstract than a simple data variable. The thing is, it’s really no more difficult, since in Scheme predicates are just procedures, and procedures are first class objects and can be passed in to other procedures just like any other. This means that we can abstract the above procedure in exactly the same way as before, by abstracting the differences in to parameters. (define (filter list pred) (cond ((null? list) ‘()) ((pred (car list)) (cons (car list) (filter (cdr list) pred))) (else (filter (cdr list) pred)))) As you can see, we’ve given the procedure a new name, filter, as it more accurately represents what this new, more abstract version does. We’ve also created a new parameter called pred, a place holder for the predicate that’s used to do the work of the function - to determine whether or not a given element meets our criteria. With this in hand, we can easily find all even numbers, or all odd numbers, or all any category of number, simply by writing a new predicate to do the work, without nearly so much duplication: (define (ev? list) (define (z x)
Functional programming Many of the examples we’ve looked at over the last few months have involved mathematical problems. These kind of problems suit pure functional programming well, since most mathematical functions don’t involve side-effects. What’s more, mathematical examples have allowed us to focus on patterns and techniques, as opposed to libraries for interacting with more complex data types, such as files, images or web pages, since numbers are built straight in to most programming languages. Because of this focus, you might have come to the conclusion that while functional programming is an interesting novelty, it’s not much use in the real world. You would be wrong! The recursive programming model is well suited to working with files and directories, while list or stream processing can easily be applied to text file processing or network programming. For a good introduction to the
application of functional programming to real-world problems, there are two good free books. The first is Text Processing with Python (available at http://gnosis.cx/TPiP/) and uses Python’s functional features to do increasingly complex text processing. This might not sound that interesting, but all the configuration files on Linux systems are made of text, and so are web pages and many of the protocols that power networks. Being able to effectively do text processing with functional techniques immediately opens up a whole world of possibilities. The other book is called Real World Haskell (http://book.realworldhaskell. org). Haskell is a functional programming language that’s increasing in popularity. Its syntax isn’t as simple as Scheme’s, but it has a range of libraries that let you do anything from systems to GUI to database programming using techniques that should now be familiar to you.
car-in-wish-list
cdr wish-list
wish-list is null?
YES
(car wish-list) equals car?
NO
Return true
Return false
This flow diagram shows how the recursive procedure carin-wish-list goes about checking to see whether a list contains a given item.
81
More languages (= (remainder x 2) 0)) (filter list z)) (define (threes list) (define (z x) (= (remainder x 3) 0))) (filter list z)) Filter is one example of a higher-order procedure. It’s also one of three procedures designed to operate on lists, which when combined together can be made to do an amazing number of things for very little effort on your part. Let’s look at these other procedures now.
Higher-order procedures The first of these other procedures we’re going to look at is map, another useful higher-order procedure for operating on lists, which lets us modify all the elements in a list. (define (map list proc) (cond ((null? list) ‘()) (else (cons (proc (car list)) (map (cdr list) proc))))) With this procedure, you could pass a list of numbers in and return a new list with the square of all the original list’s elements contained within, find the absolute value of all those items (that is, removing the sign from all the numbers), or anything else that you can imagine. We’ve shown you this example map procedure so you can get an idea of how it works, but Scheme actually has a built-in version of it that is more flexible and more powerful, capable of applying the given procedure to the elements of an arbitrary number of lists – it’s well worth investigating. The second of these is a bit more complicated – it’s called accumulate. It compresses an entire list to a single element. To see why this is useful, consider the case of summing all the values in the list. You could do this with an independent procedure, sum: (define (sum list) (cond ((null? list) 0) (else (+ (car list) (sum (cdr list)))))) But then the same problem we’ve seen throughout this
article rears its head again: what happens if you want to find the product of all the elements in the list, for instance? Once again, you’d have to write a new procedure, product. While it’s a touch more tricky, with more variables to keep track of, the same technique of looking for differences between similar functions, such as sum and product, makes creating a higher-order procedure simple. On this occasion, the differences lie in the base case, and the procedure that’s applied to the current element of the list and the rest of it. In sum, for instance, the base case would be 0, as above, but in product it would be 1 (think about it, if you kept the base case as 0, you’d get 0 as your answer every time – anything multiplied by 0 is 0!). Thinking it through, you’ll find yourself with a procedure like the one below: (define (accumulate proc base list) (cond ((null? list) base) (else (proc (car list) (accumulate (cdr list) proc base))))) After looking at all these procedures, you might be feeling a bit lost – they all look quite similar, and none of them seem incredibly powerful or useful alone, even if the idea of writing one procedure to do the work of many seems like a clever idea. Let’s take a look at an example to see how these higher order procedures can let us express complex ideas in clear and simple ways. Our example task will be to find the sum of all the squares of numbers which are multiples of three between any two numbers. If we were to write a procedure to accomplish this without the help of any higher-order procedures, it might look something like this, without all the helper procedures’ definitions: (define (sum-cubes a b) (cond ((> a b) 0) ((= (remainder a 3) 0) (+ (cube a) (sum-cubes (+ a 1) b))) (else (sum-cubes (+ a 1) b)))) After everything else we’ve seen over the last three instalments, with a bit of careful thought you can probably get
“After looking at all these procedures, you might be feeling lost.”
Functional programming in Python As we hope you’ve seen, Scheme is a powerful little language with a simple syntax that makes it easy to pick up and do very clever things with. That said, it doesn’t have the same level of adoption and library support that many other popular languages enjoy. If the functional techniques demonstrated by Scheme appeal to you, however, then you may want to investigate applying some of them in a mainstream language, such as Python or Perl. In Python, for example, there are many built-in functions provided explicitly for functional programming, including map, filter, reduce (aka accumulate) and enumerate. There are also many useful data structures and generators, including generators, list comprehensions and iterators. In fact, things like
82
iterators are deeply integrated in to the language, as common methods such as file.readlines() are iterators. If you’re interested in delving deeper into functional programming in Python, we’d recommend the functional programming HOWTO on the Python website is a good starting point (http://bit.ly/TRyWMF). For getting a greater understanding of Perl, another very popular language, there’s an entire book dedicated to the subject, called Higher Order Perl (http://hop.perl.plover.com/) and it’s an excellent read. If this chapter has piqued your interest in functional programming, a book like Higher Order Perl can show how these techniques work in a more familiar language.
More languages car
cdr
car
cdr
car
cdr
list
1
2
3
map cube
car
cdr
car
cdr
car
The map procedure lets you transform the elements of one list in to another, making use of a helper procedure to do the work of transformation to the value of the current element. The final cdr has a strike through it, representing ‘(), the empty list.
cdr
list
1 your head around this, but it’s not the most readable piece of code you’ll ever see. One of the main reasons for this, despite the use of clear procedure names such as cube, is that the overall logic is muddled and obscured by details not relevant to this specific task. For instance, the first, fourth and fifth lines of this procedure are all involved in walking the procedure through the integers between a and b. Wouldn’t it be simpler if this part of the process could be kept to one small area of code, rather than spread through the rest of it? What’s more, if you came to this code without any guidance, you might find yourself tripped up by the remainder statement or by the two recursive calls – it’s not exactly clear what the purpose of any of these lines is. You may have noticed another significant problem with this version of the procedure: it’s not very re-usable, so if you ever wanted to complete a similar task, such as finding the sum of the cubes of multiples of four, you’d have to write an entirely new procedure, all the while watching for typos and other mistakes. And if there were any mistakes, because all the parts of the code are mixed together, it would be very difficult to debug. Fortunately, it doesn’t have to be this way: with the help of the higher-order procedures discussed in the rest of this article, you can come up with a much better solution. (define (sum-cubes-better a b) (accumulate + 0 (map cube (filter multiple-three (enumerate a b))))) The only thing that we’ve not seen so far in this version is the enumerate procedure at the very end. This simply
8
27
creates a list containing all the integers between a and b. It’s not as flexible as the other procedures, but the idea of a procedure that enumerates things is very useful. At first glance, and while you’re still unfamiliar with exactly what these different procedures do, this may not seem any easier to read. If you do find that to be the case, first, try to think about what each of these procedures does and not how they do it – if you try to do the latter, you’ll get caught in a maze of recursive calls that’s almost impossible to parse manually. Then, start from the bottom of the procedure and work backwards: The enumerate procedure creates a list of all the integers we’re interested in. Then the filter procedure creates a new list, based on the one generated by enumerate, that only contains multiples of three. The map procedure then transforms this list, cubing every element in it. Finally, the accumulate procedure sums all the elements contained. Seen like this, each step of the process is expressed distinctly – far more so than the previous example – and once you’re familiar with the job each of the higher-order procedures does, it’s much easier to read, too, since all the inner-workings of recursion and conditional testing have been hidden away. It’s also an easier procedure to write, since each step of the process is contained in an independent procedure, each part can be tested and debugged independently. Finally, all of these procedures can be re-used elsewhere, reducing mistakes from repetition and making programming more fun and less boring. Q 83
i
]
.
.
. .
I
. :
;
:
.
_ ,
.
.
t a
c
.
I
.
,
;
;
;
PHP
PHP P
HP is one of the most popular programming languages around, and is the secret sauce that powers many millions of websites. Follow our series of tutorials and find out how you can get in on the act. PHP: Write your first script....................................................................... 86 PHP: Build an online calendar ..............................................................90 PHP: Extend your calendar .....................................................................94 PHP: Get started with MySQL ............................................................. 98 PHP: Do more with MySQL .................................................................. 102
85
PHP
PHP: Write your first script Using this open source technology and your Linux platform, Mike Mackay explores how to dive into the popular world of PHP programming.
P
HP dates back to 1995 when its creator, Greenlandic programmer Rasmus Lerdorf, began work on a scripting toolset that was originally known as Personal Home Page (PHP). The sudden demand for the toolset spurred Rasmus to further develop the language and, in 1997, version 2.0 was released with a number of enhancements and improvements from programmers worldwide. The version 2.0 release was hugely popular and spurred a team of core developers to join Rasmus in developing the language even further. Version 3.0, released in 1997, saw a rewrite and release of the parsing engine, and in 1998 it was estimated that more than 50,000 users were using PHP on their web pages. This version also saw the name change that we know now – PHP: Hypertext Preprocessor. Fast-forward a year to 1999 and with an estimated base of more than one million users, PHP was rapidly becoming one of the most popular languages in the world. Development continued at a frenzied pace, with hundreds of functions being added. It was at this time that two core developers, Zeev Suraski and Andi Gutmans, decided to rethink the way that PHP operated and so the parser was once again rewritten and released in version 4.0, dubbed the Zend scripting engine.
A few months after version 4.0 was released, Netcraft estimated that PHP had been installed on more than 3.6 million domains. Version 4.0 represented a massive leap forward at an enterprise and programming level, but the language still had some drawbacks, mainly due to its infancy. Version 5.0 was released to the world in 2004, and with it came a myriad of improvements taking the language to a maturity – and installation peak; it’s thought that PHP is running on more than 20 million domains and it’s reported that it’s the most popular Apache module, available on almost 54% of all Apache installations. At the time of writing, version 6.0 is nearing public release and is intended to further improve the functionality and maturity of the language. With websites such as Wikipedia, Facebook, Flickr and Digg all making use of it, it’s no wonder that PHP has become so widely adopted amongst web developers. Let’s see just how easy it is to get started with this dynamic, server-side scripting language.
“It’s thought that PHP is running on more than 20 million domains.”
Setup and installation Most of the latest distributions of Linux come with PHP, so this tutorial already assumes that it’s been installed and set up on your Linux platform and is being parsed correctly through your web server of choice. Although you can run PHP scripts via the command line, we’ll be using the browser (and therefore a web server) for this tutorial. You can follow along by uploading your PHP files to a web server on the internet (if you have one). We’re using a default installation of Apache2 on our local Linux machine, though, because we find it easier and quicker to write and test PHP on a local machine instead of having to upload files via FTP each time. If you require installation and/or setup instructions or guides for your local machine, we recommend reading the Installation on Unix Systems manual on the official PHP site, available at http://php.net/manual/en/install.unix.php. Alternatively, there are hundreds of installation guides written for pretty much every flavour of Linux. Simply search Google for your distribution if the official guide doesn’t tick all the boxes.
Getting started The PHP website might not be the most eye-catching in the world, but it’ll be a site you return to time and time again.
86
Now we get to the fun part – working with, and writing, our first PHP script. Historically, we’d write a basic “Hello World”
PHP script, but that can be, well, a little boring. Instead we’ll write some dynamic text output using PHP’s date() function. Before we get into the real nitty gritty of the language, we must first understand how the interpreter reads in our PHP code and generates the necessary output.
Easy embedding One of the advantages of PHP is that you can embed your code directly into your static HTML pages – of which the entire page is sent directly to the interpreter. It’s extremely important to note that all of your PHP files must end in the .php extension. Embedding your PHP code in HTML or HTM files means they won’t be run through the interpreter and won’t get executed – instead, you’ll just see the plain text code in your pages. For your PHP code to be extracted from the rest of your content, it must be enclosed within delimiters. PHP will execute any code found within these delimiters – anything else is simply ignored by the interpreter. The default, and most common, delimiters that we use are <?php to indicate the start of our code and ?> to signal the end. There are a few other options available for delimiters such as ‘short tags’, but some of these can have implications with XML and XHTML languages. For the purpose of this tutorial, we’re going to stick with the recommended default. With this in mind, open up your favourite text editor and write the following: <?php echo ‘Welcome to the world of PHP’; ?> Save this file as welcome.php in your web server’s root folder – that is, the folder that your web server reads when you request the site in your browser. Once the file has been saved, open your web browser and point it to the file on your local web server, for me this is http://127.0.0.1/welcome. php – this URL may differ based on your Linux setup/ configuration. When run, you should simply see ‘Welcome to the world of PHP’ displayed in your browser. If, instead, you see the raw PHP code, this means that you haven’t set up your web server to interpret your PHP files correctly. Go back, or find the relevant installation guide, and make sure you’ve followed all the steps outlined. If you successfully see the text without any PHP code, then we’re ready to move on.
Syntax, data types and functions You’ll notice that we ended our code with a semi-colon before the closing delimiter. PHP uses a semi-colon to indicate the end of a line of code, or statement – without it PHP wouldn’t know when to stop evaluating our code, in turn breaking the script. PHP is very forgiving when it comes to formatting – it will ignore any white space and new lines (except when they’re contained inside string quotes) allowing you to be pretty free over how you format (indent etc) your code. PHP supports many data types giving us enormous flexibility when writing our programs. To quote Wikipedia: “In computing, a data type is a classification identifying one of various types of data, such as floating-point, integer or Boolean.” PHP supports all of these data types and more, including strings and compound data types such as Arrays and Objects. In the standard distribution of PHP, there are more than 1,000 functions available to use. These range from simple things such as date & time functions (which we’re using in this tutorial) to more advanced concepts such as LDAP and
Why choose PHP? PHP has long been a popular choice for web developers. Not only because it has a massive userbase, and therefore, support for developing (and debugging) your code is widely and freely available, but web servers or hosts with PHP installed and ready to use are ten-a-penny these days. The relative ease of the language has been one of the reasons for its massive uptake. PHP can also be very ‘forgiving’ when it comes to programming. For instance, you don’t have to declare your variables (or their
type) before using or instantiating them and there are a couple of other ways in which you can bypass traditional programming methods. The easiest way to understand why PHP is so popular is to simply dive in and get started. If you’ve already worked with another programming language you’ll soon see just how easy PHP is to get to grips with. If this is your very first language, you’ll be pleasantly surprised at just how quickly you can get usable results.
MySQL database functionality. For anything that’s missing (or something you want to improve) you can simply ‘roll your own’ function to give additional support. We won’t go into too much detail about functions here as we’ll be focusing on the basics and keeping things simple.
Flexible scripts In our first example, we simply instructed PHP to output a specific string of text by using the echo function. The value of our string can come from many places – a database, the output of a function, a file on the server or even from user interaction on our site. By hard-coding this value we’re pretty much stuck on what the string value can be. Instead we’ll now assign the value to a variable, so open up a new file in your text editor, enter the code below and save it as welcome-var.php: <?php $display_text = ‘Welcome to world of PHP’; echo $display_text; ?> When you run this script you shouldn’t see any difference in output from the first file, but first you’ll notice a new line starting with $display_text. This is a variable. Variables in PHP can hold a single piece of data at any one time. This data can change and can be of any type at any point.
Quick tip Use a text editor that has syntax highlighting for PHP – it’ll help you quickly identify your code and specific parts, or functions, inside it. There are free programs available too so have a look around and go with the one you prefer the look of.
Believe it or not we used to write all our PHP code in Notepad, then we realised how much nicer life is in colour. Text editors are heaven to coding eyes.
87
PHP
You can clearly see the rapid growth in PHP usage and the number of installations is still on the rise.
Variables begin with a $ followed by the variable name, in this case, display_text. A variable can only begin with a letter or an underscore but the rest of the name can consist of any letters, underscores or numbers. An important note to be aware of is that variables are case-sensitive meaning that $Display_text is different (and a separate variable) from $display_text. In our script above, we’re declaring and assigning the value simultaneously. Some languages do not allow this, however PHP is very flexible when it comes to programming. Value assignment is simply the process of copying a value to the assigned variable, such as: $display_text = ‘Welcome to the world of PHP’; $my_age = 29; Typically, you would declare your variable before assigning a value but given the nature and context of this tutorial it’s acceptable to do the above. The next line in our script simply changes from echoing the hard-coded string, to echoing the value of the variable (that we assigned in the previous line). Although they essentially do the same job of outputting the string value, this method allows us to be truly flexible with the output we see in the browser.
browser, as it would from a JavaScript program (or similar browser-based script). The first line of our code is almost identical to the previous script; all we’ve changed here is the copy to reflect the more dynamic nature of the output. The second line is where the magic (also known as concatenation) happens. Essentially, and in non-computer talk, all the second line is saying is “print the display text, followed by the date command, then some more text, and finally append the date.” At this point you might be wondering how we’ve specified the date that gets printed. That is all to do with the parameters that we supply to the date function. PHP’s date function accepts input parameters in order to represent exactly what value/string is returned from the function. It currently accepts 35+ date parameters, each one representing a unique ‘piece’ of date and/or time. In our example we’ve split the date and time into two different date() calls – it would be perfectly acceptable to merge them into one. The more eagle-eyed amongst you will notice that I’ve hard-coded the text in the middle of the date functions. Again, this could be assigned to a variable instead (as we did for the initial text), for greater flexibility, in which case the line might look something like: echo $display_text . date(‘l, jS F’) . $secondary_text . date(‘H:iA’); For a full list of date input parameters, check out the date() function page of the official PHP docs: http:// php.net/manual/en/function.date.php.
“To concatenate means to combine two or more ‘things’ together.”
Quick tip Where possible, make use of indenting as it will make things ‘flow’ better and help you read the code on the page. Some text editors auto-indent for you but most developers use either a single tab or two-four spaces.
88
Time gentlemen, please So far, so good... but static text is pretty boring. Let’s do something about that and add the date and time into the mix. For this, we’re going to make use of two things – the first is the date() function and the second is the concatenate operator. As we’ve mentioned earlier in this o concatenate means to combine two or more ‘things’ together to form one single entity; in this script we’re going to combine a welcome message with the date and time. To do this, open up a new file in your text editor and enter the following code. Once you’ve done that save it as welcome-date.php: <?php $display_text = ‘Welcome to the world of PHP. It is ‘; echo $display_text . date(‘l, jS F’) . ‘, and the time is ‘ . date(‘H:iA’); ?> When run, you should see the line of text with the current date and time embedded. It’s important to note that the date and time comes from the server and NOT your
Put it all together We mentioned earlier that one of PHP’s great selling points is the ability to embed snippets of code into static HTML documents with ease. This becomes apparent when we want to create ‘dynamic’ sections inside an otherwise static page, for example, our date script above. We could easily ‘drop’ this code into our existing template (if we have one). Let’s see how that might look:
Essential resources There are plenty of books dedicated to learning PHP and it’s often hard to tell which one(s) to buy to steer you in the right direction. While I can’t help you chose the book that suits you the most, I can point you in the general direction of great ‘companion’ websites: http://php.net The ultimate resource for anything PHP related. http://php.net/manual/ en/intro-whatcando.php A taste of things that can be done with PHP. http://phpsec.org/ A great resource for any security related with PHP.
Despite owning a few PHP books, I often find myself heading over to the official PHP documentation online – it’s often quicker than picking up a book and looking for the right page. For some inspiration, check out the second link – let your mind wander and think of something you’d love to build! The last link is equally important if you plan on installing PHP on a publicfacing web server. Install the script, as it gives you some recommended base settings, then read up on general security practices.
PHP PHP 6 – what’s it all about? So what can we expect in version 6? One of the core updates will be better support for Unicode strings, allowing for a much broader set of available characters to cover greater international support. For the more advanced developers, it’s bringing in better support for Namespaces. With the massive take up of Web 2.0 functionality, version 6 is also giving default support
for the SOAP protocol and the library of XML features (for both reading and writing) are being overhauled. A handful of features are being dropped from the core language, these include magic_quotes(), register_ globals(), register_long_arrays() and safe_mode(). The main reason is security related – some functions allowed for potential security holes to be
<html> ... <body> <div id=”welcome-text”> <?php $display_text = ‘Welcome to the world of PHP. It is ‘; echo $display_text . date(‘l, jS F’) . ‘, and the time is ‘ . date(‘H:iA’); ?> </div> ... </body> </html> In this case, I’ve pasted some trimmed and rudimentary HTML code and you can clearly see where I’ve embedded the PHP code to output my dynamic text on the page. The PHP code block can sit anywhere on the page and any amount of times inside a page – don’t be concerned if you have five, 10 or sometimes more code blocks within your HTML. Something that’s really handy is that internally, PHP will ‘communicate’ between each code block on your page. For example, if you set the value of a variable in the first code block at the top of your page, it will be available to the last code block at the bottom of your page. This can work wonders when you’re altering the display of the content based on the value of a variable elsewhere in the page – a really common use of this is a Login/Logout system where a user is presented with the Login or Logout options based on their logged in ‘state’.
exposed while others lead to poor programming practice. You can already download a developer version of PHP 6 to try out, but at the time of writing there’s no official public release date. Once it’s been released, you can expect a good wait before it’s available on public servers as most companies let it have a good run to iron out any bugs before installing it.
A note of caution – the filename specified in the include() function is relative to the script that’s calling it, in other words, if your main HTML is located in the root folder and your welcome-var.php file is in a folder called scripts, your PHP code would look like this instead: <?php include(‘scripts/welcome-var.php’); ?> While it’s not quite rocket science, we’ve actually covered some pretty decent fundamentals about PHP in this tutorial. We have learned a little bit about how PHP got started and just how much it’s grown. You’ve been introduced to some basic, but core, programming skills and we’ve covered the basic syntax. By now you’re hopefully beginning to understand a little of PHP’s potential, and be ready to tackle the next part of this chapter.
“By now you’re hopefully beginning to understand a little of PHP’s potential.” Over to you...
We’ve written our first script and now have first-hand experience of how easy it is to make use of PHP on a website. From here, why not play around further with the date example, try changing the input parameters to something different or even try embedding this code into an existing site. Take a few minutes to have a look through the official PHP documentation online, www.php.net/manual/en and see what other functions PHP has to offer – you’ll be surprised how much you can achieve with just the standard installation. It’s worth bookmarking that URL as the more you use PHP the more you’ll use the website as a reference manual – and a great one at that. Q
Multiple updates Don’t forget when doing this, that you must save your files with the .php extension otherwise your PHP code will fail to be executed and you’ll be left with plain text code on your page. If you find that you’re adding the same block of code to multiple pages and you need to change it, going through each page and updating your code can be a treacherous job. Thankfully, PHP has got you covered. To help with this process, we can use the include() function. This allows us to write our PHP to a file (just as we have in our welcome-var. php file). Instead of embedding the full code in our HTML each time, we can instead do: <?php include(‘welcome-var.php’); ?> When the page is run, PHP will pick up the include() request and will read in and execute the code on that page, simply embedding the output – it works as if the code was directly on the page.
Facebook loves PHP so much, it even wrote its own Facebook Optimised version called HipHop.
89
PHP
Build an online calendar Continuing from his introductory tutorial, Mike Mackay explores arrays and functions to build a basic events calendar for our website.
I
n the last tutorial, we covered the basics of PHP, including how the language was created and subsequently grew. We were also introduced to various parts of the language, such as variables, strings, integers and PHP’s internal date() function. In this tutorial, we’ll expand on those parts, but we’ll also introduce the concept of arrays and functions to make a fully working calendar. We’ll assume that you have your Linux platform configured and serving PHP pages through your web browser, as outlined in the previous tutorial. If not, please refer to the previous article or the section titled Installing PHP on your Linux platform. So what exactly is an array? Well, to help us define this, let’s go back to the last tutorial, where we made use of a variable ($display_text) to hold a simple string message. The problem with variables is that they can hold only one piece of information at any one time. Wouldn’t it be great if we could store multiple items inside a variable? Well, this is where arrays come in.
items inside it as necessary (the only limitation on the size of the array is based on how much memory PHP is allocated). You can step through all the items inside an array (known as traversing), and PHP comes with more than 70 functions, allowing you to perform certain actions on your array, such as searching inside, counting the number of items inside it, removing duplicate items, and even reversing the order. There’s almost nothing to it when creating an array either: $data = array(); We have now created a new, empty array called $data.Arrays are structured using a key index and value data architecture. By default, when you add an item to an empty array, that item’s position in the array is 0. If you add another item, that item’s position becomes 1 in the array. You can also create your array with pre-populated data (if you already know what’s going to be in it). To do this, we just create the array as before, but this time we supply the data in a comma-separated list: $data = array(‘Red’, Orange’, ‘Yellow’, ‘Green’, ‘Blue’); This is where the key system comes in to place. The way that PHP interprets this array will be as follows: 0 = ‘Red’, 1 = ‘Orange’, 2 = ‘Yellow’, 3 = ‘Green’ , 4 = ‘Blue’ As you can see, each key is associated to the value in the array. The most important part to remember is that arrays always start at key 0 and not key 1 as many might assume; it’s easy at first to forget.
“An array allows you to hold as many items inside it as necessary.”
Introducing arrays The best way to think of an array is a special variable that holds other variables. An array allows you to hold as many
Associative arrays
Despite the many books that have been written on the language, the PHP website is still most up-to-date and comprehensive reference manual available.
90
Arrays also have the flexibility of allowing us to specify our own keys (known as associative arrays). This helps a lot when you want to store a value against a specific key instead of having to rely on automatic indexes. Let’s say we wanted to store data about a person in an array; to do that we can do the following: $person = array(‘name’ => ‘Mike Mackay’, ‘location’ => Essex’, ‘age’ => 29); By using the associate instruction (=>), we’re telling PHP that we want to create a key called name and store the value Mike Mackay against it. You can store any data type in an array – even other arrays. The way that PHP interprets our person array is how we would expect: ‘name’ = ‘Mike Mackay’, ‘location’ = ‘Essex’, ‘age’ = 29 When you want to use an array item, all you have to do is
PHP call the array and key you want: echo $data[0]; This echos out the word Red to the screen. To echo out the word Orange, you would simply change the key from 0 to 1. On our person array it’s just as simple – to echo the name to the screen, all I need to do is: echo $person[‘name’];
Add data to your array If we have our existing array, but want to add more data to it, how do we accomplish that? There are a few ways in which we can do this and often it depends largely on whether your array has custom key indexes or not; but to add another item to our $data array we can simply do: $data[] = ‘Indigo’; By supplying square brackets next to the array name, PHP recognises this action as wanting to push a value in to the array. PHP has a built-in function that does the same trick: array_push($data, ‘Indigo’); This function takes a minimum of two arguments. The first is the array you want to push the data to, then any items afterwards are pushed to the end of the array. This conveniently allows you to push multiple items in to the array at once, for example: array_push($data, ‘Indigo’, ‘Violet’); If you need only to push one item, the first method (using the square brackets) is recommended, as it has no system overheads of calling a function. If we wanted to add another item to our $person array, we need only to specify the key we wish to use, along with the required value: $person[‘profession’] = ‘Developer’;
Arrays within arrays As I mentioned before, an array can hold any type of item inside – this includes other arrays. The practice of multiple arrays is quite common, and you’ll find it extremely useful. Again, there are a couple of different ways of achieving this, and the one you use will be based on your array structure. For this example, let’s say we have an array of McLaren F1 racing drivers. Open up a text editor, enter the following PHP code and save it as drivers.php in your web root: <?php $drivers[] = array( ‘name’ => ‘Jenson Button’, ‘nationality’ => ‘British’, ‘championships’ => 1, ); $drivers[] = array( ‘name’ => ‘Lewis Hamilton’, ‘nationality’ => ‘British’, ‘championships’ => 1, ); ?> We’re using the square brackets to instruct PHP that we wish to push the driver data to the end of the array. Each item inside the master array() must be separated by a comma. Our $drivers array now contains two items – these items are arrays of data containing driver information that we wish to display. In PHP’s eyes, the data for Jenson Button is located in $drivers[0], while the data for Lewis Hamilton is located in $drivers[1]. We could have created custom keys instead of using 0 and 1, but it’s not strictly worthwhile for this example. We could simply display the data using echo and then specifying the array index key (such as $drivers[0]), but how
Installing PHP on Linux Most of the latest distributions of Linux come with PHP. Although you can run PHP scripts from the command line, we’ll be using the web browser (and therefore web server) for this tutorial. You can follow this tutorial by uploading your PHP files to a web server on the internet (if you have one). For me, though, I’m using a default installation of Apache 2 on my local Linux machine. I find it easier and quicker to write and test PHP on my local machine instead of having to upload files via FTP each time.
If you require installation and/or setup instructions or guides for your local machine, I recommend reading through the Installation on Unix systems manual found on the official PHP site, available at the following address: http://php.net/manual/en/install. unix.php. Alternatively, there are hundreds of installation guides written for pretty much each flavour of Linux. Simply search Google for your distribution if the official PHP guide doesn’t tick all of your boxes.
would we display each item when we don’t know how long the array is? Thankfully for us, there’s a simple control function called foreach() that allows us to do this. So you might be asking why we wouldn’t know the length of an array? Well, we know what driver data is contained in each item (name, nationality and championships), but a database query (or similar function) might return one driver, or it might return five drivers. We could get the the total number of items in the array by using a PHP function, but using foreach() is simpler and lets us write shorter code. The foreach() control gives us an easy way to iterate over an array. Using drivers.php that we’ve just created, copy the code just below the PHP block that contains the drivers array: <?php foreach($drivers as $driver) { echo ‘Name: ‘ . $driver[‘name’]. ‘<br />’; echo ‘Nationality: ‘ . $driver[‘nationality’]. ‘<br />’; echo ‘World Championships: ‘ . $driver[‘championships’]. ‘<br /><br />’; } ?> When you view the file in your browser, you should see a basic list of drivers on your screen: Name: Jenson Button Nationality: British World Championships: 1 Name: Lewis Hamilton Nationality: British World Championships: 1
Quick tip Use a text editor that has syntax highlighting for PHP – it’ll help you quickly identify your code and specific parts, or functions, inside it. There are free programs available too, so have a look around and go with the one you prefer the look of.
Using a code editor that has built-in syntax checker (such as Eclipse, the winner of our IDEs Roundup in LXF152) can save a lot of time and frustration!
91
PHP
The 2012 F1 calendar we’ll be recreating with our code.
Quick tip Where possible, make use of indenting, as it will make things flow better and will help you read the code on the page. Some text editors autoindent for you, but most developers use either a single tab or 2–4 spaces. .
92
On each loop of $drivers, the value of the current item is assigned to $driver (the first loop being Jenson Button), and the internal array pointer is moved on by one; so on the next loop you’ll get the next item from the array (Lewis Hamilton). This continues throughout each item in the array until the end is met. The foreach() function requires two parameters – the first is the array we want to loop through. We then use a PHP keyword as, then we enter a temporary variable name that we want to assign the loop item to (this variable is only available inside the loop). In literal terms, we’re saying: loop through each item in the $drivers array and store each driver item to a temporary array called $driver. Hopefully, in the example you’ll recognise a few parts from the last tutorial; we’re echoing out a string concatenated by a variable – this being each bit of information about the driver in the array. We then concatenate another string which is HTML, allowing us to format the output in a basic manner. On the last array item, $driver[‘championships’], we echo out two line breaks; this just gives us a bit of separation between each driver.
Let’s talk about functions There are two types of functions in PHP: 1 Built-in PHP functions, such as date() and array_push(). 2 User-defined functions. We’ll be focusing on the second type for now (we’ve already covered a few built-in PHP functions). A user-defined function is a special block of PHP code that we write that can perform custom operations any time it’s called. Some functions are written to manipulate data and then send that new value back, while others perform one-way operations, such as writing data to a file or inserting the data in to a database. To create a function, all we need to do is write the word function followed by the name we wish to call our function (It’s important to note that function names can only start with letters or underscores), followed by parenthesis and then a pair of curly braces: function shout() { } We can also supply data, known as arguments, to our function to be used inside it. When calling this function and sending information to it, the function assigns this data to the internal variable called $text, where it can manipulate it, or do as required. This data is then held locally to the function and does not overwrite any variables outside of it:
function shout($text) { } If we want the newly-modified data back, we can use return to send it back: function shout($text) { return $text; } To call a function, all you need to do is write the function name followed by parenthesis either with or without any parameters (based on the function’s requirements): shout(); We have our basic function, but all it does is send back exactly what we sent to it – pretty pointless I’m sure you’ll agree. Let’s make our function do something a bit more interesting. Create a new PHP file, copy the following code in to it and save it as function.php: <?php echo shout(‘Hello World’); function shout($text) { return strtoupper($text); } ?> If you run that in your browser, you should see ‚‘HELLO WORLD’ being displayed. What’s happening is we’re sending a string straight to the function, where we echo the returned value out. The built-in PHP function strtoupper() has a simple purpose – take the string input and convert it to uppercase. We could modify the function to perform the echo inside instead of using return, but our original method gives us greater flexibility for multi-purpose use (we may not always want to echo a value out immediately). We could write any kind of code inside our function, and we’re not limited to doing string transformations.
If() and else() You’ll notice something else with our code… we’ll be performing a conditional check using if() and else(). If/else provides a simple way of evaluating which code to run, based on the outcome of a particular check, or condition. If() is only executed when the condition inside the parenthesis equates
Essential PHP resources There are plenty of books dedicated to learning PHP, and it’s often hard to tell which one(s) to buy to steer you in the right direction. While I can’t help you choose the book that suits you most, I can point you in the general direction of great companion websites: http://php.net The ultimate resource for anything PHP related. http://php.net/manual/ en/intro-whatcando.php A taste of things that can be done with PHP. http://phpsec.org A fantastic resource for any security related to PHP.
Despite owning a few PHP books, I often find myself heading over to the official PHP documentation online – it’s often quicker than picking up a book and looking for the right page. For some inspiration, check out the second link – let your mind wander and think of something that you would love to build! The last link is equally important if you plan on installing PHP on a publicfacing web server. Install their script, as it gives you some recommended base settings to use, then read up general security practices.
PHP From the previous tutorial… In case you missed it, here are some of the basics from what we covered in the last tutorial: In the standard distribution of PHP, there are more than 1,000 functions available to use. These range from simple things, such as date and time functions, through to more advanced concepts, such as LDAP and MySQL database functionality.
All PHP code (usually) starts with <?php and ends with ?> delimiters. PHP supports many data types, such as strings, booleans, integers, arrays, objects and more. Variables in PHP can hold a single piece of data at any one time. Variables begin with $ followed by the variable name and can only begin with a letter or an underscore.
(or returns) TRUE, otherwise else() is called – just the code between the curly braces is executed, but only one of them will ever be run: if(condition is true) { // Run the code in this block } else { // Run the code in this block instead } In literal terms, all we’re going to say is: If today’s date is found as an index in the array, then echo out the race data, otherwise echo the no races message instead. Let’s now create a basic events calendar using our arrays and function knowledge.
Put it all together
The rest of the name can consist of any letters, underscores or numbers, but is also case-sensitive. The built-in date() function accepts more than 35 input parameters to represent exactly what is returned, and runs on the server, not the browser time. PHP can be run as a standalone script, or as part of an existing template, using include().
Write the following just below the end of the array and before the functions’ closing curly brace: $date = date(‘m/d/Y’); if(array_key_exists($date, $race_dates)) { echo “Today’s race is the “ . $race_dates[$date][‘title’] . “ in “ . $race_dates[$date][‘location’] . “.”; } else { echo “There is no race today.”; } We use the value of $date inside the array_key_exists() function – this function accepts two parameters: the first is the key you’re looking for (in our case it’s the date) and the second is the array you wish to check against (our $race_dates array). The array_key_exists() function returns a boolean of TRUE if the key exists, or FALSE if it doesn’t. If the race is found, we’re going to echo out the race details. We can retrieve this data because we know the key exists, therefore we use the $date variable as a shortcut to retrieve the information. Essentially, it’s the same as writing: echo $race_dates[‘13/5/2012’] ‘title’]; All that’s left for us to do is to call the script. We do this in exactly the same way as our earlier function script and put the function name (with parenthesis) at the top of our script: is_race_day(); We can then run this script in our browsers by going to calendar.php, or we can use includes() to embed it on an existing website. If a race exists on the day the script is run, we’ll be presented with the details. To test this, simply hardcode a date in the $date variable: $date = ‘13/5/2012’;
“Get today’s date and check whether it has a Grand Prix or not.”
We don’t want anything over the top, so for now we’ll create an F1 2012 race calendar. We’ll send today’s date to a function and return a current race if one is happening on that day, otherwise we’ll echo out a generic message. Start off by creating a new PHP file called calendar.php, then create a function that initially holds an array of the race dates: <?php function is_race_day() { $race_dates = array( ‘18/3/2012’ => array(‘title’ => ‘Australian Grand Prix’, ‘location’ => ‘Melbourne’), ‘25/3/2012’ => array(‘title’ => ‘Malaysia Grand Prix’, ‘location’ => ‘Kuala Lumpur’), 15/4/2012’ => array(‘title’ => ‘Chinese Grand Prix’, ‘location’ => ‘Shanghai’), ‘22/4/2012’ => array(‘title’ => ‘Bahrain Grand Prix’, location’ => ‘Sakhir’), ‘13/5/2012’ => array(‘title’ => ‘Spanish Grand Prix’, ‘location’ => ‘Catalunya’), ); } ?> I’ve included just the first five dates of the season for now, but feel free to add more. Next, let’s update the function. Get today’s date (see the first tutorial on the date() function) and check whether it has a Grand Prix or not; for this we’ll use a built-in function, array_key_exists().
And with that, we’re done! This might all feel like quite a lot to take in at once if you’re new to arrays and functions, but hopefully you’ll see that what we’ve learnt here is extremely important and intrinsic to programming. Try modifying the calendar; you could be more specific with your dates and have something on every day of a month if you wish. As an exercise, look at the date() function and alter the calendar to echo messages based on the hour of the day. Remember, you won’t need an entry for every day – only every hour. Q 93
PHP
Extend your calendar Following his second tutorial, Mike Mackay explains how to add the functionality to dynamically select races and view details.
I
n the last tutorial, we covered the basics of using arrays, control statements (if/else) and functions – both custom and native to PHP. We put this together to create an F1 season calendar. In this tutorial, we’re altering the functionality of the calendar by allowing the user to select a specific race from an HTML drop-down list. We’ll assume that you have your Linux platform configured and serving PHP pages through your web browser, as outlined in the first tutorial. If not, please refer to the previous article or the section titled Installing PHP on Linux.
Forms and security Forms are a common occurrence in every developer’s career, yet some good practices are often overlooked – especially when you’re starting out in the world of PHP. On any kind of user input, we should always perform a certain amount of validation and filtering – because we can’t always be sure where the data came from and what it contains. We want to verify that the user has submitted only the data we’re looking for and nothing else. Validation is the process of making sure the input we receive is the input we’re expecting (correct format, type etc). Filtering, or sanitising, means to clean the input of undesirable characters so that it’s safe to use. Without doing this, we open our PHP scripts up to possible code injection attacks. We’re going to use PHP’s built-in validation and filtering functions so
that we can work safely when dealing with user input. Start by creating a blank PHP file (mine’s called races.php), and at the top make an empty PHP code block with an empty variable, $race_data, in it (we’ll come to using this variable a little later on): <?php $race_data = FALSE; ?> Before we begin building our full script, we need a few things in place. First, create an array of race dates inside the code block after the $race_data line you’ve just created (refer back to the previous article if you’ve forgotten all about arrays). You don’t have to use F1 races – you’re free to use any kind of dates you want. For F1 race dates, please refer to www.formula1.com/races/calendar.html. My array looks like this (for display purposes, I’ve truncated most of my array data): $races = array( ‘Australia’ => array(‘title’ => ‘Australian Grand Prix’, ‘location’ => ‘Melbourne’, ‘date’ => ‘18/3/2012’), ‘Malaysia’ => array(‘title’ => ‘Malaysia Grand Prix’, ‘location’ => ‘Kuala Lumpur’, ‘date’ => ‘25/3/2012’), ‘China’ => array(‘title’ => ‘Chinese Grand Prix’, ‘location’ => ‘Shanghai’, ‘date’ => ‘15/4/2012’), ‘Bahrain’ => array(‘title’ => ‘Bahrain Grand Prix’, ‘location’ => ‘Sakhir’, ‘date’ => ‘22/4/2012’), ‘Spain’ => array(‘title’ => ‘Spanish Grand Prix’, ‘location’ => ‘Catalunya’, ‘date’ => ‘13/5/2012’), ... You may notice that the array differs from last time – I’m now using a location instead of a date as the array key. We’re going to allow the user to specify the location of the race, so that they can view more details – for this we need to change the array key.
Add the HTML
The official Formula One website (www.formula1.com) is the best resource, with all the latest news, reviews and race dates available.
94
The next thing we need is to add the HTML code and form that lets the user choose the race. We’re going to assume that you have some HTML knowledge, as covering HTML is outside the scope for this article. Create a basic page structure (head, body etc) directly below (and outside) the PHP code block. The form is basic and consists of one drop-down menu and a Submit button. To populate the locations in to the dropdown menu dynamically, we’re going to use PHP’s foreach() function. We covered this function previously, but to
PHP summarise it allows us to simply loop, or iterate, through an array and access each array item’s data. In your HTML body area, add the following form: <form method=”post”> <fieldset> <label for=”location”>Choose a race:</label> <select name=”location”> <?php foreach($races as $location => $race): ?> <option value=”<?php echo $location; ?>”><?php echo $location; ?></option> <?php endforeach; ?> </select> <input type=”submit” name=”submit” value=”View” /> </fieldset> </form> View the script in your browser, and you should be presented with all races in the drop-down. As you can see, a minimal amount of code gives us a dynamically generated drop-down of the items inside our array. I’ve called my dropdown ‘location’, but you can choose whatever name you want. After the drop-down, I’ve added a Submit button, so that once the user has chosen a location, they can then submit the form. You may notice that I’ve set the form method to post. You can use get if you wish, but to keep the URL tidy, I like to use post for my forms. If you want know the difference between POST and GET, refer to the information box titled GET vs POST in short on p96.
Checking for user data By omitting the action attribute, the form will post to itself. How do we know when the form has been posted? By default, when GET or POST data is sent to a script, PHP converts that data to an array, using the field names as array keys – this is extremely helpful, as we can easily check for the presence of a specific form field by using the isset() function. Add the following code right beneath the races array: if(isset($_POST[‘location’]) check_for_race($_ POST[‘location’]); Here, we’re checking if the $_POST data is available and has the array key location inside it. If you changed your select name from location then you should change the field name in the $_POST array accordingly. If the field is present (meaning the form was submitted), we’re going to call the function check_for_race() with the location field sent to it as an argument. We haven’t built that function yet, so let’s do so now.
Gentlemen, start your engines
Installing PHP on Linux Most of the latest distributions of Linux come with PHP. Although you can run PHP scripts on the command line, we’ll be using the web browser (and therefore web server) for this tutorial. You can follow this tutorial by uploading your PHP files to a web server on the internet (if you have one). For me, though, I’m using a default installation of Apache 2 on my local Linux machine. I find it easier, and quicker, to write and test PHP on my local machine instead of having to upload files via FTP
each time. If you require installation and/or setup instructions, or guides for your local machine, I recommend reading through the Installation on Unix Systems manual found on the official PHP site, available at the following URL: http://php.net/manual/en/install. unix.php Alternatively, there are hundreds of installation guides written for pretty much every flavour of Linux. Simply search on Google for your distribution if the official PHP guide doesn’t tick all of your boxes.
return; } Before we do anything, we’re declaring our function, check_for_race(). This function houses the crux of our validation and assignment code. The first line (starting with global) is new to us. Because we defined the races array outside of the function, we have to tell PHP that we want our function to be able to access it. By writing global, preceded by any existing array and/or variables name(s), we’re giving our function scope to access the data. Scope can be difficult to understand – if you’re new to it, check out the information box at the top of p97 for a more detailed explanation. The next line is where PHP’s internal validation and filtering routine comes in. PHP (as of 5.2+) comes with built-in functions for checking specific input. You can verify that input is in a certain format, within a range, or conforms to a specific pattern. The filter_input() function allows you to validate integers, floats, email addresses, URLs or supply your own regular expression pattern to test against. You can also perform a few sanitising functions on data, too – this is the part we’re most interested in for this tutorial. You’ll notice that we’ve supplied four arguments to the filter_input() function. The first one tells PHP that we want to deal with data that’s coming from the POST input. Following on, we then specify the field name that we want to sanitise, location. Don’t forget, if you changed the field name earlier to something other than location, you’ll need to amend this line accordingly.
Quick tip Always comment your code throughout. While at the time everything makes sense to you, will it do so if you have to return to your code a few months later? Commenting will help you make sense of those complicated functions that you’ve written.
We have our array of races, we have our HTML page and form; all we need to do now is validate the user data and check for the race details. When the race is found in the array, we will display the details to the user, otherwise we’ll show an error message. Copy the following PHP code just below the code snippet (if(isset($_POST...) added above: function check_for_race($location) { global $races, $race_data; $location = filter_input(INPUT_POST, ‘location’, FILTER_ SANITIZE_STRING, FILTER_FLAG_STRIP_LOW); if(isset($races[$location])) $race_data = $races[$location]; else $race_data = ‘No matching races have been found.’;
The official PHP website is the best place online to learn about the other filtering and validation options available natively to PHP.
95
PHP
Quick tip Use a text editor that has syntax highlighting for PHP – it’ll help you quickly identify your code and specific parts, or functions, inside it. There are free programs available too, so have a look around and go with the one you prefer the look of.
The next two arguments are the really interesting ones. The first, FILTER_SANITIZE_STRING, tells PHP that we’re expecting a string input, and that we want to remove any potentially unsafe characters; by default, PHP will remove any tags from the string for you. We then supply the FILTER_ FLAG_STRIP_LOW argument, which tells PHP to remove any potentially dangerous characters. There are six possible options we can use, ranging from encoding ampersands, not encoding quotes within strings and stripping or encoding low or high values. When PHP talks about low or high values, it’s referring to the ASCII table. Standard ASCII input (numbers, letters, ampersands) starts at #32 and ends at #127. Anything below 32 is considered low (inputs such as line endings, tab spaces, null characters etc), and anything above 127 is considered high (inputs such as foreign accents, currency symbols, numeracy symbols etc). By specifying whether we want these removed or encoded, we can better control how PHP deals with the string. As our race names, or other data, could include foreign characters, we’ll leave these in – input such as tabs and line endings isn’t required, so we’re going to remove it altogether. Validating and santising input data is extremely important when the data you need to work with is coming from the outside world. Not only because there’s the potential for a hacker to break your code, but users aren’t always predictable and can submit incorrect data by mistake. By checking the data before we use it, we’re reducing the possibility of bad things happening. This isn’t by any means the only precaution you should take when dealing with user input, but it’s a good start. Once we’ve validated and sanitised our input data (the value from the drop-down menu), we assign it back to the $location variable – we know now that this data is safe to work with. The next two lines deal with checking if the race exists, and what to do if it doesn’t. Hopefully, you’ll recognise the isset() call from the previous tutorial – it’s also almost identical to how we’re checking for our input POST data.
Because our array keys are defined by the place names, it’s easy to check if the chosen race exists. The key will exist in our array if we have the data – we verify this by using the shortcut to an array item with the square brackets and using the isset() function to return a TRUE or FALSE value (boolean). If the function returns TRUE, we have array data, and so we assign the value of that array item to the $race_data variable. If the function returns FALSE, it means the value the user chose doesn’t exist – so we supply a string message back. You might be wondering how a user could choose a value that’s not in the array when we’re using the array itself to populate the drop-down menu in the HTML? Well, nine times out of ten this won’t be an issue, but this is in place as a preventative measure. Sometimes, URLs get broken, or people tamper with the HTML to see what they can do to sites – by only showing the array data if it’s there or an error message if it isn’t, we’re preventing our script from breaking and showing a PHP error message; or disclosing private information about our script or server. You may not think this is important right now, but if you’re using a database or have potentially private data in the script or on the server, the less you can reveal about your hardware, or script’s content, the better. For example, in one case, I saw a website break on a database connection, and the error message displayed to me contained the database username and password. Anyone with a mind to do so could use this information to get further into the database (or server) to get their hands on information they wouldn’t otherwise be able to access.You should always be proactive in your security measures, and not have the bad luck of needing to be reactive instead. After we’ve assigned a value to the $races_data variable, be it the race data array or an error message, we finish off the function with return. This isn’t completely necessary, but it’s always good to return from a function when you’re done using it.
“You should always be proactive in your security measures.”
GET v POST in short
The SecurePHP Wiki (www. securephpwiki. com) is a good site to visit for general security practices and tips related to PHP.
96
GET and POST both form part of the RFC guidelines on the HTTP Protocol that defines the method for transferring data between the client (your internet browser, for example) and a web server. When you type a URL in to your browser, it sends a GET request (in plain text) to the server for the content you want to view. GET requests can be fairly large and can provide copyable ‘deep’ links into websites, or search pages. Data (typically in key/value pairs) is sent as a part of the URL, and interpreted by the server. POST requests are similar to GET in that they, too, can be used to retrieve content. However, any data that’s sent along with the request (again, in key/value pairs) is done so transparently to the user and the URL– it can’t easily be manipulated, viewed, or copied via the URL. POST is perfect for sending data to a site without obfuscating the URL.
PHP Show the user their data Now that we have data, no matter what its contents are, we need a way of showing it to the user. We do this by embedding some additional PHP code in to our HTML page, similar to how we embedded our drop-down menu code. Below the form, in the body area, add in the following: <?php if(is_array($race_data)): ?> <h3><?php echo $race_data[‘title’]; ?> - <?php echo $race_ data[‘date’]; ?></h3> <h4><?php echo $race_data[‘location’]; ?></h4> <?php elseif(is_string($race_data)): ?> <h3><?php echo $race_data; ?></h3> <?php endif; ?> This may look a little confusing at first, but if you break it down line by line you should be able to work out what’s happening. Essentially, we’re checking our $race_data variable to see what sort of content it holds – if it contains an array, we know we have race data, if it contains just a string we have an error message. PHP provides us with some great, simple tools to check the data type of a variable. The first line uses the function is_array(), with $races_data supplied to it as an argument. This function will return a TRUE or FALSE value back. By checking the response, we know whether to show all the data fields that we know are incased inside it, or to show a standard string line. In our previous tutorial, we used only if()else(); this is the first time we’ve come across elseif(). The elseif() control simply allows us to execute a certain block of code if additional criteria is met – this criteria will differ from the one contained in the initial if() block. In readable terms, we’re saying: if it’s an array, show this data, otherwise if it’s just a string, show this data instead. We’re not using else() here, because the $races_data holds an initial value of FALSE, set at the start of our script. If we used else(), then we’d see an error message even if the form hadn’t been submitted – not really ideal.
Scope within PHP Scope, in general programming terms, refers to where inside your script a variable can be seen/accessed. In short, in PHP variables created outside a function cannot be seen inside it, and variables created inside (static) a function cannot be seen outside it. To give access to a variable inside a function (created outside of it, as we did in our tutorial) or method, we can give it ‘global’ scope – this term should be self-explanatory, but means that the variable is accessible anywhere
within the function that it’s declared as ‘global’. It’s important to remember that variables created inside functions and methods have local (static) scope that aren’t accessible outside (unless declared), whereas control structures, such as if() and while(), don’t. Scope can initially be tricky to understand and keep in mind, but for more details I recommend reading the official PHP document page available at: http://php.net/manual/en/ language.variables.scope.php
down menu of race locations.By selecting a location and submitting the form, we’re dynamically showing the race data on the page to the user, and, if the user chooses any race that doesn’t exist (or if the form is tampered with), we display an error message instead – helpfully informing the user that something has gone wrong. Try it for yourself by visiting the script in your browser and then submitting the form with a location chosen. You are, of course, free to display the data differently, however you choose. Why not try spicing up the page with a sprinkling of CSS, or altering the date output to show weekday names or full-month names? By making a few basic changes here and there, you can dramatically transform the look and feel of the calendar. Perhaps you have an existing template, or theme that you want to replicate? As mentioned before, you can choose any kind of events calendar, and aren’t confined to using F1 races. You could be creative and use this on your personal website for family occasions, or memorable events – you’re bound only by the limit of your imagination. All that you need to remember is to use a distinct ‘key’ as the array index, and then you’re free to store and display any amount of data you want. If you decide that you would prefer to store dates instead of the names for the array keys, and have those shown in the drop-down list instead, then you can simply replace them as required. It should be a straightforward task to alter the code, but if you’re unsure how to do this then please refer back to the previous article in this chapter, in which we created our first version of the calendar and used race dates for the array keys.
“There’s a plethora of ways to get data in to your website.”
Wrapping it all up With the output code in place, you should now have everything together to run an interactive F1 calendar. When the user first visits the page, they’re presented with a drop-
Deeper into the dynamic web
I love comments in code; I describe most things in my scripts. If I have to return it’s easy to pick back up.
Working with, and displaying data doesn’t stop at arrays. There’s a whole plethora of ways to get data in to your website and/or application, and databases are one of, if not the, most popular ways of doing so. Dynamic and responsive websites play a massive part in today’s web as we know it, and you’ll soon begin to realise the potential of what you can build once you get to grips with it all. Read on to find out more. Q 97
PHP
Get started with MySQL Mike Mackay puts the M in LAMP, and shows how easily we can use databases by writing our own live visitor counter.
I
n our last tutorial, we finished off our F1 season calendar by extending the functionality to allow a user to select a specific race from an HTML drop-down list. This time, we’re starting on something new – databases. We’re going to create a ‘live’ visitor counter for our website. An additional step is now required for this tutorial, and that’s adding MySQL in to the server mix. You can download and install MySQL for Linux from http://dev.mysql.com/ downloads/mysql – you should follow the links for the appropriate version of Linux you’re using. If in doubt, the MySQL online documentation has installation guidelines and can be found at http://dev.mysql.com/doc. We’ll assume that you have your Linux platform configured and serving PHP pages through your web browser, as outlined in the first tutorial. If not, please refer to the previous article or the section titled Installing PHP on Linux. You’ll also now need to make sure you have compiled PHP with the --with-mysql[=DIR] configuration option. See the official PHP documentation at http://php.net/manual/en/ mysql.installation.php. To manage your database(s), I’d recommend downloading and installing phpMyAdmin. phpMyAdmin provides a webbased interface to a MySQL server and allows you to fully manage all aspects of your databases. It also makes debugging and testing our application easier. You can download the software at www.phpmyadmin.net.
So, what is MySQL? MySQL is a relational database management system (RDBMS) that runs as a server, providing access to databases (collections of data) using Structured Query Language (SQL). MySQL has been around since 1996, and is currently one of
the world’s most popular database platforms. It puts the M in the acronym LAMP (Linux Apache MySQL PHP). It’s open source and is used by some of the biggest names on the web, including Facebook, Twitter and Wikipedia. Data inside a database is held there using tables. A table is simply a collection of related data entries consisting of, and defined by, columns and rows. We’ll cover a brief table structure shortly as we make our first database application. MySQL is available on a plethora of operating systems, and is extremely quick. I’ve used it for a database containing more than 33 million rows, and haven’t noticed any issues with speed or performance – in fact, it’s my preferred database choice.
Planning our script We have a task – count the number of visitors on a live site and then display the output on our page(s). Whenever I have a specific task, I always sit down and plan out what’s going to happen. As soon as I have that information clear in my mind, it makes writing the script 100 times easier – it’s a practice that I’d recommend highly, as there’s nothing worse than being confused. It doesn’t have to be anything complicated – just layman’s terms on a bit of paper, with some kind of basic logic flow is usually enough. With that said, this is how I envisage we’ll approach the script: A visitor lands on a page with our whosonline.php script included on it. The script gets the visitor’s IP address and adds, or updates, it in our database. The script counts how many unique IP addresses are in the database inside a five-minute window. The script then outputs an integer representing the above value. Now, how easy was that? Granted, it gets more complicated for larger scripts and processes, but now we’ve got that written down we can start writing our script.
Getting started
Creating a database in phpMyAdmin is laughably easy. All you need to do is give it a name and hit Create.
98
The simplest way to achieve what we need is to create one PHP script that does all the hard work and simply echos an integer value out (as we outlined above); that way we can make use of the native PHP include() function (as we’ve done in our other tutorials) and ‘drop’ our counter in to any existing PHP pages we want. So, open up your favourite text editor and create a new PHP file called whosonline.php and add the standard opening and closing PHP tags:
PHP <?php ?> In our planning steps, I mentioned that we’d count only visitors inside a five-minute window. You might prefer a larger, or even smaller, timeframe, so what we’ll do is allow you to customise this amount. We’re going to use the current date and time value in EPOCH in order to allow us some pretty fine-grained control. What exactly is EPOCH, I hear you cry? EPOCH, or Unix Time, or even POSIX Time, is simply the number of seconds that have elapsed since midnight (UTC) of 1 January, 1970. For example, the EPOCH value for midnight of 1 January, 2012 is 1,325,419,200. So that’s 1 billion, 325 million, 419 thousand and 200 seconds since midnight of 1 January, 1970. Capiche?
Our Visitors database isn’t complicated, so all we need is to create two columns to house our user data in.
Creating our database Before our PHP script can do anything, we need a database to work with. Using a copy of phpMyAdmin installed earlier (go back and do this if you skipped that step), create a new database called whosonline. You can see in my screenshot that I’ve chosen to specify a utf8_unicode_ci format for my database; this isn’t wholly required, though, as we’re storing only an IP address and a timestamp. Once your database has been created, you’ll be presented with a screen that says your database has no tables. This is fixed easily by creating the necessary table that will house our visitor data. Do so by creating a new table called visitors and specifying the number of fields as two. The third screen may look a little confusing at first, but don’t worry as I’ll explain what to select for each field and why we’re doing it. Let’s now start structuring our table… Field This is the name of the column that you want. For the first row, enter ip_address, and underneath on the second row enter visited. Type This instructs MySQL what sort of data type we’re using for this column. MySQL supports many different data types from integers, longtexts, binary, floats, varchars, date and times. For this exercise, we want the first row for ip_ address to be a VARCHAR – meaning variable character. The second row should be set to TIMESTAMP, indicating we want MySQL to store a date/time representation. Length/Values For certain data types, we can limit the length of that field. For our VARCHAR record (ip_address) we want a maximum string length of 15 characters. Restricting the length of the field can save database storage size and help keep speeds up. Because TIMESTAMP is a fixed-length field, we can leave this blank for visited. Default MySQL allows you to specify a ‘fallback’ value when inserting records in to a database if there isn’t a value supplied for a field. We always want the user’s IP address, so we can leave this blank, but we’ll specify CURRENT_ TIMESTAMP for visited, as it saves us having to supply that information when saving the visitor in the database. Index Indexing in relational databases can be extremely useful. You can gain a performance boost from your SELECT queries when indexes are in place. For more information on indexing, please refer to the info box titled Indexing, what gives? For our database, though, select PRIMARY for our ip_address row, and INDEX for the Visited row. That’s pretty much it where structuring our table is concerned. You can leave Collation, Attributes, Null, AI and Comments empty in the column rows, as we don’t need to mess around with them. The only additional change I’ve
made is specifying MyISAM as my Storage Type and, again, utf8_unicode_ci as my Collation on the general structure. Make sure you hit Save, and let’s move on! Now that our database and table have been set up, we’re ready to move on to writing our script. Let’s set up some default values in our script. Open up whosonline.php in your editor and, just below the opening PHP tag, insert the following lines: $time_limit = 300; $dbHost = ‘127.0.0.1’; $dbUsername = ‘myuser’; $dbPassword = ‘mypassword’; $dbDatabase = ‘whosonline’;
Time limit
Quick tip Always comment your code throughout. While, at the time, everything makes sense to you – will it do so if you have to return to your code a few months later? Commenting will help you make sense of those complicated functions that you’ve written.
This code should be relatively self-explanatory. One thing to note is the value we’ve got for $time_limit – this value is the number of seconds you want the Visitor window to count for (continuing from our earlier discussion about EPOCH, 300 seconds is 5 minutes). Feel free to change the value to suit your needs. The $dbHost value should point to your MySQL server. In most cases, this will be 127.0.0.1 or localhost. $dbUsername and $dbPassword will again be different for your set-up. You should replace these values with the correct ones for your server (see the info box titled MySQL users in brief for more information). $dbDatabase should remain the same unless you have called your database something other than whosonline. We’re going to be using the MySQLli set of PHP
We have many different data and structure options in MySQL. Take a browse through the options available and see what else is there.
99
PHP functions. It’s similar to its predecessor MYSQL, but offers both procedural and object-orientated methods of working. For the purpose of simplicity, we’ll work exclusively with the procedural method in this tutorial. We’ll be using the mysqli_connect() function initially, and then we’ll take a look at mysqli_query() and the trick to echo’ing out the live visitor count easily, the mysqli_num_ rows() function.
Connect, Query and Display
Quick tip Use a text editor that has syntax highlighting for PHP – it’ll help you quickly identify your code and specific parts, or functions, inside it. There are free programs available, too, so have a look around and go with the one you prefer the look of.
Now we’ve come to the main focus of this tutorial – working with our MySQL database. With the whosonline.php file already open in your editor, just below the database variables we wrote, add the following lines of code: $mysqli = mysqli_connect($dbHost, $dbUsername, $dbPassword, $dbDatabase); if(mysqli_connect_errno($mysqli)) echo “Failed to connect to MySQL: “ . mysqli_connect_error(); Again, this should be relatively easy going. We’re calling the mysqli_connect() function using our database connection credentials and assigning a database connection ‘handle’ to the variable $mysqli. Why do we do this? Well, this is because the additional database functions need an available resource to work with (which you’ll see shortly); otherwise, well, they just won’t work. It’s important to note that this rule applies only to the procedural method and not the object-orientated method. The next line down, containing the if() statement, performs a quick internal check to make sure we connected to our specified database without any problems. If this fails, then the script will die with a convenient error message explaining why. Providing you didn’t encounter any connection errors, or if you did you have since corrected them, it’s now time to insert the visiting user in to our visitors table. Beneath the connection code, write the following: mysqli_query($mysqli, “REPLACE INTO visitors (ip_ address) VALUES (‘” . mysqli_real_escape_string($mysqli, $_ SERVER[‘REMOTE_ADDR’]) . “’)”); This is now starting to look a little more complicated. The first part isn’t too bad, though. mysqli_query() is the
Installing PHP on Linux Most of the latest distributions of Linux come with PHP. Although you can run PHP scripts from the command line, we’ll be using the web browser (and therefore web server) for this tutorial. You can follow this tutorial by uploading your PHP files to a web server on the internet (if you have one). For me, though, I’m using a default installation of Apache2 on my local Linux machine. I find it easier and quicker to write and test PHP on my local machine instead of having to upload files via FTP each time. If you require installation and/or setup instructions or guides for your local machine, I recommend reading through the Installation on Unix systems manual found on the official PHP site, available at the following URL: http:// php.net/manual/en/install.unix.php. Alternatively, there are hundreds of installation guides written for pretty much each flavour of Linux. Simply search Google for your distribution if the official PHP guide doesn’t tick all of your boxes.
function we use to perform a database query. We have to supply two parameters to this function – the first is our database ‘handle’ and the second is the SQL query we want to perform. For our query, we use the SQL “REPLACE INTO” to tell MySQL that we want to update or insert a value into our visitors table. We could go about this another way and check if an IP address exists in the table first, and then perform an update or insert it if it doesn’t exist, but this approach is more succinct. One caveat to this approach is that we can use this method only on distinct values in our database, but seeing as each visitor will have their own distinct IP address, this makes it perfect for our script. In order to make the database accept only distinct values and allow the “REPLACE INTO” query to work, we had to set the “ip_address” field as a PRIMARY KEY in our table structure. The role of a PRIMARY KEY within a database
MySQL users in brief
MySQL. com is a great resource for the latest news in the world of MySQL. It’s also home to official documentation.
100
Users in MySQL work in a very similar method to your Linux platform. When you first install MySQL, you are asked to create a password for the root user. Once the installation has been completed, you can then connect to your server as root. For local development, a lot of users run and test their PHP application as root. It goes without saying that you should never connect your application as root on the web. Personally, I always create a local user that mimics the same user I will be using on my public web server. This user has fewer privileges and, therefore, is limited in what they can perform. Setting up a user correctly can also protect against higher levels of SQL Injection. The permissions you give your user should be only what they need to complete the job. You can easily create users in phpMyAdmin under the Privileges tab. You can assign global permissions or restrict them to certain databases. Think carefully about their needs and go from there.
PHP table, put simply, is to identify uniquely a single row, or piece of data, within a table. The next part of the query specifies which table we want our data added to (visitors) and then what our data is. The SQL approach is very similar to the key => value relationship that we use in arrays. Here, we’re saying simply (column1, column2) VALUES (value1, value2), but when it comes to the actual data insert, MyQSL translates that to column1 => value1, column2 => value2. Our query inserts only one item of data, the user’s IP address, but to make sure we do this securely, we’re going to enclose that piece of data inside a special function. In the previous tutorial, we touched on some important security basics, such as validating and sanitising user input data. The same rules apply to working with databases, too. Not only should we protect the data that goes in, but we also need to make sure that the user isn’t inserting malicious code that can mess around with our database, such as deleting every single record – a practice more commonly known as SQL Injection. To counteract this, we’ll use MySQL’s built-in sanitising function mysqli_real_escape_string(). As with our other functions, the first parameter is a database handle. This function takes the second input parameter and escapes any special characters inside it, making it safe for use in SQL queries. This can be a real life-saver, so I’d recommend using it all the time, even for user input that you believe is safe. While we can guarantee with 99.9% certainty that $_SERVER[‘REMOTE_ADDR’] contains the user’s IP address, safeguarding our input in this way prevents anyone from messing around on the server and inserting inappropriate or malicious content. The $_SERVER global array contains information such as headers, paths and script locations that are created by the web server itself. Check out the PHP website if you want further information.
Counting our unique visitors Now that our visitor has been added, all we need to do is count the number of unique visitors in our database and echo out the value. Beneath the above query, add the following lines to your whosonline.php file: $visitors = mysqli_query($mysqli, “SELECT ip_address FROM visitors WHERE UNIX_TIMESTAMP(NOW())-UNIX_ TIMESTAMP(visited) <= ‘” . mysqli_real_escape_ string($mysqli, $time_limit) . “’”); Here, we again use the mysqli_query() function (supplying the database handle), but this time we perform a SELECT query on our visitors table. Here, we specify an additional statement in our SQL that restricts the data that’s pulled in from the database; we do this by adding a WHERE clause: WHERE UNIX_TIMESTAMP(NOW())-UNIX_ TIMESTAMP(visited) <= ‘” . mysqli_real_escape_ string($mysqli, $time_limit) . “’” There are two new things to spot here – these are UNIX_TIMESTAMP() and NOW(). You could be forgiven for thinking that they are PHP functions, as the syntax is identical, but they’re SQL functions. NOW() is a function in SQL that defines the current date and time (in the format YYYY-MM-DD HH:MM:SS), and UNIX_TIMESTAMP() converts a timestamp to EPOCH. Effectively, we’re saying here that we want any IP addresses back where the time they visited was less than, or
Indexing, what gives? Indexes are used to find rows with specific values quickly. Without an index, MySQL must begin with the first row and then read through each row until the correct value is found. The larger the table, the slower this can be. If an index is present, MySQL can quickly ‘jump’ to the approximate position to start searching from, giving your database a performance boost. Think of it like an address book with your contacts in. Without an alphabetical index, you would have to
flip through each contact until the right one is found, but with an index you know to jump straight to M when looking for Mike Mackay, saving you time by not needing to go through A-L first. When an index is in place, each INSERT, UPDATE or DELETE means MySQL has to recalculate its index data. If you place too many indexes on a table, you eventually lead to a slowdown with each request, as MySQL is busy updating its indexes.
equal to, 300 seconds (5 minutes). We can achieve this by taking the stored visitors time from the database (ie the time their record was added, visited) and subtracting it from the current time. The difference that remains from the calculation is the number of seconds since they last looked at a page on the site – our counter time window. We’re concatenating our $time_limit variable, as we’ve done in previous tutorials. Even though we’re in control of this variable, I have still chosen to enclose it in the mysqli_real_ escape_string() function. Where security is concerned, even if you believe the data won’t change, it’s still worth putting in the extra characters to be sure. It’s better that than losing your database! All that’s left to do now is display the amount of visitors from our query. We have two ways to do this, but I’m opting for the neater solution (we’ll explain the other way in the final tutorial in this chapter). As we want only the row count, and not the specific data set, we can use the function, mysqli_num_rows(). This simply gives us a count on the number of rows that a query has returned, so simply add in: echo mysqli_num_rows($visitors); In order for this function to work, it needs a valid mysqli_ query() supplied as a parameter. In our previous query, we stored the results of our SELECT query in to the $visitors variable, so this is all we need to supply to the function to get the count. If we wanted the number of rows for an alternate query, we could assign that to a different variable and supply as necessary.
Putting the script to use With that, our script is complete. If you open your web browser and run the script, you should simply see “1” displayed. If you then look at the database using phpMyAdmin, you’ll be able to view the IP address and timestamp that’s held in the record. Using databases really can be this easy! I mentioned earlier that we could drop this in to all our pages on a website, or inside a web template. I’m doing this by using the include function inside an HTML tag and wrapping some text around the output, for example: <h4>Current Site Visitors: <?php include(‘whosonline.php’); ?></h4> This value is then shown to all the users on the website. You could, of course, choose to display it a different way, or even change the text surrounding the count – be adventurous and see what you can do with it. Q 101
PHP
Do more with MySQL Mike Mackay explains how to add to your site’s visitor counter database to show not only the all-important numbers but also visitor data.
I
n the previous tutorial in this chapter, we looked at using MySQL and databases for the first time. We created a basic live visitor counter that simply told us the number of visitors to our website at any given moment. Now, let’s take a look at how we can work closer with MySQL and display not only the number of visitors, but also their IP address and visited time.
Installing MySQL Just as before, we have now added MySQL to our arsenal. If you followed along with the last tutorial, you can skip this installation step. If you’re new to MySQL then you’ll need to download and install the MySQL Server software so that PHP can make use of it. You can download and install MySQL for Linux from dev.mysql.com/downloads/mysq – you should follow the links for the appropriate version of Linux you are using. If in doubt, the MySQL online documentation has installation guidelines, and can be found at dev.mysql.com/doc. We’re going to make the assumption here that you already have your Linux platform configured and serving PHP pages through your web browser, as outlined in the first tutorial. If not, please refer to the previous tutorials in this chapter, or the section titled Installing PHP on your Linux platform at the top of p103. You’ll also now need to make sure you have compiled PHP with the --with-mysql[=DIR] configuration option. See the official PHP documentation at http://php.net/manual/en/ mysql.installation.php For managing your database(s), I would recommend downloading and installing phpMyAdmin. phpMyAdmin provides a web-based interface to a MySQL server and allows
Creating a database in phpMyAdmin is laughably easy. All you really need to do is give it a name and hit Create.
102
you to manage fully all aspects of your databases. It also makes debugging and testing our application easier. You can download the software at www.phpmyadmin.net As we’re working on from the previous tutorial, we’ll be using a lot of the same code we built. It’s a great timesaver and already gives us a foundation on which we can expand. We’ll be using the same database table structure too. Just to recap, MySQL is a relational database management system (RDBMS) that runs as a server providing access to databases (collections of data) using Structured Query Language (SQL). Data is held inside a database using tables. A table is simply a collection of related data entries consisting of, and defined by, columns and rows. MySQL is available on a plethora of operating systems, and is extremely quick.
Diving back in So, our task now is to be able to display the visitor data along with our counter. Let’s get started with our changes and review what we need to update. Open up whosonline.php in your favourite text editor and spend a few minutes familiarising yourself with the script in your mind. Check out the variables we set at the top of the script, such as the database connection details and our visitor timeout window. If you need to alter these for your local setup then feel free to do so. The $dbHost value should point to your MySQL server; in most cases this will be 127.0.0.1 or localhost. $dbUsername and $dbPassword will again be different for your setup, and you should replace these values with the correct ones for your server (see the info box titled MySQL users in a nutshell). $dbDatabase should remain the same unless you called your database something other than ‘whosonline’. We still need to continue using the mysqli_connect() function as well as the mysqli_query() function, as these form the basics of our SQL query and are required. We’re calling the mysqli_connect() function using our database connection credentials, and assigning a database connection ‘handle’ to the variable $mysqli: $mysqli = mysqli_connect($dbHost, $dbUsername, $dbPassword, $dbDatabase); if(mysqli_connect_errno($mysqli)) echo “Failed to connect to MySQL: ” . mysqli_connect_error(); Beneath our initial connection function, we want to keep the exact same query to insert the visitor’s IP address into our table so a record of their visit is tracked: mysqli_query($mysqli, “REPLACE INTO visitors (ip_ address) VALUES (‘” . mysqli_real_escape_string($mysqli, $
PHP Installing PHP on your Linux platform Most of the latest distributions of Linux come with PHP. Although you can run PHP scripts from the command line, we’ll be using the web browser (and therefore the web server) for this tutorial. You can follow this tutorial by uploading your PHP files to a web server on the internet (if you have one). For me, though, I’m using a default
installation of Apache2 on my local Linux machine. I find it easier and quicker to write and test PHP on my local machine, instead of having to upload files via FTP each time. If you require installation and/or setup instructions or guides for your local machine, I recommend reading through the Installation on Unix systems manual found on the official
SERVER[‘REMOTE_ADDR’]) . “’)”); Keep in the next part that retrieves all the applicable visitor data. However, we will need to make a small change here: $visitors = mysqli_query($mysqli, “SELECT ip_address FROM visitors WHERE UNIX_TIMESTAMP(NOW())UNIX_TIMESTAMP(visited) <= ‘” . mysqli_real_escape_ string($mysqli, $time_limit) . “’”);
Getting all the data The query above retrieves only one item of data from the database – the IP address. Because we are now going to display all the data, we need exactly that back – all the data. MySQL has a great shortcut when it comes to asking for every piece of data in the database; instead of having to manually type in each field name we can simply insert a *. The * works like a wildcard, and MySQL interprets that as: “I want every column of data”. Put simply, you’re asking for everything the database has. If, however, we wanted to limit data, we’d simply write out each column name separated by a comma. For example, ip_address, visited. Let’s now alter that query to give us every item of data back from the database: $visitors = mysqli_query($mysqli, “SELECT * FROM visitors WHERE UNIX_TIMESTAMP(NOW())- UNIX_ TIMESTAMP(visited) <= ‘” . mysqli_real_escape_ string($mysqli, $time_limit) . “’”); Remember here that we make use of UNIX_ TIMESTAMP() and NOW(). NOW() is a function in SQL that defines the current date and time (in the format YYYY-MMDD HH:MM:SS), and UNIX_TIMESTAMP() converts a time stamp to EPOCH. For more information on EPOCH, please refer to the information box titled What’s EPOCH when it’s at home? Now we’re at the point in the script where we want to change the behaviour from the previous tutorial of just outputting an integer (representing the number of visitors). Instead, we want to also show a list of IP addresses and timestamps of the visitor(s). So, then, let’s start doing that now.
PHP site, which is available at the following URL: http://php.net/manual/en/install.unix.php Alternatively, there are hundreds of installation guides written for pretty much every flavour of Linux. Simply search Google for your distribution if the official PHP guide doesn’t tick all of your boxes.
function, while Visitors will contain another array of all the visitor data. If we were to visualise this array, this is how the structure would look: $visitors_data = array( ‘total’ => [integer], ‘visitors’ => [array] ); As you can see, this is extremely straightforward. Let’s go straight in to the bulk of our change and add the code that will get and store the visitor data. It might look complicated now, but we’ll break it down into bite-sized pieces and will review every line afterwards. Directly below the above SQL query (where we made our * change), add in the following code: if(is_numeric(mysqli_num_rows($visitors)) && mysqli_num_ rows($visitors) > 0) { $visitors_data[‘total’] = mysqli_num_rows($visitors); while($online_visitors = mysqli_fetch_array($visitors)) { $visitors_data[‘visitors’] = array( ‘ip_address’ => $online_visitors[‘ip_address’], ‘visited’ => $online_visitors[‘visited’], ); } } The first line, with the if() statement, is relatively easy. Here, we are checking that our SQL query returned a set of rows back; we do this by using the is_numeric() function wrapped around our mysqli_num_rows() function. The function is_numeric() returns simply TRUE or FALSE (a boolean) based on the contents supplied to it, in this case our mysqli_num_rows() function. If the value inside is indeed numeric, the function returns
Quick tip Always comment your code throughout. While, at the time, everything makes sense to you – will it do so if you have to return to your code a few months later? Commenting will help you make sense of those complicated functions that you’ve written.
Changing our output behaviour Before, we made use of the function mysqli_num_rows() to get the number of entries returned back from our SQL query. We still want to keep this in, as it saves us time having to manually write additional PHP code to count the number of items in the results array, but now we also want to locally store the database data so that we can display it easily on our page(s). For this, we’re going back to basics and will create an array ($visitors_data) to hold our information. Our array will consist of two items, Total and Visitors. The Total field will store the integer output from the mysqli_num_rows()
Using phpMyAdmin, we can quickly view our visitor data without having to check it through the website.
103
PHP TRUE and the code inside the curly braces is then executed. The initial line within the if() braces assigns the integer value (the number of rows returned) from the database query to our $visitor_data[‘total’] array item. The next line, containing the while() statement, is the most complicated part of our update. The while() loop contains the crux of our local data storage/assignment and makes use of the massively helpful PHP function, mysqli_fetch_array().
Using loops
The official online PHP reference has details of every single MySQLi function available – an absolute must!
Loops are helpful in PHP when you want to execute a block of code when a condition is met, iteratively. The code will continue to be run for as long as the condition evaluates to TRUE. The way we achieve this with our MySQL dataset is with the mysqli_fetch_array() function and assignment. The mysqli_fetch_array() function takes the returned dataset from the SQL query, row by row, and allows you to retrieve the columns/data from that via an array – the array index is the field name of the respective column. In our example, our database has two fields, ip_address and visited. PHP will automatically create a temporary array with these names as the array key, and will associate the row data to them. We then take the output from the function (an array) and assign it to $online_visitors. This converts $online_ visitors to an array containing the row from the database. Because the method of assigning data to a variable when
MySQL users in a nutshell Users in MySQL work in a very similar method to your Linux platform. When you first install MySQL, you are asked to create a password for the root user. Once the installation is complete, you can then connect to your server as root. For local development, a lot of users run and test their PHP application as root. It goes without saying that you should never connect your application as root on the web. I always create a local user that mimics the same user I will be using on my public web server. This user has fewer privileges, and so is limited to what they can perform. Setting up a user correctly can also protect against higher levels of SQL injection attacks. The permissions you give your user should be only what they need to do the job. You can easily create users in phpMyAdmin under the Privileges tab. You can assign global permissions, or restrict them to certain databases. Think carefully about their needs and go from there.
successful equates to TRUE in PHP terms, the code inside the while() braces is then executed. Once that code has finished, the while condition is then re-evaluated and the entire process starts all over again. When the condition returns FALSE, the while() loop is then broken and any code following on from it is then executed. Thankfully, PHP does all the hard work for us when dealing with the dataset, and on each iteration of the while() loop it automatically moves on to the next available row of data (where applicable). The contents assigned to $online_ visitors are only temporary and available to that one individual iteration of the loop – it’s this reason that we store a copy of this array data inside another array, making it easy for us to access the data at a later stage without needing to communicate with MySQL. The actual array assignment we see inside the curly braces of our while() loop should look familiar to you. As explained above, we’re storing the actual visitor data as an array inside the array item for $visitors_data[‘visitors’]. We’re making use of a previously discussed method to automatically append data to the end of an array, using []. As soon as the while() loop has finished, and PHP has looped through all returned rows of data from our SQL query, our new $visitors_data array is now available to us and ready to use on our page(s). We’ve worked with arrays a lot so far in this series, but if you’re unsure of what’s happening, feel free to refer back to previous tutorials where this type of functionality is explained.
Displaying our visitor data It would be silly to go to all the hassle of retrieving and storing each unique visitor item without showing it on our page. Doing so is simple, and all you need to do is echo out which pieces of information you want. In order to display the total number of visitors, we know that we just need to echo out the integer value returned from the mysqli_num_rows() function. But this has changed slightly from the previous tutorial and this value is now held in a different ‘place’. In order to show the total, all you need to write on any PHP page is the following: <?php echo $visitors_data[‘total’]; ?> You could surround this with various HTML tags to create
104
PHP
What’s EPOCH when it’s at home? EPOCH, or Unix Time, or even POSIX Time, is simply the number of seconds that have elapsed since Midnight (UTC) of 1 January, 1970. For example, the EPOCH value for midnight of 1 January, 2012 is 1325419200. So that’s one billion, 325 million, 419 thousand and 200 seconds since midnight of 1 January, 1970. You might be wondering what the importance, or relevance, of this data is
MySQL.com is a great resource for the latest news in the world of MySQL. It’s also home to official documentation.
an emphasis on the value – the choice is entirely yours. But how do we go about displaying the actual visitors data? Well, by making use of the PHP function foreach(), we can achieve this with relative ease. Semantically speaking, our visitor data would be considered tabular data. As I’m always keen to try to use the correct mark-up for the correct data, I will be showing the visitor data on my site using an HTML table. It’s nothing fancy, but does the job nicely for me. We’ve used the foreach() function before to loop through a set of data before, so this shouldn’t be too unfamiliar to you: <table border=”1” cellpadding=”5” cellspacing=”0”> <tr> <th>IP Address</th> <th>Visited</th> </tr> <?php foreach($visitors_data[‘visitors’] as $visitor): ?> <tr> <td><?php echo $visitor[‘ip_address’]; ?></td> <td><?php echo $visitor[‘visited’]; ?></td> </tr> <?php endforeach; ?> </table> Here, you’ll notice that the PHP code is embedded inside a few HTML table tags. Hopefully, you will already have some HTML knowledge and this won’t be too confusing for you. Unfortunately, covering HTML is beyond the scope of this tutorial, but if you’re interested in learning more then there are some fantastic tutorials and training sites to be found on the internet. The key part in this code is our foreach() loop. We are looping through each item inside the $visitors_ data[‘visitors’] array (that we know contains all visitor records) and assigning each row to a new temporary variable called $visitor. The newly-created $visitor array consists of two fields, ip_address and visited (replicating how our MySQL table was structured). As we loop through, the code inside the curly braces is executed, creating a new table row. The first table column outputs an IP address followed by the next table column for the visited time. It’s important to note that the timestamp we see in the output is in the default formatting from MySQL. If you’d prefer
when you’re programming in PHP? Well, it allows us to easily (and mathematically) work out various date calculations. You could easily determine the number of seconds from one user, or script action, to another. You might even want to use it to add in some time-sensitive ‘locks’. EPOCH really does come in handy, usually when you least it expect it to!
something a little bit fancier, then you can format this data with the date() function in PHP. This function has a multitude of formatting options, all of which can be found on the official reference at http://php. net/manual/en/function.date.php This display method assumes that you are showing the data on the same PHP file, or page, as the main MySQL functionality. If this is not the case, you can simply include() the main whosonline.php (substitute accordingly if you’ve changed the filename) file on any page you wish to show the data on (note that this must be a file ending in .php for it to be parsed correctly). You’d then just embed the HTML table code where necessary. When you execute the code and view the output on your page, you’ll have noticed that I’m far from a designer and my table is, well, extremely bland and boring. Why not make good use of some CSS and style the table to fit around your website theme or design? You are free to customise it as much as you need or want to.
Quick tip Use a text editor that has syntax highlighting for PHP – it’ll help you quickly identify your code and specific parts, or functions, inside it. There are free programs available too, so have a look around and go with the one you prefer the look of.
Where to next? Now that you’ve learnt the basics of retrieving data from a MySQL query and looping through the records, why not adapt this function to output some other types of data? You could perhaps store the F1 2012 season calendar in a MySQL database. By using the DATE() function (in MySQL, not PHP) in a WHERE clause, you could display only the races that are left in the season. You might even decide to take things a step further and introduce teams and driver data to the mix, as well. You could achieve this by creating further tables in a database and writing queries to retrieve specific information from them. By following the guide in the previous tutorial in this chapter, which covered creating MySQL tables with phpMyAdmin, you should find it relatively straightforward to set up any required tables and fields. You can also use phpMyAdmin to enter manually the data yourself through the Insert tab found on any table properties page. The most important aspect of learning not just PHP but any new language is to be creative and to start challenging yourself more and more. Push and drive your PHP skills forward and be creative. Make use of all the available online resources, from tutorial sites to forums. There are literally thousands, if not millions, of people out there who are willing to help others out when they get stuck. It’s always a good idea to get involved in the community, so dive in and have fun. Q 105
I
(
n
a 1
n
c a e m o a e ++ r
n
Modern Perl
Modern Perl W
hy should a modern Gnu/Linux user know the fundamentals of Perl? Because if you want to automate a task, be it web management or desktop customisation, Perl can do it. Text is its speciality, but with the right modules Perl can process every conceivable category of data. Modern Perl: Track your reading....................................................108 Modern Perl: Build a web app............................................................. 112 Modern Perl: Adding to our app....................................................... 116
107
Modern Perl
Modern Perl: Track your reading Modern Perl makes it simple to write a database program – say, for example, to keep tabs on your books – without using SQL. Dave Cross explains how.
I
n this article we will build a simple command line program that accesses a database. The program we are going to write will keep track of a reading list. We’ll tell it about the books that we’re reading or about to read, and it will display that information in various lists. In the following tutorial we’ll make the program into a web application. Firstly, we’re going to need a database to store this information in. I’m going to use MySQL as it’s the most widely available database system, but the same code will work with minor amendments with any other relational database. We’ll store the data in two tables – author and book. In the interests of keeping things simple during this tutorial, we’ll ignore books with multiple authors. First we’ll create a new database to contain the tables and switch to that database: create database if not exists books; use books; We’ll also create a user for our application. You might want to change the password. If you do, you’ll also need to change it in the get_schema subroutine as well: create user ‘books’@’localhost’ identified by ‘README’; grant all privileges on books.* to books; The author table is very simple: create table if not exists author ( id integer primary key auto_increment, name varchar(100) ) engine innodb; The engine innodb is important as that means that we can give these tables constraints that define the relationships between them. We’ll see that being used in the book table: create table if not exists book ( id integer primary key auto_increment, isbn char(10), author integer, title varchar(250), started datetime, ended datetime, image_url varchar(250), foreign key (author) references author (id) ) engine innodb; The foreign key line at the end of the definition says that the author column in the book table contains values that are equal to the id column in the author table. So if Douglas Adams has the id 1 in the author table then the record in the book table for The Hitchhikers Guide to the Galaxy will have a
1 in its author column. Splitting the author out into a separate table means that we can store information about several Douglas Adams books in the book table without duplicating the information about the author. Avoiding data duplication is called ‘normalisation’ and is an important topic in database design. Having created our database, we now want to set up some Perl code to talk to the database. We could use the DBI (Database Interface) module and write raw SQL. But no one likes writing SQL so we’re going to use Object Relational Mapping (or ORM) to convert Perl code into SQL. This will make our code much easier to write at the cost of a small amount of set-up. The ORM we are going to use is called DBIx::Class so we’ll need to ensure we have that module installed. We’ll also need a separate module called DBIx::Class::Schema::Loader which can generate Perl libraries that are specific to our database. You can probably install both of these libraries using your distribution’s packaging tools, but if they aren’t available you can get them both from CPAN. DBIx::Class::Schema::Loader comes with a commandline program called dbicdump, which will look at the tables in your database and create the Perl code needed to manipulate those tables. You run it like this: $ dbicdump -o components=’[“InflateColumn::DateTime”]’ \ Book dbi:mysql:database=books books README The -o components option loads some extra functionality that we’ll see later on. Book is the name of the Perl module
“No one likes writing SQL, so we’ll use ORM to convert our Perl code.”
108
You can get more information about DBIx::Class from the website at http://dbix-class.org.
Modern Perl you want to create. Then there is a Perl DBI connection string, which includes information about the type of database we are talking about (mysql) and the actual database that we’re interested in (books). The last two arguments are the username (books) and the password (README). When you run that command you’ll find a new file in your current directory called Book.pm and a new directory called Book. Within the Book directory you’ll find another directory called Result and within that there are two files called Author. pm and Book.pm. If you look at the contents of these last two files, you’ll see code that closely matches the definitions of the two tables in your database.
Fishing the Amazon There’s one more thing that we need to do before starting to write our program. We’ll be using the Amazon API to get various details about the books in our database, and we need to register for an API key in order to use the API. You can register for a key at www.amazon.com/gp/aws/ registration/registration-form.html. Once you’ve signed up you can go to the Security Credentials part of the site to get your Access Key ID and Secret Access Key. We recommend setting environment variables to these values like this: export AMAZON_KEY=[Your key here] export AMAZON_SECRET=[Your secret key here] It then becomes easy to access these from within a program. Finally, we’re ready to start looking at the program. We’re going to create a program called book that has four subcommands. Typing book add <ISBN> will allow us to add a book to our reading list. Typing book start <ISBN> will flag that we’ve started to read a book and book end <ISBN> will flag that we’ve finished it. At any time, typing book list (or just book without a subcommand) will display the list of books in our database, indicating which ones we are currently reading and which we have finished. The start of the program looks like this: #!/usr/bin/env perl use strict; use warnings; use 5.010; use Book; use Net::Amazon; use DateTime; A lot of this will be common to every Perl program that you write. The first is like the “shebang” line. This tells the Linux shell to run this program using the Perl compiler. The next two lines load two standard Perl libraries called strict and warnings. Think of these as programming safety nets. The most important thing that the strict library does is to force you into declaring your variables. The warnings library looks for a number of potentially unsafe programming practices and displays a (non-fatal) warning if it finds any. No serious Perl programmer writes programs without loading these two libraries. The third use statement is slightly different. It doesn’t load a module, but tells Perl that this program needs to be run on a particular minimum version of Perl. We’re forcing the use of Perl 5.10 as we’re going to use the say function that was added in this version. The following three lines are back to loading libraries. Book is the library that we created to talk to the database. Net::Amazon is the library that we’ll use to talk to the Amazon API. And finally, DateTime is a powerful Perl library
CPAN – Perl’s killer app If you’re programming in Perl then you need to know about the Comprehensive Perl Archive Network (or CPAN). On the CPAN you’ll find almost 100,000 extra Perl modules that you can use in your programs. The CPAN is at www.cpan.org, but most people use the search page at http://search.cpan.org. A new project called MetaCPAN (at http:// metacpan.org) aims to provide a better interface and an API.
A large number of the most useful CPAN modules have been repackaged for popular Linux distributions, and this will be the easiest way to install most modules. For example, if you want to install the DateTime module on a Red Hat or Fedora system you just need to run sudo yum install perl-DateTime. On a Debian or Ubuntu system that command becomes sudo apt-get install libdatetime-perl.
for the manipulation of dates and times. Next we need to work out which of the sub-commands has been invoked and run the appropriate code: my %command = ( add => \&add, list => \&list, start => \&start, end => \&end, ); The next statement defines the valid sub-commands that our program will implement. It does this by setting up a hash (or dictionary) called %command. The % at the start of a variable name indicates that it’s a hash. A hash is like a lookup table. It has keys which are associated with values. In our case, the keys are the names of the sub-commands and the values are references to the subroutines which implement those commands. Putting an ampersand on the front of a subroutine gives us a way to refer to the subroutine without executing it and a backslash is the standard Perl syntax to get a reference to something: my $what = shift || ‘list’; if (exists $command{$what}) { $command{$what}->(@ARGV); } else { die “Invalid command: $what\n”; } The next few lines deal with the command-line options and calling the appropriate subroutine to do the work. Command-line arguments to a Perl program are stored in an array called @ARGV (the @ indicates an array in the same way that a % indicates a hash). The shift function removes the first element from an array and returns it. You’ll notice that we don’t give shift an argument. That’s because of its special behaviour. If you call shift without an argument outside of a subroutine then it will work on @ARGV by default.
Quick tip The language is called Perl. The program that compiles Perl programs is called perl. Typing either of these as PERL is wrong.
For argument’s sake If we haven’t been given a command-line argument then @ARGV will be empty and shift will return a false value. In this case we want to act as if the user gave us the subcommand list. The || operator lets us do this. This is the Boolean or operator. It returns its left operand if that value is true, otherwise it returns the right operand. So if there’s a value in @ARGV we get that, otherwise we get list. The value calculated from that expression is stored in $what (Perl scalar variables begin with a $). Having got the sub-command, we now need to know if it’s
109
Modern Perl
Quick tip Perl comes with a lot of documentation which you can read using the perldoc program. Alternatively, it’s all online at http:// perldoc.perl.org.
a valid value. We do this by looking in the %command hash. We use the exists function to see if $what matches one of the keys in the hash. If it does, we call the appropriate function, if not we die with an appropriate error message. Notice that as the hash contains subroutine references, we need to call the subroutine using a dereferencing arrow. Also notice that we pass what is left of @ARGV on to the function that we are calling. In some cases it will be empty, but in others it will contain the ISBN for a book. That’s the main structure of the program complete. All we need to do now is to implement the various subroutines that do the actual work for the various sub-commands. Before we start those, we’ll write a useful utility subroutine that they will all use: sub get_schema { return Books->connect(‘dbi:mysql:database=books’, ‘books’, ‘README’) or die “Cannot connect to database\n”; } All of the commands will need to communicate with the database. When using DBIx::Class all communication with a database is carried out through an object called a schema. Our get_schema object just connects to our books database (using the Book module we created earlier). If it can’t connect for any reason, it just kills the program with an error message.
One for the books The first sub-command we will look at is the one to add books to the database: sub add { my $isbn = shift || die “No ISBN to add\n”; my $schema = get_schema(); my $books_rs = $schema->resultset(‘Book’); if ($books_rs->search({ isbn => $isbn })->count) { warn “ISBN $isbn already exists in db\n”; return; } my $amz = Net::Amazon->new( token => $ENV{AMAZON_KEY}, secret_key => $ENV{AMAZON_SECRET}, locale => ‘uk’, ) or die “Cannot connect to Amazon\n”;
Object Relational Mapping Many programs are going to need a persistent data store and in many cases that will be a relational database such as MySQL or SQLite. In order to talk to these you’ll need some kind of database interface (such as Perl’s DBI module) and a lot of SQL scattered throughout your code. Object Relational Mapping (ORM) allows you to write code that interacts with a database at a higher level. You no longer write SQL, you just manipulate objects in your program and ORM takes care of converting that into SQL. Three concepts in Object Oriented Programming (OOP) map rather well onto matching concepts in relational databases. In OOP, a class defines a
110
type of object (such as books), and that’s very similar to table in a database. A particular instance of a class is an object (a particular book) and that’s like a row in a database table. Finally, classes and objects have attributes which are the individual properties of the object (for example title and author) and this is similar to columns in a database. ORM uses these similarities to map data from relational databases into OOP objects within your program. A good ORM, such as DBIx::Class, will be able to automatically generate the classes from the metadata stored in the database which describes the various tables.
my $resp = $amz->search(asin => $isbn); unless ($resp->is_success) { say ‘Error: ‘, $resp->message; return; } my $book = $resp->properties; my $title = $book->ProductName; my $author_name = ($book->authors)[0]; my $imgurl = $book->ImageUrlMedium; my $author = $schema->resultset(‘Author’)->find_or_ create({ name => $author_name, }); $author->add_to_books({ isbn => $isbn, title => $title, image_url => $imgurl, }); say “Added $title ($author_name)”; return; } We need the ISBN number of the book to add. This is passed as a parameter into the subroutine. Perl passes parameters into subroutines in an array called @_. In the same way that shift works on @ARGV when called without an argument outside of a subroutine, it works on @_ when called without an argument inside a subroutine. If no value is found, then the program dies. Having got an ISBN, we first need to check that the book isn’t already in the database. We get a schema object and use that to give us a resultset object for the book table. In DBIx::Class, all manipulation of a specific table is done using a resultset object. We can use the resultset’s search method to look for books with the same ISBN. The search returns another resultset object and we can use the count method on that to see how many books already exist in the database with this given ISBN. Hopefully there aren’t any. But if there are, we can display an appropriate message and return from the subroutine without doing any more work. If the book isn’t already in the database, then we can add it. But first we need to get more details from Amazon. We create a Net::Amazon object giving it the key and secret that we got from Amazon. We also set the locale to uk to indicate that we want to use Amazon’s UK data. We can then use the search method on the Amazon object to look for products with our ISBN. If the search is successful, we can get details of the matching book from the returned object. Having got the details of the book, we can extract various interesting things from the object and store them in our database. Notice that the authors method returns a list of authors and we’re only taking the first one. To insert the book, we first look for the author in the database by getting an author resultset and using the find_ or_create method to either find an existing author record or create a new one.
Find the author Once we have the author object we can use its add_to_books method to add a new book related to that author. The add_ to_books method is one that was created automatically by DBIx::Class::Schema::Loader when it created our classes. It knew that this relationship between the tables existed
Modern Perl Taking it further
Once you’ve read a couple of books, your reading list should look a bit like this.
because of the foreign key constraint that we created on the book table. We can now try adding a book to our database. Get the ISBN of a book from Amazon and try running the command: $ ./book add 0330258648 $ ./book list The next sub-command we’ll implement will be list; so that we can see what is in our database. This looks complex, but actually, it’s rather repetitive as we print the list in three sections (Reading, To Read and Read). The only difference between the sections is the selection criteria we use. Books being read have a value in the started column but a null ended column. Books that have been read have a value in ended. A book with a null started is still in the to be read pile. To run these queries we use the search method on a book resultset object. A null value in the database is represented by the undef value in Perl. The reading query looks like this: foreach ($books_rs->search({ started => { ‘!=’, undef }, ended => undef, })) { say ‘ * ‘, $_->title, ‘ (‘, $_->author->name, ‘)’; } The search arguments say that started is not null and ended is null. For each book found by the query we print the title and the author’s name. Again, these methods are created for use by DBIx::Class::Schema::Loader using information it finds about the columns in the tables and the relationships between tables. For the list of books read, the query looks like this: foreach ($books_rs->search({ ended => { ‘!=’, undef }, })) { say ‘ * ‘, $_->title, ‘ (‘, $_->author->name, ‘)’; } And for the list of books still to read, it looks like this: foreach ($books_rs->search({ started => undef, })) { say ‘ * ‘, $_->title, ‘ (‘, $_->author->name, ‘)’; } The full version of the list subroutine is on the CD.
The beginning and the end The last two sub-commands we need to implement are start and end to indicate when we start and finish reading a book.
There’s a lot to learn about Perl. Here are some suggestions for places to go for more information. The Perl home page is at http:// perl.org. From there you can find links to many other resources about Perl. One of the best ways to read about what is going on in the Perl world is to follow the Perl Iron Man blog aggregator at http://ironman. enlightenedperl.org. The definitive book about Perl is called Programming Perl. The third
edition has been out for rather a long time now, but a fourth edition is due to be published later this year. The best book for learning Perl is called, unimaginatively, Learning Perl and the sixth edition was published this summer. There are two more books in this series called Intermediate Perl and Mastering Perl. Perl user groups are known as Perl Mongers. You can get in touch with your nearest Perl Monger group by visiting the website at http://pm.org.
They are very similar, so I’ll just show the start one here: sub start { my $schema = get_schema(); my $books_rs = $schema->resultset(‘Book’); my $isbn = shift || die “No ISBN to start\n”; my ($book) = $books_rs->search({ isbn => $isbn }); unless ($book) { die “ISBN $isbn not found in db\n”; } $book->started(DateTime->now); $book->update; say ‘Started to read ‘, $book->title; } A lot of this looks very standard by now. We check we’ve been given an ISBN number and then we get a schema object and a book resultset object. We use the search method to get the book object from the database (and die if it can’t be found). Then we use the started method to update that column and call the update method to save the changes back to the database. When we set up our database classes using dbicdump we asked for an extra component called InflateColumn::DateTime to be included. This is where we see the advantage of that. It identifies any date and time columns in the database and converts those values into Perl DateTime objects in our program. So we can create a Perl DateTime object using the class’s now method and DBIX::Class will automatically convert that into the appropriate string to be stored in the database. The end sub-command looks very similar to this, with only the column name changed from started to ended.
Quick tip There’s a useful program called perltidy, which will tidy up Perl code. It’s almost certainly available prepackaged for your distribution.
Write your own We now have a working system. We can add books, start reading books, finish reading books and see what the current state of our reading list is. Having already added a book to the system above, try running the following commands: $ ./book list $ ./book start 0330258648 $ ./book list $ ./book end 0330258648 $ ./book list You’ll see the book moving between the different sections of the report. Q 111
Modern Perl
Build a web app Dancer is a Perl framework for building web applications, and Dave Cross discovers it’s an ideal way to expand his simple reading list program.
I
n the previous tutorial we built a simple command-line program to manage a reading list. We could add books to it and note when we started and finished reading them. At any time the program would display a list of books that we were reading, that we’d read and that were waiting in the pile. Command-line programs aren’t particularly pretty, however. It would be nicer to display these lists on a web page. Perl is, of course, a great language for doing this and in this article we’ll build a web application that displays our reading list. We’ll be using the Dancer framework, as it’s well suited to the simple web page we’re going to build. Dancer is one of a number of Perl frameworks, though, so have a look at the box for a brief discussion of some of the alternatives. As well as the modules we installed in the previous tutorial, we’re going to need some more modules from CPAN (http:// metacpan.org). Not least of these is Dancer itself. Fortunately, Dancer is available from the package repositories of most major Linux distributions, so you’ll just need to run yum install perl-Dancer, apt-get install libdancer-perl or something similar for your version of Linux. The Dancer installation includes a number of useful Dancer tools, but we’ll also need one which is distributed separately on CPAN. This is called Dancer::Plugin::DBIC and will make it easy for us to take the DBIx::Class libraries that we built in the previous article and use them with Dancer.
The first Dance Dancer makes it easy to start writing a web application. Once it’s installed, you get a command-line program called dance which helps you to create the skeleton of an application. All you need to do is type: $ dancer -a BookWeb This will create a directory called BookWeb and fill it with the beginnings of a Dancer application. Move into this directory and take a look at the files. We’ll be editing these later, but already Dancer has given us enough to demonstrate a running program. One of the directories created was called bin, and within that directory you’ll see a single file called app. pl. That’s our web application, so let’s run it: $ ./bin/app.pl [29957] core @0.000009> loading Dancer::Handler::Standalone handler in /usr/share/perl5/ vendor_perl/Dancer/Handler.pm l. 41 [29957] core @0.000274> loading handler ‘Dancer::Handler::Standalone’ in /usr/share/perl5/vendor_ perl/Dancer.pm l. 366 >> Dancer 1.3072 server 29957 listening on http://0.0.0.0:3000
112
== Entering the development dance floor ... If you open a browser and visit http://localhost:3000/ you’ll see your application. It doesn’t do very much right now, but it looks pretty and contains useful links to pages that will help you to learn more about Dancer.
Creating our web pages The first thing that we’re going to do is undo all of that nice HTML formatting and replace it with our own pages. The HTML is stored in two files. There’s a layout file in views/ layouts/main.tt and the content of the page is in views/ index.tt. There’s also a stylesheet in public/css/style.css. Immediately, you can see a split between where Dancer stores its output files. Files that are processed in some way to produce output are stored under views. Static files, such as stylesheets and images, are stored under public. Open the file views/layouts/main.tt in a text editor. All we’re going to do here is remove the line that loads jQuery. Our application won’t get complex enough to use JavaScript so this is unnecessary. Whilst editing the file, notice the <% content %> tag that follows the HTML <body> tag. This is an example of a Dancer Template tag. Tags are processed by Dancer and replaced with other text. The <% content %> tag will be replaced by the contents of whichever template is used for the current request. In this example, we’re dealing only with a single request and that’s a request for the index page of the application. The template for that page is views/index.tt and that’s the next
The default Dancer application is a good way for the programmer to start learning what it’s all about.
Modern Perl file we need to look at. It’s really up to you how much you change this file. We removed most of the text, left a few of the <div> elements and changed the headers. My file ended up looking like this: <div id=”page”> <div id=”sidebar”> </div> <div id=”content”> <div id=”header”> <h1>BookWeb</h1> <h2>Here’s your reading list</h2> </div> </div> </div>
Web Frameworks in Perl Writing a web application is a complex process, but using a framework can make the job much easier. There are a number of web frameworks for Perl and you can get them all from CPAN. In this article we’ve used Dancer, which is based on the Ruby framework, Sinatra. It defines a web application as a number of routes which are the HTTP requests that the application will respond to. This approach makes it easy to get an application up and
running quickly. You can get more information at http://perldancer.org. The best known Perl web framework is Catalyst, a very powerful and flexible framework. It’s behind a number of well-known web applications. Visit www.catalystframework.org. Another alternative is Mojolicious, which concentrates on making things as simple as possible while not compromising on power or flexibility. Take a look at http://mojolicio.us.
At last some Perl But this is supposed to be a Perl tutorial, so it’s about time we wrote some Perl code. We know that bin/app.pl is the file that drives the program, but if you look in there you’ll see that it’s very simple: #!/usr/bin/env perl use Dancer; use BookWeb; dance; It loads the Dancer library and the BookWeb library, and then calls Dancer’s dance function. All of the real work goes on in the BookWeb library. And that lives in the lib/BookWeb. pm file. So let’s have a look at that: package BookWeb; use Dancer ‘:syntax’; our $VERSION = ‘0.1’; get ‘/’ => sub { template ‘index’; }; true; Again, there’s not much there yet. But you can see how Dancer responds to requests. In a Dancer application you define a number of routes and the Dancer request handler matches each incoming request against the definitions of the routes and runs the code associated with the first route that matches. A route is a combination of an HTTP request type (GET, POST, etc) and a path. Currently, we have only one route defined, which handles a GET request to the root of our application. Any request that doesn’t match that definition will be handled by Dancer’s default ‘resource not found’ handler, which will send a 404 response to the browser. If a request does match the route, then the code associated with that route is run. In this case, that just interprets the index template that we just cleared out. A lot of Dancer’s power is in the new keywords, such as template, that it makes available to your application. The template keyword hides quite a lot of work – searching the filesystem to find the templates, dealing with the expansion of variables and embedding the content templates inside the layout template. So, if we want to put useful things into our web page we need to pass variables to the template call. Specifically, we will want to pass in the lists of books that we’re reading, have read and are planning to read. The call will then look a bit like the following:
template ‘index’, { reading => \@reading, read => \@read, to_read => \@to_read, }; We’ve added another parameter to the template call. This parameter is a hash, which contains details of the data we need while processing the template. The keys of the hash are names that we can use within the template (reading, read and to_read) and the values are references to arrays of books. Actually, they are references to arrays of books as you can’t store an array in a Perl hash, but you can store a reference to an array. See the perldoc perlreftut manual page for more explanation of this.
Getting data out of a database The next thing that we need to do is to populate those arrays. And, as we did in the previous article, we’re going to get that data from our database. And we’re going to use the Object Relational Mapper DBIx::Class in the same way as we did last time. Dancer has a number of plugins available on CPAN, and one of them is called Dancer::Plugin::DBIC. That will give us a ‘schema’ keyword which allows us to get easy access to the DBIx::Class schema object that we use to talk to the database. Once we’ve installed Dancer::Plugin::DBIC from CPAN, we need to configure our Dancer application to use it. We do that by editing the config.yml file, which is in the BookWeb directory. At the end of this file, add the following lines: plugins: DBIC: book: schema_class: Book dsn: dbi:mysql:database=books user: books pass: README You’ll recognise this as the connection information that we used to talk to the database in the previous tutorial. The only new information is the schema_class value, which is the name of the Book.pm class that we created to enable us to interact with the database. Our new application will need access to that class and the easiest, if slightly lo-tech, way to achieve that is to copy the Book.pm into the applications lib directory. You’ll also need to copy the Book sub-directory that contains the other database classes (Book/Result/Author. pm and Book/Result/Book.pm).
Quick tip The Perl motto is “There’s more than one way to do it”. So don’t be too surprised if you find Perl code that is written differently to my examples.
113
Modern Perl While you’re editing config.yml you should also change the template engine. Dancer comes with support for two templating engines. The default engine is a simple one that isn’t quite powerful enough for our needs, so we need to change to use the Template Toolkit. Do that by changing the template section in config.yml to look like this: # template: “simple”
Quick tip If you want advice on improving your Perl code, try the ‘perlcritic’ program that is probably available pre-packaged for your distribution.
template: “template_toolkit” engines: template_toolkit: encoding: ‘utf8’ start_tag: ‘<%’ end_tag: ‘%>’ Notice that we’ve changed the template toolkit’s start and end tags from ‘[%’ and ‘%]’ to ‘<%’ and ‘%>’. That’s because our existing templates already contain these tags.
Templating the output With these libraries in place we can finally write code that accesses the database. Change BookWeb.pm so that our route looks like this: get ‘/’ => sub { my $books_rs = schema->resultset(‘Book’); my @reading = $books_rs->search({ started => { ‘!=’, undef }, ended => undef, }); my (@read, @to_read); template ‘index’, { reading => \@reading, read => \@read, to_read => \@to_read, }; };
The Template Toolkit
114
<div id=”content”> <div id=”header”> <h1>BookWeb</h1> <h2>Here’s your reading list</h2> </div> <h3>Reading</h3> <% IF reading.size %> <ul> <% FOREACH book IN reading %> <div class=”book”><p><img src=”<% book.image_url %>” /> <a href=”http://amazon.co.uk/dp/<% book.isbn %>”><% book.title %></a> <br />By <% book.author.name %></p> <p><% IF book.started %>Began reading: <% book.started. strftime(‘%d %b %Y’) %>.<% END %> <% IF book.ended %>Finished reading: <% book.ended. strftime(‘%d %b %Y’) %>.<% END %></p> </div> <% END %> </ul> <% ELSE %> <p>No books found.</p> <% END %> </div> </div> The interesting bits are in the <% ... %> tags. You’ll see an if/else statement and a foreach loop. These are both standard programming constructs that act as you’d expect. The hash of variables that we passed into the template call had a key called reading, and that becomes the name that we use to refer to that data inside the template. As the variable is an array, we can call a size method to see how many elements it contains. If the array is empty, the if condition is false (Perl treats zero as a false value) so we execute the else code which displays no books found. If there are books in the list then we iterate over the list, putting each book in turn into a temporary variable called book. Each of these book values is a DBIx::Class book object like the ones we used in the previous article. And that means they all have methods for each of the columns in the database table. We can use those to print the title and the author’s name. We also use the ISBN to construct a link back to the book’s page on Amazon. Notice that the started column returns a Perl DateTime object so that we can use that class’s strftime method to get a nicely formatted date. In order to make the page look a little more attractive, we can add the following tweaks to the stylesheet in public/css/style.css: img { float: left; margin: 0 5px 5px 0; clear: both; }
“Macros – like functions – make it easy to bundle and re-use repetitive code.”
We’re using the schema keyword to access our schema object, but the rest of the database code is exactly the same as the code we used last time. We get a book resultset object and then use its search method to get an array of the books we’re interested in. Books that we’re currently reading have a not null value in the started column and a null value in the ended column. We’ll ignore the other two lists for now and look at how we deal with the data inside the index template.
In this example we’ve used the Template Toolkit to build the HTML pages for our application. A templating engine is an essential tool for building a website of any complexity. A good templating engine will make it easy to separate the business logic of your application from the logic that displays the data to the users. The Template Toolkit is generally accepted as the most powerful and flexible templating engine for Perl. It
Here’s what the index.tt file looks like once we’ve added code to handle the reading array. <div id=”page”> <div id=”sidebar”> </div>
has its own simplified display language, but a plugin system gives you easy access to much of the power of CPAN. Version 2 of TT has been around for several years, but version 3 is now getting close to being released. TT has a website at http://tt2.org which is not only a useful resource, but also a demonstration of the power of the toolkit. There’s also a book, Perl Template Toolkit which is published by O’Reilly.
Modern Perl Routes in Dancer We’ve said that Dancer is built around the concept of routes, but what exactly is a route and how does it use them? A route is a combination of three things: an HTTP request method; a request path; and some code. When Dancer matches an incoming request with the method and the path, it runs the associated code. In our example, this was the definition of the only route. get ‘/’ => sub { ... }; In this example, get is the HTTP method, ‘/’ is the path and the code is
Once you’ve put a couple of months’ effort into reading the classics, your booklist will look something like this.
.book, h3 { clear: both; } All we need to do now is to write the code to retrieve and display the other two lists – the list of books we’ve read and the list of books we haven’t started. But that’s going to get a little repetitive, so before we do that let’s make a few changes to the index.tt to make our life as easy as possible. The Template Toolkit has the concept of ‘macros’. These are a bit like functions in the templating world. They make it easy to bundle up and re-use repetitive code. We can define a showbook macro that defines how we display a book and then call it whenever we want to show the details of a book. Here’s the macro: <% MACRO showbook(book) BLOCK %> <div class=”book”><p><img src=”<% book.image_url %>” /> <a href=”http://amazon.co.uk/dp/<% book.isbn %>”><% book.title %></a> <br />By <% book.author.name %></p> <p><% IF book.started %>Began reading: <% book.started. strftime(‘%d %b %Y’) %>.<% END %> <% IF book.ended %>Finished reading: <% book.ended. strftime(‘%d %b %Y’) %>.<% END %></p> <% END %>
<% END %> <% ELSE %> <p>No books found.</p> <% END %> <h3>To Read</h3> <% IF to_read.size %> <% FOREACH book IN to_read %> <% showbook(book) %> <% END %> <% ELSE %> <p>No books found.</p> <% END %>
Arrays of light Each section is exactly the same; they just work on different arrays. All we need to do now is to fill in the values of the read and to_read arrays that are passed to the template. We wrote code that did this in the previous article and we just need to replicate that in the route in BookWeb.pm. When we’ve finished, the complete route definition will look something like this: get ‘/’ => sub { my $books_rs = schema->resultset(‘Book’); my @reading = $books_rs->search({ started => { ‘!=’, undef }, ended => undef, });
Re-usable bundles
my @read = $books_rs->search({ ended => { ‘!=’, undef }, });
It’s exactly the same code as before, just bundled in the MACRO ... BLOCK ... END syntax that makes it re-usable. With this block added to the top of the index template we can write the section that actually displays the books like this: <h3>Reading</h3> <% IF reading.size %> <% FOREACH book IN reading %> <% showbook(book) %> <% END %> <% ELSE %> <p>No books found.</p> <% END %> <h3>Read</h3> <% IF read.size %> <% FOREACH book IN read %> <% showbook(book) %>
defined in the subroutine. When Dancer sees an HTTP GET request to the path ‘/’ (ie the root of the website) the code is run. All parts of the route definition can be more complex. You could match a POST request with the post keyword or any HTTP request method with any. You can also access data passed in the request path using code like this: get ‘/hello/:name’ => sub { return “Hello “ . params->{name}; };
my @to_read = $books_rs->search({ started => undef, }); template ‘index’, { reading => \@reading, read => \@read, to_read => \@to_read, };
};
Next time
We still have to update the database using the book command-line program we wrote last time, but we have an attractive web version to display our current reading list. Q 115
Modern Perl
Adding to our app The power of web frameworks is in how they take care of standard features. Dave Cross uses Dancer to add interactivity to his reading list program.
I
n the previous tutorial in this chapter, we added a web front-end to our reading list program, but this interface displayed only the contents of our database and we still needed to use the command-line program to change the data. In this tutorial, we’ll fix that by adding interactivity to our web application. By the end of this article, you won’t need the command-line program at all. This will involve two major changes to the web app. Firstly, we’ll add actions to deal with adding books to the reading list and starting and finishing books. But if you want to put your reading list on a public website, you don’t want just anyone to be able to edit it, so we’ll also implement a basic level of authorisation and authentication. As in the previous article, we’ll find that Dancer will make this all a lot easier than it would be doing it all from scratch.
How to read a book
The books application, with links allowing you to maintain your reading list.
We’ll start by adding routes to our application allowing us to start and finish reading books. We’ll do this before adding books to the list, as these actions are simpler. We’ll implement these actions by adding new route definitions to the BookWeb.pm file. Here’s the definition of the start route: get ‘/start/:isbn’ => sub { my $books_rs = schema->resultset(‘Book’); my $book = $books_rs->find({ isbn => param(‘isbn’)});
if ($book) { $book->update({started => DateTime->now}); } return redirect ‘/’; }; Like all Dancer routes, this definition consists of an HTTP action (in this case get), a path and some code to execute when the first two items are matched. The path here is more complex than the path that we saw last time, as it contains a parameter. The URL that we want to use to start reading a book looks like http://example.com/start/1930110006. This will flag that you have started reading the book with ISBN 1930110006. Obviously, that ISBN value will change for different books, so we need a way to capture that parameter and use it in our code. In a Dancer route, you can match parameters with the :name syntax that you see in our definition. You can have more than one parameter defined in the route as long as they are named and separated by slashes. You access these parameters using Dancer’s param function. The rest of the code will look familiar to anyone who read the first article in this chapter (beginning on p108) where we wrote the command-line version of this program. We get a resultset for our book table, search it for a book with the given ISBN and then update the started column in that object to be equal to the current date and time. You might also remember that the DBIx::Class tool that we are using for database access automatically converts between Perl DateTime objects and date/time columns in your database. Notice that if we don’t find a book with the given ISBN, then we do nothing. It might be worth displaying an error message at that point. Or, perhaps, redirecting to the add action (which we haven’t written yet). Once we have updated the book record, we just use Dancer’s redirect function to redirect the browser back to the main page of the application. The user will then see that the chosen book has moved from the ‘To Read’ list to the ‘Reading’ list. The code for the end route is almost identical. Only the path and the database column will differ. The path will be /end/:isbn, and we’ll need to update the ended column in the database.
Adding new books The next thing we need to do is to add new books to the list. Again, we’ll be repurposing code from the original commandline program. As we need to go to Amazon for details of the book, we need to create a Net::Amazon object. We’ll need this
116
Modern Perl object in a couple of places, so we’ll write a get_amazon() subroutine that creates the object for us. sub get_amazon { return Net::Amazon->new( token => $ENV{AMAZON_KEY}, secret_key => $ENV{AMAZON_SECRET}, associate_tag => $ENV{AMAZON_ASSTAG}, locale => ‘uk’, ) or die “Cannot connect to Amazon\n”; } There’s nothing complicated here. It’s just calling the constructor on the Net::Amazon class and returning the object that is created. Annoyingly, Amazon has changed the way that this works since I wrote the first article in this series. See the Amazon API Changes boxout for more details. We can now define our add route. The path will be a similar format to the start and end routes. The code looks like this: get ‘/add/:isbn’ => sub { my $author_rs = schema->resultset(‘Author’); my $amz = get_amazon(); # Search for the book at Amazon my $resp = $amz->search(asin => param(‘isbn’)); unless ($resp->is_success) { die ‘Error: ‘, $resp->message; } my $book = $resp->properties; my $title = $book->ProductName; my $author_name = ($book->authors)[0]; my $imgurl = $book->ImageUrlMedium; # Find or create the author my $author = $author_rs->find_or_create({ name => $author_name, }); # Add the book to the author $author->add_to_books({ isbn => param(‘isbn’), title => $title, image_url => $imgurl, }); return redirect ‘/’; }; In this function, we need to talk to both the database and Amazon, so the first thing we do is create an author resultset and a Net::Amazon object. We then search Amazon for the ISBN that we’ve been given. If we find it, we first create an author record (or find the existing one if we already know about this author) and then insert details of the book. Once again, when we’ve finished, we just need to redirect to the front page and the user will see their new book in the ‘To Read’ list.
Amazon API changes It’s rare for a big company such as Amazon to make changes to its web service’s API in such a way that it breaks a lot of existing code. But, unfortunately, that’s exactly what happened at some point after I wrote the previous article in this series. In the older version of the API you needed a key and a secret. These values were passed to Net::Amazon as you created the object. Amazon has now added a third mandatory parameter – your Amazon Associates ID. Like the other two parameters, you can get this value from your Amazon web services account information.
The Net::Amazon module checks that you have given it all of the mandatory parameters when you call its constructor method. Older versions checked for the key and the secret. But once the API change was introduced those parameters weren’t enough and any API calls were failing with an error about the missing parameter. Version 0.61 of Net::Amazon adds the associates ID to the list of mandatory parameters. The new version of the call is shown in the code in this article. I recommend that you update your version of Net::Amazon to avoid any potential problems.
<div class=”book”><p><img src=”<% book.image_url %>” /> <a href=”http://amazon.co.uk/dp/<% book.isbn %>”><% book.title %></a> <br />By <% book.author.name %></p> <p><% IF book.started %>Began reading: <% book.started. strftime(‘%d %b %Y’) %>.<% END %> <% IF book.ended %>Finished reading: <% book.ended. strftime(‘%d %b %Y’) %>.<% END %></p> <% IF book.started AND NOT book.ended -%> <p><a href=”/end/<% book.isbn %>”>Finish book</a></p> <% ELSIF NOT book.started -%> <p><a href=”/start/<% book.isbn %>”>Start book</a></p> <% END %> </div> <% END %> Our additions are towards the end. If the book has a value in the start date but no value in the end date then it must be in the reading list and we display a finish book link. If it has no start date then it must be in the to read list and we display a start book link. If we make these changes and start our application (with bin/app.pl), you should see these links appearing next to the books – assuming that you have books on the list. And that brings us neatly to the next problem. We need a better way to add books to the list. Let’s do it by searching Amazon.
Quick tip The best book about Perl is Programming Perl. The fourth edition has just been published.
“We need a better way to add books to the list. Let’s do it by searching Amazon.”
Adding links That’s all very well, but currently the only way to access our new routes is by typing addresses, including the ISBNs, into the location bar in your browser. That’s hardly user-friendly. Let’s fix that by adding links to the list of books. In the file views/index.tt, we have a macro called showbook which is responsible for displaying an individual book in the main list. We can edit that and have the links appear for every book. Once the links have been added, the macro looks like this: <% MACRO showbook(book) BLOCK %>
Amazon exploration The best place for a search box is in a sidebar that appears on every page. Our sidebar is defined in views/layouts/main.tt. Edit the sidebar div so it looks like this: <div id=”sidebar”> <p><form method=”POST” action=”/search”><p>Search Amazon: <input name=”search” values=”<% search %>” /> <input type=”submit” value=”Search” /></form></p> </div> That will put a search box on every page in our application. But now we need to write code to carry out the search and display the results. Notice in the form definition we’ve said that the form sends a POST request to /search. That gives us a couple of clues as to how our route definition should look:
117
Modern Perl
Quick tip There are a huge number of blogs dedicated to Perl programming. Many of the best ones are collected at http://mgnm. at/ironman.
post ‘/search’ => sub { my $amz = get_amazon(); my $resp = $amz->search( keyword => param(‘search’), mode => ‘books’, ); my %data; $data{search} = param(‘search’); if ($resp->is_success) { $data{books} = [ $resp->properties ]; } else { $data{error} = $resp->message; } template ‘results’, \%data; }; We need a Net::Amazon object in order to search Amazon, so we get that first. We can then use the same search method as we used before, but with different arguments. We tell Amazon that we’re looking for a book and that the keyword we’re looking for is the search term that the user has given us. If the search is successful then the books that match are retrieved by calling the properties method on the response object. We put that list in a hash called %data, along with the text we searched for, and pass that to the results template. Which means we need to create a template called views/results.tt. It looks like this: <h1>BookWeb - Search Results</h1> <% IF error -%> <p class=”error”><% error %> <% ELSE %> <p>You searched for: <b><% search %></b></p> <% IF books.size %> <ul> <% FOREACH book IN books -%> <li><b><% book.title %></b> (<% book.authors.list.0 %>) <a href=”/add/<% book.isbn %>”>Add to list</a></li> <% END %> </ul> <% ELSE %> <p>Your search returned no results.</p> <% END %> <% END %> There’s a bit of code there for displaying an error if the search failed and for displaying a “no results” message, but most of
118
the code is used to display a list of books that are returned from Amazon. For each book in the list we display the title, the author and a link to add the book to our reading list. If you save these changes and restart the application, you should find that you have a fully functional website that now allows you to do anything that our original command-line program did. You can add new books to the list and tell the system when you start and finish a book. The only problem is that anyone else can do all of that too.
“Presumably you’d like to display your reading list to anyone who is interested.”
Deploying your application In the past two articles, we’ve been using Dancer’s built-in test web server to run our web application. But if you find the app to be useful, you’ll eventually want to deploy it on a real, public web server. There are a number of different options. Dancer is built on top of Perl technology called PSGI, which is a protocol that defines the interactions between a web application and the web hosting environment where the application runs. If you have a PSGIcompatible application, then it’s simple
The search results page. Amazon seems to have a rather liberal definition of ‘Perl’.
enough to deploy it in any PSGI-ready web hosting environment. And as any Dancer application is already PSGI compatible, you can deploy it just about anywhere. Details of some common deployment scenarios are in the Dancer::Deployment manual page, which comes as part of the standard Dancer distribution. Just enter perldoc Dancer::Deployment at the command line. For more details of PSGI (and Plack, a reference implementation of the specification), see the project’s website at http://plackperl.org.
Adding security Presumably, you’d like to display your reading list to anyone who’s interested, but you’d prefer it if only you can update it. For that we need to introduce some security. We’re going use some really basic authentication, but I hope it will be obvious how to extend it for use in the real world. We’re going to add the concept of a logged in user. And we’re going to store whether the current user is logged in or logged out using a session cookie. Support for sessions comes as a part of the standard Dancer distribution, but in order to store your session in a cookie, you will need to install the extra Dancer::Session::Cookie module from CPAN. Having installed the module, you need to configure it by adding the following two lines to your config.yml file: session: cookie session_cookie_key: somerandomnonsense The value of the cookie key can be any random string – the more random the better. Mine isn’t a great example. In order to add session support, we need to add use Dancer::Session to the list of modules near the top of BookWeb.pm. Now we need to think about how our security will work. I’m going to define a list of paths that are public. Anyone can see those pages, but anyone trying to access pages outside of this list will be prompted to log in if they haven’t already. Dancer has the concept of a before hook which is fired before any route is run. That’s a perfect place to check whether the user is allowed to do whatever they are trying to do: my %public_path = map { $_ => 1 } (‘/’, ‘/login’, ‘/search’); hook before => sub {
i
d
/
n
e
.
.
.
-
.
( ,
n ;
n ;
o
;
Python
Python P
ython is the Swiss Army knife of programming languages. Its range of add-on modules means you can do almost anything quickly and easily. Here weâ&#x20AC;&#x2122;re going to discover how to create Clutter graphics, code a Gimp plugin and have fun hacking Minecraft on a Raspberry Pi. Python: Different types of data ....................................................... 122 Python: Code a system monitor .................................................... 124 Python: Clutter animations .................................................................. 128 Python: Stream video ................................................................................. 132 Python: Code a Gimp plugin ............................................................... 136 Python: Gimp snowflakes ..................................................................... 140 Python: Make a Twitter client............................................................ 144 Minecraft: Start hacking ..........................................................................148 Minecraft: Image wall importing ....................................................150 Minecraft: Make a trebuchet ..............................................................154 Minecraft: Build a cannon ...................................................................... 158
121
Python
Python: Different types of data Functions tell programs how to work, but it’s data that they operate on. Nick Veitch goes through the basics of data in Python.
I
n this article, we’ll be covering the basic data types in Python and the concepts that accompany them. In later articles, we’ll look at a few more advanced topics that build on what we do here: data abstraction, fancy structures such as trees, and more.
What is data?
While we’re looking only at basic data types, in real programs getting the wrong type can cause problems, in which case you’ll see a TypeError.
122
In case you’ve hopped straight to this chapter, let’s go back to basics. In the world, and in the programs we write, there’s an amazing variety of different types of data. In a mortgage calculator, for example, the value of the mortgage, the interest rate and the term of the loan are all types of data; in a shopping list program, there are all the types of food and the list that stores them – each of which has its own kind of data. The computer’s world is a lot more limited. It doesn’t know the difference between all these data types, but that doesn’t stop it from working with them. The computer has a few basic ones it can work with, and that you have to use creatively to represent all the variety in the world. We’ll begin by highlighting three data types: first, we have numbers. 10, 3 and 2580 are all examples of these. In particular, these are ints, or integers. Python knows about other types of numbers, too, including longs (long integers), floats (such as 10.35 or 0.8413) and complex (complex numbers). There are also strings, such as ‘Hello World’, ‘Banana’ and ‘Pizza’. These are identified as a sequence of characters enclosed within quotation marks. You can use either double or single quotes. Finally, there are lists, such as [‘Bananas’, ‘Oranges’, ‘Fish’]. In some ways, these are like a
string, in that they are a sequence. What makes them different is that the elements that make up a list can be of any type. In this example, the elements are all strings, but you could create another list that mixes different types, such as [‘Bananas’, 10, ‘a’]. Lists are identified by the square brackets that enclose them, and each item or element within them is separated by a comma.
Working with data There are lots of things you can do with the different types of data in Python. For instance, you can add, subtract, divide and multiply two numbers and Python will return the result: >>> 23 + 42 65 >>> 22 / 11 2 If you combine different types of numbers, such as an int and a float, the value returned by Python will be of whatever type retains the most detail – that is to say, if you add an int and a float, the returned value will be a float. You can test this by using the type() function. It returns the type of whatever argument you pass to it. >>> type(8) <type ‘int’> >>> type(23.01) <type ‘float’> >>> type(8 + 23.01) <type ‘float’> You can also use the same operations on strings and lists, but they have different effects. The + operator concatenates, that is combines together, two strings or two lists, while the * operator repeats the contents of the string or list. >>> “Hello “ + “World” “Hello World” >>> [“Apples”] * 2 [“Apples”, “Apples”] Strings and lists also have their own special set of operations, including slices. These enable you to select a particular part of the sequence by its numerical index, which begins from 0. >>> word = “Hello” >>> word[0] ‘H’ >>> word[3] ‘l’ >>> list = [‘banana’, ‘cake’, ‘tiffin’] >>> list[2] ‘tiffin’ Indexes work in reverse, too. If you want to reference the last
Python element of a list or the last character in a string, you can use the same notation with a -1 as the index. -2 will reference the second-to-last character, -3 the third, and so on. Note that when working backwards, the indexes don’t start at 0.
Methods Lists and strings also have a range of other special operations, each unique to that particular type. These are known as methods. They’re similar to functions such as type() in that they perform a procedure. What makes them different is that they’re associated with a particular piece of data, and hence have a different syntax for execution. For example, among the list type’s methods are append and insert. >>> list.append(‘chicken’) >>> list [‘banana’, ‘cake’, ‘tiffin’, ‘chicken’] >>> list.insert(1, ‘pasta’) >>> list [‘banana’, ‘pasta’, ‘cake’, ‘tiffin’, ‘chicken’] As you can see, a method is invoked by placing a period between the piece of data that you’re applying the method to and the name of the method. Then you pass any arguments between round brackets, just as you would with a normal function. It works the same with strings and any other data object, too: >>> word = “HELLO” >>> word.lower() ‘hello’ There are lots of different methods that can be applied to lists and strings, and to tuples and dictionaries (which we’re about to look at). To see the order of the arguments and the full range of methods available, you’ll need to consult the Python documentation.
Variables In the previous examples, we used the idea of variables to make it easier to work with our data. Variables are a way to name different values – different pieces of data. They make it easy to manage all the bits of data you’re working with, and greatly reduce the complexity of development (when you use sensible names). As we saw above, in Python you create a new variable with an assignment statement. First comes the name of the variable, then a single equals sign, followed by the piece of data that you want to assign to that variable. From that point on, whenever you use the name assigned to the variable, you are referring to the data that you assigned to it. In the examples, we saw this in action when we referenced the second character in a string or the third element in a list by appending index notation to the variable name. You can also see this in action if you apply the type() function to a variable name: >>> type(word) <type ‘str’> >>> type(list) <type ‘list’>
Other data types There are two other common types of data that are used by Python: tuples and dictionaries. Tuples are very similar to lists – they’re a sequence data type, and they can contain elements of mixed types. The big difference is that tuples are immutable – that is to say, once you create a tuple you cannot change it – and that tuples are
identified by round brackets, as opposed to square brackets: (‘bananas’, ‘tiffin’, ‘cereal’). Dictionaries are similar to a list or a tuple in that they contain a collection of related items. They differ in that the elements aren’t indexed by numbers, but by ‘keys’ and are created with curly brackets: {}. It’s quite like an English language dictionary. The key is the word that you’re looking up, and the value is the definition of the word. With Python dictionaries, however, you can use any immutable data type as the key (strings are immutable, too), so long as it’s unique within that dictionary. If you try to use an already existing key, its previous association is forgotten completely and that data lost for ever. >>> english = {‘free’: ‘as in beer’, ‘linux’: ‘operating system’} >>> english[‘free’] ‘as in beer’ >>> english[‘free’] = ‘as in liberty’ >>> english[‘free’] ‘as in liberty’
The Python interpreter is a great place to experiment with Python code and see how different data types work together.
Looping sequences One common operation that you may want to perform on any of the sequence types is looping over their contents to apply an operation to every element contained within. Consider this small Python program: list = [‘banana’, ‘tiffin’, ‘burrito’] for item in list: print item First, we created the list as we would normally, then we used the for… in… construct to perform the print function on each item in the list. The second word in that construct doesn’t have to be item, that’s just a variable name that gets assigned temporarily to each element contained within the sequence specified at the end. We could just as well have written for letter in word and it would have worked just as well. That’s all we have time to cover in this article, but with the basic data types covered, we’ll be ready to look at how you can put this knowledge to use when modelling real-world problems in later articles. In the meantime, read the Python documentation to become familiar with some of the other methods that it provides for the data types we’ve looked at before. You’ll find lots of useful tools, such as sort and reverse! Q 123
Python
Python: Code a system monitor Tidying up some code with Clutter, Nick Veitch takes you far from the command line into a new realm of technicolour graphical possibilities.
It’s a bit dark in here… Could be the promising start of a 3D adventure game, perhaps. Or your first Clutter effort!
W
e have touched on few web-based wonders you can build with Python elsewhere in this guidebook, but we going to do a rare thing now and cover using a GUI to display stuff graphically to the user. One of the reasons for this being unusual is, for the most part, GUI code gets very big very quickly, so a whole tutorial would be taken up by just drawing a panel and a few buttons on the screen. We’re going to take a break from being so user-unfriendly for a while, as for the next few pages we’re going to be building applications using the PyClutter library. If you don’t know much about Clutter, check the boxout (A Note About Versions) over the page. For the first tutorial we’re going to build a small but useful little utility to get to grips with how Clutter and PyClutter work. As Clutter has a dearth of documentation and examples, hopefully the code we will cover here will give you an idea of how we can use it practically within our Python web apps. Our task here is to create an app that will show us the current network speeds for our internet connection. Yes, there are plenty of monitors out there, but this will be our own, and delivered in about 70 lines of simple code. The first thing you need to get to grips with in Clutter is the basic terminology. Unlike other GUI toolkits, which usually
124
define objects like windows or panel, Clutter refers to the visual area as a ‘stage’. To continue the analogy, objects that appear on (or actually, in, but it sounds weird to say it) the stage are called ‘actors’. It makes more sense when you start coding it, and the names don’t seem so strange after a while. The thing about the actors is that they have more properties than a standard widget because they actually exist in a 3D environment, rather than a 2D one.
All the world’s a stage Anyway, enough hyperbabble – it will make more sense when we write some code. Open up your standard Python environment (mine is a Bash shell, but you can use some of those fancy ones if you like), and let’s create our very first Clutter script… >>> import clutter >>> stage = clutter.Stage() >>> stage.set_size(500,300) >>> red=clutter.Color(255,0,0,255) >>> black=clutter.Color(0,0,0,255) >>> stage.set_color(black) >>> stage.show_all() When you’re done, click in the Close gadget on the window that opened. I know it didn’t do anything amazing, but it does have the potential to! Let’s take a look at what just happened. The first line obviously loaded the Clutter module. In turn,
Python
Clutter opens a few more modules itself – back-end stuff that links into display libraries to be able to put things on the screen. Next up we created a stage object. The stage is like a viewport – an area where your actor objects can play. Setting the attributes is as simple as calling some methods for the stage class, in this case a size and a colour. The parameters for the size method are x and y dimensions, and the colour is taken from the clutter.Color object (which takes values for RGB and alpha). As with other GUI toolkits, we should cause the object to be shown before any of it is drawn on the screen, which is what the final command does. But what of our actors, the objects that we want to show on the screen? Let’s add some text objects: >>> a=clutter.Text() >>> a.set_font_name(“Sans 30”) >>> a.set_colour(red) >>> a.set_text (“Hello World!”) >>> a.set_position(130,100) >>> stage.add(a) Now we’ve added a text object, our first actor. Hopefully it will be fairly clear what the methods are doing – picking a font, a colour, setting the text string and positioning it on the stage. The final call in the code example adds the actor to the stage, and until this point, you won’t be able to see it. Now that it’s there though, you can continue to play around with it – try setting it to a different position or adding new colours. As I mentioned earlier, the PyClutter documentation is scanty, but we can gain some solace in the fact that Python has good introspection. Try typing in dir (a) at this point to see the methods and attributes available for this object. Our next step is to build a running script, but there’s something we haven’t covered yet: for all the Clutter magic to work properly, we should turn control of the application over to the clutter.main() function, but we don’t want to do that without some way to exit the program. In such situations, Python will catch Ctrl+C interrupts, so we will have no way of quitting. The answer is to provide some keyboard events. When the stage window is active, Clutter will receive signals for keypresses. All we need to do is provide a callback function that will process that event, and if the correct key has been pressed, quit out of the main loop. You could also assign other actions to some keys, like changing the colour of the stage for example. >>> def parseKeyPress(self, event):
... if event.keyval == clutter.keysyms.q: ... clutter.main_quit() ... elif event.keyval == clutter.keysyms.r: ... self.set_color(red) ... >>> stage.connect(‘key-press-event’, parseKeyPress) >>> clutter.main() When run in the interactive Python shell, the quit function will not quit Python itself, or even destroy the application; it will just return control to the Python shell. In the case of a running script though, calling the clutter.main_quit() method will effectively end the application, or at least the Clutter part of it.
The traditional first app, though rather daringly we have left out the comma.
Time to monitor something Right, now we have the interface sorted out, how are we going to build an amazing bandwidth monitor? We first need to find out the speed of the network traffic. Whenever I am confronted with a question about some piece of system statistics, I always go and ask my old friend, proc. Yes, the /proc pseudo filesystem is the repository of everything you ever needed to know about a running Linux box. proc is a huge sprawling mess of files, but the one we want is /proc/net/dev. This lists all the network devices, and reading the file will give you statistics on bytes in and out, packets,
A note about versions The Clutter library, and consequently the Python module that uses the Clutter library, has been updated recently to version 1.18.0. Normally updates may cause a few inconsistencies between old versions and new versions of software, but in this case there are fundamental differences between the code of versions before and after 0.9. The PyClutter module and the Clutter library should be available in your distro’s repository, but when you install it, make sure you have a version 0.9 (preferably 1.0) or above, otherwise I can guarantee you that none of the code in this tutorial will work. If you think that’s a faff, you ought to try writing a tutorial and then discovering the whole library changes…
Messing around in the interactive shell is a quick, safe way to finding out about Clutter objects and methods.
125
Python
Quick tip Keeping track of versions can be a nightmare, but most modules store their version number in <modulename>.__ version__ . Not only is this useful for you to check, but your applications can check for a compatible version before they try and do anything tricky.
dropped packets, errors and so on. The only thing we are interested in are the bytes sent and the bytes received. I know that the number there is a total, and we wanted a speed, but behold the power of proc – just open the file again and the magic numbers will have changed. Now, I hope I am not going too fast for you, but simple arithmetic should not be beyond us. If we poll the file every second and subtract the old number from the new number, everything should be fine. All we really have to do is build a little function that will read in the file, parse it for the information we want, and compute the deltas. Before we leave we will save the old number so we can subtract it next time. Here’s how the function should look, more or less: devfile=open(‘/proc/net/dev’,’r’) for line in devfile.readlines(): line=line.strip() if (line[:4] == ‘eth0’): line=line[5:].split() print line[0], line[8] Hopefully, this will make some sense to you without me needing to draw diagrams. We read in the file and iterate through the lines, looking for the one that begins eth0: – it is necessary to strip the line before searching because the output is padded by an amount to make the tables line up. When we have the correct line, we take of the interface part and split the string up, so we have each of the numbers as part of a list. The counts for bytes in and out happen to be at the 0 and 8 positions in this list. Here we have just printed them out – you can type in the code and see what it gives you. All that needs to be added to that is to convert the strings to integers and store them so we can keep a track of what’s going on.
Maths is your friend The more detail-oriented of you might question whether we take into account the length of time it takes this snippet of code to run. If you want to time this code, go ahead – on my development system it takes 0.0001 seconds to run. In case you’re interested, a complete command line app would look something like this: import time lasttime=1 lastin=0 lastout=0 def getspeed(): x=open(‘/proc/net/dev’,’r’) for line in x.readlines():
Why should I care about Clutter? Clutter is a GPL graphics and GUI library that was originally developed by the OpenedHand team. It was later sold to Intel, which is committed to further development and deployment. The great thing about Clutter is that it’s a simple, fast and powerful way to deliver 3D or 2D graphics on a number of platforms. The back-end is essentially OpenGL, but by using the Clutter library developers can take advantage of a fast,
126
efficient and friendly way to develop graphically rich apps without messing around with more technical aspects of the OpenGL libraries. Clutter also forms an integral part of Moblin, an attempt to deliver a powerful graphical version of Linux to run on mobile devices. Moblin, via Meego, lives on in a fork called Mer, which is being developed as the Sailfish OS and a new smartphone by Jolla (www.jolla.com).
Numbers. Coloured numbers. That change. And monitor things. This is also a pretty good start.
line=line.strip() if (line[:4] == ‘eth0’): line=line[5:].split() bin=int(line[0]) bout=int(line[8]) return (bin, bout) while True : z= getspeed() timedelta=time.time()-lasttime lasttime=time.time() sin=(float(z[0]-lastin))/(1024*timedelta) sout=(float(z[1]-lastout))/(1024*timedelta) print sin, sout lastin=z[0] lastout=z[1] time.sleep(5) This incorporates a timing function to more accurately calculate the speeds, but bear in mind that we’re only talking about a couple of milliseconds, so it doesn’t make a lot of difference. It is useful however, if we ever want to alter the timing period elsewhere in the software. Now what we have to do is to incorporate this functionality into our Clutter application. We could just stick the loop at the end of our program and fail to ever call the main Clutter loop. We can still update the actor objects whenever we like, but this would be a Bad Thing. The nicer way to do it is to give liberty, autonomy and freedom back to the actors, but make use of an animation timeline to control their text. Timelines are covered in slightly more detail in the box over the page, but to give you a brief summary, a timeline is just a timer that counts to some value and then emits the programmatic equivalent of a beep – a signal. The signal can be caught and fed to a callback, and as well as itself, you can supply other parameters to the call. For our purposes, we can make the timer call a function that will test the network speed and update our two actors. The timeline is an object unto itself, but when we execute the connection between the timeline and the callback function, we can pass along our text actor objects too, so the callback function will be able to change them directly. Note that if you’re going for more complicated behaviours, this doesn’t preclude you from having other timers too – you could set one up to change the colour of the objects every
Python
second if you wanted, and it needn’t interfere with the timeline we have already created. Timelines can be used like threads in a multithreaded app – they aren’t quite as flexible, but they are easier to manage and they it easier to deal with animated objects, because you can separate the business of animating the object from the other interactions it has. import clutter import time lasttime=1 lastbin=0 lastbout=0 black =clutter.Color(0,0,0,255) red = clutter.Color(255, 0, 0, 255) green =clutter.Color(0,255,0,255) blue =clutter.Color(0,0,255,255) def updatespeed(t, a, b): global lasttime, lastbin, lastbout f=open(‘/proc/net/dev’,’r’) for line in f.readlines(): line=line.strip() if (line[:4] == ‘eth0’): line=line[5:].split() bin=int(line[0]) bout=int(line[8]) timedelta=time.time()-lasttime lasttime=time.time() speedin=round((bin-lastbin)/(1024*timedelta), 2) speedout=round((bout-lastbout)/(1024*timedelta), 2) lastbin, lastbout = bin, bout a.set_text(str(speedin)+’KB/s’) xx, yy=a.get_size() a.set_position(int((300-xx)/2),int((100-yy)/2) ) b.set_text(str(speedout)+’KB/s’) xx, yy=b.get_size() b.set_position(int((300-xx)/2),int((100-yy)/2)+100 ) def parseKeyPress(self, event): # Parses the keyboard #As this is called by the stage object if event.keyval == clutter.keysyms.q: #if the user pressed “q” quit the test clutter.main_quit() elif event.keyval == clutter.keysyms.r: #if the user pressed “r” make the object red self.set_color(red) elif event.keyval == clutter.keysyms.g: #if the user pressed “g” make the object green self.set_color(green) elif event.keyval == clutter.keysyms.b: #if the user pressed “b” make the object blue self.set_color(blue) elif event.keyval == clutter.keysyms.Up: #up-arrow = make the object black self.set_color(black) print ‘event processed’, event.keyval stage = clutter.Stage() stage.set_size(300,200) stage.set_color(blue) stage.connect(‘key-press-event’, parseKeyPress) intext=clutter.Text() intext.set_font_name(“Sans 30”) intext.set_color(green) stage.add(intext)
It’s all about timing The Clutter library uses objects called timelines to do practically everything that needs to be done while an application is running. The timeline is the heartbeat of your script, and makes sure that everything at least makes a good attempt at running together. Timelines are used extensively for controlling animations and effects within Clutter, but you can also use them as your own interrupts to call routines every so often. It does this by emitting signals for events such as started, next-frame, completed and so on. Each of these signals can be bound to a callback function to control something else. Here is a short example you can type into a Python shell: >>> import clutter >>> t=clutter.Timeline() >>> t.set_duration(2000) >>> t.set_loop(True)
>>> def ping(caller): ... print caller ... >>> t.connect(‘completed’,ping) 9L >>> t.start() >>> <clutter.Timeline object at 0xb779639c (ClutterTimeline at 0x95b9860)> Hopefully the methods of the timeline object should be easy to follow. The duration is set as a number of milliseconds. The timeline is then set to loop. Here we have created a simple function called ping, which just prints out the parameter it was called with. next, we connect the completed emitted signal to the ping function and start the timeline running. Without any further interaction, the ping function will now be called every two seconds, as the timeline completes, until you kill the Python shell.
outtext=clutter.Text() outtext.set_font_name(“Sans 30”) outtext.set_color(red) stage.add(outtext) stage.show_all() t=clutter.Timeline() t.set_duration(5000) t.set_loop(True) t.connect(‘completed’, updatespeed, intext, outtext) t.start() clutter.main() Here we’ve brought together all the elements we have explored in this tutorial. We have created a stage, populated it with actors, and then used the timeline objects in Clutter to make them update themselves at our whim. But so far we have only scratched the surface of Clutter’s graphical capabilities. We haven’t even learned about behaviours or animations yet, never mind the alpha channel effects. Please trust us that we will be including these in our next project. Q
The main clutter website at www.clutterproject.org doesn’t have much help for Python users, but there is lots of background info and plenty of C documentation. 127
Python
Python: Clutter animations The code master Nick Veitch gets his head in a spin with the help of a news feed and some clever clutter animations.
Well, no news is good news, but at least we know what internationally respected agency it’s coming from.
I
n the last tutorial we had a look at the basics of Clutter as we used it to build a network speed monitor. This time we’ll be looking at some of the very powerful animation techniques used in Clutter, how to group objects, and a little more about text actors. We will be doing this in the guise of implementing a feed reader. There isn’t enough space for us to implement a complete multi-stream reader and explore the animations, but we will be covering enough ground to get you started on building such a beast, including fetching the data from the feed and applying it to the Clutter objects. For those of you who haven’t been tempted by one of these magnificent Python tutorials before, we usually try to do as much as possible in the interactive mode of Python first. It is a kinder, gentler environment than the normal mode in which programs are run, as you can type things in and experiment. The code listings in these cases include the Python prompt >>> at the beginning of the line when you have something to type in, and without it when the environment is giving you some feedback, just as it appears on screen.
again this time in order to retrieve the amazingly interesting data for our app. Before we do that though, we need to have a URL for a feed. You can choose any you like. The best way to find out a particular feed address is to go to the relevant page in a web browser and look for the RSS syndication icon. More often than not, this icon is linked to the feed address, so you can just copy the link location with your browser or read it off the status bar. You could use the TuxRadar feed, for example, at www.tuxradar.com/rss. For our example we are going to use the BBC news feed, for two reasons. The first is that it supplies an image reference, which will be nice for our experiments with Clutter textures, and the second is that it gets updated with news stories pretty much constantly, which makes it good for testing. The BBC news feed is at http://newsrss.bbc.co.uk/rss/ newsonline_uk_edition/world/rss.xml.
Getting fed The very first thing we should look at is how to get the data from our feed. We have come across the most excellent Feedparser library for Python before, and we will be using it
128
The Clutter website is worth checking every so often, because new documentation and new and life-changing versions of Clutter have a habit of appearing there.
Python
>>> import feedparser >>>f= feedparser.parse(‘http://newsrss.bbc.co.uk/rss/ newsonline_uk_edition/world/rss.xml’) >>>f {‘feed’: {‘lastbuilddate’: u’Wed, 30 Dec 2010 19:11:25 GMT’, ‘subtitle’: u’Get the latest BBC World News: international news, features and analysis from Africa, Americas, South Asia, Asia-Pacific, Europe and the Middle East.’, ‘language ... Actually, causing Python to display the variable we assigned to the feed just spews out the whole content of the container (we have truncated it in the output shown). To get to the bits you are interested in, you need to use the keys. All of the actual feed items are stored in a big list referenced by ‘entries’, and contain the information for the item – summary, timestamp, link URL and so on. >>> f.entries[0].title u’Nokia expands claim against Apple’ >>> f.entries[0].link u’http://news.bbc.co.uk/1/hi/technology/8434132.stm’ >>> f.entries[0].updated_parsed time.struct_time(tm_year=2009, tm_mon=12, tm_mday=29, tm_hour=18, tm_min=0, tm_sec=46, tm_wday=2, tm_ yday=364, tm_isdst=0) >>> f.entries[0].updated u’Wed, 30 Dec 2009 18:00:46 +0000’ >>> f.entries[0].summary u’Nokia ramps up its legal fight against Apple, claiming that almost all of its products infringe its patents.’ As you can see, we can get pretty much everything we need to build a parser out of these elements. There is one optional thing that we might want though – an image. The feed specs allow for an image to be supplied as a channel ‘ident’ for the feed, which is passed along as a URL to the image and some text describing it. This should be in the body of the feed, in an element called image, so we can reference it with (in this case), bbc.feed.image.href. There are many ways we could use this image, but one of the simplest is to just download it locally – then we can do what we like with it. A great way to do this in Python is to use the splendid urllib module for Python. Among its ever-sharp
RSS and other feeds There are plenty of different elements stuffed into an average RSS feed. Apart from the feed image, we’re only using ones that are guaranteed to be present in any feed that you might come across. If you want to know more about what things might be there, you should take a look at the different standards documents. Yes, to make things more confusing there were different versions of RSS developed at different times by largely different groups with very different ideas about how things should go. Using the Feedparser module smooths out a lot of the nonsense. The Harvard website has a really useful tutorial on constructing an RSS feed, which, cunningly for our purposes, works as an equally good tutorial for ripping one apart too. http://cyber.law.harvard.edu/rss/rss.html.
tools is the urlretrieve method, which will download the content of a URL and save it to the system’s temporary storage (usually /tmp), before handing back a filename and a bunch of HTTP info. The saved file should get cleaned up when the temporary storage is hosed down, or we can delete it when we quit if we want to be nice. Here it is: >>> import urllib >>> img, data =urllib.urlretrieve(f.feed.image.href) >>> img ‘/tmp/tmpTsCyDc.gif’ We’ll do something useful with this file in just a moment.
There are many places you can use as a source of RSS feeds; just look for the little orange icon with the three white arcs in it.
Cluttering up With that little excursion into the realm of feeds now out of the way, we can get down to the serious business of getting all cluttery. Now, in the last tutorial we introduced the basic Clutter elements of the stage, actors and timelines. If you haven’t read it yet, it’s a good idea to go back now and take a look because we’ll be building on the things we went through there. Anyhow, for now, here is the ‘Previously on Lost’ short recap. A Clutter window is called a stage. This is where the action happens, and by default in PyClutter is implemented in a standard GTK window. The elements that appear on (in) the stage are called actors, and can be anything from text to simple shapes or pixmap textures. The last one is what interests us this time. We will be using text actors again, but Clutter allows us to import images directly from files to use. Those of you with memories longer than a Channel 4 ad break will no doubt remember that we have such a file lurking around. Just to prove that it works, we should set up a stage and all the usual Clutter, erm, clutter, and then import our new pixmap. >>> import clutter >>> black=clutter.color(0,0,0,255) >>> white=clutter.color(255,255,255,255) >>> stage= clutter.Stage()
129
Python
>>> stage.set_size(400,60) >>> stage.set_color(white) >>> stage.show_all() >>> ident =clutter.texture_new_from_file(img) >>> stage.add(ident) This code just sets up a simple stage, sets the background colour to white and then adds the pixmap image. You can see from the code that unlike many graphic toolkits, we can add the actor to the stage after the stage is already on display. You should see something like, if not exactly the same as, the image on the first page of this tutorial. Foresight is a wonderful thing, which is why the pixmap fits so nicely on the stage. As we didn’t set a position for it, it just comes in at (0,0), which is the top-left of the stage.
Getting animated
Quick tip Want to find a complete list of the Clutter built-in animation codes? Try the C documentation, which is more up to date: http:// clutter-project. org/docs/ clutter/stable/ clutter-ImplicitAnimations. html#Clutter AnimationMode.
Before we finish our masterful feed reader, now would be a useful interlude during which to mess around with some animation. Last time, we animated some text objects through the use of the timeline – a powerful part of the Clutter magic that gives us easy interrupts that we can use to animate all sorts of items. However, since the 1.0 release of Clutter there is an even more powerful way to animate objects. The Clutter module now provides new methods for animating actors, and the most powerful of these is one that we are going to experiment with next. A Clutter actor has an animate method. The arguments it takes are an animation mode, followed by a duration (in milliseconds) and then the properties and values to be animated. Let’s break that down a bit. The animation mode can be defined, but Clutter has already built in several types that can be referenced through the Clutter module. These act as the tweening mechanisms for all the values you list Here is a selection: CLUTTER_LINEAR CLUTTER_EASE_IN_QUAD CLUTTER_EASE_OUT_QUAD CLUTTER_EASE_IN_OUT_QUAD CLUTTER_EASE_IN_CUBIC CLUTTER_EASE_OUT_EXPO CLUTTER_EASE_IN_OUT_EXPO CLUTTER_EASE_IN_OUT_CIRC and there are many more. These models compute the in-between phases of properties at any frame in the animation. The linear mode is a linear progression, while the others are variations producing different effects. So, say we had an object at position 0,0 and we animated it over 2,000 milliseconds to a position 100,0 animating only the x position. With a linear animation mode it would be at position 50,0 after one second. The animation takes the
Doc Holiday Clutter is really great. The only thing that sucks at the moment is the documentation for the Python module. Although the classes and methods are largely identical to the C implementation of Clutter (naturally), there are a few subtle differences, and sometimes the semantics of doing
130
something in Python can get you in a muddle. Python’s introspection tools are useful for this, particularly the dir() function, which you can call on any object, even a module. Try it on Clutter to see a list of the static types and methods available: dir(clutter), or on a method: dir(clutter.Text).
current values as the start point, and the supplied values as the end point. Any property that is not supplied is not animated. The supplied properties have to come in a pair with the end value following, and any property can be animated. It’s easier with a few examples (I hope you still have your stage open): >>> ident.set_anchor_point(60,30) >>> ident.set_position(60,30) >>> ident.animate(clutter.LINEAR,2000,’rotation-angle-y’,720) <clutter.Animation object at 0xa3f861c (ClutterAnimation at 0xa4bb050)> >>> ident.animate(clutter.LINEAR,2000,’rotation-angle-y’,720) <clutter.Animation object at 0xa3f85f4 (ClutterAnimation at 0xa4bb0c8)> The second time we ran the rotation, nothing happened. That is because the property value was the same – the value you supply isn’t the amount to animate by, it is the position the object should end up. The second time we ran the animation, the object was already in that position, so it was animated – it just didn’t go anywhere. You can of course, animate more than one property at a time. Try this: >>> ident.animate(clutter.LINEAR,2000,’x’,100, ‘rotationangle-y’, 360 ) <clutter.Animation object at 0xa3f85f4 (ClutterAnimation at 0xa4bb2c8)> >>> ident.animate(clutter.EASE_IN_SINE,2000,’x’,0, ‘rotationangle-y’, 0 ) <clutter.Animation object at 0xa3f85f4 (ClutterAnimation at 0xa4bb3ae)> This time the little logo did a dance and then returned to its original position.
Just the facts Well, spinning logos are all well and good, but what we really want to see is the text for our headline and the summary of the story sitting there in the space next to it. This will leave us with three actors on the screen. While Clutter can handle animating different actors concurrently, it becomes a bit of a pain for us to have to handle separate animations for the whole group of actors. The key word there, if you hadn’t picked up on it, is ‘group’. Clutter now supports containers, including a group container (the others are rather like GTK stacking containers, and we will no doubt use them some other time). First of all we’ll create a new group, then define the elements to go in it and add them all. To save messiness we will import the ident image again as a new object. >>> group1= clutter.Group() >>> ident1=clutter.texture_new_from_file(img) >>> head1=clutter.Text() >>> head1.set_position(130, 5) >>> head1.set_color(blue) >>> head1.set_text(f.entries[0].title) >>> body1=clutter.Text() >>> body1.set_max_length(75) >>> body1.set_position(130, 22) >>> body1.set_size(250, 100) >>> body1.set_line_wrap(True) >>> body1.set_text(f.entries[0].summary) >>> group1.add(ident1, head1, body1) >>> group1.show_all() >>> stage.add(group1)
Python
In this code we have created a new ident image and two text objects – one to represent the title of the news item (head1), and one for the summary (body1). We have set the text for them from the first entry we found in the feed and set them up in a good position next to the ident image. Here we have seemed to set absolute positions for the head1 and the body1 text objects, but these positions will actually remain relative to the group. When we initiate the group, it becomes the parent object to the actors, rather than the stage as before. So the positions of the objects are relative to the group position (which starts off by default at 0,0). You should try to remember this. For example: >>> head1.get_position() (130,5) >>> group1.set_position(10,10) >>> head1.get_position() (130,5) >>> group1.set_position(0,0) You saw all the objects move along the screen when we changed the group position, but don’t fall into the trap of thinking that the child objects think they have moved. They haven’t – they are still in the same place, but their world has moved… The great thing about this is, of course, that we can also animate a group – we need only supply one transformation, because all the properties of the individual actors are relative to the group. >>> group.animate(clutter.LINEAR,2000,”x”,200,”y,”30”) <clutter.Animation object at 0xa3f85f4 (ClutterAnimation at 0xa4bb2c8)> >>> group.animate(clutter.LINEAR,2000,”x”,0,”y,”0”) <clutter.Animation object at 0xa3f85f4 (ClutterAnimation at 0xa4bb4a8)> This time all the elements moved smoothly as though they were one, which (in terms of the group) they are. We can still adjust them individually – change the text or move them, but any transformations are relative to the group, not the stage. So, now we have a group, we can add another, then implement two animation modes to make the transition between the different items from the news feed: >>> group2= clutter.Group() >>> ident2=clutter.texture_new_from_file(img) >>> head2=clutter.Text() >>> head2.set_position(130, 5) >>> head2.set_color(blue) >>> head2.set_text(f.entries[1].title) >>> body2=clutter.Text() >>> body2.set_max_length(75) >>> body2.set_position(130, 22) >>> body2.set_size(250, 100) >>> body2.set_line_wrap(True) >>> body2.set_text(f.entries[1].summary) >>> group2.add(ident2, head2, body2) >>> group2.hide() >>> stage.add(group2) >>> group1.animate(clutter.EASE_OUT_EXPO,4000,’x’,800,’ y’,0,’rotation-angle-y’,180) <clutter.Animation object at 0x92ca61c (ClutterAnimation at 0x935a2a0)> >>> group2.animate(clutter.EASE_OUT_ EXPO,1,’x’,400,’y’,100,’rotation-angle-y’,720) <clutter.Animation object at 0x92ca66c (ClutterAnimation at 0x935a118)>
>>> group2.show() >>> group2.animate(clutter.EASE_OUT_ EXPO,3000,’x’,0,’y’,0,’rotation-angle-y’,0) <clutter.Animation object at 0x92ca16c (ClutterAnimation at 0x8f46028)> Here we have set up a new group and hidden it before adding to the stage. Then, for convenience’s sake, we animate it out of the way and reveal it. As it is out of the stage area, it can’t be seen and doesn’t interfere with the first group. When we reveal it, it is still offstage, but the new animation then puts it back in the proper position, and it seems to glide through the ether to come to a perfect stop. Hurrah! We are now expert animators to rival the likes of Disney. Maybe.
That reveal in all its animated glory. Well, obviously you can’t see it here, but you can if you run the listing or type it in.
Moving on It’s looking good so far, but our feed reader could still do with some extensions. There is very little in the way of error checking for instance (what if the feed is empty or the web connection is lost?) and it only handles a single feed rather than merging together several feeds. Next time we’ll be looking at more advanced methods of animation, and stringing animations together. We will also extend the basic building blocks of clutter by showing how to incorporate elements of the Cairo graphics library. Q 131
Python
Python: Stream video
Combine the video power of GStreamer with the graphical cunning of Clutter and you get what Nick Veitch categorises as ‘neat stuff’.
>>> import cluttergst >>> stage=clutter.Stage() >>> stage.set_title(“Clutter_Streamer”) >>> stage.set_size(320,290) >>> stage.set_color(clutter.Color(255,255,255,255) ) >>> stage.show() >>> If you’ve been following along with us for previous episodes of this chapter, this should all be familiar. For newbies, having imported the all-important modules (note that we need gst as well as cluttergst), we create a Stage object (a window, in Clutter-speak), give it a title, a size and a background colour, and then display it. You should see a white window.
All the world’s a stage
T
he primary purpose of Clutter is to make it easy to create neat graphical interfaces. It adopts a sort of fire-and-forget attitude to a lot of things, particularly animation – once you’ve set up a sequence, you can just start it and leave it to do its thing. However, Clutter can’t do this on its own. The main module has neat animation stuff, but it’s sorely lacking in other areas. It only has a few primitive ‘actor’ objects for a start, which is why there’s an extension that makes use of Cairo, the 2D graphics library. Another library that Clutter leans on is GStreamer, which can do wonders with streaming media data. Indeed, when it comes to multimedia, GStreamer must be one of the mostused libraries, so it’s no surprise that there’s a Python module for it. We will need to understand a small amount of the GStreamer framework to use it in our application, but of course we won’t have space to cover all of it. Fortunately, it isn’t complicated. We just need something straightforward – we have a URI for a video file or source of some description, and we want to be able to play it. First of all, we have to set up the usual Clutter stage: >>> import gst >>> import clutter
132
Now things start to get interesting. Objects displayed on the Clutter stage are called actors. There are several types of actor. Some basic ones, such as the rectangle and text objects, live inside the main Clutter module, but there are other, special actors in some of the extra support libraries. The VideoTexture object lives in the cluttergst module, and is similar to the rectangle actor. It has many of the same
You now have streaming video in a window, and you haven’t even written any sort of application yet.
Python
properties that we have investigated before, but it is special because it links to a video player that it uses to update its own texture. This is one of the powerful, useful things about Clutter: once you have set up your actor, and maybe even given it an animation to follow, it just carries on and does what it was told without you having to nip back and check up on it. The video player is known as a playbin, and is a sort of encapsulated player. You give it a resource you want to play, and it connects everything up. Let’s do that now: >>> vid=cluttergst.VideoTexture() >>> playbin=vid.get_playbin() >>> playbin.set_property(‘uri’,’mmsh://live.camstreams.com/ cscamglobal16MSWMExt=.asf’) >>> First we set up our video texture actor, then we extract the playbin object that was created. Before we can play anything, we have to set the source. The best way to do this is by calling the set_property method for the playbin object, including the property we want to set (the URI) and what we want to set it to. In this example, we’ve used a web-based video source. It’s actually a traffic cam in Picadilly Circus, London, which streams live video (as opposed to ones that upload a new image every five seconds). To begin with, you may wish to eschew the dizzying delights of two dozen Transit vans trying to negotiate the lights at the same time and use a local file. It means there’s one less thing to go wrong. The URI can point to a file or anything else that qualifies as a valid resource. GStreamer will figure out what it is and what to do with it, assuming it has the correct plugins. If it doesn’t, you’ll get an error message about not being able to play the stream. Stick to something simple that you know will play, such as a video file you’ve played successfully in Totem. For files (or anything else), you need to prefix the URI with the appropriate protocol. So for example, a valid file URI might be file:///home/evilnick/Videos/killmike.ogg. Next we need to create another GStreamer object that we can use to control the playbin object we just made: >>> pipe=gst.Pipeline(“pipe”) >>> pipe.add(playbin) >>> stage.add(vid) Having created the pipe object, we can connect it to the playbin using the pipe’s own add() method. The pipe is just being used as a container for our player. Finally, we can add the video texture object to the stage – that’s the bit that’s
My stream won’t play If your GStreamer pipe quits with an error, or more unusually, just doesn’t do anything, then you may have a codec problem. Your distro probably doesn’t have all these codecs installed, but you can get them for your distro by installing a package called gstreamer_plugins_good, or something similar. The other ones are the ‘bad’ and the ‘ugly’ (do you see the joke there?). Depending on your location, you may be able to legally download and use these extra plugins to see more types of stream.
updated when the stream is playing. You now have a white window with a video texture displayed inside it, although you won’t be able to see it because you haven’t turned it on yet. >>> pipe.set_state(gst.STATE_PLAYING) <enum GST_STATE_CHANGE_ASYNC of type GstStateChangeReturn> The pipe container is used to control the stream. Using the predefined gst constant, gst.STATE_PLAYING, we turn the player on. If you’re playing from a file, you should see the image instantly. If you see only part of it, you need to make the stage bigger. We’re still in interactive mode, so we can adjust that using the stage.set_size method. stage.set_size(640,480) We can also change the state of the player to pause it: >>> pipe.set_state(gst.STATE_PAUSED) <enum GST_STATE_CHANGE_SUCCESS of type GstStateChangeReturn> >>> pipe.set_state(gst.STATE_PLAYING) <enum GST_STATE_CHANGE_ASYNC of type GstStateChangeReturn> Note that the video stream does, in fact, pause. You might have imagined that is what would happen with a file, but it also happens on a live webcam feed.
Awww. There are some quite interesting cams if you search for them. Or just use your own files.
Getting freaky All we’ve done is create a Clutter version of a stream viewer – like Kaffeine, but with fewer options. However, our video texture is a Clutter actor, so we can make it do things. >>> vid.set_size(100,90) >>> vid.set_size(100,290) >>> vid.set_size(320,50) Even when we change the size, or move the actor around… >>> vid.move_by(10,10) >>> vid.move_by(10,10) You can do pretty much all the standard actor methods on the video. Previously we took a detailed look at the animation
133
Python methods, and these will work on a video texture too. As a brief recap, the method takes a value for the animation effect (a Clutter constant that enumerates various styles), the duration in milliseconds and a list of property /value pairs. The actor is then transformed to these absolute values. For example: >>> vid.animate(clutter.LINEAR, 1000, ‘width’,640, ‘height’, 480, ‘x’,0,’y’,0) <clutter.Animation object at 0x96f3884 (ClutterAnimation at 0xb4799a00)> >>> vid.animate(clutter.LINEAR, 1000, ‘width’,320, ‘height’, 240, ‘x’,160,’y’,120) <clutter.Animation object at 0x96f4284 (ClutterAnimation at 0xb4799c00)> If you want more of an idea what can be done with the animation transforms, check out the previous tutorial. We can now have some picture-in-picture fun with our little camera experiment, but there’s one more new element we need to
Quick tip Confusingly, the VideoTexture actor has a get_uri() method that returns nothing. That’s because it is just a texture – the playbin object is the one that has the URI data in it, and if you ever forget what feed it’s linked to, you can use playbin. get_property(‘uri’).
understand. First, we need to constuct another video object, just like the first, but with a different stream >>> vid2=cluttergst.VideoTexture() >>> playbin2=vid2.get_playbin() >>> playbin2.set_property(‘uri’,’mmsh://live.camstreams. com/cscamglobal5MSWMExt=.asf’) >>> pipe2=gst.Pipeline(“pipe2”) >>> pipe2.add(playbin2) >>> stage.add(vid2) >>> pipe2.set_state(gst.STATE_PLAYING) >>> vid.set_position(0,0) >>> vid.set_size(100,80) >>> vid.set_depth(2) You’ll have come across most of this before, but the last line is new. The concept of depth is part of the 2.5 dimensions of Clutter. Objects are closer or further away from the stage. In a strange twist of logic, at least to my mind, depth is positive out from the stage, and negative into it. So, if you want to layer objects, as we have done here, you will want to have the top object depth > lower object. This changes the way the actor is rendered, according to the perspective settings, but for small values of depth, you probably won’t even notice. Do play around, though. We now have everything we need to make a multichannel streaming video browser. For this next bit of code, we’ll build an actual Python application, so fire up your friendliest text editor and type in the following: import clutter import gst import cluttergst class videobrowser: def __init__ (self): self.channel1=”mmsh://live.camstreams.com/ cscamglobal16?MSWMExt=.asf” self.channel2=”mmsh://live.camstreams.com/ cscamglobal5?MSWMExt=.asf” # initialize stage self.stage = clutter.Stage() self.stage.set_color(clutter.Color(255, 255, 255, 255)) self.stage.set_size(640, 480) self.stage.set_title(‘LXF Traffic Watch - press “t” to toggle view’) self.stage.connect(‘key-press-event’, self.parseKeyPress) self.stage.connect(“destroy”, clutter.main_quit) #set up 2 video textures self.video1 = cluttergst.VideoTexture() self.playbin1 = self.video1.get_playbin() self.playbin1.set_property(‘uri’, self.channel1) self.pipeline1 = gst.Pipeline(“pipe1”) self.pipeline1.add(self.playbin1) self.video1.set_position(0,0) self.video1.set_size(640, 480) self.video1.set_depth(-2)
One minute you are gazing out over Piccadilly Circus, the next you’re in Trafalgar Square – or Tokyo or anywhere you can find a feed.
134
A word in your shell-like The GStreamer objects will automatically play sound on any video resources, and if you have two sources playing at the same time, you’ll just get noise. You can use gst functions to change the sound output (and pipe it to different places if you like), but for just adjusting the volume, the VideoTexture actor from Clutter has a suitable method – set_audio_volume().
Python self.stage.add(self.video1) self.pipeline1.set_state(gst.STATE_PLAYING) # second one self.video2 = cluttergst.VideoTexture() self.playbin2 = self.video2.get_playbin() self.playbin2.set_property(‘uri’, self.channel2) self.pipeline2 = gst.Pipeline(“pipe2”) self.pipeline2.add(self.playbin2) self.video2.set_position(0,0) self.video2.animate(clutter.LINEAR, 1000, ‘rotation_angle_y’, 0, ‘rotation_angle_z’, 0, ‘width’, 80, ‘height’, 60 ) self.stage.add(self.video2) self.video2.set_depth(0) self.pipeline2.set_state(gst.STATE_PLAYING) #display the stage and run self.stage.show_all() clutter.main() def parseKeyPress(self, stage, event): print ‘parsekey got ‘, self, event #do stuff when the user presses a key if event.keyval == clutter.keysyms.q: #if the user pressed “q” quit the app clutter.main_quit() elif event.keyval == clutter.keysyms.t: #if the user pressed “t”, toggle video #which is in front? if self.video1.get_depth() == -2 : #video2 is on top self.video2.animate(clutter.LINEAR, 300, ‘rotation_angle_y’, 360, ‘rotation_angle_z’, 360, ‘width’, 640, ‘height’, 480 ) self.video1.animate(clutter.LINEAR, 1000, ‘rotation_angle_y’, 0, ‘rotation_angle_z’, 0, ‘width’, 80, ‘height’, 60 ) self.video2.set_depth(-2) self.video1.set_depth(0) else : # video 1 is on top self.video1.animate(clutter.LINEAR 300, ‘rotation_angle_y’, 360, ‘rotation_angle_z’, 360, ‘width’, 640, ‘height’, 480 ) self.video2.animate(clutter.LINEAR, 1000, ‘rotation_angle_y’, 0, ‘rotation_angle_z’, 0, ‘width’, 80, ‘height’, 60 ) self.video1.set_depth(-2) self.video2.set_depth(0) if __name__ == ‘__main__’: videobrowser() The init code sets up two video streams, placing one as an actor above the larger one, and the other in the corner. After a few seconds the feeds should spring to life and you’ll see double-decker buses and whatnot. The cunning part of the code is the callback signal that traps keypresses. While the window on screen has focus, any keypress will cause an event. The stage.connect() method connects this signal to the keypress method we defined in the main application class. If the toggle key has been pressed, the two video feeds are animated into the opposite position. We have to remember to set the depths again, so the new, small image goes on top (the depth setting is how we determine which
actor is on top when the button is pressed). Feel free to experiment, add screens and play with animation calls. Clutter does the animation for you, so there’s no need to worry about doing anything once it’s stopped. The Q key is also trapped to give a clean exit from the app.
The GStreamer site can help you if you want to do clever things with your pipes and feeds.
Going further It’s possible to add further streams to this and cascade them down one side. Instead of toggling between two positions, the streams could all rotate around one spot. You might also want to play more with the animations: the values given in the animate method are absolute, rather than relative, which is why the angular rotations are multiples of 360 degrees – so the video texture ends up the right way round. In the last two parts of this series, we looked at the clutter.Text() actors. It would be pretty simple to add a text actor here at the front to let you know which stream you’re looking at, and you can change the value of the text when the screens are being switched without worrying about animating that too. For streams with audio, I guess you would want to mute the stream that is minimised. Check the boxout (A Word in your Shell-like, p134) for details about doing that. For more tricks with GStreamer objects, you should also check out the GStreamer documentation, which is a lot more effusive than the PyClutter docs at the moment. Q
Flash! Ahh-ahh! There was a time when any sort of live video feed was just that – a nice streaming socket you could clamp on to and suck the goodness out of. These days it seems that everyone would rather you use their silly, customised and Flash-based embedded players instead. I’m not sure whether they do this to get better metrics on users or to
‘protect’ content, but apart from making it difficult to have many open at a time in Firefox, it makes it hard to find suitable live streams. Traffic cams are a good bet, or if you wanted to be naughty, well, some of the embedded Flash players kindly show the URL of the raw feed in the page source. But I didn’t tell you that. Shh!
135
Python
Python: Code a Gimp plugin Jonni Bidwell uses Python to add some extra features to the favourite open source image-manipulation app, without even a word about Gimp masks.
M
ultitude of innuendoes aside, Gimp enables you to extend its functionality by writing your own plugins. If you wanted to be hardcore, then you would write the plugins in C using the libgimp libraries, but that can be pretty off-putting or rage-inducing. Mercifully, there exist softcore APIs to libgimp so you can instead code the plugin of your dreams in Gimp’s own Script-Fu language (based on Scheme), Tcl, Perl or Python. This tutorial will deal with the last in this list, which is probably most accessible of all these languages, so even if you have no prior coding experience you should still get something out of it.
Get started On Linux, most packages will ensure that all the required Python gubbins get installed alongside Gimp; your Windows and Mac friends will have these included as standard since version 2.8. You can check everything is ready by starting up Gimp and checking for the Python-Fu entry in the Filters menu. If it’s not there, you’ll need to check your installation. If it is there, then go ahead and click on it. If all goes to plan this should open up an expectant-looking console window, with a prompt (>>>) hungry for your input. Everything that Gimp can do is registered in something called the Procedure
Database (PDB). The Browse button in the console window will let you see all of these procedures, what they operate on and what they spit out. We can access every single one of them in Python through the pdb object. As a gentle introduction, let’s see how to make a simple blank image with the currently selected background colour. This involves setting up an image, adding a layer to it, and then displaying the image. image = pdb.gimp_image_new(320,200,RGB) layer = pdb.gimp_layer_new(image,320,200,RGB,'Layer0',100 ,RGB_IMAGE) pdb.gimp_image_insert_layer(image,layer,None,0) pdb.gimp_display_new(image) So we have used four procedures: gimp_image_new(), which requires parameters specifying width, height and image type (RGB, GRAY or INDEXED); gimp_layer_new(), which works on a previously defined image and requires width, height and type data, as well as a name, opacity and combine mode; gimp_image_insert_layer() to actually add the layer to the image, and gimp_display_new(), which will display an image. You need to add layers to your image before you can do anything of note, since an image without layers is a pretty ineffable object. You can look up more information about these procedures in the Procedure Browser – try typing gimp-layer-new into the search box, and you will see all the different combine modes available. Note that in Python, the hyphens in procedure names are replaced by underscores, since hyphens are reserved for subtraction. The search box will still understand you if you use underscores there, though.
Draw the line
You can customise lines and splodges to your heart’s content, though frankly doing this is unlikely to produce anything particularly useful.
136
All well and good, but how do we actually draw something? Let’s start with a simple line. First select a brush and a foreground colour that will look nice on your background. Then throw the following at the console: pdb.gimp_pencil(layer,4,[80,100,240,100]) Great, a nicely centred line, just like you could draw with the pencil tool. The first parameter, gimp_pencil(), takes just the layer you want to draw on. The syntax specifying the points is a little strange: first we specify the number of coordinates, which is twice the number of points because each point has an x and a y component; then we provide a list of the form [x1, y1, …, xn, yn]. Hence our example draws a line from (80,100) to (240,100). The procedures for selecting and adjusting colours, brushes and so forth are in the PDB too: pdb.gimp_context_set_brush('Cell 01')
Python
pdb.gimp_context_set_foreground('#00a000') pdb.gimp_context_set_brush_size(128) pdb.gimp_paintbrush_default(layer,2,[160,100]) If you have the brush called ‘Cell 01’ available, then the above code will draw a green splodge in the middle of your canvas. If you don’t, then you’ll get an error message. You can get a list of all the brushes available to you by calling pdb. gimp_brushes_get_list(‘’). The paintbrush tool is more suited to these fancy brushes than the hard-edged pencil, and if you look in the procedure browser at the function gimp_paintbrush, you will see that you can configure gradients and fades too. For simplicity, we have just used the defaults/current settings here. For the rest of this tutorial we will describe a slightly more advanced plugin for creating bokeh effects in your own pictures. ‘Bokeh’ derives from a Japanese word meaning blur or haze, and in photography refers to the out-of-focus effects caused by light sources outside of the depth of field. It often results in uniformly coloured, blurred, disc-shaped artefacts in the highlights of the image, which are reminiscent of lens flare (think Star Trek: Into Darkness). The effect you get in each case is a characteristic of the lens and the
aperture – depending on design, one may also see polygonal and doughnut-shaped bokeh effects. For this exercise, we’ll stick with just circular ones. Our plugin will have the user pinpoint light sources using a path on their image, which we will assume to be singlelayered. They will specify disc diameter, blur radius, and hue and saturation adjustments. The result will be two new layers: a transparent top layer containing the ‘bokeh discs’, and a
Applying our bokeh plugin has created a pleasing bokeh effect in the highlights.
“Our plugin creates a layer with ‘bokeh discs’ and another with a blurred copy of the image” layer with a blurred copy of the original image. The original layer remains untouched beneath these two. By adjusting the opacities of these two new layers, a more pleasing result may be achieved. For more realistic bokeh effects, a part of the image should remain in focus and be free of discs, so it may be fruitful to erase parts of the blurred layer. Provided the user doesn’t rename layers, then further applications of our
137
Python Quick tip For many, many more home-brewed plugins, check out the Gimp Plugin Registry at http:// registry.gimp.org
plugin will not burden them with further layers. This means that one can apply the function many times with different parameters and still have all the flare-effect discs on the same layer. It is recommended to turn the blur parameter to zero after the first iteration, since otherwise the user would just be blurring the already blurred layer. After initialising a few de rigueur variables, we set about making our two new layers. For our blur layer, we copy our original image and add a transparency channel. The bokeh layer is created much as in the previous example. blur_layer = pdb.gimp_layer_copy(timg.layers[0],1) pdb.gimp_image_insert_layer(timg, blur_layer, None, 0) bokeh_layer = pdb.gimp_layer_new(timg, width, height, RGBA_IMAGE, "bokeh", 100, NORMAL_MODE) pdb.gimp_image_insert_layer(timg, bokeh_layer, None, 0) Our script’s next task of note is to extract a list of points from the user’s chosen path. This is slightly non-trivial since a general path could be quite a complicated object, with curves and changes of direction and allsorts. Details are in the box below, but don’t worry – all you need to understand is that the main for loop will proceed along the path in the order drawn, extracting the coordinates of each component point
as two variables x and y. Having extracted the point information, our next challenge is to get the local colour of the image there. The PDB function for doing just that is called gimp_image_pick_color(). It has a number of options, mirroring the dialog for the Colour Picker tool. Our particular call has the program sample within a 10-pixel radius of the point x,y and select the average colour. This is preferable to just selecting the colour at that single pixel, since it may not be indicative of its surroundings.
Bring a bucket To draw our appropriately-coloured disc on the bokeh layer, we start – somewhat counter-intuitively – by drawing a black disc. Rather than use the paintbrush tool, which would rely on all possible users having consistent brush sets, we will make our circle by bucket filling a circular selection. The selection is achieved like so: pdb.gimp_image_select_ellipse(timg, CHANNEL_OP_ REPLACE, x - radius, y - radius, diameter, diameter) There are a few constants that refer to various Gimp-specific modes and other arcana. They are easily identified by their shouty case. Here the second argument stands for the
Here are our discs. If you’re feeling crazy you could add blends or gradients, but uniform colour works just fine.
Paths, vectors, strokes, points, images and drawables Paths are stored in an object called vectors. More specifically, the object contains a series of strokes, each describing a section of the path. We’ll assume a simple path without any curves, so there is only a single stroke from which to wrest our coveted points. In the code we refer to this stroke as gpoints, which is really a tuple that has a list of points as its third entry. Since Python lists start at 0, the list of points is accessed as gpoints[2]. This list takes the form [x0,y0,x0,y0,x1,y1,x1,y1,...]. Each point is counted
138
twice, because in other settings the list needs to hold curvature information. To avoid repetition, we use the range() function’s step parameter to increment by 4 on each iteration, so that we get the xs in positions 0, 4, 8 and the ys in positions 1, 5, 9. The length of the list of points is bequeathed to us in the second entry of gpoints for j in range(0,gpoints[1],4): You will see a number of references to variables timg and tdraw. These represent the active image and layer (more correctly image
and drawable) at the time our function was called. As you can imagine, they are quite handy things to have around because so many tools require at least an image and a layer to work on. So handy, in fact, that when we come to register our script in Gimp, we don’t need to mention them – it is assumed that you want to pass them to your function. Layers and channels make up the class called drawables – the abstraction is warranted here since there is much that can be applied equally well to both.
Python number 2, but also to the fact that the current selection should be replaced by the specified elliptical one. The dimensions are specified by giving the top left corner of the box that encloses the ellipse and the said box’s width. We feather this selection by two pixels, just to take the edge off, and then set the foreground colour to black. Then we bucket fill this new selection in Behind mode so as not to interfere with any other discs on the layer: pdb.gimp_selection_feather(timg, 2) pdb.gimp_context_set_foreground('#000000') pdb.gimp_edit_bucket_fill_full(bokeh_layer, 0,BEHIND_ MODE,100,0,False,True,0,0,0) And now the reason for using black: we are going to draw the discs in additive colour mode. This means that regions of overlapping discs will get brighter, in a manner which vaguely resembles what goes on in photography. The trouble is, additive colour doesn’t really do anything on transparency, so we black it up first, and then all the black is undone by our new additive disc. pdb.gimp_context_set_foreground(color) pdb.gimp_edit_bucket_fill_full(bokeh_layer, 0,ADDITION_ MODE,100,0,False,True,0,0,0) Once we’ve drawn all our discs in this way, we do a Gaussian blur – if requested – on our copied layer. We said that part of the image should stay in focus; you may want to work on this layer later so that it is less opaque at regions of interest. We deselect everything before we do the fill, since otherwise we would just blur our most-recently drawn disc. if blur > 0: pdb.plug_in_gauss_iir2(timg, blur_layer, blur, blur)
Softly, softly Finally we apply our hue and lightness adjustments, and set the bokeh layer to Soft-Light mode, so that lower layers are illuminated beneath the discs. And just in case any black survived the bucket fill, we use the Color-To-Alpha plugin to squash it out. pdb.gimp_hue_saturation(bokeh_layer, 0, 0, lightness, saturation) pdb.gimp_layer_set_mode(bokeh_layer, SOFTLIGHT_ MODE) pdb.plug_in_colortoalpha(timg, bokeh_layer, '#000000') And that just about summarises the guts of our script. You will see from the code on the disc that there is a little bit of housekeeping to take care of, namely grouping the whole series of operations into a single undoable one, and restoring
any tool settings that are changed by the script. It is always good to tidy up after yourself and leave things as you found them. In the register() function, we set its menupath to ‘<Image>/Filters/My Filters/PyBokeh...’ so that if it registers correctly you will have a My Filters menu in the Filters menu. You could add any further scripts you come up with to this menu to save yourself from cluttering up the already crowded Filters menu. The example images show the results of a couple of PyBokeh applications.
After we apply the filter, things get a bit blurry. Changing the opacity of the layer will bring back some detail.
“To finish, group the operations into a single undoable one, and reset any changed tool settings” Critics may proffer otiose jibes about the usefulness of this script, and indeed it would be entirely possible to do everything it does by hand, possibly even in a better way. That is, on some level at least, true for any Gimp script. But this manual operation would be extremely laborious and error-prone – you’d have to keep a note of the coordinates and colour of the centre of each disc, and you’d have to be incredibly deft with your circle superpositioning if you wanted to preserve the colour addition. Q
Registering your plugin In order to have your plugin appear in the Gimp menus, it is necessary to define it as a Python function and then use the register() function. The tidiest way to do this is to save all the code in an appropriately laid out Python script. The general form of such a thing is: #! /usr/bin/env python from gimpfu import * def myplugin(params): # code goes here register( proc_name, # e.g. “python_fu_linesplodge” blurb, #.“Draws a line and a splodge” help, author, copyright, date, menupath, imagetypes,
params, results, function) # “myplugin” main() The proc_name parameter specifies what your plugin will be called in the PDB; ‘python_fu’ is actually automatically prepended so that all Python plugins have their own branch in the taxonomy. The menupath parameter specifies what kind of plugin you’re registering, and where your plugin will appear in the Gimp menu: in our case “<Image>/Filters/Artistic/LineSplodge...” would suffice. imagetypes specifies what kind of images the plugin works on, such as “RGB*, GRAY*”, or simply “” if it doesn’t operate on any image, such as in our example. The list params
specifies the inputs to your plugin: you can use special Python-Fu types here such as PF_ COLOR and PF_SPINNER to get nice interfaces in which to input them. The results list describes what your plugin outputs, if anything. In our case (PF_IMAGE, image, “LSImage”) would suffice. Finally, function is just the Python name of our function as it appears in the code. To ensure that Gimp finds and registers your plugin next time it’s loaded, save this file as (say) myplugin.py in the plugins folder: ~/.gimp-2.8/ plug-ins for Linux (ensure it is executable with chmod +x myplugin.py) or %USERPROFILE %\.gimp-2.8\plug-ins\ for Windows users, replacing the version number as appropriate.
139
Python
Python: Gimp snowflakes Winter is here, so set your White Walker traps and snares, hunker down and admire some fractal snowflakes with Jonni Bidwell. there will be an entry called FractalFlake, so go ahead and click it, if you haven't already, you impatient devil. You will be greeted with a new dialogue â&#x20AC;&#x201C; don't worry about the Image and Drawable options, these are irrelevant for plug-ins that output new. In fact, don't worry about any of the options for your first snowflake, just click go ahead and OK and watch the script work away. When youâ&#x20AC;&#x2122;re done admiring your handiwork, have a fiddle with the parameters: Size corresponds to the pixel size of the square canvas produced by the plugin (of which the snowflake occupies about 60%), Minimum Length is the line length of each straight line in your fractal. This corresponds to the base case for the recursion (see below), if you have a larger image, and a smaller minimum length, then the image will take longer to produce. The random wobble will randomly deviate the vertices of the snowflake by up to this number of pixels, making for a more organic look, or a bit of mess if you set it too high.
Recursion, see recursion
A
Quick tip See more on the Thue-Morse connections on a blog piece by Zachary Abel http://bit.ly/ ThueMorse
140
s I write this tutorial, the temperatures have begun their steady decline, which seems a reasonable excuse to draw some snowflakes. The canonical snowflake of choice for programmers is based on a fractal curve invented by the Swedish mathematician, Helge von Koch. As well as making a pretty picture, this tutorial serves as a gentle introduction to recursion, one of the trickier programming paradigms. Inside the ZIP archive at http://linuxformat.com/files/ ca2015.zip you'll find a file called kochflake.py. Assuming you have the newest version (2.8) of Gimp installed, then copy this file to your ~/.gimp-2.8/plug-ins folder and give it the chmod +x treatment. If you don't have Gimp installed, then it will certainly be in your distribution's repositories (it will be also be there if you do have it installed, incidentally) so let fly with apt-get install, pacman -S, yum install, or whatever is your weapon of choice. When you start Gimp have a meander to the Filters menu, and there you should find a new sub-menu called My Filters (when you start playing with lots of plug-ins, the Filters menu can easily get crowded, so it's a good idea to annex off your custom additions in this way). Inside the My Filters menu
Now that you've satisfied all your snowflake-rendering desires and are ready to learn what's going on behind the scenes, let's take a deep breath and think about recursion. A recursive function is one which calls itself. "Shenanigans!", I hear you cry, "Only an infinite loop of despair could result from such a confabulation". And you would be quite correct, were it not for a (good) recursive function calling itself with different parameters and having a non-recursive base case. The
Following steps derived from a binary sequence, our Turtle can trace an ever more convincing von Koch curve.
Python
Turtles and the Thue-Morse sequence The von Koch curve is a little less messy to program if you forget about geometry and think like a turtle. If you have Python's turtle module installed together with the tk graphical toolkit we can do it in the following snippet of code: import turtle def von_koch(t, order, size): if order == 0:
t.forward(size) else: for angle in [60, -120, 60, 0]: von_koch(t, order - 1, size / 3) t.left(angle) von_koch(turtle,5,400) This enables us to see a connection with the binary sequence obtained by starting with a 0
Fibonacci sequence example in our Mathematica tutorial [see p88] provides a reasonable example, (we won't say a good example, because it's hideously inefficient), the j-th Fibonacci number F(j) is defined as the sum of the (j-2)-th and (j-1)-th Fibonacci numbers, so F(j)=F(j – 1) + F(j - 2). This definition as it stands is not satisfactory, until we first specify two initial Fibonacci numbers, traditionally F(0) := 0 and F(1) := 1. Armed with this knowledge, we can work out F(3) as the sum F(2) + F(1), F(2) we don't know, but it is F(1) + F(0) = 2, so F(3) = F(1) + F(0) + F(1) = 2. Another example which might be closer to some of your hearts is a recursive directory listing: Print a list of files in the current directory, then for each directory do the same, possibly with some indentation. Here the base case is when the current directory contains no subdirectories, and applying the procedure will traverse down a directory and give a lengthy listing of its entire contents. So once we have our base cases sufficiently well-defined, then recursion is all good. Granted, its much easier for computers to understand than humans, so have a peruse at the step-by-step guide [see p142] to see how the von Koch curve is constructed. The snowflake is just three of these curves arranged around an equilateral triangle.
Understanding the code The code contains a number of housekeeping lines which might be initially distracting, so let's jump straight into the fractalflake() function, which is where all the action is. All Python-fu plugins accept the timg and tdraw arguments (respectively an image and a drawable), even though they are not relevant for functions like this which output new, rather than acting on existing, images. So forgetting about these arguments we have size, min_length and rnd which are exactly what the user passes to GIMP via the initial dialogue. Our first tasks are to set up an RGB image and a layer to draw on, and start an undo group so that the whole process is seen as one operation, not several hundred carefully directed little lines. We also set up a temporary paintbrush, since drawing lines 1 pixel thick is otherwise tricky, and set the foreground colour to snow white. This is all dealt with in the first nine lines of fractflake() after that we have our recursive step, drawStep(), which we will skip over for a minute so we can see how it is called. The code on line 52 refers to three points (ax,ay), (bx,by) and (cx,cy). The first two are the base of our triangle, located 75% of the way down the image and at 20% and 80% in the horizontal direction. The point (cx,cy) is horizontally centered, and up top, 25% down the page.
and adding the sequences complement at each stage. Thus 0, 01, 0110, 01101001, and so on. This is known as the Thue-Morse sequence. And if we interpret a 0 as an instruction to move the turtle forward by one unit, and a 1 to rotate 60° counter clockwise, then a term sufficiently far down the Thue-Morse sequence begins to look uncannily von Koch like, as in the image below-left.
We call our drawStep() function to draw three von Koch fractals between these points, and there is where the magic happens. So let's now delve into this function. We first calculate the distance, using the Pythagorean theorem, between the two points passed to drawStep, then we see if this length is greater than the supplied min_length: dy = y2 - y1 dx = x2 - x1 length = math.sqrt (dx ** 2 + dy **2) if length > min_length: First, pay attention to what happens if this isn't the case (skipping over the mess down to line 48), so the points are sufficiently close together. This is our base case and involves nothing more than drawing a straight line: pdb.gimp_pencil(layer,4,[x1,y1,x2,y2]) The syntax is a little strange – the 4 refers to the number of co-ordinates, hence half of the number of points. So now we can tackle the recursive case. This looks ugly, but that’s just geometry. We define some new points (px,py), (qx,qy) and (rx,ry): the first two divide the line segment into thirds, and the latter (which is tricky to calculate) is located perpendicular to the midpoint at a distance such that an equilateral triangle is formed by these points. if length > min_length:
The plug-in’s output with a minimum line length of five.
141
Python
px = x1 + dx / 3. py = y1 + dy / 3. mpx = x1 + dx / 2. mpy = y1 + dy / 2. h = length / 3 * math.sqrt(3)/2 qx = px + dx / 3. qy = py + dy / 3. rx = mpx + h * (y1 - y2) / length ry = mpy + h * (x2 - x1) / length Next, we consider if we are adding a random wobble. If so we make a list of 10 random numbers in the required range, if not we make a list of 10 zeros. We then do the recursive call with the new points perturbed, if requested. if rnd > 0: r = [random.randrange(0,rnd) for j in range(10)] else: r = [0 for j in range(10)] drawStep(x1 + 0,y1 + 0,px + r[2],py + r[3]) drawStep(px + r[2],py + r[3],rx + r[4],ry +r [5]) drawStep(rx + r[4],ry + r[5],qx + r[6],qy + r[7]) drawStep(qx + r[6],qy + r[7],x2 + 0,y2 + 0) Notice that we don't let the random wobble affect the end points (x1,y1) and (x2,y2). You are welcome to do this, using x1 + r[0], y1 + r[1] and x[2] + r[8] and y[2] + r[9], but the resulting curve will not be closed, which looks a bit odd.
And there you have it, barring some trivial housekeeping tasks at the end of the function, fractals and snowflakes are your oysters. There are all manner of other fractals you can draw In this way. Trees and ferns are particularly popular. The register() function at the end is used to register the plug-in in the Gimp Procedure Database. The first argument here is the main function name prefixed by python_fu. Then we have fairly self-explanatory entries for a descriptive name, a more verbose description, an author, a licence and a date. After this we need to specify where in the menu structure the plug-in will appear, if (as in our case) the plug-in has some options then it is customary to indicate this by adding an ellipsis at the end of the menu entry. The next entry specifies which type of images the plugin works on, and we have set this to an empty string since it is irrelevant in our case. Then, more interestingly, we have a list of arguments to pass to our function. The PF_SPINNER type gives a neat way of entering an integer. The first number is the default an then we have a triplet consisting of the minimum, maximum and step size for manipulating the spinner. The same structure works for PF_SLIDER, which controls the input with a sliding bar. Other useful types are PF_TOGGLE, for boolean (on or off) options, as well as the self-explanatory PF_FONT, PF_ BRUSH and PF_LAYER. Q
Ordering up the perfect snowflake
1
Beginning at order 0 (or a line)
The von Koch curve of order 0 is just a humble straight line. There really isn’t that much more one can possibly say about it aside from it being rather peaceful and reminds us of Flatliners…
3
Order 2 curve (emergent snowflake)
Now subdivide each order 0 curve, so that our order 2 curve is constituted of four order 1 curves, or 16 order 0 curves. We can see the familiar snowflake-edge pattern emerging.
142
2
An order 1 curve (pointy)
If we divide this line into thirds and form an equilateral triangle with the middle third we get the order 1 curve. This curve is made up of four order 0 curves at one third scale.
4
Order 3 curve (or pretty)
The order 3 curve, composed of 64 order 0 curves. Definitely things are getting more complicated, and you’ve probably got the idea by now: The von Koch curve of order n has 4n straight lines, and is pretty.
Why switch to Linux? The top 10 reasons to give it a try today
1
Linux is free
Is Windows 7 really worth spending £100? Are you actually going to get £100 of value from it, or are you just going to use the same old programs you use on XP? Linux is free, now and forever. You pay nothing and get great software – what’s not to like?
2
Linux is fast
3
Linux has 1000s of apps
A few years ago, your computer was faster than a speeding Superman, so what happened to slow it down? Why does it take minutes to start up? Switch to Linux and let your PC perform at its best: maximum speed, all the time.
Want to make some music? Or burn DVDs? How about if you want to make a website? Or touch up some photos? Maybe you feel like running your own web site? Linux lets you do all this and more out of the box for no cost. Don’t pay for software again!
4
Linux is secure
Hackers? Viruses? Remote exploits? We’ve heard of them, but only because Windows users get hacked so often. Switch to Linux and leave these security problems where they belong: in the last decade. No more viruses, no more critical updates. Poof – they’re gone.
5
Linux is reliable
Do the words “blue screen of death” bring you out in a cold sweat? Stop losing your work. Stop having to run CHKDSK to fix problems. Stop rebooting every other hour. Linux will run and keep on running without a hiccup until you turn it off.
6
Linux works on any PC
Got a bang up to date wonder PC? Great! It’ll run Linux. Got a PC from 10 years ago with an old CPU and limited RAM? That’ll run Linux too. Got a PC from 20 years ago? Yes, even that will run Linux just fine. Whether you have 64MB of RAM or 4GB, Linux is ready for you.
7
Linux gives you choice
Choose from a dozen different web editors, two dozen text editors, three dozen programming toolkits, four dozen music players and thousands of games – get the perfect software for you. And the best bit is, it’s all free!
8
Linux is easy to use
9
Linux is growing
Get up and running with familiar programs like Firefox in minutes, then explore as much as you want. Plus, try as hard as you want, Linux is really hard to break – you can even leave your computer-savvy five-year-old alone with it for a day and it’ll be unscathed.
With a huge and growing community of friendly users, Linux is ready for people of all levels. From absolute newbies to hardened computer veterans, there’s a place for everyone to join in, ask their questions, and meet like-minded people.
10
Linux is everywhere
Google uses Linux. Amazon uses Linux. BMW uses Linux. Nokia uses Linux. Intel, IBM, Oracle, Cisco, HP, Motorola, Novell, BT, Dell, Toshiba – yup, they all use Linux too. With millions of users around the world already using Linux, what’s stopping you?
This advert brought to you by the absolutely unbiased folks at Linux Format magazine.
Linux Format: the easiest way to try Linux
Python
Make a Twitter client Jonni Bidwell shows you how to do Twitter like a boss. A command-line boss that accepts arguments and catches errors.
W
hile prior studies have shown that a great deal of Twitter content can be categorised as, ‘phatic’, ‘pointless babble’ and ‘self promotion’, and a great deal more is just plain spam; it is nevertheless a fact that among all the chaff there is some highly-informative and up to the minute wheat. Or ‘information’ if you prefer. Twitter enables developers access to its comprehensive REST API, so that they can use custom applications to interact with various tweety resources in a sensible manner. While you could use the API directly in Python, you would have to write a bunch of messy code to parse your queries correctly or unwrap lengthy JSON responses. Mercifully, all of this has been done for you in the python-twitter module, available from all good distributions, or via pip install if you want the latest version. Besides twittering, we will also see how command line options are dealt with using the argparse module, as well as how to do some simple error catching. In order to use the REST API, you must register as a Twitter developer, so hop along to http://dev.twitter.com and declare yourself with your regular Twitter credentials. Then create a new application, populating the Name, Description and Website fields with anything you like. Leave the Callback URL field blank and click the create button. You will get shouted at if your application’s name contains ‘twitter’, so don’t do that. Now go to the Permissions section of your application and change its access to Read and Write. You want
to be able to post stuff after all. Now go to the API Keys section and create an OAuth token. Now grab yourself a copy of the code linked at the top of the page at www.linuxformat. com/archives?issue=184, unzip it and populate config.py with the API Key, API Secret and Access Token Secret respectively. And now that the stage is set, let us see what people are saying about us with a quick search. Run: python twitter_api.py --search="\"linux format\"". Such slander! We have to escape the inner quotes so that bash doesn’t disappear our results, and the argument is passed quoted to the Twitter search function as a phrase. If you have a butchers at the code, you will see that twitter_api.py processes all the command line options and calls the relevant functions in twitter_ functions.py. In the case of our search above, once we have set up our API object, then all it takes is a call to api.GetSearch() and we’ve got ourselves a list of 15 tweets on our chosen subject. Tweets have their own class with a GetText() method for extracting the content, but since this content could be in any character encoding we use the helper function safe_print() to force UTF-8 output where possible. You can read about all the available API methods from the command line with pydoc twitter.Api or you can visit the Google Code website here: http://bit.ly/1jZ5qIl. Note that the argparse module now replaces the old optparse module, providing a handy means of parsing command line arguments.
“Twitter enables developers access to its comprehensive REST API.”
Our search for Windows XP found a lot of worried people – survival kit indeed. Notice how nicely the unicode characters are printed.
Adding argparse By importing the argparse module and creating an ArgumentParser object and calling parse_args() your program will get a -h or --help option for free. You can see this in action by creating a file argtest.py with the following contents: import argparse parser = argparse.ArgumentParser() args = parser.parse_args() Then run python argtest.py -h to see your free usage message. As it stands this is not particularly useful, but once we start adding arguments this will change. Arguments can be positional (mandatory) or optional and we can add a mandatory argument to argtest.py by inserting the following just above the last line: parser.add_argument("grr_arg", help="Repeat what you just told me")
144
Python
Taking a REST REST is short for Representational State Transfer and refers to a set of principles for gathering and sharing data rather than any concrete protocol. Twitter implements two major APIs, a RESTful one, which we will use, and a streaming one, which we won’t. The streaming API provides low-latency access to real-time data, which you can do all sorts of fancy stuff with, but the RESTful API provides a simple query and response mechanism which suits our purposes just fine.
There are a couple of ways to authenticate your application with Twitter. If you just want to access public data, then there’s an applicationonly method. Otherwise you will need to use OAuth tokens, this may seem slightly convoluted for this simple personal-use exercise, but userid/password authentication was turned off last year. Proper OAuth2 authentication is a back and forth dance with a few variations depending on the context. Ultimately it asks the user if an app can use
their account, and if the user consents then an access token is returned to the app via a callback URL. Only the authenticated application can use the token and it can be revoked by the user at any time. The upshot is that the application never gets to see the user’s credentials. In our simple situation we hardcode the token to our developer account, if you were making something distributable you would never share the variable secret, and all the access tokens would be requested dynamically.
Now when you run python argtest.py you will be given a stern reprimand about “too few arguments”. If you run it with the -h option, you will see that correct usage of your program requires you to provide a value for grr_arg. We haven’t added any functionality for this option yet, but at least if we run our program with an argument, eg python argtest.py foo, then we no longer get an error, or indeed any output whatsoever. The args namespace we created contains all the arguments that our program expects, so we can use grr_arg by adding the following to our file: print "You argued: {}. Huh.".format(args.grr_arg)
Using arguments More complicated arguments can easily be dealt with; for example we could sum an arbitrarily long list of integers by modifying the add_argument call like this: parser.add_argument('integers', metavar='N', type=int, nargs='+', help='some integers') By default, arguments are assumed to be strings, so we use the type= option to stipulate that integers are provided. The metavar directive refers to how our argument is referred to in the usage message, and nargs=’+’ refers to the fact that many integers may be provided. We could make a regular-ordinary program for summing two integers with nargs=2, but where would be the fun in that? We have to put the arguments provided into the list args.integers, so we can process it like so: print "The answer is {}.".format(sum(args.integers)) Our Twitter project works exclusively with optional arguments. These creatures are preceded with dashes, often having a long form, eg --verbosity, and a short form, say -v. Our Twitter program has 5 options in total (not counting the complementary --help option): --search, --trending-topics, --user-tweets, --trending-tweets, and --woeid. As it stands --woeid only affects the --trending-topics and --trending-tweets options. While the argparse module could easily handle grouping these arguments so that an error is issued if you try and use --woeid with another option, it’s much easier to not bother and silently ignore the user’s superfluous input: Haven’t we all seen enough errors? For example, the search argument which takes an additional string argument (the thing you’re searching for) is described as follows: parser.add_argument("-s", "--search", type=str, dest="search_term", nargs=1, help="Display tweets containing a particular string.")
Once we’ve built up all the arguments then we collate them into a namespace with: args = parser.parse_args() so that our search term is accessible via args.search_term, which we pass to search() in twitter_functions.py. This function acquires a list of tweets via: tweets = api.GetSearch(searchTerm) and the following block prints them all out, prefixed by the user id of the individual responsible: for tweet in tweets: print '@'+tweet.user.screen_name+': ', util.safe_print(tweet.GetText())getsearch
Our usage instructions for all the optional arguments you can use.
Trends near you The original Python Twitter code originated from Boston, and hard-coded the Where On Earth ID (WOEID) used by the trendingTopics() function accordingly (it’s 2367105). We can forgive the authors’ clinging to their New England roots, but for this tutorial we have added the --woeid option to see what’s hot elsewhere. This is an optional parameter and only affects the trending topics/tweets functions. If you don’t provide it then results are returned based on global trends using the GetTrendsCurrent() method of the API, rather than GetTrendsWoeid(). You can use the WOEID looker upper at http://zourbuth.com/tools/woeid for this. For example, we can see what’s going on in sunny Glasgow by the invocation: python twitter_api.py --trending-topics --woeid=21125 This only works for a few cities, so the wretched backwater wasteland you call home may not have any trends associated
145
Python
OpenHatch community OpenHatch.org is a Boston-based not-for-profit with the admirable and noble goal of lowering the barriers into open source development. Its website provides a system for matching volunteer contributors to various community and education projects and it runs numerous free workshops imparting the skills required to become a bona fide open source contributor. Since 2011 it has
been running outreach events with a particular focus on Python, but also covering other software and striving to get more women involved with programming. In this tutorial we’ve built on OpenHatch’s Python code for providing simple yet powerful interaction with the Twitter social networking platform. The original code was developed for a Python workshop in 2012 and we have
with it, which results in an error. You can test for this in the Python interpreter as follows, where woeid is the WOEID of your desired location: import twitter_functions test = twitter_functions.api.GetTrendsWoeid(woeid) If you don’t get an error ending with “Sorry, this page does not exist”, then all is well. We use Python’s error catching to fallback to the global trends function GetTrendsCurrent() when this happens: try: trending_topics = api.GetTrendsWoeid(woeid) except twitter.TwitterError: trending_topics = api.GetTrendsCurrent() It’s prudent (but not necessarily essential, the catchall clause except: is entirely valid) to specify the exception that you want to catch – if you aren’t specific, however, confusion and hairpulling may arise. The common base exceptions include IOError, for when file operations go wrong, and ImportError which is thrown up when you try and import something that isn’t there: try: import sys, absent_module except ImportError: print "the module is not there" sys.exit() Modules will also provide their own exceptions, for example if
brought it up to date and expanded on it for purposes of this tutorial. In particular we now use the argparse module rather than the deprecated optparse. You can check out some of the other great Python projects from this and other events at the official site (http://bit.ly/1fuabFI). You could even use your mad programming skillz to help out some thoroughly worthy causes.
you try and do this tutorial without a network connection you’ll get an error from the urllib2 module. So we catch that by wrapping the net-dependant functions. We can chain except: clauses, so the next bit of the above code is: except twitter.urllib2.URLError: print ("Error: Unable to connect to twitter, giving up") twitter.sys.exit() The userTweets() function is pretty straightforward, so we’ll just print the relevant segment here: tweets = api.GetUserTimeline(screen_name=username) for tweet in tweets: util.safe_print(tweet.GetText())
Unicode fixer The function trendingTweets() is a little more complicated: we need to first get a list of trending topics, and then for each of these grab some tweets. But there’s a sting in the tail – sometimes the topics returned will have funky unicode characters in them, and these need to be sanitised before we can feed them to our search function. Specifically, we need to use the quote function of urllib2 to do proper escaping, otherwise it will try and fail to ASCII-ize them. trending_topics = api.GetTrendsCurrent() for topic in trending_topics: print "**",topic.name esc_topic_name = twitter.urllib2.quote(topic.name. encode('utf8')) tweets = api.GetSearch(esc_topic_name) for tweet in tweets[:5]: print '@' + tweet.user.screen_name + ': ', util.safe_print(tweet.GetText())ç print '\n' We’ve been a bit naughty in assuming that there will be at least five tweets, the syntax for limiting the number of tweets GetSearch returns seems to be in a state of flux, but since these are trending it’s reasonable that there will be plenty. And that completes our first foray into pythonic twittering. We have developed the beginnings of a command-line Twitter client, we have parsed options, caught exceptions and sanitised strings. If your appetite is sufficiently whetted then why not go further? You could add a --friends option to just display tweets from your friends, a --post option to post stuff, a --follow option, and really anything else you want. Q
“For this tutorial we have added the --woeid option to see what’s hot elsewhere.”
You might not get exactly the same results as the website, but both methods show that people appear to care about acorns.
146
Try the new issue of MacFormat free* in the award-winning app! macformat.com/ipad Packed with practical tutorials and independent advice â&#x20AC;&#x201C; discover why MacFormat has been the UKâ&#x20AC;&#x2122;s best-selling Apple magazine for seven years! * New app subscribers only
Python
Minecraft: Start hacking Use Python on your Pi to merrily meddle with Minecraft, says Jonni Bidwell.
A
rguably more fun than the generously provided Wolfram Mathematica: Pi Edition is Mojang’s generously provided Minecraft: Pi Edition. The latter is a cut-down version of the popular Pocket Edition, and as such lacks any kind of life-threatening gameplay, but includes more blocks than you can shake a stick at, and three types of saplings from which said sticks can be harvested. This means that there’s plenty of stuff with which to unleash your creativity, then, but all that clicking is hard work, and by dint of the edition including of an elegant Python API, you can bring to fruition blocky versions of your wildest dreams with just a few lines of code.
Don’t try this at home, kids… actually do try this at home.
Assuming you’ve got your Pi up and running, the first step is downloading the latest version from http://pi.minecraft. net to your home directory. The authors stipulate the use of Raspbian, so that’s what we’d recommend – your mileage may vary with other distributions. Minecraft requires the X server to be running so if you’re a boot-to-console type you’ll have to startx. Start LXTerminal and extract and run the contents of the archive like so: $ tar -xvzf minecraft-pi-0.1.1.tar.gz $ cd mcpi $ ./minecraft-pi See how smoothly it runs? Towards the top-left corner you can see your x, y and z co-ordinates, which will change as you navigate the block-tastic environment. The x and z axes run parallel to the floor, whereas the y dimension denotes altitude. Each block (or voxel, to use the correct parlance) which makes up the landscape is described by integer co-ordinates and a BlockType. The ‘floor’ doesn’t really have any depth, so is, instead, said to be made of tiles. Empty space has the BlockType AIR, and there are about 90 other more tangible substances, including such delights as GLOWING_OBSIDIAN and TNT. Your player’s co-ordinates, in contrast to those of the blocks, have a decimal part since you’re able to move continuously within AIR blocks. The API enables you to connect to a running Minecraft instance and manipulate the player and terrain as befits your megalomaniacal tendencies. In order to service these our first task is to copy the provided library so that we don’t mess with the vanilla installation of Minecraft. We’ll make a special folder for all our mess called ~/picraft, and put all the API stuff in ~/picraft/minecraft. Open LXTerminal and issue the following directives: $ mkdir ~/picraft $ cp -r ~/mcpi/api/python/mcpi ~/picraft/minecraft
Dude, where’s my Steve? Here we can see our intrepid character (Steve) inside the block at (0,0,0). He can move around inside that block, and a few steps in the x and z directions will take Steve to the shaded blue block. On this rather short journey he will be in more than one block at times, but the Minecraft API’s getTilePos() function will choose the block which contains most of him. Subtleties arise when trying to translate standard concepts, such as lines and polygons from Euclidean
148
space into discrete blocks. A 2D version of this problem occurs whenever you render any kind of vector graphics: Say, for instance, you want to draw a line between two points on the screen, then unless the line is horizontal or vertical, a decision has to be made as to which pixels need to be coloured in. The earliest solution to this was provided by Jack Elton Bresenham in 1965, and we will generalise this classic algorithm to three dimensions a little later in this chapter.
Isometric projection makes Minecraft-world fit on this page.
Python
Now without further ado, let’s make our first Minecraftian modifications. We’ll start by running an interactive Python session alongside Minecraft, so open another tab in LXTerminal, start Minecraft and enter a world then Alt-Tab back to the terminal and open up Python in the other tab. Do the following in the Python tab: import minecraft.minecraft as minecraft import minecraft.block as block mc = minecraft.Minecraft.create() posVec = mc.player.getTilePos() x = posVec.x y = posVec.y z = posVec.z mc.postToChat(str(x)+’ ‘+ str(y) +’ ‘+ str(z)) Behold, our location is emblazoned on the screen for a few moments (if not, you’ve made a mistake). These co-ordinates refer to the current block that your character occupies, and so have no decimal point. Comparing these with the co-ordinates at the top-left, you will see that these are just the result of rounding down those decimals to integers (e.g. -1.1 is rounded down to -2). Your character’s co-ordinates are available via mc.player.getPos(), so in some ways getTilePos() is superfluous, but it saves three float to int coercions so we may as well use it. The API has a nice class called Vec3 for dealing with three-dimensional vectors, such as our player’s position. It includes all the standard vector operations such as addition and scalar multiplication, as well as some other more exotic stuff that will help us later on. We can also get data on what our character is standing on. Go back to your Python session and type: curBlock = mc.getBlock(x, y - 1, z) mc.postToChat(curBlock) Here, getBlock() returns an integer specifying the block type: 0 refers to air, 1 to stone, 2 to grass, and you can find all the other block types in the file block.py in the ~/picraft/ minecraft folder we created earlier. We subtract 1 from the y value since we are interested in what’s going on underfoot – calling getBlock() on our current location should always return 0, since otherwise we would be embedded inside something solid or drowning. As usual, running things in the Python interpreter is great for playing around, but the grown up way to do things is to put all your code into a file. Create the file ~/picraft/gps.py with the following code. import minecraft.minecraft as minecraft import minecraft.block as block mc = minecraft.Minecraft.create() oldPos = minecraft.Vec3() while True: playerTilePos = mc.player.getTilePos() if playerTilePos != oldPos: oldPos = playerTilePos x = playerTilePos.x y = playerTilePos.y z = playerTilePos.z
t = mc.getBlock(x, y – 1, z) mc.postToChat(str(x) + ‘ ‘ + str(y) + ‘ ‘ + str(z) + ‘ ‘ + str(t)) Now fire up Minecraft, enter a world, then open up a terminal and run your program: $ python gps.py The result should be that your co-ordinates and the BlockType of what you’re stood on are displayed as you move about. Once you’ve memorized all the BlockTypes (joke), Ctrl+C the Python program to quit. We have covered some of the ‘passive’ options of the API, but these are only any fun when used in conjunction with the more constructive (or destructive) options. Before we sign off, we’ll cover a couple of these. As before start Minecraft and a Python session, import the Minecraft and block modules, and set up the mc object: posVec = mc.player.getTilePos() x = posVec.x y = posVec.y z = posVec.z for j in range(5): for k in range(x - 5, x + 5) mc.setBlock(k, j, z + 1, 246) Behold! A 10x5 wall of glowing obsidian has been erected adjacent to your current location. We can also destroy blocks by turning them into air. So we can make a tiny tunnel in our obsidian wall like so: mc.setBlock(x, y, z + 1, 0) Assuming of course that you didn’t move since inputting the previous code. In the rest of this chapter, we’ll see how to build and destroy some serious structures, dabble with physics, rewrite some of the laws thereof, and go a bit crazy within the confines of our 256x256x256 world. Until then, try playing with the mc.player.setPos() function. Teleporting is fun! Q
All manner of improbable structures can be yours.
Quick tip Check out Martin O’Hanlon’s website www. stuffaboutcode. com, which includes some great examples of just what the API is capable of.
149
Python
Minecraft: Image wall importing Have you ever wanted to reduce your pictures to 16 colour blocks? You haven’t? Tough – Jonni Bidwell is going to tell you how regardless.
Not some sort of bloodshot cloud, but a giant raspberry floating in the sky. Just another day at the office.
T
echnology has spoiled us with 32-bit colour, multimegapixel imagery. Remember all those blocky sprites from days of yore, when one had to invoke something called one’s imagination in order to visualise what those giant pixels represented? In this tutorial we hark back to those halcyon days from the comfort of Minecraft-world, as we show you how to import and display graphics using blocks of coloured wool. Also Python. And the Raspberry Pi.
The most colourful blocks in Minecraft are wool (blockType 35): there are 16 different colours available, which are selected using the blockData parameter. For this tutorial we shall use these exclusively, but you could further develop things to use some other blocks to add different colours to your palette. The process of reducing an image’s palette is an example of quantization – information is removed from the image and it becomes smaller. In order to perform this colour quantization we first need to define our new restrictive palette, which involves specifying the Red, Green and Blue components for each of the 16 wool colours. This would be a tedious process, involving importing an image of each wool colour into Gimp and using the colour picker tool to obtain the component averages, but fortunately someone has done all the hard work already. We also need to resize our image – Minecraft-world is only 256 blocks in each dimension, so since we will convert one
Standard setup If you’ve used Minecraft: Pi Edition before you’ll be familiar with the drill, but if not this is how to install Minecraft and copy the API for use in your code. We’re going to assume you’re using Raspbian, and that everything is up to date. You can download Minecraft from http://pi.minecraft. net, then open a terminal and unzip the file as follows (assuming you downloaded it to your home directory): $ tar -xvzf ~/minecraft-pi-0.1.1.tar.gz
150
All the files will be in a subdirectory called mcpi. To run Minecraft you need to have first started X, then from a terminal do: $ cd ~/mcpi $ ./minecraft-pi It is a good idea to set up a working directory for your Minecraft project, and to copy the API there. The archive on the disk will extract into a directory called mcimg, so you can extract it to your home directory and then copy the api files like so:
$ tar -xvzf mcimg.tar.gz $ cp -r ~/mcpi/api/python/mcpi ~/mcimg/ minecraft For this tutorial we’re going to use the PIL (Python Imaging Library), which is old and deprecated but is more than adequate for this project’s simple requirements. It can import your .jpg and .png files, among others, so there’s no need to fiddle around converting images. Install it as follows: $ sudo apt-get install python-imaging
Python
With just 16 colours, Steve can draw anything he wants (inaccurately).
pixel to one block our image must be at most 256 pixels in its largest dimension. However, you might not want your image taking up all that space, and blocks cannot be stacked more than 64 high, so the provided code resizes your image to 64 pixels in the largest dimension, maintaining the original aspect ratio. You can modify the maxsize variable to change this behaviour, but the resultant image will be missing its top if it is too tall. The PIL module handles the quantization and resizing with one-line simplicity, but we must first define the palette and compute the new image size. The palette is given as a list of RGB values, which we then pad out with zeroes so that it is of the required 8-bit order. For convenience, we will list our colours in order of the blockData parameter. mcPalette = [ 221,221,221, # White 219,125,62, # Orange 179,80,188, # Magenta 107,138,201, # Light Blue 177,166,39, # Yellow 65,174,56, # Lime Green 208,132,153, # Pink 64,64,64, # Dark Grey 154,161,161, # Light Grey 46,110,137, # Cyan 126,61,181, # Purple 46,56,141, # Blue 79,50,31, # Brown 53,70,27, # Green 150,52,48, # Red 25,22,22, # Black ] mcPalette.extend((0,0,0) * 256 - len(mcPalette) / 3) Unfortunately the “/ 3” is missing from the code on the disc, though it is a mistake without any real consequence
(phew). Padding out the palette in this manner does however have the possibly unwanted side-effect of removing any really black pixels from your image. This happens because their value is closer to absolute black (with which we artificially extended the palette) than the very slightly lighter colour of the ‘black’ wool. To work around this you can change the (0,0,0) above to (25,22,22), so that there are no longer any absolute blacks to match against. A reasonable hack if you’re working with a transparent image is to replace this value with your image’s background colour, then the transparent parts will not get drawn. We make a new single-pixel dummy image to hold this palette: mcImagePal = Image.new("P", (1,1)) mcImagePal.putpalette(mcPalette) The provided archive includes the file test.png, which is in fact the Scratch mascot, but you are encouraged to replace this line with your own images to see how they survive the {res,quant}ize. You can always TNT the bejesus out of them if you are not happy. To ensure the aspect ratio is accurate we use a float in the division to avoid rounding to an integer. mcImage = Image.open("test.png") width = mcImage.size[0] height = mcImage.size[1] ratio = height / float(width) maxsize = 64 As previously mentioned, blocks in Minecraft-world do not stack more than 64 high (perhaps for safety reasons). The next codeblock proportionally resizes the image to 64 pixels in its largest dimension. if width > height: rwidth = maxsize rheight = int(rwidth * ratio) else: rheight = maxsize rwidth = int(rheight / ratio) If you have an image that is much longer than it is high,
151
Python
then you may want to use more than 64 pixels in the horizontal dimension and fix the height at 64. Replacing the above block with just the two lines of the else clause would achieve precisely this. Now we convert our image to the RGB colourspace, so as not to confuse the quantize() method with transparency information, and then force upon it our woollen palette and new dimensions. You might get better results by doing the resize first and the quantization last, but we prefer to keep our operations in lexicographical order. mcImage = mcImage.convert("RGB") mcImage = mcImage.quantize(palette = mcImagePal) mcImage = mcImage.resize((rwidth,rheight)) For simplicity, we will position our image close to Steve’s location, five blocks away and aligned in the x direction to be precise. If Steve is close to the positive x edge of the world, or if he is high on a hill, then parts of the image will sadly be lost. Getting Steve’s coordinates is a simple task: playerPos = mc.player.getPos() x = playerPos.x y = playerPos.y z = playerPos.z Then it is a simple question of looping over both dimensions of the new image, using the slow but trusty getpixel() method, to obtain an index into our palette, and using the setBlocks() function to draw the appropriate colour at the appropriate place. If your image has an alpha channel then getpixel() will return None for the transparent pixels and no block will be drawn. To change this behaviour one could add an else clause to draw a default background colour. Image co-ordinates start with (0,0) in the top-left corner, so to avoid drawing upside-down we subtract the iterating variable k from rheight. for j in range(rwidth): for k in range(rheight):
Unlike in Doom, this strawberry/ cacodemon doesn’t spit fireballs at you. This is good.
152
pixel = mcImage.getpixel((j,k)) if pixel < 16: mc.setBlock(j + x + 5, rheight - k + y, z, 35, pixel) To do all the magic, start Minecraft and move Steve to a position that befits your intended image. Then open a terminal and run: $ cd ~/mcimg $ python mcimg.py So that covers the code on the disc, but you can have a lot of fun by expanding on this idea. A good start is probably to put the contents of mcimg.py into a function. You might want to give this function some arguments too. Something like the following could be useful as it enables you to specify the image file and the desired co-ordinates: def drawImage(imgfile, x=None, y=None, z=None): if x == None: playerPos = mc.player.getPos() x = playerPos.x y = playerPos.y z = playerPos.z If no co-ordinates are specified, then the player’s position is used. If you have a slight tendency towards destruction, then you can use live TNT for the red pixels in your image. Just replace the mc.setBlock line inside the drawing loop with the following block: if pixel == 14: mc.setBlock(j + x + 5, rheight - k + y, z, 46, 1) else: mc.setBlock(j + x + 5, rheight - k + y, z, mcPaletteBlocks[pixel]) If you don’t like the resulting image, then it’s good news, everyone – it is highly unstable and a few careful clicks on the TNT blocks will either make some holes in it or reduce it to dust. It depends how red your original image was. While Minecraft proper has a whole bunch of colourful blocks, including five different types of wooden planks and
Python That’s right, Steve, go for the ankles. Let’s see how fast he runs without those!
stairs, six kinds of stone, emerald, and 16 colours of stained glass, the Pi Edition is a little more restrictive. There are some good candidates for augmenting your palette, though: Blockname
Block ID
Red
Green
Blue
Gold
41
241
234
81
Lapis Lazuli
22
36
61
126
Sandstone
24
209
201
152
Ice
79
118
165
244
Diamond
57
116
217
212
We have hitherto had it easy insofar as the mcPalette index aligned nicely with the coloured wool blockData parameter. Now that we’re incorporating different blockTypes things are more complicated, so we need a lookup table to do the conversion. Assuming we just tack these colours on to the end of our existing mcPalette definition, like so: mcPalette = [ … 241,234,81, 36,61,126, 209,201,152, 118,165,244,
116,217,212 ] mcPaletteLength = len(mcPalette / 3) then we can structure our lookup table as follows: mcLookup = [] for j in range(16): mcLookup.append((35,j)) mcLookup += [(41,0),(22,0),(24,0),(79,0),(57,0)] Thus the list mcLookup comprises the blockType and blockData for each colour in our palette. And we now have a phenomenal 31.25% more colours [gamut out of here - Ed] with which to play. To use this in the drawing loop, use the following code inside the for loops: pixel = mcImage.getpixel((j,k)) if pixel < mcPaletteLength: bType = mcLookup[pixel][0] bData = mcLookup[pixel][1] mc.setBlock(j + x + 5, rheight - k + y, z, bType, bData) In this manner you could add any blocks you like to your palette, but be careful with the lava and water ones: their pleasing orange and blue hues belie an inconvenient tendency to turn into lava/waterfalls. Incidentally, lava and water will combine to create obsidian. Cold, hard obsidian. Q
More dimensions One of the earliest documentations of displaying custom images in Minecraft:Pi Edition is Dav Stott’s excellent tutorial on displaying Ordnance Survey maps, http://bit.ly/1lP20E5. Twodimensional images are all very well, but Steve has a whole other axis to play with. To this end the aforementioned Ordnance Survey team has provided, for the full version of Minecraft, a world comprising most of Great Britain, with each block representing 50m. Its Danish counterpart
has also done similar, though parts of MinecraftDenmark were sabotaged by miscreants. Another fine example is Martin O’Hanlon’s excellent 3d modelling project. This can import .obj files (text files with vertex, face and texture data) and display them in Minecraft: Pi Edition. Read all about it at http://bit.ly/1sutoOS . Of course, we also have a temporal dimension, so you could expand this tutorial in that direction, giving Steve some animated gifs
to jump around on. If you were to proceed with this, then you’d probably have to make everything pretty small – the drawing process is slow and painful. Naturally, someone (Henry Garden) has already taken things way too far and has written Redstone – a Clojure interface to Minecraft which enables movies to be rendered. You can see the whole presentation including a blockified Simpsons title sequence at http://bit.ly/1sO0A2q.
153
Python
Minecraft: Make a trebuchet Build your labour of love and then blow it sky high with pyrotechnic Jonni Bidwell, a stash of TNT and an age-old siege machine.
N
ow that we’re au fait with the basics of the API, it’s time to get crazy creative. Building a house is hard, right? Wrong. With just a few lines of sweet Python your dream home can be yours. Provided your dream home is a fairly standard box construction, that is. If your dreams are wilder all it takes is more code. You will never have to worry about planning permission, utility connection, chancery repair contributions or accidentally digging up a neolithic burial ground (unless you built it first). It never actually rains in Minecraft Pi, so a flat-roof construction will happily suit our purposes just fine. We kick off proceedings by defining two corners for our house: v1 is the block next to us in the x direction and one block higher than our current altitude, whereas v2 is an aesthetically pleasing distance away: pos = mc.player.getTilePos() v1 = minecraft.Vec3(1,1,0) + pos v2 = v1 + minecraft.Vec3(10,4,6) Now we create a solid stone cuboid between these vertices and then hollow it out by making a smaller interior cuboid full of fresh air: mc.setBlocks(v1.x,v1.y,v1.z,v2.x,v2.y,v2.z,4) mc.setBlocks(v1.x+1,v1.y,v1.z+1,v2.x-1,v2.y,v2.z-1,0)
154
Great, except our only means of egress and ingress is via the rather generous skylight, and a proper floor wood (geddit?) be nice. If you’re standing in a fairly flat area, you’ll notice that the walls of your house are hovering one block above ground level. This space is where our floor will go. If your local topography is not so flat, then your house may be embedded in a hill, or partly airborne, but don’t worry – the required terraforming or adjustments to local gravity will all be taken care of. Let’s make our rustic hardwood floor: mc.setBlocks(v1.x,v1.y-1,v1.z,v2.x,v1.y -1,v2.z,5) The windows are just another variation on this theme: mc.setBlocks(v1.x,v1.y+1,v1.z+1,v1.x,v1.y+2,v1.z+3,102) mc.setBlocks(v1.x+6,v1.y+1,v1.z,v1.x+8,v1.y+2,v1.z,102) mc.setBlocks(v2.x,v1.y+1,v1.z+1,v2.x,v1.y+2,v1.z+3,102) mc.setBlocks(v1.x+2,v1.y+1,v2.z,v1.x+4,v1.y+2,v2.z,102) The roof uses the special half block 44, which has a few different types. Setting the blockType makes it wooden, matching our floor: mc.setBlocks(v1.x,v2.y,v1.z,v2.x,v2.y,v2.z,44,2) The door is a bit more complicated, the gory details are in the box on page 156, but the following three lines do the job: mc.setBlocks(v1.x+2,v1.y,v1.z,v1.x+3,v1.y,v1.z,64,3) mc.setBlock(v1.x+2,v1.y+1,v1.z,64,8) mc.setBlock(v1.x+3,v1.y+1,v1.z,64,9) Having lovingly and laboriously constructed our new property, the next step is to come up with new and inventive ways of destroying it. We have already mentioned that TNT can be made live, so that a gentle swipe with a sword (or
A house. Now let’s blow it up.
Python
It doesn’t look like much, but just you wait...
anything really) will cause it to detonate. It would be trivial to use setBlocks to fill your house with primed TNT, but we can do much better. Readers, let me introduce my beta trebuchet. Rather than simulating a projectile moving through space we will instead trace its parabolic trajectory with hovering TNT. Detonating the origin of this trajectory will initiate a most satisfying chain reaction, culminating in a big chunk of your house being destroyed. First we will cover some basic twodimensional mechanics. In the absence of friction, a projectile will trace out a parabola determined by the initial launch velocity, the angle of launch and the local gravitational acceleration, which on earth is about 9.81ms-2. As a gentle introduction, we will fiddle these constants so that the horizontal distance covered by this arc is exactly 32 blocks and at its peak it will be 16 blocks higher than its original altitude. If blocks were metres, then this fudge would correspond to a muzzle velocity just shy of 18ms-1, and an elevation of 60 degrees. We will only worry about two dimensions, so the arc will be traced along the z axis with the x co-ordinate fixed just next to our door. This is all summed up by the simple formula y = z(2- z/16), which we implement this way: for j in range(33): height = v1.y + int(j*(2 – j/16.)) mc.setBlock(v1.x+4,height,v1.z-j,46,1) The final argument sets the TNT to be live, so have at it with your sword and enjoy the fireworks. Or maybe not: the explosions will, besides really taxing the Pi’s brain, cause some TNT to fall, interrupting the chain reaction and preserving our lovely house. We don’t want that, so we instead use the following code: height = v1.y ground = height - 1 j=0
while ground <= height: mc.setBlocks(v1.x + 4,oldheight,v1.z - j,v1.x + 4,height,v1.z - j,46,1) j += 1 oldheight = height height = v1.y + int(j * (2 - j / 16.)) ground = mc.getHeight(v1.x + 4, v1.z - j) This ensures that our parabola is gap-free and also mitigates against the TNT arc-en-ciel terminating in mid-air. We have dealt with this latter quandary using the getHeight() function to determine ground level at each point in the arc, and stop building when we reach it. Note that we have to make the getHeight() call before we place the final TNT block, since the height of the world is determined by the uppermost non-air object, even if the said object is hovering. If our construction exceeds the confines of the Minecraft world, then you could just build another house in a better situation, or you could change v1.z - j to max(-116,v1.z-j) in the above loop, which would make a vertical totem of danger right at the edge of the world. Now that we have our trajectory, we can add the mighty siege engine: z = v1.z -j - 1 mc.setBlocks(v1.x + 3, oldheight, z + 10, v1.x + 6, oldheight + 2, z + 7,85) mc.setBlocks(v1.x + 4, oldheight + 2, z + 12, v1.x + 4, oldheight + 2, z + 1, 5) Up until this point, we have aligned everything along a particular axis: Our house (before you blew it up) faces the negative z direction, which might be akin to facing south, and this is also the direction along which our explosive parabola is traced. Naturally, we could rotate everything 90 degrees and the code would look much the same – modulo some judicious permuting of x,y and z and +/- – though your house would look a bit funny built on its side. Things get complicated
Quick tip The trebuchet code was inspired by the amazing Martin O’Hanlon and his Pi-based projects on www. stuffaboutcode. com.
155
Python
Swiss cheese, baby! (Your house may need some repairs after this tutorial.)
Quick tip You can do all the coding here in the interpreter, but copying errors are frustrating. Thus it might be easier to put it all in a file called house. py which you can execute with python house.py while Minecraft is running.
if we want to shed the yoke of these grids and right angles, to work instead with angles of our choosing. The problem is how to approximate a straight line when our fundamental units are blocks of fixed orientation rather than points. A general three-dimensional drawline() function will prove invaluable in your subsequent creations, enabling you to create diverse configurations from parallelepipeds to pentagrams. What is required is a 3D version of the classical Bresenham algorithm. Pi guru Martin O’Hanlon’s github contains several wonderful Minecraft Pi Edition projects, including a mighty cannon from which this beginner project takes its inspiration. Martin has a whole Python drawing class, which includes the aforementioned 3D line algorithm, but once you understand the 2D version the generalisation is reasonably straightforward. Let us imagine we are in a Flatland-style Minecraft world in which we wish to approximate the line in the (x,y) plane connecting the points (-2,-2) and (4,1). This line has the equation y = 0.5x - 1. The algorithm requires that the gradient of the line is between 0 and 1, so in this case we are fine. If we wanted a line with a different slope, then we can flip the axes in such a way as to make it conform. The crux of the algorithm is the fact that our pixel line will fill only one pixel (block) per column, but multiple pixels per row. Thus as we
rasterize pixel by pixel in the x direction, our y coordinate will either stay the same or increment by 1. Some naïve Python would then be: dx = x1 – x0 dy = y1 – y0 y = y0 error = 0 grad = dy/dx for x in (x0,x1): plot(x,y) error = error + grad if error >= 0.5: y += 1 error -= 1 where plot() is some imaginary plotting function and grad is between 0 and 1. Thus we increment y whenever our error term accumulates sufficiently, and the result is the image which meets your gaze. Bresenham’s trick was to reduce all the calculations to integer operations, which were far more amenable to 1960s hardware. Nowadays we can do floating point calculations at great speed, but it is still nice to appreciate these novel hacks. The floating point variables grad and error arise due to the division by dx, so if we multiply everything by this quantity and work around this scaling, then we are good to go. To get this working in three dimensions is not so much of an abstractive jump, we first find which is the dominant axis (the one with the largest change in co-ordinates) and flip things around accordingly. Moving along the dominant axis one block at a time and incrementing the co-ordinates of minor axes as required. We have to pay careful attention to the sign of each co-ordinate change, which we store in the variable ds. The ZSGN() function returns 1, -1 or 0 if its argument is positive, negative or zero respectively; I have left coding this as an exercise for the reader. We make extensive use of a helper function minorList(a,j) which returns a copy of the list a with the jth entry removed. We can code this using a one-liner thanks to lambda functions and list slicing: minorList = lambda a,j: a[:j] + a[j + 1:] Our function getLine() will take two vertices, which we will represent using three-element lists, and return a list of all the vertices in the resulting 3D line. All of this is based on Martin’s code, for which we should all be grateful. The first part of it initialises our vertex list and deals with the easy case where both input vertices are the same...
Double door details Putting doors into our house is our first encounter with the additional blockData parameter. This is an integer from 0 to 15 and controls additional properties of blocks, such as the colour of wool and whether or not TNT is live. Our door occupies four blocks and is aligned in the x direction. It’s recessed slightly back from the surrounding walls, closed, and has the handles helpfully placed towards the middle. These properties are controlled by various bits of the blockType. We number the four bits from the rightmost bit 0 to the leftmost bit 3 and in little-endian notation so that 8 is represented in binary as 1000. Bit 3 is
156
set if the block is part of the top section of a door. If this is the case, then bit 0 is the only other bit of concern, it determines the placement of the handles/hinges. Top sections of doors thus have blockType 8 or 9. For the bottom sections we have the following bit assignments: bit 3...........off bit 2 ...........door is open bit 1 ...........door recessed bit 0 ..........alignment (off=x, on=z) The top sections must be placed after the bottom ones, since they inherit their properties from their inferiors.
Doors are always a good idea for those wishing to avoid claustrophobia/death.
Python
Here our line is just a single block: def getLine(v1, v2): if v1 == v2: vertices.append([v1]) After this it gets a bit ugly, we set up the previously mentioned list of signs ds, and a list of absolute differences (multiplied by two) a. The idx = line is technically bad form, we want to find our dominant axis, thus the index of the maximum entry in a. Using the index() method together with max means that we are looping over our list twice, but since this is such a short list we shan’t worry, it looks much nicer this way. We refer to the dominant co-ordinates by X and X2. Our list s is a re-arrangement of ds, with the dominant coordinate at the beginning. And there are some other lists to keep track of the errors. The variable aX refers to the sign of the co-ordinate change along our dominant axis. else: ds = [ZSGN(v2[j] - v1[j]) for j in range(3)] a = [abs(v2[j]-v1[j]) << 1 for j in range(3)] idx = a.index(max(a)) X = v1[idx] X2 = v2[idx] delta = a[idx] >> 1 s = [ds[idx]] + minorList(ds,idx) minor = minorList(v1,idx) aminor = minorList(a,idx) dminor = [j - delta for j in aminor] aX = a[idx] With all that set up we can delve into our main loop, in which vertices are added, differences along minor axes examined, errors recalculated, and major co-ordinates incremented. Then we return a lovely list of vertices. loop = True while(loop): vertices.append(minor[:idx] + [X] + minor[idx:]) if X == X2: loop = False for j in range(2):
if dminor[j] >= 0: minor[j] += s[j + 1] dminor[j] -= aX dminor[j] += aminor[j] X += s[0] return vertices To conclude in style, we will test this function by making a mysterious and precarious beam of wood next to where we are standing as a fitting testament to your wonderous labours this day, padawan. v1 = mc.player.getTilePos() + minecraft.Vec3(1,1,0) v1 = minecraft.Vec3(1,1,0) + pos v2 = v1 + minecraft.Vec3(5,5,5) bline = getLine([v1.x,v1.y,v1.z],[v2.x,v2.y,v2.z]) for j in bline: mc.setBlock(j[0],j[1],j[2],5) Over the page, we’ll look at creating a fully functioning Minecraft cannon. Boom! Q
Now we can escape the gridlock and build at whatever angles our heart desires.
Don’t try this at home, kids.
157
Python
Minecraft: Build a cannon Learn some object-oriented Python and, just as importantly, blow more stuff up as Jonni Bidwell continues his adventures with Minecraft: Pi Edition. distinct species. Thus you have likely already done some work with objects, possibly without knowing it. A particular object has some methods associated with it: for example we can add, subtract, divide and multiply integers; compare and concatenate strings; slice, curtail and append to lists, and so on. These methods all come for free when we instantiate an object. So when we do proper object oriented programming we make a blueprint detailing our own custom methods for our own bespoke objects. This blueprint is called a class, and in Python the methods therein are defined as functions. If you look at Martin’s code, the first class we come across (line 33) is called MinecraftDrawing. It’s fairly lengthy, comprising six methods, dealing with all the drawing primitives one could hope for – the drawing of points, lines, spheres and faces.
Taking advantage of objects
I Quick tip Try experimenting with the velocity and blast radius arguments to the cannon.fire() method on line 376 of minecraftcannon.py
158
n the previous tutorial. your blockophilic author had some fun building and demolishing a house, while you learned about the Minecraft API, the Bresenham algorithm for drawing blocky lines and how to fiddle with the bits of the blockType value to make lovely doors and live TNT. If you’ve skipped straight to this section, I advise you to go back and try the trebuchet first. In this outing we’ll learn some equally valuable lessons and continue with the destructive theme, this time by way of a controllable cannon courtesy of Martin O’ Hanlon (www.stuffaboutcode.com). All the code is on his site (http://bit.ly/1u9D2bs) and you can run it from its directory with a simple python minecraft-cannon.py. The code runs to nearly 400 lines so we won’t cover everything – just the juicy bits. This project serves as a nice introduction to object oriented programming, so first a few words on this topic. In programming parlance, an object is an instantiation of a class. This definition is, to start with at least, unsatisfactory at best and meaningless at worst. Think instead of objects as a family of which all the standard programming constructs (arrays, data types, functions – pretty much anything) are
You might wonder at this point why exactly this objectification of drawings is necessary, as opposed to just having a module housing all the drawing functions. This is certainly a valid concern – it is entirely possible to modularise this, but if you look closely within the class block you will see that many of the methods share the variable self.mc defined in the special __init__() method. Thus if you were to extract all these methods to independent functions, many of them would all have to take an extra mc parameter. For a single variable this won’t hurt, but you can easily imagine how the situation could deteriorate. Object oriented programming helps us group variables and functions (or data and behaviour) in a sensible
Steve watches nonchalantly as fiery obsidian hell is unleashed on an unsuspecting, if rashly exposed, tree.
Python
manner. If nothing else, you will have noted the proliferation of the keyword self throughout each class. This keyword, given as the first argument to a method, stipulates that that method will inherit all the class-specific properties – the variables self.* specified in the __init__() method. The other three classes in the module (MinecraftBullet, MinecraftCannon and CannonCommands) all define more than one property, more than justifying their class-ification, and refer to each other in a coherent manner as required. The cannon is controlled by a simple command interpreter provided by the cmd module, which lets you call functions with arguments by wrapping them in directive functions. This is a great example of Python taking something which would otherwise be laborious and tedious and making it child’s play. All that is needed is a class which subclasses the cmd.Cmd class and contains all the commands you require the interpreter to understand. These commands are defined as functions with names of the form do_*() – hence, for example, the function describing the exit command is named do_exit(). Any function which returns a value will exit the interpreter loop, thus all functions except do_exit() and do_EOF() (called when Ctrl + D is inputted) don’t return anything. Since we have subclassed cmd.Cmd we have to call its __init__() method manually to start the interpreter, and we also set up a custom introductory message and prompt here, so that the CannonCommands class begins as follows: class CannonCommands(cmd.Cmd) def __init__(self): cmd.Cmd.__init__(self) self.prompt = “Stuffaboutcode.com Cannon >> “ self.intro = “Minecraft Cannon - www.stuffaboutcode. com” With all this set up, we have a fully functional command line, with history, line editing and even bash-like tab completion. Furthermore, we even get a help command which will print the docstrings for the do_* functions. “Delightful, but how do I get my weapon?” I hear you interject, your enthusiasm perhaps giving way to impatience. This is simply a case of invoking the start command which sets up the mc object (put in the self namespace since it is shared amongst the whole class) and instantiates a MinecraftCannon object three blocks away from the player’s current position. The cannon itself is pretty simple: drawCannon() draws a 3x3 wooden base, and drawGun() draws 5 blocks of dark wool in
a line. The reason for having two functions here is that we can change the angles of azimuth and elevation for the cannon, necessitating its redrawing. This is achieved from the interpreter using the commands rotate and tilt respectively, which in turn call the setDirection() and setAngle() methods of the cannon class. The coordinates of the end of the cannon are calculated by considering the point on an appropriately sized sphere centred at the fuse end of the cannon. Details are in the Spherical Trigonometry box overleaf – paint over it if trigonometry triggers youth-related trauma. When the cannon is tilted or rotated the clearGun() method is called, which draws over the gun with air blocks. Then we calculate the new end-point of the cannon as described in the box, and use mcDrawing.drawLine to draw the appropriate line. The latter function calls getLine() (the longest function in the module), which is an implementation of the Bresenham line algorithm which we talked about last issue. (If you missed it, don’t worry, but if you worry get yourself a back issue as instructed below.) So now we come to the best bit: firing the cannon. This instantiates a MinecraftBullet object with velocity 1 (it moves
Steve wishes his balls were just a bit more destructive.
My first objects As a gentle introduction to object oriented programming, let’s consider a simple music library application. Here our objects will be the library itself and our favourite tracks (whether that’s some rousing Brahms or the latest Israeli psytrance anthems) and we will have a method for adding tracks. class library: def __init__(self): self.lib = [] def add(self,trackobj): self.lib += [trackobj]
class track: def __init__(self,artist,title): self.artist = artist self.title = title Here our library uses just the standard list methods, and we bring it into fruition and populate as follows: >>> mylib = library() >>> mylib.add(track("Tom Lehrer", "Poisoning pigeons in the park")) >>> mylib.add(track("Bill Bailey", "Beautiful ladies in danger")) It’s admittedly not much, but you get the idea.
You can (and should) add another special method __repr__() which returns how these objects are displayed – the default representation is not so helpful: >>> mylib.lib[0] <__main__.track object at 0x7f12eaad9908> So inside the track class you could add the following lines: def __repr__(self): return("<Track> %s,%s" % (self.artist, self.title)) This will give you a slightly more informative description in each case.
159
Python
You have to fire this one manually, but Steve don’t care, he crazy.
at 1 block per tick or hundredth of a second) and blast radius 3 (when it hits something an empty sphere of radius 3 will result). The bullet itself is just a single block of glowing obsidian, so drawing it and erasing it are straightforward – see the one-line methods draw() and clear(). We have to work out the velocities in three dimensions, which requires similar trigonometry to that involved in drawing the gun barrel. We then enter into a while loop, calling bullet.update once per tick, which will return True until a collision occurs. Velocity in the negative y direction (due to gravity) increases linearly with time, as we see in line 250 self.yVelocity = self.yStartVelocity + self.gravity * self.ticks whereas in the x and z dimensions it remains constant as friction is not worth bothering with. Velocity is measured in blocks per tick, so the new position is calculated by incrementing each component of the old position with the corresponding component of the velocity (line 253) and then rounded to integer co-ordinates (line 258). If the projectile is moving slowly, then the rounding could result in its remaining in situ across ticks, in which case there’s no point wasting
Spherical trigonometry Given an azimuthal (horizontal) angle phi and an angle of elevation (vertical) theta, the point on a sphere centred on the origin and having radius l is calculated by trigonometry as shewn in the diagram. Here the blue dot represents a point on the sphere and the black dot its projection onto the x-z plane. Because of how the angles are defined, the y co-ordinate ends up with a slightly tidier expression than the others, which we can see in the function findPointOnSphere() (line 24): def findPointOnSphere(cx, cy, cz, radius, phi, theta):
x = cx + radius * math.cos(math. radians(theta)) * math.cos(math. radians(phi)) z = cz + radius * math.cos(math. radians(theta)) * math.sin(math. radians(phi)) y = cy + radius * math.sin(math. radians(theta)) The trig functions require angles to be converted to radians (we assume that the rotate and tilt commands take their angles in degrees). You may recall that there are exactly pi radians in 180 degrees, and they enjoy the property of being an entirely dimensionless measure.
Nobody else in the Coding Academy 2015 team respects me for including this image.
160
effort redrawing it. We test that this is not the case with if matchVec3(newDrawPos, self.drawPos) == False: and then proceed with the redrawing. This involves a simple check that the new position is empty space: if self.mc.getBlock(newDrawPos.x, newDrawPos.y, newDrawPos.z) == block.AIR: If this is the case then we disappear the old obsidian block, update the draw position and redraw: self.clear() self.drawPos = minecraft.Vec3(newDrawPos.x, newDrawPos.y, newDrawPos.z) self.draw() If not then we make our crater and change movedBullet to False so that we exit the update loop: self.mcDrawing.drawSphere(newDrawPos, self. blastRadius, block.AIR) movedBullet = False
Better explosions Remember in our previous tutorial how we blew all that stuff up using chain reactions from TNT? Now we’re going to upgrade our cannon fire using similar principles. Since the only way to detonate TNT is by hitting it or by detonating another block of TNT in its vicinity, we will have to manually trigger the explosion, which means that the fire command will work differently this time around. Specifically, it will now add a block of TNT to the end of the barrel, leaving poor Steve to light the blue touch paper and run. We need to monitor this block and act swiftly when it is struck – a matter of delicate timing, to be sure. Just before it explodes, we need another TNT block to appear just in front of it, and so on – creating the illusion of a moving/exploding/super nashwan fireball of destruction. The code for this part of the exercise is called tntcannon.py. Once Steve hits the TNT and it starts flashing, the getBlock() method actually detects it as air, which is a convenient trigger for us to prepare to place the next block in the chain. This waiting with bated breath is achieved using the following loop, in which pass is the standard Python ‘do nothing’ command: while self.mc.getBlock(xt, yt, zt) == 46: pass
Python
From this moment of ignition we have a four-second window in which to get someplace safe and ensure that the next block is placed in a timely manner. The initial explosion will instigate a chain reaction, and empirical studies reveal that this reaction propagates at a rate of about a block every 0.3 seconds, so we shall synchronise the placing of new TNT with this imaginary metronome. We will need to tweak the trajectory slightly to avoid duplicates in the path (we discussed this resulting from rounding a few paragraphs ago – keep up!), which would otherwise spoil the show. The upshot of all this is that we don’t really have any control over the velocity of our cannonball: it will travel at four blocks per second, give or take. In the y direction, since we want to preserve our modelling of gravity, we will use a trick from last issue and draw a column of TNT in the event that there is significant vertical movement. This way at least the trajectory will be vaguely accurate even if the projectile doesn’t really accelerate, and moreover appears to elongate and contract as it moves vertically. The complete fire() method then looks like this: def fire(self, velocity, blastRadius): xt, yt, zt = findPointOnSphere(self.baseOfGun.x, self. baseOfGun.y, self.baseOfGun.z, self.lenghtOfGun, self. direction, self.angle) #draw the TNT trigger self.mcDrawing.drawPoint3d(xt, yt, zt, 46, 1) #support so that it don’t fall when hit self.mcDrawing.drawPoint3d(xt,yt - 1, zt, block.WOOL. id, 15) # wait patiently for trigger while self.mc.getBlock(xt, yt, zt) == 46: pass time.sleep(3.6) startPos = minecraft.Vec3(xt,yt,zt) tntBullet = MinecraftTNTBullet(self.mc,startPos,self. direction,self.angle,1) while not tntBullet.update(): time.sleep(0.3) In much the same way as in the earlier part of the tutorial, we will have an update() method for our object which will calculate the new position and return False until we hit something. It looks like this:
Even crazy Steve is awed by the power of the new weapon.
Yeah, it don’t always work right. If you can fix it I’ll let you paint my fence.
def update(self): self.yVelocity += self.gravity oldPos = minecraft.Vec3(int(round(self. Posx)),int(round(self.Posy)),int(round(self.Posz))) self.Posx += self.xVelocity self.Posy += self.yVelocity self.Posz += self.zVelocity newPos = minecraft.Vec3(int(round(self. Posx)),int(round(self.Posy)),int(round(self.Posz))) if newPos != oldPos: if self.mc.getBlock(newPos.x,newPos.y,newPos.z) != block.AIR.id: return True if abs(newPos.x) > 128 or newPos.y > 128 or abs(newPos.z) > 128 or newPos.y < -10: # off the edge of the world or deep underground return True self.mc.setBlocks(newPos.x,newPos.y,newPos.z,new Pos.x,oldPos.y,newPos.z,46,1) height = newPos.y return False Since TNT is quite destructive it is possible that our bullet could rip through quite a bit of scenery before finally coming to rest, but we limit the damage by having the update() method return True when the bullet’s altitude drops below –10. We also do this if it leaves the confines of Minecraft world – that is, if any of its coordinates gets larger in magnitude than 128. Once the fireworks finally die down, we redraw our cannon since it will have suffered some damage at launchtime, and then return the command prompt so that Steve can have another pop at some innocent blocks. Unfortunately, thanks to the unfathomable forces at work when so many TNT blocks are exploding at once, there is an element of randomness in all of this. As a result, blocks can fly off in all directions or detonate at the wrong time – culminating in the unfortunate side-effect of stopping the reaction and leaving half a parabola’s worth of TNT in the sky. The sleep time values on lines 276 and 280 were chosen in a fairly ad hoc manner, so possibly by tweaking these you might be able to improve the situation. On the other hand, no weapon is perfect, and this unreliability is merely a reflection of this ineluctable truth. Have fun experimenting and happy coding! Q 161
MAGAZINE IPAD EDITION The iPad edition of net has been completely rebuilt from the ground up as a tablet-optimised reading experience.
TRY IT FOR FREE TODAY!
You’ll find additional imagery, exclusive audio and video content in every issue, including some superb screencasts that tie in with the practical projects from the issue’s authors. Don’t miss it!
TRY IT FOR FREE TODAY WITH OUR NO-OBLIGATION 30-DAY TRIAL AT UK: netm.ag/itunesuk-261 US: netm.ag/itunesus-261