Have You Powered Your Ap plication with Memcached?

from ource ...rch 2013

Introducing Celery for Python+Django

Developers Let's Try

Have You Powered Your Application with Memcached?

Developers know that in computer/network technologies, the speed of an application is really important. Here, we are going to demonstrate how fast a memcached-powered application will perform.

The article published in the March 2012 edition of OSFY (‘Speed up Your Cloud with Memcached’) covered the concepts related to memcached. In this article, I aim to demonstrate just how much performance improves if you use a memcached layer between your application and database.

For example, consider a Web application. Any website that is served up ‘dynamically’ probably has some static components throughout the life of the page. Loading such information each time through a database query is unacceptable. Let us consider a website that has a very huge list of names in a database, which does not change frequently—and this list has more than 3,000,000 rows of data. Now, you want to display some of the names, which match some user criteria (say, a user clicks on a button). What happens if that page gets reasonably good traffic, and many users click that button to fetch the data? The database will ‘cry’, right? In this case, every click made by each user will make a direct call to the database, causing a really unacceptable overhead to the database. Here, the importance of caching comes into the picture, and as a caching mechanism, memcached will shine.

Given below are the steps to set up an environment for demonstrating the speed of a memcached-powered application.

Step 1: Setting up the database

Create a database called ips in MySQL, and a table called ips_class inside it. Then generate a set of 33554431 IPs using bash expansion:

echo {192..193}.{0..255}.{0..255}.{0..255}

Insert all these IPs into the table ips_class using LOAD DATA INFILE [see http://dev.mysql.com/doc/refman/5.5/en/ load-data.html]

mysql> select count(*) from ips_class; +----------+ | count(*) | +----------+ | 33554431 | +----------+ 1 row in set (18.24 sec)

Step 2: Setting up memcached

Download the memcached source from http://memcached. org/ and install it. Start memcached on the local system on two ports, using the following commands:

memcached -p 11212 & memcached &

I now have two memcached instances, one running on port 11212 and the second on the default port (i.e., 11211).

$ telnet 0 11211 Trying 0.0.0.0... Connected to 0. Escape character is '^]'.

$ telnet 0 11212 Trying 0.0.0.0... Connected to 0. Escape character is '^]'.

What next? Get the memcached_demo.py script, for which you can use the following command:

git clone git@github.com:sukujgrg/mysql_memcached.git

What this script does

This script contacts the local MySQL database and executes two SQL fetch queries (below). On the first execution of the script, it will print the output of the query, which is directly fetched from the database, and will also store that data in a locally running memcache instance. From the next execution onwards, this script will directly fetch data from the memcache. The SQL queries to fetch data in this script are:

select * from ips_class where c_class like '192.168.1.25%'; select count(*) from ips_class;

On first execution of the script:

$ python memcached_demo.py Fetched from DB (Execution time = 17.4954309464 seconds) Total IPs in the DB = 33554431 Fetched from DB (Execution time = 18.8784821033 seconds) All IPs in DB which is matching to the regex 192.168.1.25* = 192.168.1.25 192.168.1.250 192.168.1.251 192.168.1.252 192.168.1.253 192.168.1.254 192.168.1.255

On second execution of the script:

$ python memcached_demo.py Cached Result (Execution time = 0.000431060791016 seconds) Total no. of IPs in the DB = 33554431 Cached Result (Execution time = 0.00189614295959 seconds) All IPs in DB which is matching to the regex 192.168.1.25* = 192.168.1.25 192.168.1.250 192.168.1.251 192.168.1.252 192.168.1.253 192.168.1.254 192.168.1.255

You can clearly see the difference between the ‘Execution time’ in both cases. Very impressive, right?

Of course, there are many options, including invalidating or expiring the data, in memcached. For more on the topic, delve deeper into the References.

References

[0] http://www.ibm.com/developerworks/xml/library/osmemcached/index.html [1] http://www.adayinthelifeof.nl/2011/02/06/ memcache-internals/ [2] http://dev.mysql.com/doc/mysql-ha-scalability/en/ ha-memcached-interfaces-python.html [3] http://blog.echolibre.com/2009/11/memcache-andpython-getting-started/

By: Suku

The author is an open source fan. Feel free to respond to this article and direct your feedback to sukujgrg@gmail.com.

THE COMPLETE M AGAZINE ON OPEN SOURCE

Your favourite Magazine on Open Source is now on the Web, too. LinuxForU.com

Follow us on Twitter@LinuxForYou

CODE SPORT

Sandya Mannarswamy

Over the next few columns, we will take a look at data storage systems, and how they are evolving to cater to the data-centric computing world.

Last month, we featured a special edition of ‘CodeSport’, which discussed the evolution of programming languages over the past 10 years, and how they are likely to evolve over the coming 10 years. The article went on to hazard a guess that the ‘Big Data’ explosion would shift the momentum to languages that make data processing simple and efficient, and make programs ‘datacentric’ instead of the ‘code-centric’ perspective. Many of our readers had responded with their own views on how they see computing paradigms evolving over the coming 10 years. Thanks a lot to all our readers for their feedback and thoughts.

One of our readers, Ravikrishnan, sent me a pertinent comment, which I want to share: “Thank you for your article on the evolution of programming languages. Indeed, there is a heavy momentum towards processing huge amounts of data using commodity hardware and software. While the basic concepts and algorithms of computer science would continue to hold sway, the sheer scale of the data explosion would require programmers to understand and apply algorithms where data does not fit in main memory. Hence programmers need to start worrying about data latency of secondary storage such as flash SSD/disk storage systems. In a way, the shift towards data-centric computing means more intelligent storage systems, and a need for programmers to understand about state-of-the-art storage systems, where big data is stored, processed and preserved. While this is not a traditional topic covered in ‘CodeSport’, given the importance of data storage in a ‘Big Data’ world, it would be great if ‘CodeSport’ does a deep-dive into state-of-the-art storage systems in a future column.”

It was a timely reminder for me. While I have discussed various ‘Big Data’ computing paradigms in some of our past columns, I have not covered storage systems at all. So over the next few columns, I am going to discuss storage systems, and how they have evolved over years to cater to the ‘Big Data’ explosion. I will take readers through some of the challenging problems as well as the state-of-the-art research directions in this space.

A storage systems primer

Let us start our journey into storage by understanding some of the basic concepts and terminology. In the traditional view of storage, we all know about the triumvirate of the CPU, memory and disk, where the hard disk (also known as secondary storage) is part of, or directly attached to your computer, and acts as the permanent storage. From now onwards, when I use the term ‘storage’, I actually imply the traditional secondary storage, which acts as the backup to the main memory (which is the primary storage). These include hard disk drives, flash/SSD storage, tape drives, etc.

Traditional HDDs are accessed using a variety of protocols such as SCSI, ATA, SATA and SAS. SCSI stands for Small Computer System Interface and is a parallel peripheral interface standard widely used in personal computers for attaching printers and hard-disks. ATA is another interface used for attaching disks; also known as IDE, wherein the controller is integrated into the disk drive itself, ATA is also a parallel interface like SCSI and both have their equivalent serial interfaces namely, Serial SCSI (abbreviated as SAS) and Serial ATA (abbreviated as SATA) which allow a serial stream of data to be transmitted between the PC and the disk drive.

Note, however, that in the traditional view of storage, it is part of the compute server, since it is directly attached to the server and is accessed through

it. It is not an independent addressable entity, and is not shared across multiple computers. Typically, this is known as ‘Direct Attached Storage’ (DAS). Access to the data in secondary storage is through User the server to which it is attached; hence, if the server is down due to some failure, the data becomes inaccessible. Also, as data storage requirements increase, we need to have greater storage capacity. We produce User 2.5 quintillion bytes of information every day, out of the Web searches we do, the online purchases we make, the mobile calls we make, and the social network presence we have (a quintillion is 1000 x 1000 x 1000 times a billion). Given the volume of User Big Data that gets produced, the storage requirements go on increasing exponentially. However, in case of Direct Attached Storage, the number of I/O cards (for example, SCSI cards) that can be connected to a computer is limited. Also, the maximum length of a User SCSI cable is 25 M. Given these restrictions, the amount of storage that can be realised Figure 1: Disks on Server 2 are full using the conventional directly attached storage is limited. Also note that DAS results in uneven storage utilisation. If one of the servers has used up all its disk storage and needs further storage, it cannot use any free storage available in the other servers. This is shown in Figure 1, where disks on Server 2 are full, whereas there is excess disk capacity available on Servers 1 and 3. However, it is not possible for Server 2 to use them.

As opposed to the traditional servercentric paradigm we have seen above, in a storage-centric view, storage exists as an independent entity, apart from the compute servers. Storage can be addressed independently, from multiple servers. A simple form of a storage network is shown in Figure 2, where we can visualise the SCSI cables of the computers having been replaced with connections to the network storage. Though the storage is an independent entity on a network, to the operating system running on the compute server, it appears as if locally attached to the compute server. Figure 2: Storage network

Two popular forms of storage-centric architectures are Network Attached Storage (NAS) and iSCSI (internet SCSI). SANs provide block-level access Storage Area Network (SAN). The latter (SAN) allows to storage, just like traditional locally attached storage. storage entities to exist in a network, which can be accessed In contrast, in the case of Network Attached Storage or from compute servers using either special protocols such NAS, a dedicated storage computer exists as an entity as Fibre-Channel or standard TCP/IP protocols such as on the network, and is accessible from multiple compute

LAN Server 1 Disk Disk Disk Disk Disk Disk

Server 2 Disk Disk Disk

Server 3 User User User User

LAN Server Server Disk

Storage Network Server

servers concurrently. Unlike a SAN, a NAS provides file-level storage semantics to multiple compute servers, appearing as a file server to the operating system running on the compute server. Internally, the NAS file server would access the physical storage at block level to access the actual data, while this is transparent to the OS on the compute server, which is exposed only to file-level operations on the NAS server.

Hybrids of SAN and NAS also exist. Since there is no file system concept for SANs, various file protection and access-control mechanisms need to be taken care of in the OS running on the compute server. In case of NAS, file protections and access control can be enforced at the NAS server. The next concept we need to understand in the storage domain is Scale-up Storage vs Scale-out Storage, which we will discuss next month.

Remembering Aaron Swartz

It has been almost two months since the death of Aaron Swartz. Most of us would have read about the enormous outpouring of grief caused by this tragic loss. Aaron Swartz was a programmer first and foremost, and the reason I wanted to mention him in our column was not just because he was a well-known activist who fought for the freedom of information on the Internet, but because he is a sterling example of what differentiates a great programmer from the run of the mill. He had an enormous enthusiasm for building software that solves challenging problems. He was involved in the development of the RSS format, wrote the Web.py framework, and was a technical architect of reddit.com, just to mention a few examples of his work. He had a great passion for expanding and sharing his knowledge with all developers. Rest in peace, Aaron.

My ‘must-read book’ for this month

This month’s must-read book suggestion comes from one of our readers, Aruna Rajan. She recommends the book ‘Introduction to Information Retrieval’ by Christopher D. Manning, Prabhakar Raghavan and Heinrich Schütze. This book focuses on various information retrieval techniques, including the most popular one of Web search engines. The book is available online at http://nlp.stanford.edu/IR-book/html/htmledition/ irbook.html. Thank you, Aruna, for your suggestion.

If you have a favourite programming book/article that you think is a must-read for every programmer, please do send me a note with the book’s name, and a short write-up on why you think it is useful, so I can mention it in the column. This would help many readers who want to improve their coding skills.

If you have any favourite programming puzzles that you would like to discuss on this forum, please send them to me, along with your solutions and feedback, at sandyasm_AT_yahoo_ DOT_com. Till we meet again next month, happy programming and here’s wishing you the very best!

By: Sandya Mannarswamy

The author is an expert in systems software and is currently in a happy state between jobs. Her interests include compilers, multi-core technologies and software development tools. If you are preparing for systems software interviews, you may find it useful to visit Sandya’s LinkedIn group ‘Computer Science Interview Training India’ at http://www.linkedin.com/ groups?home=&gid=2339182.

Month

March 2013

April 2013

May 2013

June 2013

July 2013

August 2013

September 2013

October 2013

November 2013

December 2013

January 2014

February 2014

osFY Magazine attractions during 2013-14

theMe Featured List

Virtualisation Virtualisation Solution Providers

Open source Databases Certification & Training Solution Providers

Netwok monitoring

Open Source application development Mobile Apps

Cloud

Open Source on Windows Web Hosting Providers

Open Source Firewall and Network security E-mail Service Providers

Android Special

Kernel Special

Cloud Special Gadgets

IT Consultancy

IT Hardware

Linux & Open Source Powered Data Storage Network Storage

Open Source for Web development and deployment Security

Top 10 of Everything on Open Source IT Infrastructure

Build Your Own Web Page With QForms

QForm is the state-conscious, event-driven object management system for creating Web pages. QForm is the ‘C’ and ‘V’ of the MVC architecture of QCubed. It relieves you from typing HTML code for every single element, and dealing with monotonous and repetitive code for handling user actions.

The entire world is moving to the Web. Socket programming is becoming passé for even lowerlevel programs as JavaScript catches on with JSON, while other Web technologies such as Web sockets promise better results in an increasingly distributed world. We have become habituated to using Web services that promise to be reliable. Facebook, Twitter, GMail, and even WordPress and Drupal, all deliver the content so well that we often forget that the Web is a stateless mess. No machine remembers anything about a previous action, primarily because HTTP does not support states.

What is a state anyway?

When I say stateless, I mean to convey that in most cases, when we log into a Web application, the server does not know the state of the application on the client side. Which elements are being displayed, which ones are hidden, and what can be done on the page—the server knows none of this. To make sure that an invalid request does not cause a security breach, and to assure the right response for every request, a number of methods are in use. In most cases, cookies are used to validate a user’s identity and authenticity, and the programmer has to build a system that

could keep track of what the user does in his browser. This is important to make sure that the server side of the application does not send any invalid or unnecessary data to the client. QCubed, using its QForm library, however, takes a snapshot of the page and saves it on the server side before sending the HTML to the client. This state is used in later requests from the client. This technique improves the application reliability, and reduces the programmer’s work and complexity of design by a very large margin.

What is QForm?

In the previous article, we have already looked at Codegen, which helps build forms, and takes care of the ‘Model’ part of QCubed’s MVC architecture. QForm is the part of QCubed that handles the ‘View’ and ‘Controller’ parts of MVC. QForm is responsible for creation of the page, along with elements (we would call them controls) on the page (text-boxes, buttons, text, etc.) and helps the developer determine the events that will trigger actions, and the way in which those actions are to be handled. Before we go into code, let us first answer an interesting question: “If QForm handles the Web page, why is it named QForm, not QPage?”

The reason is, QCubed treats the entire page as a ‘Form’ that submits to itself. This technique makes sure that the client does not have to know the names of other files used in the process, nor their paths, nor their function names—a better level of security, you see! For example, if you allow the user to insert data in newpost.php, while insert_newpost. php handles the values from the form submission and inserts data into the database, then a bug in insert_newpost.php can possibly open a plethora of vulnerabilities! QCubed would normally hide the information about which file handles which request on the server. This technique has another benefit; it helps handle multiple events and actions easily. Now, without further ado, we will delve into the code.

Building the first Web page in QCubed

In previous articles, we created a small MySQL database to store blog posts and comments. Let us create a Web page that allows us to insert a new blog post in the database.

Code

Create a file named newpost.php in the root directory of your QCubed installation, and put the following code in it:

<?php

require_once 'qcubed.inc.php';

class NewPost extends QForm { // Declaring elements—Input Boxes protected $txtTitle, $txtBody; // Submit button protected $btnSubmit; // Text labels protected $lblMessage, $lblResult; protected function Form_Create() { // Set message at top $this->lblMessage = new QLabel($this); $this->lblMessage->Text = "Create new Blog Post"; // Define Title input box $this->txtTitle = new QTextBox($this); // Define Body input box $this->txtBody = new QTextBox($this); // make it a textarea $this->txtBody->TextMode = QTextMode::MultiLine; // Submit Button $this->btnSubmit = new QButton($this); $this->btnSubmit->Text = 'Submit'; $this->btnSubmit->AddAction(new QClickEvent(), new QAjaxAction('btnSubmit_Click')); // Result text $this->lblResult = new QLabel($this);

protected function btnSubmit_Click($strFormId, $strControlId, $strParameter) { // Create new post $objNewPost = new Post(); // Load defaults $objNewPost->Initialize(); // Assign values from the form $objNewPost->Title = $this->txtTitle->Text; $objNewPost->Body = $this->txtBody->Text; // Save $objNewPost->Save(); // Saved. Show the result $this->lblResult->Text = "Post was saved"; // Empty the textboxes $this->txtTitle->Text = ''; $this->txtBody->Text = '';

NewPost::Run('NewPost');

And in the same directory, create another file named newpost.tpl.php, and put the following in it:

<html><head> <title>Create new Blog Post</title> <link rel="stylesheet" type="text/css" href="<?php echo __ CSS_ASSETS__ . "/styles.css" ?>" /> </head><body> <?php $this->RenderBegin(); ?>

<?php $this->lblMessage->Render(); ?> Title <?php $this->txtTitle->Render(); ?> Body <?php $this->txtBody->Render(); ?> <?php $this->lblResult->Render(); ?> <?php $this->btnSubmit->Render(); ?> <?php $this->RenderEnd();?> </body></html>

Then visit the newpost.php file in your browser; the output should look like Figure 1. Before you think that the strong HTML tag at the top is because of an error somewhere, let me tell you that such output is expected; we will know why shortly.

Understanding the code

QForm separates the presentation layer and program logic into two different files. The newpost.php file contains the logic of the page (controller), while the newpost.tpl.php governs the layout (view). We begin with the logic.

The first line of the controller file includes the qcubed. inc.php file from the root directory of your QCubed installation. This readies the framework capabilities to be used throughout the file. Next, we define a class NewPost, which is derived from the QForm class. Before we proceed with creating the HTML controls (such as textboxes, buttons, etc), we define them as class members. If these variables are not defined as class members but as local variables inside the Form_Create function, you would not be able to render them, much less handle any actions on them. This is because the layout is created by the view file, which has no access to variables in the Form_Create function but can access class members.

The initial behaviour and content is determined by the Form_Create function. In this function, we define the controls (HTML elements). For example, when we do:

$this->lblMessage = new QLabel($this);

Here we actually tell QCubed to make an lblMessage class member variable of type QLabel, with the current QForm as the parent. The $this variable passed to the QLabel constructor indicates that the parent of lblMessage is the current QForm class.

The concept of parent is very important to HTML controls in QCubed (QCubed calls them QControls). One of the biggest features of the QForm and QControl library is that it allows you to create sophisticated composite controls and panels using other child controls. These controls and panels can have their own layout, defined

localhost/lfy/newpost.php

Create New Blog Post

Title

Body

Submit

Figure 1: Initial form created by Form_Create()

behaviour and database interactions; they can have their own validation logic and are highly reusable. For example, you could create a calculator panel that can do basic math functions. Once you have created the panel, you can use it on any page by adding just a couple of lines; there is no need to write the same code over and over again for every page. Since controls can be nested, the concept of parent becomes important in controlling the behaviour, placement and visibility.

QLabel is a control that, when rendered, will create text on the page. We set the text to be displayed using the QLabel’s Text property by doing:

$this->lblMessage->Text = "Create new Blog Post";

In a very similar fashion, we define other controls. The two text-boxes txtTitle and txtBody are to allow the user to input the title and body of a blog post; the btnSubmit button is to submit these values to the server. For most QControls, the Text property defines the text they would display. For example, QLabel’s Text will appear directly where rendered; QButton’s Text will appear on the button; QTextBox’s Text will appear as the text in the input box. Every control has some common properties and some unique ones. For example, the TextMode property can exist in only QTextBox-based controls (QTextBox, QIntegerTextBox and QFloatTextbox) and can convert the text-box to a text-area, a password input control, or an HTML5 ‘search’ text-box.

It is noteworthy that the Text of the label lblResult was kept blank, only so we could change it to something else later on.

This line:

$this->btnSubmit->AddAction(new QClickEvent(), new QAjaxAction('btnSubmit_Click'));

adds an action to the btnSubmit button for a click

event. The AddAction function takes two parameters— the event, and the action. When specifying the event type, in most cases, creating a new event object inline is enough and recommended. QCubed supports all events allowed by JavaScript, and you can choose any one of them. The second parameter is the action, which can be one of these three types: 1. QServerAction: This will reload the entire page with the changes defined in the event-handler function.

When a QServerAction is executed, the client sends an

HTTP POST request over the wire. The server receives the request and runs the event-handler function, and makes any required visual changes to the page accordingly, and then sends back the entire page, thus reloading the page. 2. QAjaxAction: This will render the changes without reloading the page (using AJAX). In this case, the client submits the form using an AJAX request. The server runs the event-handler function, and records any changes needed for the layout. These changes are then transformed and sent to the client for reflecting the changes on screen. 3. QJavaScriptAction: This will call a JavaScript function on the client side. In this case, the event-handler function must be present on the client side. No calls are made to the server.

All these actions would take one parameter—the name of the function to be called when the event occurs. In our case, the function btnSubmit_Click will be called when btnSubmit is clicked. In case of QJavaScriptAction, the function passed in as the argument must have the braces included in the call, and the function must be present on the client side—either included in another JS file, or inline in the document. QCubed cannot validate the availability of the event-handler functions, and it is the developer’s job to make sure that they exist.

Our event-handler function btnSubmit_Click takes in three parameters: 1. strFormID: This is the FormID (same as the class name). 2. strControlID: This is the control ID of the HTML element that initiated the action. Control IDs are generated automatically by QCubed when a new control is created. These are the same ones used for the id attribute of the element when it is rendered. The developer can enforce a desired ID by supplying it as the second parameter to the constructor, like:

$this->lblResult = new QLabel($this, ‘resultLabel’);

This will set the control ID of lblResult to ‘resultLabel’. QCubed allows only alphanumeric control IDs. 3. strParameter: This is any extra parameter that needs to be passed to the function. This parameter can be set by using the ActionParameter property of the QControl.

In most cases, you would not need to use them, but they are needed to be present in the event-handler function. With time, you would discover that these three parameters can help you reduce coding efforts and cut short development time even more.

Now we take a look at what the event handler does. Inside the event handler, we create a new blog post object named $objNewPost. (Remember, QCubed creates classes for each of your tables, and your rows are treated as objects of the table class.) Like we did in the last article, we initialise the object with database defaults, set the value of the post title and body, and save the post. We also set the text of lblResult to ‘Post was saved’ and empty the text-boxes txtTitle and txtBody. There are a few points worth noting: 1. We retrieve the values input into QTextBox controls by just fetching theirText property value. The same property is used to set the values as well (as we did for the QButton and QLabels). You do not have to make any extra effort to retrieve values from $_POST or $_GET arrays, anywhere. 2. Though all the actions are taken via AJAX, QCubed does not require us to deal with the low-level

JavaScript, and XmlHttpRequests and responses. You do not write any HTML to be sent. Just set the values to what you want, and QCubed will calculate what needs to be done by itself. 3. The event handler does not return any value—eventhandler functions will not return values. If you want to terminate the function conditionally (e.g., if the input in the txtTitle text-box was blank), then simply return without any value. 4. The server keeps a copy of the state of the application on the client side, and every time the page updates itself via a QServerAction or

QAjaxAction, the server updates and syncs the copy of the page state. You do not need to worry about what data the variables contain on the server side—they always contain the last submitted value.

If you use the back button, the server knows that you clicked the back button (well, not exactly, but it's safe to assume it at the outset of our journey with QCubed) and uses the corresponding state to produce the HTML. Whatever be the case, in the end, you will not have to go through anomalies.

All the code in the controller file is useless if you do not Run the file. To run the file, we need to call the public static function Run, with the class name as the parameter. This function is built into QForm, and is not meant to be written manually. It tells QCubed to search for the template file in the same directory. By default, it would look for a file with the same name but ending in

‘.tpl.php’ instead of ‘.php’. If you want to use a different file as the view file, you may pass the filename as a second parameter to the Run function, like:

NewPost::Run(‘NewPost’, ‘newpost_alternate.tpl.php’);

This will force QCubed to use newpost_alternate.tpl.php as the template file. Such a technique can be used to create multiple layouts for the same page!

The view file

The view file contains the HTML skeleton of the page. The only thing you do not have to write in this file is the HTML for the individual controls, which you defined in the controller file. You would instead call the Render function on those controls—this makes sure that QCubed takes control of their behaviour. The view file would typically begin with the HTML head section containing the page title, and any extra JS and CSS to be loaded. You might want to create a static PHP page to be ‘included’ for the purpose. Remember that, as of now, QCubed does not come with any special functions to help you with the HTML head section. All controls must be rendered between two lines: ‘$this->RenderBegin();’ and ‘$this->RenderEnd();’. An attempt to render a control beyond these boundaries will cause the framework to produce an error and halt. It is advisable to put the RenderBegin() function just after the body tag begins, and RenderEnd() just before the body tag closes. Again, controls will not render themselves unless you ask them to. If you removed the line ‘$this>btnSubmit->Render();’ from the view file, then the submit button will disappear from the page.

And now, the one question we left unanswered—why does lblMessage show the strong tags on the page instead of making the text bold? It is because QCubed enables the htmlentities function for the text in all QLabel variables. In the Form_Create() function, just set:

$this->lblMessage->HtmlEntities = false;

You will see (Figure 2) that the strong tag gets back its importance!

Behaviour

The page behaviour is pretty obvious. It will accept the title and body of a post, and when the Submit button is hit, it creates a new post and saves it to the database, erases the contents of the title and body input boxes, and displays a message that the post was saved.

The real power of QForms

Throughout this article, we have discovered some of the core features and functionalities of QForms. Here is how they help you: 1. Separate View and Controller files: QForm separates the

Create new Blog Post Title

Body

Submit

Figure 2: With HtmlEntities set to false, QLabel honours HTML code in 'Text'

HTML layout from behaviour. Chances of messing up

HTML and PHP code are lower than ever. Since ‘model’ is automatically taken care of by Codegen, we are not going to worry about that. 2. Event-driven: You define a control, bind it to an event, and write the event-handler function for the same. QForm does the rest. 3. AJAX is easy: One does not need to write JS functions, remember form element names and IDs, and there is no need to deal with the $_POST array either. QCubed does the dirty work, allowing you to focus on ‘what’ to do rather than ‘how’ to do. 4. AJAX to page reloads and vice versa: Changing the behaviour of an action from a ‘full page reload’ to

‘AJAX based’ can be done by changing just one word—the action type. 5. Less HTML than ever: QCubed produces HTML code for single elements, while you have to take care of only the overall page structure. Combined with Codegen, it means roughly 80 per cent reduction in the amount of HTML and

JavaScript you would write otherwise. Also, the fact that most HTML code is automatically generated means that you have to deal with far fewer syntax errors in HTML code. 6. Stateful: The server knows the state of the page on the client side, and that improves security as well as ease of use for the programmer.

This article hardly scratches the surface of QForm. There are a lot of other powerful features built into QForm; some interesting ones include—control and form validation, custom controls, database-based session handling, centralised form state handling, HTML input validation and purification for preventing XSS attacks, and so on. We will peek into a few of them in the next article of this series.

By: Vaibhav Kaushal

The author is a 25 year old Web developer from Bengaluru who also happens to be a core contributor to QCubed. He loves writing for technology magazines when he is not busy fiddling around with QCubed or developing his website (http://www.cintegration.com/), where he gives QCubed-related advice.

Anil Seth

Video Subtitle Translation

This column looks at how to make MOOC videos accessible to non-English speakers.

The news that California State Universities were tying up with Udacity for inexpensive MOOC (massive open online courses) for credit was not surprising. The only surprise is the speed with which the changes are taking place. I am inclined to agree with the analysis on TechCrunch (http://goo.gl/fRppX) that this online project is going to end college education, as we know it. The advantage is that a wealth of options will become available for all to learn. The disadvantage is that most of the content will be in English.

Can the videos in English be accessible to students not very comfortable with the language? They would benefit a lot if subtitles (http://en.wikipedia.org/wiki/Subtitling) are provided in their language.

So how does one go about getting content with language subtitles in it? The time and effort required to translate the content into the vast number of languages would be huge. Crowd-sourcing can be an answer, for example, by using http://www.amara.org/.

Subtitling/captioning and the Web

Video players can merge video frames with subtitles while playing. There are numerous formats available for subtitling; the basic content, though, will be similar. Each subtitle is a text line to be displayed, along with information about when to start the display and when to stop. The best way to provide this information is by specifying the starting time and the end time, or the duration for each subtitle. This makes the subtitle file independent of the frame rate at which a video file may be created. One common format is the SubRip (.srt extension), which was the basis of another useful format, WebVTT, which may become widespread, as it is now a W3C standard. W3C has a competing timed text (TTML) standard, which is an XML document, intended to ensure interoperability of streaming video and captions on the Web.

However, the HTML5 video element supports a track element, which can be used to specify the subtitle file, perhaps in WebVTT format, and meet the needs of streaming video with captions in a user-defined language.

It is common these days to have same language subtitles (http:// en.wikipedia.org/wiki/Same_language_subtitling) for television and video. The obvious advantage is that it makes content accessible to the hearing-impaired. Another advantage is its educational value. It helps practice reading as an incidental and sub-conscious part of entertainment. However, on the Web, it has an even greater significance, which is probably why Google has been in the forefront of the WebVTT format—it allows video content to be searched easily!

Machine translation of captions

Manual translation is time-consuming and expensive, even with crowd-sourcing. The quantum of content is too large to be translated within a useful timeframe for all languages of interest. Furthermore, the content of technical courses is likely to be unambiguous, and not open to subtle differences in the interpretation of words and phrases. Machine translation may provide the answer.

If you search the Web for open source machine translation engines, you will find Moses (http://www.statmt.org/moses/), a statistical translator, and Apertium (http://www.apertium.org/), a rulebased translator. Moses' capabilities are, in principle, similar to the software used by Google and Microsoft. However, it does not come with language models and datasets for carrying out translation—so, for it to be useful, you need to provide language models and training datasets. Apertium, however, comes with translation capabilities for a number of language pairs. The current list and status can be seen at http://wiki.apertium.org/wiki/List_of_language_pairs.

Unfortunately, the progress in pure open source tools is likely to be slow. The reason is fairly obvious; Web-based translators from Google, Microsoft and others provide excellent functional alternatives. These sites have a wealth of data, e.g., pages from multi-lingual sites, which may be used for training and fine-tuning translations.

If same-language subtitles are available, you may rely on machine translation for generating subtitles in a language for which a machine translator is available. YouTube provides this feature for translated captions on its site by using Google Translate, e.g., http://www.youtube.com/watch?v=1St0tJVGCW8. So, the easiest option is to use Google or Bing translators on the Web. Several open source tools had been created to translate subtitles using the Google Translate API. However, these tools no longer work after changes in the usage policy of Google's Translate API, but they may be modified to use Microsoft's translation API instead.

We can hope that the MOOC course videos will make samelanguage captions available, so that machine translation can spread this knowledge to an even wider group of learners.

A side lesson: The sudden changes in the usage policy for the Google Translate API reinforce the need for pure open source solutions for translation applications as well as for language models and translation datasets. The generosity of commercial sites will be aligned with their commercial interests and cannot be taken for granted.

By: Anil Seth

The author has earned the right to do what interests him. You can find him online at http://sethanil.com/, http://sethanil.blogspot.com/ and reach him via email at anil@sethanil.com.

Next Article

Introducing Celery for Python+Django

Developers Let's Try

Have You Powered Your Application with Memcached?

Step 1: Setting up the database

Step 2: Setting up memcached

What this script does

References

THE COMPLETE M AGAZINE ON OPEN SOURCE

Your favourite Magazine on Open Source is now on the Web, too. LinuxForU.com

Follow us on Twitter@LinuxForYou

CODE SPORT

A storage systems primer

Remembering Aaron Swartz

My ‘must-read book’ for this month

Month

osFY Magazine attractions during 2013-14

theMe Featured List

Build Your Own Web Page With QForms

What is a state anyway?

What is QForm?

Building the first Web page in QCubed

Code

Understanding the code

Title

Body

The view file

Behaviour

The real power of QForms

Create new Blog Post Title

Body

Video Subtitle Translation

Subtitling/captioning and the Web

Machine translation of captions

More articles from this publication:

Introducing Celery for Python+Django

Try Your Hand at the Dropbox API for Android

Develop Android Apps with Ease

Kick-Starting Virtualisation with VirtualBox

Deploying a Ticket Request System with OTRS

A Look at the Basics of LVM

Ensuring Security on Open Source Virtual Platforms

Planning an Android-Based Device for an Enterprise

Programming Socket in C (TCP

This article is from:

ource ...rch 2013