Introducing Celery for Python+Django

from ource ...rch 2013

Try Your Hand at the Dropbox API for Android

For background task processing and deferred execution in Python with Django, Web developers and system admins can try out Celery.

Asynchronous mass email delivery, clickstreams like the number of hotels being watched or the number of likes, image resizing, video processing, connecting to third-party APIs, enhancing the user experience by executing tasks in the background rather than in the request-response loop, or executing a large number of queries on a database for reporting... these are some of the scenarios in which you might want to use distributed job queues in order to scale your application. Job queues also separate the tasks to execute in real-time from those to be scheduled for later. There are many use cases in which job queues help you achieve better user experiences. This article introduces readers to the use of Celery to leverage this in Python and Django applications.

Celery is based on distributed message-passing for asynchronous task queues/job queues. It is capable of supporting both real-time operations as well as scheduled jobs. The tasks can be run on a single worker or multiple workers concurrently, to take advantage of multiprocessing. Celery provides a powerful and flexible interface for defining, executing, managing and monitoring tasks. A Celery system can consist of multiple workers and brokers, yielding high availability and horizontal scaling. Celery is suitable for applications that need to achieve any of the following: 1. Execute tasks asynchronously. 2. Distributed execution of expensive processes. 3. Third-party API usage. 4. Periodic/ scheduled tasks. 5. Retrying tasks. 6. Enhancing the user experience.

Celery architecture

Task queues are used to distribute work across workers. Celery task queues are based on the Advanced Message Queue Protocol. By default, it uses RabbitMQ as its message broker; however, users are not limited to RabbitMQ but can use Redis, MongoDB or Beanstalk too. Figure 1 depicts this process.

Step 1: The AMQP receives a task from the client, which may be a Web app or a Python program.

Step 2: The workers constantly monitor the queue; as soon as a message is dropped in the queue, one of the workers picks it up and executes it.

Step 3: Depending on the configuration, it may or may not save the result, once it has finished processing the execution.

Setting up Celery

Although the choice of message broker is entirely your decision, for this article, I assume we are using RabbitMQ (it's what I use in production, too). Before installing Celery, you must have RabbitMQ installed and running (start it with rabbitmq-server start). Then, all you need to install Celery is pip install –U celery and you're ready to create your first program using Celery.

Make a project folder, and in it, create a file tasks.py, which will contain the tasks you want to perform using Celery. Here's a sample program I’ll be using to fetch JSON and read its contents:

WebApp 1 AMQP 2

W.N.

Workers

Result Backend

Figure 1: Block diagram of Celery architecture

from celery import Celery

#Configure celery. celery = Celery('tasks', broker='amqp://guest@localhost//')

@celery.task #Decorator which defines the underlying function as a celery task. def fetch_data(json_name): sleep(10) url_to_open = "http://localhost/%s" % json_name req = urllib2.Request(url_to_open) opener = urllib2.build_opener() f = opener.open(req) data_fetched = simplejson.load(f) print data_fetched return data_fetched

Now run the celery daemon from the terminal using the following command:

celery worker -A tasks --loglevel=INFO

These are the minimum arguments you need to pass to start the service. Other options like events, concurrency levels and CeleryBeat can also be passed as arguments. You’ll learn about them later in the article.

In another terminal, use the Python interpreter to call the tasks module file:

>>> from tasks import add, fetch_data >>> result = fetch_data.delay('sample.json')

Next, track the task state/fetch the result. There are a variety of ways to achieve this, depending on your use case. For example: You just want to execute the task, and don’t want to save the result. You might want to check if the task has finished executing, or is still pending. Do you want to save the result in the message queue itself, or in MySQL or a back-end of your choice?

To achieve the third, you need to configure this setting in your tasks.py file, as follows:

celery = Celery('tasks', backend='amqp', broker='amqp:// guest@localhost//')

Now the message queue is configured to save the result of the job. You can configure any back-end that you wish to use here. This is how an immediate task is executed, though there might be use-cases wherein you would want to run scheduled jobs. To run a task as a scheduled task, you need to define the schedule in the decorator of the task, as follows:

@periodic_task(run_every=datetime.timedelta(minutes=1)) def print_name(): print "Welcome to Tutorial"

The entry for period can be in the form of timedelta or in the form of cron too. Now the command for running the daemon would be:

celery worker -A tasks --loglevel=INFO –B # Where B means running

CeleryBeat is used for periodic tasks; if the argument –B is not passed, then it will not run periodic tasks.

Next, let’s look at how we can integrate Celery with Web frameworks—in this case, Django.

Integrating Celery with Django

Create a Django project using django-admin.py startproject simple_django_project, and then create an app in the project with python manage.py start app celery_demo. Next, install django-celery with pip install django-celery and then modify the settings.py file to configure the message queue, as shown below:

import djcelery djcelery.setup_loader ()

INSTALLED_APPS = (

... 'Djcelery' , )

BROKER_HOST = "localhost" BROKER_PORT = 5672 BROKER_USER = "myusername" BROKER_PASSWORD = "mypassword" BROKER_VHOST = "myvhost"

Next, sync the database with python manage.py syncdb, after which create tasks.py inside the app. Now you can create

Continued on page 60...

S.G.Ganesh

Why Care About Design Smells?

Design smells are poor solutions to recurring implementation and design problems. This article gives readers an overview on what design smells are, why we should be concerned about them, and what we can do to fix them.

Martin Fowler describes design smells as “structures in the code that suggest (sometimes they scream for) the possibility of refactoring.” Design smells are known by various other names, such as code smells, design flaws, design anti-patterns, etc. Martin Fowler popularised the term ‘smells’ in his book on refactoring, and this is the term known to most programmers today.

Why should we care about design smells? Because they are ‘problem patterns’ that can negatively impact the quality of the design, and ultimately affect the quality of the software. One way of looking at the cost of taking short-cuts to address immediate concerns is that the project accumulates ‘technical debt’. In other words, when we take short-cuts and don’t do the right thing, we have to pay a price for it—a form of ‘debt’. Over a period of time, if not addressed, this debt accumulates and when it crosses a threshold, it can result in completely derailing the software project. The stories of such failed projects are all too common. Note that this phenomenon is common in both commercial and open source projects.

Consider a software development project that was led by an inexperienced designer. Assume that the project was delivered under tight schedules, and the main objective of the project was to satisfy all the functional requirements elicited from the customers, without providing much focus on NonFunctional Requirements (NFRs). This short description of the project is in fact sufficient for you to understand that the code will have numerous smells! Now, let’s explore why.

One cause of design smells is when the designer makes wrong design decisions – a mistake that inexperienced designers are prone to make. Another cause is tight development schedules. When a project is developed under stiff and implausible deadlines, programmers and designers take numerous ‘short-cuts’, i.e., they do what works instead of doing what is right. From the code, you can observe these short-cuts in the form of design smells. When functionality completion is the main focus of software development and when NFRs are not given due importance, it results in sloppy design, which gets noticed as design smells.

Let us discuss a specific smell to get a better understanding of the subject.

One of the smells described by Martin Fowler is the ‘refused bequest’. The word ‘bequest’ means ‘a gift of personal property by will’. So, the term ‘refused bequest’

java.util.Date

java.sql.Date java.sql.Time

Figure 1: An example of the ‘refused bequest’ smell from JDK

means ‘not accepting inherited property’. In the context of object-oriented programming, a refused bequest means ‘derived classes reject what is inherited from the base class’.

Why is it a problem? Because it violates Liskov’s Substitution Principle (LSP), which is a cardinal rule in OO programming/ design. LSP states that, “We should be able to use a base reference/pointer without being aware of whether it's pointing to a derived object or not.” When there is no logical or conceptual ‘is-a relationship’ shared between the base and derived classes, we cannot use a base pointer/reference without knowing if it is pointing to the base type object or the derived type object.

Let us look at an example of this smell from the open source Java Development Kit (JDK). The java.util.Date class supports both date and time functionality. The class has two derived classes: java.sql.Date and java.sql.Time, which represent date and time responsibilities (see Figure 1). The java.sql.Date class supports only date-related functionality, and throws exceptions if time-related functionality is invoked! Consider the following code segment, for example:

java.util.Date date1 = new java.util.Date(2013, 01, 01); System.out.println(date1.getSeconds()); // prints 0 since zero seconds elapsed // from the start of the day

java.util.Date date2 = new java.sql.Date(2013, 01, 01); System.out.println(date2.getSeconds()); // throws java.lang.IllegalArgumentException // since time related methods // are not supported in java.sql.Date

As you can see, the ‘refused bequest’ smell can have a real effect on the users of the software: when programmers use these two classes, they can unexpectedly get an IllegalArgumentException. Since JDK is a public API, it is difficult to fix this mistake, since changing the interface of these classes will result in breaking backward compatibility.

So how can one fix (i.e., refactor) this smell? Clearly, java.sql.Date and java.sql.Time require parts of the functionality from java.util.Date. So, if these classes use composition and contain an instance of the java.util.Date class, this problem could be solved. [As an aside, it is a bad practice to use the same name for both the base and derived classes—the name of both base and derived class is Date, in this case. The compiler accepts it because the base and derived classes are located in different packages. To address this problem, the names of the classes java.sql.Date and java.sql.Time could be changed to java.sql.SQLDate and java.sql.SQLTime respectively, to avoid having the same names (though in different packages).]

Unlike ‘refused bequest’, most other smells don’t have visible or directly perceivable effects on the end software quality. However, that doesn’t mean they are not important. An analogy would be the internal cracks in a bridge or a building. These cracks may not immediately cause the structure to collapse, but if left unaddressed, they would develop further and will eventually bring down the structure. Similarly, design smells, whether their impact is externally perceivable or not, have a real impact on the quality of the software. If they are left unaddressed, they can wreak havoc and eventually result in the failure of the software project.

Continued from page 58...

a URL entry in urls.py that maps to a function in view, which will be used to call the tasks that we have defined in tasks.py. Run the Celery daemon now, with the following commands:

python manage.py celeryd -l info -c 2 # Without CeleryBeat python manage.py celeryd -l info -c 2 –B # With CeleryBeat

This is a simple method to integrate Celery with Django.

Adding multiple workers

Now scaling workers is not a concern—all you need is to ship your tasks app to a new machine, set up Celery and just start running it. The Celery daemon will start talking to the message queue, and multiple workers will start executing tasks. Celery makes sure that your task is executed once, and not by multiple workers.

Monitoring

As your application grows, so will the need to make it more stable and robust. To achieve this, you need to monitor all the components of your Celery set-up.

Monitoring RabbitMQ

To check the number of messages in your queues via the console, simply run rabbitmqctl-status, which will list all queues, with the number of messages in each queue. For a GUI-based output, you can simply install the RabbitMQ management plug-in.

By: S G Ganesh

The author works for Siemens (Corporate Research & Technologies), Bengaluru. You can reach him at sgganesh at gmail dot com.

Monitoring Celery

First of all, to manage Celery, you need to switch events on— start the Celery daemon with the option -E, so the command would become python manage.py celeryd -l info -c 2 –B –E. This starts to capture events, and now you can monitor your workers, task states, etc, using: Celery command-line utilities Django-Celery admin Flower: A real-time Celery Web-monitor

Celery is one of the most stable systems available. It is very easy to get started with, very simple to configure, fast at executing millions of tasks, and flexible, as almost any component of Celery can be used on its own, changed, or configured as per requirements. Some other great features of Celery are: 1. Designing workflows: To chain multiple tasks, you can use canvas to divide your tasks into sub-tasks. 2. Webhooks: To enjoy the power of Celery using other languages like PHP, Ruby, etc. 3. Routing: Send tasks to a particular queue rather than any queue, and to implement all the routing mechanisms that the message broker supports.

There are loads of other great features of Celery, which are beyond the scope of this article. I am sure that if you have a use-case, chances are that you can do it with Celery.

By: Konark K. Modi

The author works as a senior systems engineer with Website Operations at Makemytrip.com, and is a technology enthusiast at heart. Follow him at: @konarkmodi.

Introducing Celery for Python+Django

Next Article

Try Your Hand at the Dropbox API for Android

Celery architecture

Setting up Celery

Integrating Celery with Django

Continued on page 60...

Why Care About Design Smells?

Continued from page 58...

Adding multiple workers

Monitoring

Monitoring RabbitMQ

Monitoring Celery

More articles from this publication:

Try Your Hand at the Dropbox API for Android

Develop Android Apps with Ease

Kick-Starting Virtualisation with VirtualBox

Deploying a Ticket Request System with OTRS

A Look at the Basics of LVM

Ensuring Security on Open Source Virtual Platforms

Planning an Android-Based Device for an Enterprise

Programming Socket in C (TCP

Have You Powered Your Ap plication with Memcached?

This article is from:

ource ...rch 2013