On Boto and Chef Community Cookbooks

Today I learned:

  1. It is time to let the old Boto retire; it would not let me access the attached managed policies of an IAM role. Boto3 supports this.
  2. Chef community cookbooks that are bundled with a project have a tendency to not work if left alone for a while—and upgrading them can get messy due to the way dependencies are handled. Stick to the cookbooks with no or simple dependency graphs or just roll your own solution.

How to compile SimpleParse 2.1.0a1 for Python 2.6 on Windows Vista

SimpleParse is a fast Python single-pass parser generator that I use regularly. When I finally made the move onto Python 2.6 it turned out that there is no pre-compiled package for 2.6 on Windows. So, here is my procedure for compiling the source package on Windows Vista.

1. Install Cygwin if you don’t already have it on your system, and make sure that the version of Python you are installing SimpleParse for is on either the system or the Cygwin path.

2. Download and install Microsoft Visual C++ 2008 Express Edition. You should ensure that you have the latest Vista service packs installed before attempting this. If the installer quits on you then just reboot the computer and try again. Without this installed, you wil get an ‘Unable to find vcvarsall.bat’ error.

3. Download and unpack the SimpleParse 2.1.0a1 source. Using the Cygwin shell, place yourself in the root source directory.

4. If we try to run python setup.py install at this point, the Visual C++ compiler will complain:

stt/TextTools/mxTextTools/mxTextTools.c(149) : error C2133:
'mxTextSearch_Methods' : unknown size
stt/TextTools/mxTextTools/mxTextTools.c(920) : error C2133:
'mxCharSet_Methods': unknown size
stt/TextTools/mxTextTools/mxTextTools.c(2103) : error C2133:
'mxTagTable_Methods' : unknown size
error: command '"C:\Program Files\Microsoft Visual Studio 9.0\VC\BIN\cl.exe"'
failed with exit status 2

We have to add the following lines to stt/TextTools/mxTextTools/mxTextTools.c, starting at line 148 (before staticforward is used for the first time):

#ifdef _MSC_VER
#define staticforward extern

5. with is a Python 2.6 keyword, meaning it can’t be used as a variable, as is the case in the SimpleParse source code. So, we have to replace it with something else:

$ sed -r 's/with/with_t/g' < stt/TextTools/TextTools.py > tmp.txt
$ cp tmp.txt stt/TextTools/TextTools.py

6. Finally, run python setup.py install as usual.

Running pytst 1.15 on a 64-bit platform

Update: The latest version, 1.17, compiles on 64-bit platforms out of the box, so the patch below is no longer necessary.

Nicolas Lehuen’s pytst is a C++ ternary search tree implementation with a Python interface. It’s an excellent tool—and it is also really, really fast.

Unfortunately version 1.15 doesn’t compile on 64-bit platforms, giving the following error messages:

pythonTST.h:178: error: cannot convert 'int*' to 'Py_ssize_t*' for argument '3'
to 'int PyString_AsStringAndSize(PyObject*, char**, Py_ssize_t*)'
tst_wrap.cxx: In function 'PyObject* _wrap__TST_walk__SWIG_1(PyObject*, int, PyO
tst_wrap.cxx:3175: error: cannot convert 'int*' to 'Py_ssize_t*' for argument '3
' to 'int PyString_AsStringAndSize(PyObject*, char**, Py_ssize_t*)'
tst_wrap.cxx: In function 'PyObject* _wrap__TST_close_match(PyObject*, PyObject*
tst_wrap.cxx:3250: error: cannot convert 'int*' to 'Py_ssize_t*' for argument '3
' to 'int PyString_AsStringAndSize(PyObject*, char**, Py_ssize_t*)'
tst_wrap.cxx: In function 'PyObject* _wrap__TST_prefix_match(PyObject*, PyObject
[...and so on...]

Until Nicolas releases an updated version, here is the quick fix:

cp pythonTST.h pythonTST.h.orig
cp tst_wrap.cxx tst_wrap.cxx.orig
sed -r 's/int size/Py_ssize_t size/' < tst_wrap.cxx.orig > tst_wrap.cxx
sed -r 's/int length/Py_ssize_t length/' < pythonTST.h.orig > tmpfile
sed -r 's/sizeof\(int\)/sizeof(long)/' < tmpfile > pythonTST.h

Run these commands from the pytst source directory and you should be all set. I’m not sure if this a fully satisfactory solution, but at least this will get the test suite running again.

Hacking comments in Django 1.0

The recent release of Django 1.0 included a full rewrite of the comments framework. Comments have been available in Django for a while but were never properly documented until now.

This article will show you how to adapt and extend the comments framework so that it fits the needs of your application. Why extend it? Well, mainly because the framework does what it says on the box—and nothing more. It allows you to attach comments to any Django object instance but for the rest of the business logic—e.g. regulating who can modify and delete comments—you are on your own.

Also, the current documentation does not cover all features so what I am writing here should hopefully fill a few gaps.


You need to be familiar with Django. If you’re not, then have a look at the tutorial in the official documentation or alternatively at my previous article on how to get started with Django on Google App Engine.

How comments work

It’s really simple—just skim through the well-written documentation and you should pretty much be able to figure it out.

For example, to show a comment form for an instance of a model called my_model_instance, you just need two lines of template code:

{% load comments %}
{% render_comment_form for my_model_instance %}

The magic behind the comments framework lies in its use of generic model relations. This is a very powerful (and well-hidden) Django feature that allows your models to have generic foreign keys, meaning they can link to any other model. The comments framework uses this technique to ensure that comments can be attached to an arbitrary model in your application.

The scenario

I will be describing a real-life case from my company web site, Eventseer.net. Eventseer is an event tracker that helps researchers stay informed on upcoming conferences and workshops. It uses the comments framework for two different purposes.

Firstly, registered users can add comments to each event in our database. Secondly, all users can claim a personal profile page where they get what we call a whiteboard—which is simply a blogging application. Each entry on a whiteboard can be commented on by other registered users.

The problem

There are some limitations when it comes to adding comments on Eventseer. For example, only registered users are allowed to add comments. After a comment has been added, only the user who added it or an administrator are allowed to delete it.

These are fairly typical requirements—which are not supported out of the box in the comments framework. There is some support for using the built-in permissions system, but this will still not let you exercise fine-grained per user access control.

Moreover, the default comment templates are ugly as sin and will have to adapted to fit your application.

Step 1: Enabling comments

This is described well enough in the standard documentation. However, if we want to add extra functionality there are a couple of extra things to be done.

First, we add the comments framework to INSTALLED_APPS in settings.py:

# eventseer/settings.py


Note that I also added an app called eventseer.mod_comments. This is where our comments wrapper code will reside. (I will be using the eventseer project name for the rest of this tutorial).

Now synchronize the database:

$ python manage.py syncdb

This creates the tables necessary for storing the comments.

Finally, add an entry in your base urls.py:

# eventseer/urls.py

from django.conf.urls.defaults import *

urlpatterns = patterns('',
    (r'^comments/', include('eventseer.mod_comments.urls')),

This is where we deviate from the standard documentation: Instead of routing all comment URLs to the bundled comments application we instead route them to our own custom application. This allows us to intercept comment URLs as required.

Step 2: Add the modified comments application

This is done the usual way:

$ python manage.py startapp mod_comments

In the previous step we added a reference to urls.py in the mod_comments application, so this file must be added:

# eventseer/mod_comments/urls.py

from django.conf.urls.defaults import *

urlpatterns = patterns('',
    (r'^delete/(?P<comment_id>\d+)/$', 'eventseer.mod_comments.views.delete'),
    (r'', include('django.contrib.comments.urls')),

The first line routes requests to /comments/delete/ to a custom delete view which we will create in the next step. For this example this is the only behavior we wish to modify. The last line ensures that all other requests are passed through to django.contrib.comments.urls.

Step 3: Create the wrapper view

We want to make sure that only the user who wrote a comment or administrators are allowed to delete it. This can be taken care of in mod_comments/views.py:

# eventseer/mod_comments/views.py

from django.contrib.auth.decorators import login_required
from django.contrib.comments.models import Comment
from django.http import Http404
from django.shortcuts import get_object_or_404
import django.contrib.comments.views.moderation as moderation

def delete(request, comment_id):
    comment = get_object_or_404(Comment, pk=comment_id)
    if request.user == comment.user or \
        return moderation.delete(request, comment_id)
        raise Http404

First we wrap the delete function with the login_required decorator so as to keep out non-authenticated users. We then check if the user who made the delete request actually owns the comment or if the user has administrator permissions. If either case holds true we pass the request on to the original delete method. Otherwise a 404 (page not found) error is raised.

We can of course modify the view method signature as required. In fact, the original delete method can be completely bypassed if that is what we want.

Step 4: Modifying delete behavior

By default the delete view shows a confirmation page (comments/delete.html) on GET requests and does the actual deletion on POST requests. After the deletion is done you will be shown the standard deleted.html template. Alternatively, adding a next parameter to the POST request will send the user to the given URL.

Say we wish to make some changes to the confirmation page, comments/delete.html. Instead of modifying the original in the Django distribution we create our own version. Create the directory eventseer/mod_comments/templates/comments and copy delete.html into it.

You will typically find this file in /usr/lib/python2.5/site-packages/django/contrib/comments/templates/comments on Linux systems or C:/Python2.5/Lib/site-packages/django/contrib/comments/templates/comments on Windows systems—your mileage may vary.

Typically you will wish to change this template to fit in with your site design, for instance by inheriting from your base templates.

To make the modified template take precedence, just add the new directory to settings.py:

# eventseer/settings.py


This will make sure that the Django URL resolver queries the eventseer/mod_comment/templates directory—where it will find our alternative version of comments/delete.html. Requests to other comment views that use the other default templates will be passed through to the correct default location.


The Django comments framework is the easiest and quickest way to add commenting functionality to your application. The flip side of this simplicity is that you will often have to extend the framework to make it behave according to your requirements. As this tutorial have shown, this can be done without making changes to the comments framework itself. One of the core strengths of Django is how it provides a set of reusable building blocks upon which you can add your own advanced functionality as required.

At the time of writing, the comments framework documentation is somewhat sparse. If you want to learn more about the inner workings of Django comments you will have to consult the source code—there are quite a few undocumented features that are really useful.


Tim Hoelscher noticed that I hadn’t said anything about how to work around the Django permission system, which was an unintentional omission.

The original delete method in django.contrib.comments.views.moderation requires that the user who wants to delete a comment has the comments.can_moderate permission. Regular users do not have this permission by default, so we have to set it for all users who are allowed to delete comments. (Remember, the wrapper delete makes sure that they can only delete their own comments.)

An easy way to solve this is to create a ‘user’ group, assign the comments.can_moderate permission to this group, and finally assign all users to this group. This can be done through the admin interface, with a few lines of SQL, or within your Django application. Refer to the Django permissions documentation for more information on how permissions work.

Serving static files with Django and AWS – going fast on a budget

I just posted an article on how to improve Django response times through the use of pre-generated static files:

Speed matters.

When Google tried adding 20 extra results to their search pages, traffic dropped by 20%. The reason? Page generation took an extra .5 seconds.

This article will show how Eventseer utilizes an often overlooked way of improving the responsiveness of a web application: Pre-generating and serving static files instead of dynamic pages.

The full posting can be read here.

Porting legacy databases to Google App Engine

A reader posed the following question:

“I’m trying to convert my django app to work with google app engine. This is preferred rather than spending $100/year extra for a site with ssh access, plus I love the appengine dashboard.

Here is my issue: My current django app is fairly static. It pulls all its data from a mysql database containing ~6,000 rows. This itself is built from a gadfly database, so it should be pretty easy to get these values into the datastore/gql.

How can I sync my database with appengine?”

This is a highly relevant problem if you are porting an existing Django application to the Google App Engine. Luckily, the App Engine SDK includes a bulk data uploader tool that does the job. Let’s work through an example where we use this tool to transfer data from an existing MySQL database onto a Django application running on Google App Engine.

Case description: We have an inventory database that is currently stored in MySQL. This database is to be made available through a Django web application that allows visitors to review the inventory. The database is named ‘customerdb’ and has a single table called ‘inventory’:’

mysql> select * from inventory;
| name     | quantity |
| ham      |        2 |
| cheese   |        7 |
| macaroni |        1 |
3 rows in set (0.00 sec)

Setup: We need an App Engine-ready Django application that provides us with the views and models we need to display our inventory. For this scenario we will name the application ‘upload-demo’ and make it available on http://upload-demo.appspot.com. My earlier tutorials should provide you with what you need to build the basic application structure.

The full set of application files can be downloaded here. References to the application name and paths will have to be changed according to your system setup.

Once the fundamentals are in place you should add an inventory model that mirrors the table in our database:

# upload-demo/uploaddemo/main/models.py

from google.appengine.ext import db

class Inventory(db.Model):
    name = db.StringProperty()
    quantity = db.IntegerProperty()

We also need a view that displays the data:

# upload-demo/uploaddemo/main/views.py

from django.http import HttpResponse
from uploaddemo.main.models import Inventory

def main(request):
    result = ""
    items = Inventory.all()

    for item in items:
        result += "%s: %i<br/>" % (item.name, item.quantity)

    return HttpResponse(result)

Finally, your urls.py should point towards the view:

# upload-demo/uploaddemo/urls.py

from django.conf.urls.defaults import *

urlpatterns = patterns("",
    (r"^$", "uploaddemo.main.views.main"),

The application directory structure should look exactly like this:

Project directory structure

To verify that we are good to go, deploy the application to App Engine:

[test@mybox ~]$ appcfg.py update upload-demo

You should see an empty page—which makes sense since we have no data yet.

Step 1 – Create a bulk load handler: The bulk loader accepts CSV-formatted data which it will feed it into the datastore:

# upload-demo/loader.py

from google.appengine.ext import bulkload

class InventoryLoader(bulkload.Loader):
    def __init__(self):
        fields = [
            ("name", str),
            ("quantity", int)
        bulkload.Loader.__init__(self, "Inventory", fields)

if __name__ == "__main__":

In this case we have created a loader for the Inventory model where the fields match the name and type of the fields in the model. Note that the loader is kept outside of the Django application.

Step 2 – Add the handler to the project: This is done by adding an entry to app.yaml that references loader.py:

# upload-demo/app.yaml

application: upload-demo
version: 1
runtime: python
api_version: 1

- url: /load
  script: loader.py
  login: admin
- url: /.*
  script: main.py

A login will be required to access the loader URL—we don’t want anyone to add to our inventory without permission.

Step 3 – Convert the data to CSV:

Getting this step right can be surprisingly tricky, depending on your legacy database. For MySQL you may have to make sure that the user account has file write access rights:

[root@mybox ~]# mysql -u root
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 74740
Server version: 5.0.45 Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> grant file on *.* to 'test'@'localhost';
Query OK, 0 rows affected (0.01 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.01 sec)

This command might have to be run as root, depending on how your database is configured. To do the data dump we run the following select statement:

[test@mybox ~]$ mysql -u test customerdb -e "select * into
    outfile '/tmp/inventory.txt' fields terminated by ',' from
[test@mybox ~]$ cat /tmp/inventory.txt

If you are using PostgreSQL you can achieve the same by using the COPY command.

Step 4 – Upload the data: First, redeploy your application to App Engine:

[test@mybox ~]$ appcfg.py update upload-demo

We then use the bulkload_client.py script to upload our CSV file. The script is found in the tools folder of your App Engine installation—you may have to add it to your PATH. Note that you have to use double dashes for the parameters.

[test@mybox ~]$ bulkload_client.py --filename=/tmp/inventory.txt
    --kind=Inventory --url=http://upload-demo.appspot.com/load

INFO 2008-06-15 07:39:21,682 bulkload_client.py]
    Starting import; maximum 10 entities per post
INFO 2008-06-15 07:39:21,684 bulkload_client.py]
    Importing 3 entities in 29 bytes
ERROR 2008-06-15 07:39:21,997 bulkload_client.py]
    An error occurred while importing: Received code 302: Found
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<H1>302 Moved</H1>
The document has moved
<A HREF="https://www.google.com/accounts/ServiceLogin?service=ah&

ERROR    2008-06-15 07:39:21,997 bulkload_client.py] Import failed

Now, that didn’t work. Remember that app.yaml says we have to authenticate ourself as an admin user before we can upload data. Try visiting http://upload-demo.appspot.com/load in a web browser. After having authenticated yourself using your Google account you will be redirected to the following page:

Loader authentication screen

Just what we needed! Add the cookie string parameter to the previous request and try again:

[test@mybox ~]$ bulkload_client.py --filename=/tmp/inventory.txt
    --kind=Inventory --url=http://upload-demo.appspot.com/load

INFO 2008-06-15 07:50:58,541 bulkload_client.py]
    Starting import; maximum 10 entities per post
INFO 2008-06-15 07:50:58,549 bulkload_client.py]
    Importing 3 entities in 29 bytes
INFO 2008-06-15 07:50:59,102 bulkload_client.py]
    Import succcessful

If you visit http://upload-demo.appspot you should now see the data we just uploaded.

Final notes: This simple example should be enough to get you started. When converting real-life databases you will have to deal with more complex schemas with references between tables. The discussion here should point you in the right direction. You may also find the SDK documentation on types and property classes useful when porting your legacy database.

Django on Google App Engine: Templates and static files

In a previous tutorial we learned how to set up a simple Django project on the Google App Engine. We also saw how to use the App Engine datastore in place of the Django model API.

Now, let’s have a look at how to integrate Django templates. I will also show you how to serve static files.

Important: Remember to upgrade to the latest version of the App Engine SDK (version 1.0.1 at the time of writing). Otherwise, this tutorial will not work for you if you are developing on Windows.

Step 1: Set up an App Engine project—I am calling mine djangostatic. Follow steps 1 through 7 from the previous tutorial, remembering to substitute the project directory path and project name in main.py and app.yaml, and you will be all set.

Step 2: We will create a simple view that makes use of a template. First, let us define the template. Create a directory where you can store templates:

tmp/djangostatic$ cd djangostatic/main
tmp/djangostatic/djangostatic/main$ mkdir -p templates/main

Then, add the file main.html to your new template directory:

# djangostatic/djangostatic/main/templates/main/main.html

        <link href="/css/main.css" type="text/css"
            Hello world!

Note that the template refers to a style sheet file, main.css, which we will create later on.

Step 3: Django needs to be told where to search for template files: this is done in the settings.py file. The settings file is mostly pre-configured; we just have to modify the part that sets the TEMPLATE_DIRS variable:

# djangostatic/djangostatic/settings.py

import os
ROOT_PATH = os.path.dirname(__file__)

    ROOT_PATH + "/main/templates",

Step 4: After creating the template and telling Django where to find it, we have to write a view that does the actual rendering:

# djangostatic/djangostatic/main/views.py

from django.shortcuts import render_to_response

def main(request):
    return render_to_response("main/main.html")

This tells Django to use the template main/main.html when rendering the response. The render_to_response method is a convenient shortcut for rendering a template and returning a response in one step.

Step 5: Finally, we need to map a URL to our view—this is done in urls.py:

# djangostatic/djangostatic/urls.py

from django.conf.urls.defaults import *

urlpatterns = patterns("",
    (r"^$", "djangostatic.main.views.main"),

Start your development server (dev_appserver.py djangostatic), fire up your browser, and open the page at If you have done everything right so far, you should get the “hello world” message from the template.

Step 6: So what about the style sheet file, main.css? A style sheet file is a typical example of a static file. We use Django for rendering dynamic pages, so requests for static files should not be handled by the Django engine. In a regular Django application, we usually configure the web server to route such requests to a specific directory. On the App Engine, we achieve the same effect by adding a static handler to app.yaml:

# djangostatic/app.yaml

application: djangostatic
version: 1
runtime: python
api_version: 1

- url: /css
  static_dir: media/css
- url: /.*
  script: main.py

Here, we have added an entry that routes all requests beginning with /css to the directory media/css. Let us create this directory:

tmp/djangostatic$ mkdir -p media/css

Step 7: The link in our template specified /css/main.css as the full URL, so we have to add the main.css file to our new directory:

# djangostatic/media/css/main.css

p {
    font-size: 48px;

Reload the application page; the browser should now be able to make use of the style sheet so that the message is displayed in a larger font. You can view the final results here.

Final notes: To learn more about how to serve static files on App Engine, have a look at the official documentation on how to configure an app. Django templates are very powerful—this tutorial has only shown you the absolute basics. Visit the Django template documentation to get the full story.

Django on Google App Engine in 13 simple steps

In this tutorial I will show you how to get a simple datastore-backed Django application up and running on the Google App Engine. I will assume that you are somewhat familiar with Django.

Update 1: You can download the full set of files from here. Make sure to fix the sys.path in main.py.

Update 2: There is now a Turkish translation of this tutorial, courtesy of Türker Sezer.

Update 3: Now in Russian as well.

Update 4: Brazilian Portuguese tranlation by Marcio Andrey Oliveira.

Step 1: Register an app name and install the development kit per the instructions.

Step 2: Create a directory for your application—for this tutorial my application is called mashname:

tmp$ mkdir mashname
tmp$ cd mashname

Step 3: Add a file called main.py to your new directory:

# main.py

import os, sys
os.environ[&quot;DJANGO_SETTINGS_MODULE&quot;] = &quot;mashname.settings&quot;

# Google App Engine imports.
from google.appengine.ext.webapp import util

# Force Django to reload its settings.
from django.conf import settings
settings._target = None

import django.core.handlers.wsgi
import django.core.signals
import django.db
import django.dispatch.dispatcher

# Log errors.
#   log_exception, django.core.signals.got_request_exception)

# Unregister the rollback event handler.

def main():
    # Create a Django application for WSGI.
    application = django.core.handlers.wsgi.WSGIHandler()

    # Run the WSGI CGI handler with that application.

if __name__ == &quot;__main__&quot;:

This is basically the same file as suggested here, except I had to set the Python path to be able to test locally. I also had to set the DJANGO_SETTINGS_MODULE—this might not be necessary when running on the App Engine. I had to disable the error logging which I was not able to get working.

Step 4: Add a file called app.yaml to the same directory:

application: mashname
version: 1
runtime: python
api_version: 1

- url: /.*
  script: main.py

Make sure to get the application name right.

Step 5: From your mashname directory, create a new Django project:

tmp/mashname$ django-admin.py startproject mashname

(I’m assuming that your current Django setup is working as it should.)

Step 6: You should now be able to test your application:

tmp/mashname$ cd ..
tmp$ dev_appserver.py mashname
INFO     2008-04-08 19:08:10,023 appcfg.py] Checking for updates to the SDK.
INFO     2008-04-08 19:08:10,384 appcfg.py] The SDK is up to date.
INFO     2008-04-08 19:08:10,404 dev_appserver_main.py] Running application mash
name on port 8080: http://localhost:8080

Point your browser towards and you should get the standard Django It worked! message.

Step 7: Create a Django app within the project:

tmp$ cd mashname
tmp/mashname$ python mashname/manage.py startapp main

Step 8: Now it is time to add a model. We will be creating a simple application that logs all visitors to the data store and displays their IP address and time of visit. Edit ~/mashname/mashname/main/models.py so that it looks like this:

# models.py

from google.appengine.ext import db

class Visitor(db.Model):
    ip = db.StringProperty()
    added_on = db.DateTimeProperty(auto_now_add=True)

There is no need to sync the database since we are not using regular Django models.

Step 9: Now we create a view that is responsible for both adding data to the Visitor model and showing the previous visitors. Edit views.py (in the same directory as models.py) so that it does what we want:

# views.py

from django.http import HttpResponse

from mashname.main.models import Visitor

def main(request):
    visitor = Visitor()
    visitor.ip = request.META[&quot;REMOTE_ADDR&quot;]

    result = &quot;&quot;
    visitors = Visitor.all()

    for visitor in visitors.fetch(limit=40):
        result += visitor.ip + u&quot; visited on &quot; + unicode(visitor.added_on) + u&quot;&quot;

    return HttpResponse(result)

Step 10: Finally, make your urls.py point towards the view:

# urls.py

from django.conf.urls.defaults import *

urlpatterns = patterns(&quot;&quot;,
    (r&quot;^$&quot;, &quot;mashname.main.views.main&quot;),

Step 11: Test your application (as in step 6) and everything should hopefully work. For each page reload a new entry is added to the Visitor model and shown in the view.

Step 12: Upload your application to the Google App Engine:

tmp$ appcfg.py update mashname

For the first upload you will have to provide the mail address and password for your Google account.

Step 13: Enjoy! To view the final results, go to http://mashname.appspot.com/.