Porting legacy databases to Google App Engine

A reader posed the following question:

“I’m trying to convert my django app to work with google app engine. This is preferred rather than spending $100/year extra for a site with ssh access, plus I love the appengine dashboard.

Here is my issue: My current django app is fairly static. It pulls all its data from a mysql database containing ~6,000 rows. This itself is built from a gadfly database, so it should be pretty easy to get these values into the datastore/gql.

How can I sync my database with appengine?”

This is a highly relevant problem if you are porting an existing Django application to the Google App Engine. Luckily, the App Engine SDK includes a bulk data uploader tool that does the job. Let’s work through an example where we use this tool to transfer data from an existing MySQL database onto a Django application running on Google App Engine.

Case description: We have an inventory database that is currently stored in MySQL. This database is to be made available through a Django web application that allows visitors to review the inventory. The database is named ‘customerdb’ and has a single table called ‘inventory’:’

mysql> select * from inventory;
+----------+----------+
| name     | quantity |
+----------+----------+
| ham      |        2 |
| cheese   |        7 |
| macaroni |        1 |
+----------+----------+
3 rows in set (0.00 sec)

Setup: We need an App Engine-ready Django application that provides us with the views and models we need to display our inventory. For this scenario we will name the application ‘upload-demo’ and make it available on http://upload-demo.appspot.com. My earlier tutorials should provide you with what you need to build the basic application structure.

The full set of application files can be downloaded here. References to the application name and paths will have to be changed according to your system setup.

Once the fundamentals are in place you should add an inventory model that mirrors the table in our database:

# upload-demo/uploaddemo/main/models.py

from google.appengine.ext import db

class Inventory(db.Model):
    name = db.StringProperty()
    quantity = db.IntegerProperty()

We also need a view that displays the data:

# upload-demo/uploaddemo/main/views.py

from django.http import HttpResponse
from uploaddemo.main.models import Inventory

def main(request):
    result = ""
    items = Inventory.all()

    for item in items:
        result += "%s: %i<br/>" % (item.name, item.quantity)

    return HttpResponse(result)

Finally, your urls.py should point towards the view:

# upload-demo/uploaddemo/urls.py

from django.conf.urls.defaults import *

urlpatterns = patterns("",
    (r"^$", "uploaddemo.main.views.main"),
)

The application directory structure should look exactly like this:

Project directory structure

To verify that we are good to go, deploy the application to App Engine:

[test@mybox ~]$ appcfg.py update upload-demo

You should see an empty page—which makes sense since we have no data yet.

Step 1 – Create a bulk load handler: The bulk loader accepts CSV-formatted data which it will feed it into the datastore:

# upload-demo/loader.py

from google.appengine.ext import bulkload

class InventoryLoader(bulkload.Loader):
    def __init__(self):
        fields = [
            ("name", str),
            ("quantity", int)
        ]
       
        bulkload.Loader.__init__(self, "Inventory", fields)

if __name__ == "__main__":
    bulkload.main(InventoryLoader())

In this case we have created a loader for the Inventory model where the fields match the name and type of the fields in the model. Note that the loader is kept outside of the Django application.

Step 2 – Add the handler to the project: This is done by adding an entry to app.yaml that references loader.py:

# upload-demo/app.yaml

application: upload-demo
version: 1
runtime: python
api_version: 1

handlers:
- url: /load
  script: loader.py
  login: admin
- url: /.*
  script: main.py

A login will be required to access the loader URL—we don’t want anyone to add to our inventory without permission.

Step 3 – Convert the data to CSV:

Getting this step right can be surprisingly tricky, depending on your legacy database. For MySQL you may have to make sure that the user account has file write access rights:

[root@mybox ~]# mysql -u root
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 74740
Server version: 5.0.45 Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> grant file on *.* to 'test'@'localhost';
Query OK, 0 rows affected (0.01 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.01 sec)

This command might have to be run as root, depending on how your database is configured. To do the data dump we run the following select statement:

[test@mybox ~]$ mysql -u test customerdb -e "select * into
    outfile '/tmp/inventory.txt' fields terminated by ',' from
    inventory"
[test@mybox ~]$ cat /tmp/inventory.txt
ham,2
cheese,7
macaroni,1

If you are using PostgreSQL you can achieve the same by using the COPY command.

Step 4 – Upload the data: First, redeploy your application to App Engine:

[test@mybox ~]$ appcfg.py update upload-demo

We then use the bulkload_client.py script to upload our CSV file. The script is found in the tools folder of your App Engine installation—you may have to add it to your PATH. Note that you have to use double dashes for the parameters.

[test@mybox ~]$ bulkload_client.py --filename=/tmp/inventory.txt
    --kind=Inventory --url=http://upload-demo.appspot.com/load

INFO 2008-06-15 07:39:21,682 bulkload_client.py]
    Starting import; maximum 10 entities per post
INFO 2008-06-15 07:39:21,684 bulkload_client.py]
    Importing 3 entities in 29 bytes
ERROR 2008-06-15 07:39:21,997 bulkload_client.py]
    An error occurred while importing: Received code 302: Found
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="https://www.google.com/accounts/ServiceLogin?service=ah&
continue=http://upload-demo.appspot.com/_ah/login%3Fcontinue%3Dhttp://
upload-demo.appspot.com/load&ltmpl=gm&ahname=Django+data+u
pload+demo&sig=f9861d41d527e55f15742b8d54504bcc">here</A>.
</BODY></HTML>

ERROR    2008-06-15 07:39:21,997 bulkload_client.py] Import failed

Now, that didn’t work. Remember that app.yaml says we have to authenticate ourself as an admin user before we can upload data. Try visiting http://upload-demo.appspot.com/load in a web browser. After having authenticated yourself using your Google account you will be redirected to the following page:

Loader authentication screen

Just what we needed! Add the cookie string parameter to the previous request and try again:

[test@mybox ~]$ bulkload_client.py --filename=/tmp/inventory.txt
    --kind=Inventory --url=http://upload-demo.appspot.com/load
    --cookie='ACSID=AJKiYcE[...]1Hh4'

INFO 2008-06-15 07:50:58,541 bulkload_client.py]
    Starting import; maximum 10 entities per post
INFO 2008-06-15 07:50:58,549 bulkload_client.py]
    Importing 3 entities in 29 bytes
INFO 2008-06-15 07:50:59,102 bulkload_client.py]
    Import succcessful

If you visit http://upload-demo.appspot you should now see the data we just uploaded.

Final notes: This simple example should be enough to get you started. When converting real-life databases you will have to deal with more complex schemas with references between tables. The discussion here should point you in the right direction. You may also find the SDK documentation on types and property classes useful when porting your legacy database.

Django on Google App Engine: Templates and static files

In a previous tutorial we learned how to set up a simple Django project on the Google App Engine. We also saw how to use the App Engine datastore in place of the Django model API.

Now, let’s have a look at how to integrate Django templates. I will also show you how to serve static files.

Important: Remember to upgrade to the latest version of the App Engine SDK (version 1.0.1 at the time of writing). Otherwise, this tutorial will not work for you if you are developing on Windows.

Step 1: Set up an App Engine project—I am calling mine djangostatic. Follow steps 1 through 7 from the previous tutorial, remembering to substitute the project directory path and project name in main.py and app.yaml, and you will be all set.

Step 2: We will create a simple view that makes use of a template. First, let us define the template. Create a directory where you can store templates:

tmp/djangostatic$ cd djangostatic/main
tmp/djangostatic/djangostatic/main$ mkdir -p templates/main

Then, add the file main.html to your new template directory:

# djangostatic/djangostatic/main/templates/main/main.html

<html>
    <head>
        <link href="/css/main.css" type="text/css"
               rel="stylesheet"></link>
    </head>
    <body>
        <p>
            Hello world!
        </p>
    </body>
</html>

Note that the template refers to a style sheet file, main.css, which we will create later on.

Step 3: Django needs to be told where to search for template files: this is done in the settings.py file. The settings file is mostly pre-configured; we just have to modify the part that sets the TEMPLATE_DIRS variable:

# djangostatic/djangostatic/settings.py

import os
ROOT_PATH = os.path.dirname(__file__)

TEMPLATE_DIRS = (
    ROOT_PATH + "/main/templates",
)

Step 4: After creating the template and telling Django where to find it, we have to write a view that does the actual rendering:

# djangostatic/djangostatic/main/views.py

from django.shortcuts import render_to_response

def main(request):
    return render_to_response("main/main.html")

This tells Django to use the template main/main.html when rendering the response. The render_to_response method is a convenient shortcut for rendering a template and returning a response in one step.

Step 5: Finally, we need to map a URL to our view—this is done in urls.py:

# djangostatic/djangostatic/urls.py

from django.conf.urls.defaults import *

urlpatterns = patterns("",
    (r"^$", "djangostatic.main.views.main"),
)

Start your development server (dev_appserver.py djangostatic), fire up your browser, and open the page at http://127.0.0.1:8080/. If you have done everything right so far, you should get the “hello world” message from the template.

Step 6: So what about the style sheet file, main.css? A style sheet file is a typical example of a static file. We use Django for rendering dynamic pages, so requests for static files should not be handled by the Django engine. In a regular Django application, we usually configure the web server to route such requests to a specific directory. On the App Engine, we achieve the same effect by adding a static handler to app.yaml:

# djangostatic/app.yaml

application: djangostatic
version: 1
runtime: python
api_version: 1

handlers:
- url: /css
  static_dir: media/css
 
- url: /.*
  script: main.py

Here, we have added an entry that routes all requests beginning with /css to the directory media/css. Let us create this directory:

tmp/djangostatic$ mkdir -p media/css

Step 7: The link in our template specified /css/main.css as the full URL, so we have to add the main.css file to our new directory:

# djangostatic/media/css/main.css

p {
    font-size: 48px;
}

Reload the application page; the browser should now be able to make use of the style sheet so that the message is displayed in a larger font. You can view the final results here.

Final notes: To learn more about how to serve static files on App Engine, have a look at the official documentation on how to configure an app. Django templates are very powerful—this tutorial has only shown you the absolute basics. Visit the Django template documentation to get the full story.

Django on Google App Engine in 13 simple steps

In this tutorial I will show you how to get a simple datastore-backed Django application up and running on the Google App Engine. I will assume that you are somewhat familiar with Django.

Update 1: You can download the full set of files from here. Make sure to fix the sys.path in main.py.

Update 2: There is now a Turkish translation of this tutorial, courtesy of Türker Sezer.

Update 3: Now in Russian as well.

Update 4: Brazilian Portuguese tranlation by Marcio Andrey Oliveira.


Step 1: Register an app name and install the development kit per the instructions.

Step 2: Create a directory for your application—for this tutorial my application is called mashname:

tmp$ mkdir mashname
tmp$ cd mashname

Step 3: Add a file called main.py to your new directory:

# main.py

import os, sys
os.environ[&quot;DJANGO_SETTINGS_MODULE&quot;] = &quot;mashname.settings&quot;
sys.path.append(&quot;/home/brox/tmp/mashname&quot;)

# Google App Engine imports.
from google.appengine.ext.webapp import util

# Force Django to reload its settings.
from django.conf import settings
settings._target = None

import django.core.handlers.wsgi
import django.core.signals
import django.db
import django.dispatch.dispatcher

# Log errors.
#django.dispatch.dispatcher.connect(
#   log_exception, django.core.signals.got_request_exception)

# Unregister the rollback event handler.
django.dispatch.dispatcher.disconnect(
django.db._rollback_on_exception,
django.core.signals.got_request_exception)

def main():
    # Create a Django application for WSGI.
    application = django.core.handlers.wsgi.WSGIHandler()

    # Run the WSGI CGI handler with that application.
    util.run_wsgi_app(application)

if __name__ == &quot;__main__&quot;:
    main()

This is basically the same file as suggested here, except I had to set the Python path to be able to test locally. I also had to set the DJANGO_SETTINGS_MODULE—this might not be necessary when running on the App Engine. I had to disable the error logging which I was not able to get working.

Step 4: Add a file called app.yaml to the same directory:

application: mashname
version: 1
runtime: python
api_version: 1

handlers:
- url: /.*
  script: main.py

Make sure to get the application name right.

Step 5: From your mashname directory, create a new Django project:

tmp/mashname$ django-admin.py startproject mashname

(I’m assuming that your current Django setup is working as it should.)

Step 6: You should now be able to test your application:

tmp/mashname$ cd ..
tmp$ dev_appserver.py mashname
INFO     2008-04-08 19:08:10,023 appcfg.py] Checking for updates to the SDK.
INFO     2008-04-08 19:08:10,384 appcfg.py] The SDK is up to date.
INFO     2008-04-08 19:08:10,404 dev_appserver_main.py] Running application mash
name on port 8080: http://localhost:8080

Point your browser towards http://127.0.0.1:8080/ and you should get the standard Django It worked! message.

Step 7: Create a Django app within the project:

tmp$ cd mashname
tmp/mashname$ python mashname/manage.py startapp main

Step 8: Now it is time to add a model. We will be creating a simple application that logs all visitors to the data store and displays their IP address and time of visit. Edit ~/mashname/mashname/main/models.py so that it looks like this:

# models.py

from google.appengine.ext import db

class Visitor(db.Model):
    ip = db.StringProperty()
    added_on = db.DateTimeProperty(auto_now_add=True)

There is no need to sync the database since we are not using regular Django models.

Step 9: Now we create a view that is responsible for both adding data to the Visitor model and showing the previous visitors. Edit views.py (in the same directory as models.py) so that it does what we want:

# views.py

from django.http import HttpResponse

from mashname.main.models import Visitor

def main(request):
    visitor = Visitor()
    visitor.ip = request.META[&quot;REMOTE_ADDR&quot;]
    visitor.put()

    result = &quot;&quot;
    visitors = Visitor.all()
    visitors.order(&quot;-added_on&quot;)

    for visitor in visitors.fetch(limit=40):
        result += visitor.ip + u&quot; visited on &quot; + unicode(visitor.added_on) + u&quot;&quot;

    return HttpResponse(result)

Step 10: Finally, make your urls.py point towards the view:

# urls.py

from django.conf.urls.defaults import *

urlpatterns = patterns(&quot;&quot;,
    (r&quot;^$&quot;, &quot;mashname.main.views.main&quot;),
)

Step 11: Test your application (as in step 6) and everything should hopefully work. For each page reload a new entry is added to the Visitor model and shown in the view.

Step 12: Upload your application to the Google App Engine:

tmp$ appcfg.py update mashname

For the first upload you will have to provide the mail address and password for your Google account.

Step 13: Enjoy! To view the final results, go to http://mashname.appspot.com/.