Persistent Django on Amazon EC2 and EBS - The easy way
August 21, 2008 – 8:12 pmNow that Amazon’s Elastic Block Store (EBS) is publicly available, running a complete Django installation on Amazon Web Services (AWS) is easier than ever.
Why EBS? EBS provides persistent storage, which means that the Django database is kept safe even after the Django EC2 instances terminate.
This tutorial will take you through all the necessary steps for setting up Django with a persistent PostgreSQL database on AWS. I will be assuming no prior knowledge of AWS, so those of you who have dabbled with it before might want to skim through the first steps. Knowing your way around Django is an advantage but not a requirement.
I am deliberately keeping things simple—to get a deeper understanding of the hows and whys of AWS you should take a look at James Gardner’s excellent article as well as the official documentation.
The command line tools can be a bit intimidating so I will also show you how Elasticfox can be a fully satisfactory alternative.
Summary
We are going to register with AWS, get acquainted with Elasticfox, start up an EC2 instance, install Django and PostgreSQL on the instance, and finally mount an EBS drive and move our database to it.
Step 1: Set up an AWS account
To use AWS you need to register at the AWS web page. If you already have an account with Amazon you can extend this to also cover AWS.
Step 2: Download and install the Elasticfox Firefox extension
This tool will make life a whole lot easier for you. Down the road there is no avoiding the official command line tools or alternatively boto if you want to access AWS programmatically. For now, let’s stick with Elasticfox.
You can install the extension from this page.
Step 3: Add your AWS credentials to Firefox
Launch Elasticfox (’Tools’ -> ‘Elasticfox’) and click on the ‘credentials’ button. Enter your account name (typically the email address you registered with), AWS access key and AWS secret access key. This information can be found via the ‘Your web services account’ on the AWS start page.
Step 4: Create a new EC2 security group
Let’s pause for a while to consider what we are doing.
You will be running your Django installation off an EC2 instance. There is no magic to them at all—they are simply fully functional servers that you access the same way as, say, a dedicated server or a web hosting account.
By default, EC2 instances are an introverted lot: They prefer keeping to themselves and don’t expose any of their ports to the outside world. We will be running a web application on port 8000 so therefore port 8000 has to be opened. (Normally we would be opening port 80, but since I will only be using the Django development web server then port 8000 is preferable). SSH access is also essential, so port 22 should be opened as well.
To make this happen we must create a new security group where these ports are opened.
Click on the ‘Security Groups’ tab and then the ‘Refresh’ icon. The list should update to show you the ‘default’ group.
Then click the ‘Create Security Group’ icon and create a new group named ‘django’.
Now we need to add the actual permissions. Click the ‘Grant Permission’ icon and add ‘From port 8000 to 8000′ under ‘Protocol Details’. Repeat the same action for port 22.
Your security group is now ready for use.
Step 5: Set up a key pair
Having a security group is not enough; we also have to set up a key pair to access the instance via SSH.
Why is this necessary? Think about it: You are launching a server instance but no one has told you the root password. So, setting up a private/public key pair is the only way to gain access.
Click on the ‘KeyPairs’ tab and then the ‘Create a new keypair’ icon. Name your new key pair ‘django-keypair’. A save dialog will pop up, allowing you to save the private key in a safe location. Use the filename ‘django.pem’.
Step 6: Launch an EC2 instance
I have a certain fondness for Fedora, so I’ll be using the fedora-8-i386-base-v1.07 AMI with AMI ID ami-2b5fba42.
Return to the ‘AMIs and Instances’ tab.
If you click the ‘Refresh’ icon in the ‘Machine Images’ section you will get a list of all public images. To find the one we’re after, enter ‘fedora-8′ in the search box—after a while all the relevant images will appear.
Right-click the image with the AMI ID as above and select ‘Launch instance(s) of this AMI’.
This is where the actions from the previous steps start making sense. Set the key pair to ‘django-keypair’ and add the ‘django’ security group to the launch set. Leave all the other settings as they are. Then click the ‘Launch’ button.
Important: From this point and on the meter will be running! If the fire alarm goes off, you get bored with this tutorial, or whatever: Do remember to shut down the instance before you leave, otherwise it will cost you $2.40 per day.
The ‘Your Instances’ section should update, showing you that the instance you just launched is ‘pending’. Click the ‘Refresh’ icon after a while—in a minute or so the status should change to ‘running’.
Step 7: Connect with your new instance
Double click on the running instance and copy the ‘Public DNS Name’ entry. This is the domain name you use to access the instance from the outside. In this tutorial, my instance is hosted at ‘ec2-75-101-248-101.compute-1.amazonaws.com’.
Now we are going to SSH into the instance. I am doing this via Cygwin on Windows, but any SSH client should do. If you are on Windows and have Putty installed you can even launch directly from Elasticfox by right-clicking on the running instance and selecting ‘SSH to Public DNS Name’.
Let’s start with a basic sanity check:
$ ssh root@ec2-75-101-248-101.compute-1.amazonaws.com The authenticity of host 'ec2-75-101-248-101.compute-1.amazonaws. com (75.101.248.101)' can't be established. RSA key fingerprint is db:0a:85:36:99:5f:65:6b:c7:77:3e:37:59:fc:16:fd. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'ec2-75-101-248-101.compute-1.amazonaws. com,75.101.248.101' (RSA) to the list of known hosts. Permission denied (publickey,gssapi-with-mic).
As expected, this isn’t working; we need to use the private key you saved earlier. Go to the directory where you saved the django.pem file and type the following:
$ ssh -i django-keypair.pem root@ec2-75-101-248-101.compute-1.amazonaws.com
__| __|_ ) Fedora 8
_| ( / 32-bit
___|\___|___|
Welcome to an EC2 Public Image
: -)
Base
[root@ ~]#
That’s better!
If you try pointing your browser towards ‘http://ec2-75-101-248-101.compute-1.amazonaws.com:8000/’ you should get a ‘can’t establish a connection’ error since there is no web server running on port 8000 as of yet.
Step 8: Install required software
Most AMI instances are stripped to the bone, so we have to add the software packages we need to get Django up and running. The steps required will of course vary from AMI to AMI, but running the following script as root is sufficient for our v1.07 Fedora 8 instance:
# Install subversion yum -y install subversion # Install, initialize and launch PostgreSQL yum -y install postgresql postgresql-server service postgresql initdb service postgresql start # Modify PostgreSQL config to avoid username/password problems # Note: This grants access to _all_ local traffic! cat > /var/lib/pgsql/data/pg_hba.conf <<EOM local all all trust host all all 127.0.0.1/32 trust EOM # Restart PostgreSQL to enable new security policy service postgresql restart # Set up a database for Django psql -U postgres -c "create database djangotest encoding 'utf8'" # Install Django (I always checkout from SVN) cd /opt svn co http://code.djangoproject.com/svn/django/trunk/ django-trunk ln -s /opt/django-trunk/django /usr/lib/python2.5/site-packages/django ln -s /opt/django-trunk/django/bin/django-admin.py /usr/local/bin # Install psycopg2 (for database access from Python) yum -y install python-psycopg2
Step 9: Set up a Django project
First we set up an account for our test Django project:
[root ~]# useradd djangotest [root ~]# su - djangotest [djangotest ~]$
For the full story on how to create a new Django project you should have a look at the official tutorial. For now, just execute the following as the ‘djangotest’ user:
[djangotest ~]$ django-admin.py startproject mysite
Now we have all we need to test if the installation is working. Launch the development server like this:
[djangotest ~]$ python mysite/manage.py runserver ec2-75-101-248-101.compute-1.amazonaws.com:8000 Validating models... 0 errors found Django version 1.0-beta_1-SVN-8461, using settings 'mysite.settings' Development server is running at http://ec2-75-101-248-101.compute-1.amazonaws.com:8000/ Quit the server with CONTROL-C.
Note that I am using the full external domain name with the ‘runserver’ command.
Visit ‘http://ec2-75-101-248-101.compute-1.amazonaws.com:8000/’ with your browser and you should see the regular Django ‘It worked!’ page.
Note: Please don’t use the Django development server in a production setting. In fact, you probably shouldn’t use it on anything that is exposed to the outside world. The only reason I am doing it this way in this tutorial is to keep things simple—normally you should set up a proper web server such as Apache or Lighttpd. Refer to the Django documentation for information on how to do this.
Step 10: Create a Django application
I will show you how to put the Django database in persistent storage later on, so we have to set up a simple database-backed Django application.
Modify mysite/settings.py as follows:
DATABASE_ENGINE = 'postgresql_psycopg2'
DATABASE_NAME = 'djangotest'
DATABASE_USER = 'postgres'
DATABASE_PASSWORD = ''
...
INSTALLED_APPS = (
'django.contrib.admin',
'django.contrib.auth',
...
Then modify mysite/urls.py to allow access to the admin GUI:
from django.conf.urls.defaults import *
# Uncomment the next two lines to enable the admin:
from django.contrib import admin
admin.autodiscover()
urlpatterns = patterns('',
# Example:
# (r'^mysite/', include('mysite.foo.urls')),
# Uncomment the next line to enable admin documentation:
# (r'^admin/doc/', include('django.contrib.admindocs.urls')),
# Uncomment the next line to enable the admin:
(r'^admin/(.*)', admin.site.root),
)
Now we have to sync the database:
[djangotest ~]$ python mysite/manage.py syncdb
You will be asked to create an admin user—set both the username and the password to ‘djangotest’.
Then create a Django app:
[djangotest ~]$ python mysite/manage.py startapp myapp
If you got the preceding steps right, you should now be able to log on to the admin GUI at http://ec2-75-101-248-101.compute-1.amazonaws.com:8000/admin/ with the ‘djangotest’ user.
Add a new user to verify that the database connection works—we will be needing that new user later on.
Step 11: Create and mount an EBS instance
This is where things get really cool!
There is a huge problem with our current setup: Once you shut down the AMI instance, all the data in our database will disappear. Enter EBS.
EBS lets you define a persistent storage volume that can be mounted by EC2 instances. If we move our database files to an EBS volume then they will persist no matter what happens to our EC2 instances.
First, go back to Elasticfox and make a note of the availability zone of your running instance—this should be something like ‘us-east-1b’.
Then click on the ‘Volumes and Snapshots’ tab. Click the ‘Create Volume’ icon and create a 1GB volume that belongs to the same availability zone as your instance.
Right-click the new volume and choose ‘Attach this volume’. This will let you attach the volume to the running instance. Use /dev/sdh as the mount point. Refresh after a couple of seconds and the ‘Attachment status’ should have changed to ‘attached’.
Go back to your terminal and create an ext3 filesystem on the new volume:
[root ~]# mkfs.ext3 /dev/sdh
mke2fs 1.40.4 (31-Dec-2007)
/dev/sdh is entire device, not just one partition!
Proceed anyway? (y,n) y
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
131072 inodes, 262144 blocks
13107 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=268435456
8 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376
Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 35 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
All that remains is to mount the filesystem, in this case to /vol:
[root ~]# echo "/dev/sdh /vol ext3 noatime 0 0" >> /etc/ fstab [root ~]# mkdir /vol [root ~]# mount /vol [root ~]# df --si Filesystem Size Used Avail Use% Mounted on /dev/sda1 11G 1.4G 8.8G 14% / /dev/sda2 158G 197M 150G 1% /mnt none 895M 0 895M 0% /dev/shm /dev/sdh 1.1G 35M 969M 4% /vol
Step 12: Moving the database to persistent storage
First make sure that PostgreSQL is stopped:
[root ~]# service postgresql stop Stopping postgresql service: [ OK ]
You should also terminate your Django development server in case it is still running.
Now move the PostgreSQL database files to the EBS volume mounted at /vol:
[root ~]# mv /var/lib/pgsql /vol
For this to work we have to make a small modification to the /etc/init.d/postgresql file—make sure that the lines starting at around line 100 look exactly like this:
...
# Set defaults for configuration variables
PGENGINE=/usr/bin
PGPORT=5432
PGDATA=/var/lib/pgsql
if [ -f "$PGDATA/PG_VERSION" ] && [ -d "$PGDATA/base/template1" ]
then
echo "Using old-style directory structure"
else
PGDATA=/var/lib/pgsql/data
fi
PGDATA=/vol/pgsql/data
PGLOG=/vol/pgsql/pgstartup.log
...
Note that this is a Fedora-specific hack—the main idea is to have the $PGDATA system variable point at /vol/pgsql/data.
For other databases the procedure will differ. A similar procedure for MySQL is available here.
PostgreSQL can now be restarted:
[root ~]# service postgresql start Starting postgresql service: [ OK ]
To verify that Django is using the same database as before you can revisit the admin GUI—the new user you added previously should still be available.
And there you have it!
Step 13: Shutting down
For completeness’ sake, let’s review the steps required to shut everything down.
First, stop the database server and unmount the EBS volume:
[root ~]# service postgresql stop Stopping postgresql service: [ OK ] [root ~]# umount /vol
Then return to Elasticfox, right-click the EBS volume and select ‘Detach this instance’. When you are done with this tutorial you can delete the volume instance as well—having it in storage will cost you money.
Finally, go to the ‘AMIs and Instances’ tab and terminate the running instance. That should conclude your current transaction with AWS. (Refresh the volume and instances sections to verify that everything has really shut down).
Final words
If you now repeat steps 6 to 11 you should be able to launch a brand new EC2 instance that uses the database on your stored volume—this is left as an exercise for the reader. The only deviations from the procedure are that you shouldn’t have to run the PostgreSQL ‘initdb’ command, or create the ‘djangotest’ database.
This has been a bare-bones introduction to how EBS lets you run a persistent Django installation on AWS. In real life, the following issues have to be considered:
- Use a proper web server.
- Make sure the web server log files, database log, django logs etc. are moved to persistent storage as well.
- Create a custom AMI that is properly set up for your Django project (so that you don’t have to do the full setup procedure every time you launch an instance).
Then there’s scaling, backup, and so on. Nonetheless, hopefully this article should be enough to get you started.




















31 Responses to “Persistent Django on Amazon EC2 and EBS - The easy way”
Sweet. I tried it out and this was very very clear. It seems though that unless one saves a modified AMI (can one do this?) to EBS, or installs everything in a separate /software partition mounted on EBS, one would have to reinstall django, etc. Is this true? If so the nonrelocatable nature of most debs or rpms would be problematic.
By Rahul Dave on Aug 21, 2008
Rahul, you can definitely save a modified AMI - the James Gardner article I’m mentioning describes the procedure. I think you can also do it directly from Elasticfox. (I had to leave out that part or the posting would have been way too long.)
By Thomas Brox Røst on Aug 21, 2008
Thanks, I went and read that…seems you create the AMI locally and upload to S3…looks simple enough. I was wondering however, if you could kinda directly save to EBS..hmm perhaps an AMI could have a boot+ramdisk image and the root image be got from a EBS partition?
By Rahul Dave on Aug 22, 2008
Thank you. Very well written article!
By Shabda on Aug 22, 2008
I’ve rewritten ElasticFox and named it SpandexFox - it supports saving your AMI back to S3 from the GUI, and has some INSTRUCTIONS. Plus it’s compatible with FF3 (don’t know if ElasticFox is yet). Maybe give it a try?
SpandexFox.com
Thanks
By Joshua McKenty on Aug 22, 2008
Rahul, I usually start with a fresh AMI, install and configure all the software I need to run my service, and then save it as a new AMI. After launching an instance of this new AMI I update it with the latest version of my code and the data I need. Before EBS I would get the data from S3 but EBS opens up for some more interesting solutions. (So far I have never run a full site on AWS - I have only used it for background processing).
By Thomas Brox Røst on Aug 22, 2008
wonderful article for us to get started with amazon!
thanks a lot
By sean on Aug 22, 2008
When creating the db for django, I get a
psql: FATAL: Ident authentication failed for user “postgres”
Is there a step that I’m missing?
By Wilkes Joiner on Aug 27, 2008
Wilkes, I missed a small but important step.
After having run the code that modifies the PostgreSQL configuration file you should do a “service postgresql restart”. This enables the new (less strict) PostgreSQL security policy.
I have updated the relevant part of the article - thanks for pointing it out!
By Thomas Brox Røst on Aug 27, 2008
Have you any experience snap-shotting the postgresql EBS volume to S3 and restoring postgresql from S3 to EBS ?
I am particularly interested in the process needed to snapshot the Postgresql volume ensuring the volume/data is in a consistent state - with a minimum outage of postgresql (or none at all)
By Paul on Sep 10, 2008
Paul, I do that on Eventseer and have had no problems so far.
After having shut down PostgreSQL and unmounted the volume, as in the last step above, I create a snapshot. This snapshot then contains all my database files.
When I need to use the database on an EC2 instance I just follow these steps:
1) Create a volume from the snapshot.
2) Attach the volume.
3) Mount the volume as /vol.
4) Stop PostgreSQL.
5) Change the PostgreSQL database location (as above).
6) Start PostgreSQL.
This gives me access to the database in the state it was in when the snapshot was made. As long as you remember to shut down PostgreSQL before doing anything with the data files you should be fine.
By Thomas Brox Røst on Sep 10, 2008
I’m using this post to get started on AWS and it’s helping a lot. One thing I got stuck on was using putty through elasticfox. I needed to use puttygen to create a putty-friendly key file out of the one given from elasticfox.
By Greg on Sep 22, 2008
Greg, thanks for pointing that out.
By Thomas Brox Røst on Sep 22, 2008
How can I find out how much it will cost to have django and postgresql on the EC2 instance? I want to use to host my django apps, but I’m uncertain if it will be cost effective. The app I want to deploy is pretty small in size with not much data.
Thanks!
By Frank on Sep 25, 2008
Frank, I guess that depends on the nature of your application. Running a small EC2 instance ($0.10 per hour) will set you back $72 a month. Then there’s the added costs of EBS storage and traffic, which are probably somewhat negligible for a small-volume site. If you need a more powerful EC2 instance then multiply the costs by four.
For comparisons sake, an economy plan dedicated server from GoDaddy will at the time of writing cost you $80 per month if you go with their monthly payment plan. You can probably find cheaper options than that.
If you refer to the pricing plans on the AWS site you can probably make a rough estimate of whether this is a cost-effective solution for your needs.
By Thomas Brox Røst on Sep 25, 2008
Thanks Thomas!
This is a great tutorial, but as you mentioned, the aws seems to be a little more expensive than a dedicated server…
By sean on Oct 3, 2008
Sean, that is true. Personally I still use a dedicated server for the things that have to be done in real-time and where continuous presence is required. Everything else, such as data aggregation, log processing, maintenance and even testing, I now run exclusively on AWS.
These are the types of tasks I would previously run on stand-by dedicated servers. Since these servers would be idle most of the time it makes a lot more sense for me to only buy computing resources as required.
Also, EBS has made it both faster and easier to set up replicas of my Django production environment, which is really useful for many different tasks.
By Thomas Brox Røst on Oct 3, 2008
I am unable to complete the instructions as given. The following is invalid:
service postgresql initdb
Usage: /etc/init.d/postgresql {start|stop|status|restart|condrestart|condstop|reload|force-reload}
so, a bit later on:
service postgresql restart
Stopping postgresql service: [FAILED]
Initializing database: [FAILED]
Starting postgresql service: [FAILED]
By b2 on Nov 5, 2008
Also, python-psycopg2 doesn’t seem to exist - it’s not found by a yum search.
python-psycopg does exist.
By b2 on Nov 5, 2008
b2: Are you sure you launched an instance of the correct Fedora 8 AMI (ami-2b5fba42)? While logged in, try running
cat /etc/fedora-release
This should output
Fedora release 8 (Werewolf)
By Thomas Brox Røst on Nov 5, 2008
I swear the article referred to a different ami but of course I was using the wrong one - one marked “getting-started”.
So now I reached the point of starting the project and python fails:
[root@domU-12-31-39-00-C6-01 opt]# useradd djangotest
[root@domU-12-31-39-00-C6-01 opt]# su - djangotest
[djangotest@domU-12-31-39-00-C6-01 ~]$ django-admin.py startproject mysite
Traceback (most recent call last):
File “/usr/local/bin/django-admin.py”, line 2, in
from django.core import management
ImportError: No module named django.core
By b2 on Nov 5, 2008
b2: Looks like Python can’t find the Django installation. Try the following steps to verify if Django is available:
[djangotest@domU-12-31-39-00-55-E8 ~]$ python
Python 2.5.1 (r251:54863, Jul 10 2008, 17:24:48)
[GCC 4.1.2 20070925 (Red Hat 4.1.2-33)] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.
>>> import django
>>>
You will probably get an ImportError. If that is the case, try redoing the Django installation steps as root:
# cd /opt
# svn co http://code.djangoproject.com/svn/django/trunk/ django-trunk
# ln -s /opt/django-trunk/django /usr/lib/python2.5/site-packages/django
# ln -s /opt/django-trunk/django/bin/django-admin.py /usr/local/bin
It is especially important that you get the symbolic links right, or Python won’t know where Django is located. Also make sure that the svn checkout doesn’t give you any error messages.
Edit: There was actually a missing space in one of the statements. I have corrected it now.
By Thomas Brox Røst on Nov 5, 2008
Thanks for your patience. The development server now starts with 0 errors but there is no response to a browser request - network timeout. I am using the correct url.
Is it possible the pre-alpha 1.1 build is broken?
I’ll be trying to install django 1.0 to test sure but I am a linux/fedora noob.
By b2 on Nov 6, 2008
Looks like my fault again - I did not have 8000 in my security group - works now.
By b2 on Nov 6, 2008
b2: No worries - and thanks for helping me find that error.
By Thomas Brox Røst on Nov 6, 2008