Persistent Django on Amazon EC2 and EBS – The easy way

by on August 21, 2008
in AWS, Django, EBS, EC2, PostgreSQL

Now that Amazon’s Elastic Block Store (EBS) is publicly available, running a complete Django installation on Amazon Web Services (AWS) is easier than ever.

Why EBS? EBS provides persistent storage, which means that the Django database is kept safe even after the Django EC2 instances terminate.

This tutorial will take you through all the necessary steps for setting up Django with a persistent PostgreSQL database on AWS. I will be assuming no prior knowledge of AWS, so those of you who have dabbled with it before might want to skim through the first steps. Knowing your way around Django is an advantage but not a requirement.

I am deliberately keeping things simple—to get a deeper understanding of the hows and whys of AWS you should take a look at James Gardner’s excellent article as well as the official documentation.

The command line tools can be a bit intimidating so I will also show you how Elasticfox can be a fully satisfactory alternative.

Summary

We are going to register with AWS, get acquainted with Elasticfox, start up an EC2 instance, install Django and PostgreSQL on the instance, and finally mount an EBS drive and move our database to it.

Step 1: Set up an AWS account

To use AWS you need to register at the AWS web page. If you already have an account with Amazon you can extend this to also cover AWS.

Step 2: Download and install the Elasticfox Firefox extension

This tool will make life a whole lot easier for you. Down the road there is no avoiding the official command line tools or alternatively boto if you want to access AWS programmatically. For now, let’s stick with Elasticfox.

You can install the extension from this page.

Step 3: Add your AWS credentials to Firefox

Launch Elasticfox (‘Tools’ -> ‘Elasticfox’) and click on the ‘credentials’ button. Enter your account name (typically the email address you registered with), AWS access key and AWS secret access key. This information can be found via the ‘Your web services account’ on the AWS start page.

Step 4: Create a new EC2 security group

Let’s pause for a while to consider what we are doing.

You will be running your Django installation off an EC2 instance. There is no magic to them at all—they are simply fully functional servers that you access the same way as, say, a dedicated server or a web hosting account.

By default, EC2 instances are an introverted lot: They prefer keeping to themselves and don’t expose any of their ports to the outside world. We will be running a web application on port 8000 so therefore port 8000 has to be opened. (Normally we would be opening port 80, but since I will only be using the Django development web server then port 8000 is preferable). SSH access is also essential, so port 22 should be opened as well.

To make this happen we must create a new security group where these ports are opened.

Click on the ‘Security Groups’ tab and then the ‘Refresh’ icon. The list should update to show you the ‘default’ group.

Then click the ‘Create Security Group’ icon and create a new group named ‘django’.

Now we need to add the actual permissions. Click the ‘Grant Permission’ icon and add ‘From port 8000 to 8000′ under ‘Protocol Details’. Repeat the same action for port 22.

Your security group is now ready for use.

Step 5: Set up a key pair

Having a security group is not enough; we also have to set up a key pair to access the instance via SSH.

Why is this necessary? Think about it: You are launching a server instance but no one has told you the root password. So, setting up a private/public key pair is the only way to gain access.

Click on the ‘KeyPairs’ tab and then the ‘Create a new keypair’ icon. Name your new key pair ‘django-keypair’. A save dialog will pop up, allowing you to save the private key in a safe location. Use the filename ‘django.pem’.

Step 6: Launch an EC2 instance

I have a certain fondness for Fedora, so I’ll be using the fedora-8-i386-base-v1.07 AMI with AMI ID ami-2b5fba42.

Return to the ‘AMIs and Instances’ tab.

If you click the ‘Refresh’ icon in the ‘Machine Images’ section you will get a list of all public images. To find the one we’re after, enter ‘fedora-8′ in the search box—after a while all the relevant images will appear.

Right-click the image with the AMI ID as above and select ‘Launch instance(s) of this AMI’.

This is where the actions from the previous steps start making sense. Set the key pair to ‘django-keypair’ and add the ‘django’ security group to the launch set. Leave all the other settings as they are. Then click the ‘Launch’ button.

Important: From this point and on the meter will be running! If the fire alarm goes off, you get bored with this tutorial, or whatever: Do remember to shut down the instance before you leave, otherwise it will cost you $2.40 per day.

The ‘Your Instances’ section should update, showing you that the instance you just launched is ‘pending’. Click the ‘Refresh’ icon after a while—in a minute or so the status should change to ‘running’.

Step 7: Connect with your new instance

Double click on the running instance and copy the ‘Public DNS Name’ entry. This is the domain name you use to access the instance from the outside. In this tutorial, my instance is hosted at ‘ec2-75-101-248-101.compute-1.amazonaws.com’.

Now we are going to SSH into the instance. I am doing this via Cygwin on Windows, but any SSH client should do. If you are on Windows and have Putty installed you can even launch directly from Elasticfox by right-clicking on the running instance and selecting ‘SSH to Public DNS Name’.

Let’s start with a basic sanity check:

$ ssh root@ec2-75-101-248-101.compute-1.amazonaws.com
The authenticity of host 'ec2-75-101-248-101.compute-1.amazonaws.
com (75.101.248.101)' can't be established.
RSA key fingerprint is db:0a:85:36:99:5f:65:6b:c7:77:3e:37:59:fc:16:fd.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ec2-75-101-248-101.compute-1.amazonaws.
com,75.101.248.101' (RSA) to the list of known hosts.
Permission denied (publickey,gssapi-with-mic).

As expected, this isn’t working; we need to use the private key you saved earlier. Go to the directory where you saved the django.pem file and type the following:

$ ssh -i django-keypair.pem root@ec2-75-101-248-101.compute-1.amazonaws.com

         __|  __|_  )  Fedora 8
         _|  (     /    32-bit
        ___|\___|___|

 Welcome to an EC2 Public Image
                       : -)
    Base

[root@ ~]#

That’s better!

If you try pointing your browser towards ‘http://ec2-75-101-248-101.compute-1.amazonaws.com:8000/’ you should get a ‘can’t establish a connection’ error since there is no web server running on port 8000 as of yet.

Step 8: Install required software

Most AMI instances are stripped to the bone, so we have to add the software packages we need to get Django up and running. The steps required will of course vary from AMI to AMI, but running the following script as root is sufficient for our v1.07 Fedora 8 instance:

# Install subversion
yum -y install subversion

# Install, initialize and launch PostgreSQL
yum -y install postgresql postgresql-server
service postgresql initdb
service postgresql start

# Modify PostgreSQL config to avoid username/password problems
# Note: This grants access to _all_ local traffic!
cat > /var/lib/pgsql/data/pg_hba.conf <<EOM
local all all trust
host all all 127.0.0.1/32 trust
EOM

# Restart PostgreSQL to enable new security policy
service postgresql restart

# Set up a database for Django
psql -U postgres -c "create database djangotest encoding 'utf8'"

# Install Django (I always checkout from SVN)
cd /opt
svn co http://code.djangoproject.com/svn/django/trunk/ django-trunk
ln -s /opt/django-trunk/django /usr/lib/python2.5/site-packages/django
ln -s /opt/django-trunk/django/bin/django-admin.py /usr/local/bin

# Install psycopg2 (for database access from Python)
yum -y install python-psycopg2

Step 9: Set up a Django project

First we set up an account for our test Django project:

[root ~]# useradd djangotest
[root ~]# su - djangotest
[djangotest ~]$

For the full story on how to create a new Django project you should have a look at the official tutorial. For now, just execute the following as the ‘djangotest’ user:

[djangotest ~]$ django-admin.py startproject mysite

Now we have all we need to test if the installation is working. Launch the development server like this:

[djangotest ~]$ python mysite/manage.py runserver ec2-75-101-248-101.compute-1.amazonaws.com:8000
Validating models...
0 errors found

Django version 1.0-beta_1-SVN-8461, using settings 'mysite.settings'
Development server is running at http://ec2-75-101-248-101.compute-1.amazonaws.com:8000/
Quit the server with CONTROL-C.

Note that I am using the full external domain name with the ‘runserver’ command.

Visit ‘http://ec2-75-101-248-101.compute-1.amazonaws.com:8000/’ with your browser and you should see the regular Django ‘It worked!’ page.

Note: Please don’t use the Django development server in a production setting. In fact, you probably shouldn’t use it on anything that is exposed to the outside world. The only reason I am doing it this way in this tutorial is to keep things simple—normally you should set up a proper web server such as Apache or Lighttpd. Refer to the Django documentation for information on how to do this.

Step 10: Create a Django application

I will show you how to put the Django database in persistent storage later on, so we have to set up a simple database-backed Django application.

Modify mysite/settings.py as follows:

DATABASE_ENGINE = 'postgresql_psycopg2'
DATABASE_NAME = 'djangotest'
DATABASE_USER = 'postgres'
DATABASE_PASSWORD = ''
...

INSTALLED_APPS = (
    'django.contrib.admin',
    'django.contrib.auth',
...

Then modify mysite/urls.py to allow access to the admin GUI:

from django.conf.urls.defaults import *

# Uncomment the next two lines to enable the admin:
from django.contrib import admin
admin.autodiscover()

urlpatterns = patterns('',
    # Example:
    # (r'^mysite/', include('mysite.foo.urls')),

    # Uncomment the next line to enable admin documentation:
    # (r'^admin/doc/', include('django.contrib.admindocs.urls')),

    # Uncomment the next line to enable the admin:
    (r'^admin/(.*)', admin.site.root),
)

Now we have to sync the database:

[djangotest ~]$ python mysite/manage.py syncdb

You will be asked to create an admin user—set both the username and the password to ‘djangotest’.

Then create a Django app:

[djangotest ~]$ python mysite/manage.py startapp myapp

If you got the preceding steps right, you should now be able to log on to the admin GUI at http://ec2-75-101-248-101.compute-1.amazonaws.com:8000/admin/ with the ‘djangotest’ user.

Add a new user to verify that the database connection works—we will be needing that new user later on.

Step 11: Create and mount an EBS instance

This is where things get really cool!

There is a huge problem with our current setup: Once you shut down the AMI instance, all the data in our database will disappear. Enter EBS.

EBS lets you define a persistent storage volume that can be mounted by EC2 instances. If we move our database files to an EBS volume then they will persist no matter what happens to our EC2 instances.

First, go back to Elasticfox and make a note of the availability zone of your running instance—this should be something like ‘us-east-1b’.

Then click on the ‘Volumes and Snapshots’ tab. Click the ‘Create Volume’ icon and create a 1GB volume that belongs to the same availability zone as your instance.

Right-click the new volume and choose ‘Attach this volume’. This will let you attach the volume to the running instance. Use /dev/sdh as the mount point. Refresh after a couple of seconds and the ‘Attachment status’ should have changed to ‘attached’.

Go back to your terminal and create an ext3 filesystem on the new volume:

[root ~]# mkfs.ext3 /dev/sdh
mke2fs 1.40.4 (31-Dec-2007)
/dev/sdh is entire device, not just one partition!
Proceed anyway? (y,n) y
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
131072 inodes, 262144 blocks
13107 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=268435456
8 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376

Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 35 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

All that remains is to mount the filesystem, in this case to /vol:

[root ~]# echo "/dev/sdh /vol ext3 noatime 0 0" >> /etc/
fstab
[root ~]# mkdir /vol
[root ~]# mount /vol
[root ~]# df --si
Filesystem             Size   Used  Avail Use% Mounted on
/dev/sda1               11G   1.4G   8.8G  14% /
/dev/sda2              158G   197M   150G   1% /mnt
none                   895M      0   895M   0% /dev/shm
/dev/sdh               1.1G    35M   969M   4% /vol

Step 12: Moving the database to persistent storage

First make sure that PostgreSQL is stopped:

[root ~]# service postgresql stop
Stopping postgresql service:                               [  OK  ]

You should also terminate your Django development server in case it is still running.

Now move the PostgreSQL database files to the EBS volume mounted at /vol:

[root ~]# mv /var/lib/pgsql /vol

For this to work we have to make a small modification to the /etc/init.d/postgresql file—make sure that the lines starting at around line 100 look exactly like this:

...
# Set defaults for configuration variables
PGENGINE=/usr/bin
PGPORT=5432
PGDATA=/var/lib/pgsql
if [ -f "$PGDATA/PG_VERSION" ] && [ -d "$PGDATA/base/template1" ]
then
        echo "Using old-style directory structure"
else
        PGDATA=/var/lib/pgsql/data
fi
PGDATA=/vol/pgsql/data
PGLOG=/vol/pgsql/pgstartup.log
...

Note that this is a Fedora-specific hack—the main idea is to have the $PGDATA system variable point at /vol/pgsql/data.

For other databases the procedure will differ. A similar procedure for MySQL is available here.

PostgreSQL can now be restarted:

[root ~]# service postgresql start
Starting postgresql service:                               [  OK  ]

To verify that Django is using the same database as before you can revisit the admin GUI—the new user you added previously should still be available.

And there you have it!

Step 13: Shutting down

For completeness’ sake, let’s review the steps required to shut everything down.

First, stop the database server and unmount the EBS volume:

[root ~]# service postgresql stop
Stopping postgresql service:                               [  OK  ]
[root ~]# umount /vol

Then return to Elasticfox, right-click the EBS volume and select ‘Detach this instance’. When you are done with this tutorial you can delete the volume instance as well—having it in storage will cost you money.

Finally, go to the ‘AMIs and Instances’ tab and terminate the running instance. That should conclude your current transaction with AWS. (Refresh the volume and instances sections to verify that everything has really shut down).

Final words

If you now repeat steps 6 to 11 you should be able to launch a brand new EC2 instance that uses the database on your stored volume—this is left as an exercise for the reader. The only deviations from the procedure are that you shouldn’t have to run the PostgreSQL ‘initdb’ command, or create the ‘djangotest’ database.

This has been a bare-bones introduction to how EBS lets you run a persistent Django installation on AWS. In real life, the following issues have to be considered:

  • Use a proper web server.
  • Make sure the web server log files, database log, django logs etc. are moved to persistent storage as well.
  • Create a custom AMI that is properly set up for your Django project (so that you don’t have to do the full setup procedure every time you launch an instance).

Then there’s scaling, backup, and so on. Nonetheless, hopefully this article should be enough to get you started.

Addendum

A reader pointed out that the PostgreSQL user home directory should also be changed. While I haven’t tried this myself, the correct procedure is probably to do a usermod -d /vol/pgsql postgres as root.

Comments

67 Responses to “Persistent Django on Amazon EC2 and EBS – The easy way”
  1. Rahul Dave says:

    Sweet. I tried it out and this was very very clear. It seems though that unless one saves a modified AMI (can one do this?) to EBS, or installs everything in a separate /software partition mounted on EBS, one would have to reinstall django, etc. Is this true? If so the nonrelocatable nature of most debs or rpms would be problematic.

  2. Thomas Brox Røst says:

    Rahul, you can definitely save a modified AMI – the James Gardner article I’m mentioning describes the procedure. I think you can also do it directly from Elasticfox. (I had to leave out that part or the posting would have been way too long.)

  3. Rahul Dave says:

    Thanks, I went and read that…seems you create the AMI locally and upload to S3…looks simple enough. I was wondering however, if you could kinda directly save to EBS..hmm perhaps an AMI could have a boot+ramdisk image and the root image be got from a EBS partition?

  4. Shabda says:

    Thank you. Very well written article!

  5. I’ve rewritten ElasticFox and named it SpandexFox – it supports saving your AMI back to S3 from the GUI, and has some INSTRUCTIONS. Plus it’s compatible with FF3 (don’t know if ElasticFox is yet). Maybe give it a try?

    SpandexFox.com

    Thanks

  6. Thomas Brox Røst says:

    Rahul, I usually start with a fresh AMI, install and configure all the software I need to run my service, and then save it as a new AMI. After launching an instance of this new AMI I update it with the latest version of my code and the data I need. Before EBS I would get the data from S3 but EBS opens up for some more interesting solutions. (So far I have never run a full site on AWS – I have only used it for background processing).

  7. sean says:

    wonderful article for us to get started with amazon!
    thanks a lot

  8. Wilkes Joiner says:

    When creating the db for django, I get a

    psql: FATAL: Ident authentication failed for user “postgres”

    Is there a step that I’m missing?

  9. Thomas Brox Røst says:

    Wilkes, I missed a small but important step.

    After having run the code that modifies the PostgreSQL configuration file you should do a “service postgresql restart”. This enables the new (less strict) PostgreSQL security policy.

    I have updated the relevant part of the article – thanks for pointing it out!

  10. Paul says:

    Have you any experience snap-shotting the postgresql EBS volume to S3 and restoring postgresql from S3 to EBS ?
    I am particularly interested in the process needed to snapshot the Postgresql volume ensuring the volume/data is in a consistent state – with a minimum outage of postgresql (or none at all)

  11. Thomas Brox Røst says:

    Paul, I do that on Eventseer and have had no problems so far.

    After having shut down PostgreSQL and unmounted the volume, as in the last step above, I create a snapshot. This snapshot then contains all my database files.

    When I need to use the database on an EC2 instance I just follow these steps:

    1) Create a volume from the snapshot.

    2) Attach the volume.

    3) Mount the volume as /vol.

    4) Stop PostgreSQL.

    5) Change the PostgreSQL database location (as above).

    6) Start PostgreSQL.

    This gives me access to the database in the state it was in when the snapshot was made. As long as you remember to shut down PostgreSQL before doing anything with the data files you should be fine.

  12. Greg says:

    I’m using this post to get started on AWS and it’s helping a lot. One thing I got stuck on was using putty through elasticfox. I needed to use puttygen to create a putty-friendly key file out of the one given from elasticfox.

  13. Thomas Brox Røst says:

    Greg, thanks for pointing that out.

  14. Frank says:

    How can I find out how much it will cost to have django and postgresql on the EC2 instance? I want to use to host my django apps, but I’m uncertain if it will be cost effective. The app I want to deploy is pretty small in size with not much data.

    Thanks!

  15. Thomas Brox Røst says:

    Frank, I guess that depends on the nature of your application. Running a small EC2 instance ($0.10 per hour) will set you back $72 a month. Then there’s the added costs of EBS storage and traffic, which are probably somewhat negligible for a small-volume site. If you need a more powerful EC2 instance then multiply the costs by four.

    For comparisons sake, an economy plan dedicated server from GoDaddy will at the time of writing cost you $80 per month if you go with their monthly payment plan. You can probably find cheaper options than that.

    If you refer to the pricing plans on the AWS site you can probably make a rough estimate of whether this is a cost-effective solution for your needs.

  16. sean says:

    Thanks Thomas!
    This is a great tutorial, but as you mentioned, the aws seems to be a little more expensive than a dedicated server…

  17. Thomas Brox Røst says:

    Sean, that is true. Personally I still use a dedicated server for the things that have to be done in real-time and where continuous presence is required. Everything else, such as data aggregation, log processing, maintenance and even testing, I now run exclusively on AWS.

    These are the types of tasks I would previously run on stand-by dedicated servers. Since these servers would be idle most of the time it makes a lot more sense for me to only buy computing resources as required.

    Also, EBS has made it both faster and easier to set up replicas of my Django production environment, which is really useful for many different tasks.

  18. b2 says:

    I am unable to complete the instructions as given. The following is invalid:

    service postgresql initdb
    Usage: /etc/init.d/postgresql {start|stop|status|restart|condrestart|condstop|reload|force-reload}

    so, a bit later on:
    service postgresql restart
    Stopping postgresql service: [FAILED]
    Initializing database: [FAILED]
    Starting postgresql service: [FAILED]

  19. b2 says:

    Also, python-psycopg2 doesn’t seem to exist – it’s not found by a yum search.

    python-psycopg does exist.

  20. Thomas Brox Røst says:

    b2: Are you sure you launched an instance of the correct Fedora 8 AMI (ami-2b5fba42)? While logged in, try running

    cat /etc/fedora-release

    This should output

    Fedora release 8 (Werewolf)

  21. b2 says:

    I swear the article referred to a different ami but of course I was using the wrong one – one marked “getting-started”.

    So now I reached the point of starting the project and python fails:

    [root@domU-12-31-39-00-C6-01 opt]# useradd djangotest
    [root@domU-12-31-39-00-C6-01 opt]# su – djangotest
    [djangotest@domU-12-31-39-00-C6-01 ~]$ django-admin.py startproject mysite
    Traceback (most recent call last):
    File “/usr/local/bin/django-admin.py”, line 2, in
    from django.core import management
    ImportError: No module named django.core

  22. Thomas Brox Røst says:

    b2: Looks like Python can’t find the Django installation. Try the following steps to verify if Django is available:

    [djangotest@domU-12-31-39-00-55-E8 ~]$ python
    Python 2.5.1 (r251:54863, Jul 10 2008, 17:24:48)
    [GCC 4.1.2 20070925 (Red Hat 4.1.2-33)] on linux2
    Type “help”, “copyright”, “credits” or “license” for more information.
    >>> import django
    >>>

    You will probably get an ImportError. If that is the case, try redoing the Django installation steps as root:

    # cd /opt
    # svn co http://code.djangoproject.com/svn/django/trunk/ django-trunk
    # ln -s /opt/django-trunk/django /usr/lib/python2.5/site-packages/django
    # ln -s /opt/django-trunk/django/bin/django-admin.py /usr/local/bin

    It is especially important that you get the symbolic links right, or Python won’t know where Django is located. Also make sure that the svn checkout doesn’t give you any error messages.

    Edit: There was actually a missing space in one of the statements. I have corrected it now.

  23. b2 says:

    Thanks for your patience. The development server now starts with 0 errors but there is no response to a browser request – network timeout. I am using the correct url.

    Is it possible the pre-alpha 1.1 build is broken?
    I’ll be trying to install django 1.0 to test sure but I am a linux/fedora noob.

  24. b2 says:

    Looks like my fault again – I did not have 8000 in my security group – works now.

  25. Thomas Brox Røst says:

    b2: No worries – and thanks for helping me find that error. :)

  26. Andrey says:

    Thanks for a great tutorial. There an undescovered case.
    Django recommends to use a separate server for media files. It’s good opportunity to use S3 for this case, but S3 is slow for a lot of small files. There some recommendations?

  27. Thomas Brox Røst says:

    Andrey, have you had a look at Amazon’s new content delivery service, CloudFront? It uses S3 on the back end while distributing all your files to edge location servers. I’m using it on EventSeer and it works great so far.

  28. Matt Tucker says:

    In an earlier comment, Paul had asked about doing EBS snapshots of Postgres. The answer of shutting down Postgres in order to take the snapshot makes sense, but I’m looking to find a way to do live snapshots of a running database.

    So far I’m looking at doing the following:

    * Run snapshot command on Postgres to flush as much as possible to disk
    * Pause file system
    * Issue EBS snapshot command
    * Un-pause file system

    From what I can tell that should do the trick, but I’m interested to hear if anyone has practical experience with this or another approach.

  29. Matt Tucker: On Postgresql, you should configure wal-archiving and PITR snapshots. I think it requires 8.3, but it is /so/ worth it.

    See http://www.postgresql.org/docs/8.3/interactive/continuous-archiving.html for an explanation on how to setup it (but read the whole thing, there are things that will affect your ability to replay data!)

  30. Jack Briner says:

    You have decided to reinstall your software rather than put it in a repackaged ami on each instance creation.

    After trying to package my own ami rather miserably, I can see why one might go this route.

    However, I was trying to decide if I should install the software in ebs with my data. If I customize some of the software, I will have real headaches keeping everything properly synced unless I do some kind of remote version control.

    There would be a performance penalty with both the data and the software running over the network. However, it seems the easiest.

    Another option would be to cache the software (and data?) to the ec2 local disk from ebs and return it later to the ebs after I was done with the ami.

    Thanks in advance for your thoughts,
    Jack

  31. Thomas Brox Røst says:

    Jack, what I usually do is to make a custom AMI with all the software that shouldn’t change very often (e.g. web servers, web application frameworks, Python libraries) and then download the latest stable version of my source code for each instance I start up. So far I have only used EBS for data.

    I don’t think there’s a right way of doing this, but my principle so far has been to keep the things that change infrequently (i.e. the combination of libraries and tools where I know the version releases work well together and that I don’t want to mess with unless I really have to…) and the things that change frequently (i.e. my database) separate.

    A good strategy would probably be to start with the simplest possible solution that does the job, even if it is suboptimal. For me, just getting into the mindset of doing distributed computing took a while. You will also find that there a lot of practical issues, such as timing the distributed jobs, merging results into your production environment, dealing with errors, and so on. (Looking into Hadoop might be a good idea). Just keep it simple and then optimize once you have a stable and fault tolerant infrastructure.

  32. mark says:

    Hi,
    I have 2 instances running fedora and suse respectively.But iam not able to ssh between the 2 instances.
    On both the servers when i run nmap it shows port 80,443,22 open.
    Can anybody give me tips please

  33. Thomas Brox Røst says:

    Mark, if you can ssh from your computer into each launched instance, using a private key as described in the article, then the ports should not be a problem. Do you by any chance use different security groups/key pairs for each instance? If so, have you checked that you use the correct private key on the ‘from’ instance for connecting to the ‘to’ instance?

  34. marc says:

    There is one step which i am still not getting.
    With a simple webapp, i’ll have some log, the webapp itself (maybe rails, maybe php), and my database

    so i’d probably setup maybe 2 ebs volumes (webapp+logs and database).
    then i’d copy my database and webapp to the volumes, create symlinks (or change the config), tweak the config of my instance, upload my ssl keys for the webserver and so on….

    but… when i stop/restart my instance i have persitance logs, webapp and database… but still my whole config is gone, my ssl keys are gone… shouldn’t there be a way to store the main volume itself on ebs or am i missing something?

    looking forward for clarification! :)

  35. Thomas Brox Røst says:

    Marc, you need to create your own AMI bundle that is preloaded with all the configuration files, software, keys etc. that you need for your webapp – I deliberately omitted that part from this tutorial. You then launch instances using your custom AMI rather than the stock AMI. Have a look at James Gardner’s article (http://jimmyg.org/2007/09/01/amazon-ec2-for-people-who-prefer-debian-and-python-over-fedora-and-java/) to learn how to do this.

  36. Tom says:

    OK…I hope this is not a silly question. I am trying to use EC2 for a demo environment for numerous non-web applications. I created a volume and placed all the necessary files on there and mapped everything to a specific drive. I shut the instance down and restarted it later and attached my volume and it came up as another drive. Is there a way to ensure that the volume comes up as the same drive or do I just have to use UNC mappings. Thanks in advance.

  37. Umair says:

    Thanks for the tutorial.
    One thing though, you should change the home directory for the user “postgres” to the new location. .i.e. /vol/pgsql

  38. Thomas Brox Røst says:

    Umair, I have updated the tutorial. Thanks!

  39. Thank you, this was very very helpful!!
    Uri

  40. Thomas Brox Røst says:

    Ritesh, thanks!

  41. “/dev/sda2 158G 197M 150G 1% /mnt”

    Why don´t store de postgresql database in the “/dev/sda2″ disk? into the “/mnt” directory structure? If you never “terminate” the instance, the data will survive.

    PS: I am newbie in the AWS’s field.

  42. Thomas Brox Røst says:

    Ariel, the /dev/sda2 partition is not persistent. If the instance goes down for whatever reason the data will be lost.

    This posting is getting a bit old; nowadays you can have EBS-backed instances that can be stopped and restarted and where the data on the root device will be kept intact. See http://serverfault.com/questions/158647/ebs-volum-on-ec2-with-ubuntu-image-from-alestic for a good explanation.

    I still prefer having my non-ephemeral data on dedicated EBS volumes that are mounted onto the instance. This makes it easy to set up a snapshot scheme for backup purposes and to replicate and move data around if required. It also tends to simplify recovery from catastrophic failure.

  43. Dear Thomas, thank you so much!!!

  44. Thien Nguyen says:

    Hi, I followed your instruction and it worked fine up to step 9. The development server was launched with my public DNS on port 8000. However, when I enter the URL on some browser, the server just didn’t response and it resulted in a connection timed out. Do you know what’s wrong with it?
    Thanks!

  45. Thomas Brox Røst says:

    Hi Thien, make sure that port 8000 in the EC2 security group is opened, as described in step 4. Otherwise your server will be blocked by the firewall, which might explain the timeout.

  46. Thien Nguyen says:

    Awesome, it works. Thanks, Thomas!

  47. David Phillips says:

    Thomas,

    Great tutorial! Thank you for cutting hours off my install! I am a nood to django and python, but the app I have created, is using MySQL. Would it be difficult to install MySQL instead of PostgreSQL? Would it simply be a matter of installing MySQL instead of PostgreSQL (and then formatting the appropriate Django files)?

    Again, thank you for all your hard work on this!

    dp

  48. Thomas Brox Røst says:

    Thanks, David. It shouldn’t be too difficult to substitute PostgreSQL with MySQL. The Running MySQL on Amazon EC2 with EBS article on the AWS site might help you out. As long as you have a working MySQL installation and tell Django to use it you should be all set.

Share Your Thoughts