AWS – Thomas Brox Røst

On Boto and Chef Community Cookbooks

July 5, 2017July 5, 2017 Thomas Brox Røst Leave a comment

Today I learned:

It is time to let the old Boto retire; it would not let me access the attached managed policies of an IAM role. Boto3 supports this.
Chef community cookbooks that are bundled with a project have a tendency to not work if left alone for a while—and upgrading them can get messy due to the way dependencies are handled. Stick to the cookbooks with no or simple dependency graphs or just roll your own solution.

AWS cuts data transfer rates: Pricing comparison update

February 2, 2010 Thomas Brox Røst Leave a comment

AWS just cut their outbound data transfer rates from $0.17 to $0.15 per GB/month up until 10TB. I have updated my previous comparison between Go Daddy and AWS with the latest numbers.

Updated AWS/Go Daddy dedicated server cost comparison

August 28, 2009February 2, 2010 Thomas Brox Røst 3 Comments

UPDATE 1: Corrected the bandwidth calculation in the formulas for AWS.

UPDATE 2: Added new data for the February 2010 AWS data transfer price reduction.

In a previous posting I did a cost comparison of a reserved Amazon Web Services EC2 instance and a comparable dedicated server from Go Daddy. Amazon recently announced a set of price cuts for reserved instances, so an updated comparison is in order.

The server configurations I’m comparing are the same as last time:

	Go Daddy	AWS	AWS (new)
Processor	Core 2 Duo 2.66 GHz	4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each) [1 unit equals a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor]	4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each) [1 unit equals a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor]
Hard Drive(s)	Dual 300GB drives	850GB of instance storage	850GB of instance storage
Memory	3.2GB	7.5GB	7.5GB
1-year plan, w/o bandwidth	$2,483.46	$2,351.20	$1,962.40
3-year plan, w/o bandwidth	$6,622.56	$5,153.60	$4,557.20

I have added an extra column for the new EC2 reserved instance pricing scheme. Notably, the prices for the Go Daddy options haven’t changed in the last six months. For the 1- and 3-year AWS plans, the total costs have dropped by $390 (17%) and $600 (12%), bandwidth excluded.

(For the full discussion, refer to the previous posting.)

When including bandwidth use, the updated table looks as follows:

	1GB/mth	20GB/mth	100GB/mth	400GB/mth	800GB/mth
Go Daddy, 1-year plan	$2,483.46	$2,483.46	$2,483.46	$2,483.46	$2,723.34
AWS, 1-year plan	$2,353.24	$2,392.00	$2,555.20	$3,167.20	$3,983.20
AWS, 1-year plan (Aug 2009)	$1,963.24	$2,002.00	$2,165.20	$2,777.20	$3,593.20
AWS, 1-year plan (Feb 2010)	$1,963.00	$1,997.20	$2,141.20	$2,681.20	$3,401.20

Go Daddy, 3-year plan	$6,622.56	$6,622.56	$6,622.56	$6,622.56	$7,342.20
AWS, 3-year plan	$5,159.72	$5,276.00	$5,765.60	$7,601.60	$10,049.60
AWS, 3-year plan (Aug 2009)	$4,559.72	$4,676.00	$5,165.60	$7,001.60	$9,449.60
AWS, 3-year plan (Feb 2010)	$4,559.00	$4,661.60	$5,093.60	$6,713.60	$8,873.60

Updated comparison between AWS and Go Daddy pricing plans

With the old pricing, the AWS option was preferable unless bandwidth exceeded 100GB per month (for the 1-year plan) or 250GB per month (for the 3-year plan). After the August 2009 price cuts, AWS became an even more competitive option, although one that still falls behind for high bandwidth scenarios.

In February 2010, the price for outgoing data traffic dropped from $0.17 to $0.15. With the 3-year plan, AWS now matches Go Daddy up until almost 400GB per month.

Addendum: Some of the background data used in this posting:

Go Daddy quotes from August 28, 2009
Go Daddy 3-year plan cost: 2-year plan quote * 1.5
AWS 1-year plan cost (< Aug 2009): $1,300 + (24 * 365 * 1 * $0.12) + (GB/mth * $0.17 * 12)
AWS 1-year plan cost (Aug 2009): $910 + (24 * 365 * 1 * $0.12) + (GB/mth * $0.17 * 12)
AWS 1-year plan cost (Feb 2010): $910 + (24 * 365 * 1 * $0.12) + (GB/mth * $0.15 * 12)
AWS 3-year plan cost (< Aug 2009): $2,000 + (24 * 365 * 3 * $0.12) + (GB/mth * $0.17 * 36)
AWS 3-year plan cost (Aug 2009): $1,400 + (24 * 365 * 3 * $0.12) + (GB/mth * $0.17 * 36)
AWS 3-year plan cost (Feb 2010): $1,400 + (24 * 365 * 3 * $0.12) + (GB/mth * $0.15 * 36)
EC2 pricing information
Go Daddy dedicated server pricing information

Why Amazon Web Services just became a competitive web hosting provider

March 12, 2009February 2, 2010 Thomas Brox Røst 5 Comments

UPDATE 1: There is now an updated version of this posting. The new version incorporates the August 2009 AWS reserved instance pricing changes.

UPDATE 2: Corrected the bandwidth calculation in the formulas for AWS.

Amazon Web Services just announced a new reserved instances pricing plan. In short, this plan allows you to reserve EC2 instances for a 1 to 3 year period by paying a one-time reservation fee. The hourly rate for reserved instances is considerably lower than for regular spot-market instances. For comparisons sake, a large standard on-demand instance will set you back $0.40 per hour, while the large standard reserved instance is only $0.12 per hour.

With the old pricing scheme, hosting a web service on AWS instead of on a dedicated server was not a very cost-competitive option, at least not for resource-intensive applications. For my web site, Eventseer, I require at least a large standard instance—at $0.40 per hour for 24/7 operation (bandwidth costs not included), this turned out way too expensive compared with offerings from traditional dedicated server providers.

To see if the new pricing scheme fares any better, I have compared the cost of an AWS EC2 reserved large instance with a similar dedicated server from Go Daddy:

	Go Daddy	AWS
Processor	Core 2 Duo 2.66 GHz	4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each) [1 unit equals a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor]
Hard Drive(s)	Dual 300GB drives	850GB of instance storage
Memory	3.2GB	7.5GB
1-year plan, w/o bandwidth	$2,483.46	$2,351.20
3-year plan, w/o bandwidth	$6,622.56	$5,153.60

I’m not sure how the 4 EC2 compute units compete with a dedicated Core 2 Duo 2.66 GHz. This probably also depends on the nature of your application. Note that the AWS solution has twice the amount of memory. From what I can see, you can not get a Go Daddy dedicated server with more than 3.2GB of memory, while AWS offers up to 15GB on the extra large instances.

When disregarding bandwidth costs, AWS suddenly makes a lot of sense. As bandwidth use is highly application-dependent, let’s consider a few different bandwidth use scenarios:

	1GB/mth	20GB/mth	100GB/mth	400GB/mth	800GB/mth
Go Daddy, 1-year plan	$2,483.46	$2,483.46	$2,483.46	$2,483.46	$2,723.34
AWS, 1-year plan	$2,353.24	$2,392.00	$2,555.20	$3,167.20	$3,983.20

Go Daddy, 3-year plan	$6,622.56	$6,622.56	$6,622.56	$6,622.56	$7,342.20
AWS, 3-year plan	$5,159.72	$5,276.00	$5,765.60	$7,601.60	$10,049.60

Comparison between AWS and Go Daddy pricing plans

Conclusion: With a 1-year plan, AWS is the cheapest option until you reach about 100GB of external bandwidth per month. With the 3-year plan, the AWS bandwidth cost isn’t a problem until about 250GB per month. (Bandwidth is “free” with Go Daddy dedicated servers up until 500GB per month; after that it’s an extra $19.99 per month until you reach 1,000GB).

Considering that the AWS solution gets you twice the amount of RAM, AWS suddenly seems a very viable option even for web service hosting—as long as you’re not expecting extreme
amounts of traffic. However, once you get popular the outgoing data transfer pricing will take its toll.

Addendum: Some of the background data used in this posting:

Go Daddy quotes from March 12, 2009
Formula for Go Daddy 3-year plan cost: 2-year plan quote * 1.5
Formula for AWS 1-year plan cost: $1,300 + (24 * 365 * 1 * $0.12) + (GB/mth * $0.17 * 12)
Formula for AWS 3-year plan cost: $2,000 + (24 * 365 * 3 * $0.12) + (GB/mth * $0.17 * 36)
EC2 pricing information
Go Daddy dedicated server pricing information

Persistent Django on Amazon EC2 and EBS – The easy way

August 21, 2008February 13, 2010 Thomas Brox Røst 69 Comments

Now that Amazon’s Elastic Block Store (EBS) is publicly available, running a complete Django installation on Amazon Web Services (AWS) is easier than ever.

Why EBS? EBS provides persistent storage, which means that the Django database is kept safe even after the Django EC2 instances terminate.

This tutorial will take you through all the necessary steps for setting up Django with a persistent PostgreSQL database on AWS. I will be assuming no prior knowledge of AWS, so those of you who have dabbled with it before might want to skim through the first steps. Knowing your way around Django is an advantage but not a requirement.

I am deliberately keeping things simple—to get a deeper understanding of the hows and whys of AWS you should take a look at James Gardner’s excellent article as well as the official documentation.

The command line tools can be a bit intimidating so I will also show you how Elasticfox can be a fully satisfactory alternative.

Summary

We are going to register with AWS, get acquainted with Elasticfox, start up an EC2 instance, install Django and PostgreSQL on the instance, and finally mount an EBS drive and move our database to it.

Step 1: Set up an AWS account

To use AWS you need to register at the AWS web page. If you already have an account with Amazon you can extend this to also cover AWS.

Step 2: Download and install the Elasticfox Firefox extension

This tool will make life a whole lot easier for you. Down the road there is no avoiding the official command line tools or alternatively boto if you want to access AWS programmatically. For now, let’s stick with Elasticfox.

You can install the extension from this page.

Step 3: Add your AWS credentials to Firefox

Launch Elasticfox (‘Tools’ -> ‘Elasticfox’) and click on the ‘credentials’ button. Enter your account name (typically the email address you registered with), AWS access key and AWS secret access key. This information can be found via the ‘Your web services account’ on the AWS start page.

Step 4: Create a new EC2 security group

Let’s pause for a while to consider what we are doing.

You will be running your Django installation off an EC2 instance. There is no magic to them at all—they are simply fully functional servers that you access the same way as, say, a dedicated server or a web hosting account.

By default, EC2 instances are an introverted lot: They prefer keeping to themselves and don’t expose any of their ports to the outside world. We will be running a web application on port 8000 so therefore port 8000 has to be opened. (Normally we would be opening port 80, but since I will only be using the Django development web server then port 8000 is preferable). SSH access is also essential, so port 22 should be opened as well.

To make this happen we must create a new security group where these ports are opened.

Click on the ‘Security Groups’ tab and then the ‘Refresh’ icon. The list should update to show you the ‘default’ group.

Then click the ‘Create Security Group’ icon and create a new group named ‘django’.

Now we need to add the actual permissions. Click the ‘Grant Permission’ icon and add ‘From port 8000 to 8000’ under ‘Protocol Details’. Repeat the same action for port 22.

Your security group is now ready for use.

Step 5: Set up a key pair

Having a security group is not enough; we also have to set up a key pair to access the instance via SSH.

Why is this necessary? Think about it: You are launching a server instance but no one has told you the root password. So, setting up a private/public key pair is the only way to gain access.

Click on the ‘KeyPairs’ tab and then the ‘Create a new keypair’ icon. Name your new key pair ‘django-keypair’. A save dialog will pop up, allowing you to save the private key in a safe location. Use the filename ‘django.pem’.

Step 6: Launch an EC2 instance

I have a certain fondness for Fedora, so I’ll be using the fedora-8-i386-base-v1.07 AMI with AMI ID ami-2b5fba42.

Return to the ‘AMIs and Instances’ tab.

If you click the ‘Refresh’ icon in the ‘Machine Images’ section you will get a list of all public images. To find the one we’re after, enter ‘fedora-8’ in the search box—after a while all the relevant images will appear.

Right-click the image with the AMI ID as above and select ‘Launch instance(s) of this AMI’.

This is where the actions from the previous steps start making sense. Set the key pair to ‘django-keypair’ and add the ‘django’ security group to the launch set. Leave all the other settings as they are. Then click the ‘Launch’ button.

Important: From this point and on the meter will be running! If the fire alarm goes off, you get bored with this tutorial, or whatever: Do remember to shut down the instance before you leave, otherwise it will cost you $2.40 per day.

The ‘Your Instances’ section should update, showing you that the instance you just launched is ‘pending’. Click the ‘Refresh’ icon after a while—in a minute or so the status should change to ‘running’.

Step 7: Connect with your new instance

Double click on the running instance and copy the ‘Public DNS Name’ entry. This is the domain name you use to access the instance from the outside. In this tutorial, my instance is hosted at ‘ec2-75-101-248-101.compute-1.amazonaws.com’.

Now we are going to SSH into the instance. I am doing this via Cygwin on Windows, but any SSH client should do. If you are on Windows and have Putty installed you can even launch directly from Elasticfox by right-clicking on the running instance and selecting ‘SSH to Public DNS Name’.

Let’s start with a basic sanity check:

$ ssh root@ec2-75-101-248-101.compute-1.amazonaws.com
The authenticity of host 'ec2-75-101-248-101.compute-1.amazonaws.
com (75.101.248.101)' can't be established.
RSA key fingerprint is db:0a:85:36:99:5f:65:6b:c7:77:3e:37:59:fc:16:fd.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ec2-75-101-248-101.compute-1.amazonaws.
com,75.101.248.101' (RSA) to the list of known hosts.
Permission denied (publickey,gssapi-with-mic).

As expected, this isn’t working; we need to use the private key you saved earlier. Go to the directory where you saved the django.pem file and type the following:

$ ssh -i django-keypair.pem root@ec2-75-101-248-101.compute-1.amazonaws.com

         __|  __|_  )  Fedora 8
         _|  (     /    32-bit
        ___|\___|___|

 Welcome to an EC2 Public Image
                       : -)
    Base

[root@ ~]#

That’s better!

If you try pointing your browser towards ‘http://ec2-75-101-248-101.compute-1.amazonaws.com:8000/’ you should get a ‘can’t establish a connection’ error since there is no web server running on port 8000 as of yet.

Step 8: Install required software

Most AMI instances are stripped to the bone, so we have to add the software packages we need to get Django up and running. The steps required will of course vary from AMI to AMI, but running the following script as root is sufficient for our v1.07 Fedora 8 instance:

# Install subversion
yum -y install subversion

# Install, initialize and launch PostgreSQL
yum -y install postgresql postgresql-server
service postgresql initdb
service postgresql start

# Modify PostgreSQL config to avoid username/password problems
# Note: This grants access to _all_ local traffic!
cat &gt; /var/lib/pgsql/data/pg_hba.conf &lt;&lt;EOM
local all all trust
host all all 127.0.0.1/32 trust
EOM

# Restart PostgreSQL to enable new security policy
service postgresql restart

# Set up a database for Django
psql -U postgres -c &quot;create database djangotest encoding 'utf8'&quot;

# Install Django (I always checkout from SVN)
cd /opt
svn co http://code.djangoproject.com/svn/django/trunk/ django-trunk
ln -s /opt/django-trunk/django /usr/lib/python2.5/site-packages/django
ln -s /opt/django-trunk/django/bin/django-admin.py /usr/local/bin

# Install psycopg2 (for database access from Python)
yum -y install python-psycopg2

Step 9: Set up a Django project

First we set up an account for our test Django project:

[root ~]# useradd djangotest
[root ~]# su - djangotest
[djangotest ~]$

For the full story on how to create a new Django project you should have a look at the official tutorial. For now, just execute the following as the ‘djangotest’ user:

[djangotest ~]$ django-admin.py startproject mysite

Now we have all we need to test if the installation is working. Launch the development server like this:

[djangotest ~]$ python mysite/manage.py runserver ec2-75-101-248-101.compute-1.amazonaws.com:8000
Validating models...
0 errors found

Django version 1.0-beta_1-SVN-8461, using settings 'mysite.settings'
Development server is running at http://ec2-75-101-248-101.compute-1.amazonaws.com:8000/
Quit the server with CONTROL-C.

Note that I am using the full external domain name with the ‘runserver’ command.

Visit ‘http://ec2-75-101-248-101.compute-1.amazonaws.com:8000/’ with your browser and you should see the regular Django ‘It worked!’ page.

Note: Please don’t use the Django development server in a production setting. In fact, you probably shouldn’t use it on anything that is exposed to the outside world. The only reason I am doing it this way in this tutorial is to keep things simple—normally you should set up a proper web server such as Apache or Lighttpd. Refer to the Django documentation for information on how to do this.

Step 10: Create a Django application

I will show you how to put the Django database in persistent storage later on, so we have to set up a simple database-backed Django application.

Modify mysite/settings.py as follows:

DATABASE_ENGINE = 'postgresql_psycopg2'
DATABASE_NAME = 'djangotest'
DATABASE_USER = 'postgres'
DATABASE_PASSWORD = ''
...

INSTALLED_APPS = (
    'django.contrib.admin',
    'django.contrib.auth',
...

Then modify mysite/urls.py to allow access to the admin GUI:

from django.conf.urls.defaults import *

# Uncomment the next two lines to enable the admin:
from django.contrib import admin
admin.autodiscover()

urlpatterns = patterns('',
    # Example:
    # (r'^mysite/', include('mysite.foo.urls')),

    # Uncomment the next line to enable admin documentation:
    # (r'^admin/doc/', include('django.contrib.admindocs.urls')),

    # Uncomment the next line to enable the admin:
    (r'^admin/(.*)', admin.site.root),
)

Now we have to sync the database:

[djangotest ~]$ python mysite/manage.py syncdb

You will be asked to create an admin user—set both the username and the password to ‘djangotest’.

Then create a Django app:

[djangotest ~]$ python mysite/manage.py startapp myapp

If you got the preceding steps right, you should now be able to log on to the admin GUI at http://ec2-75-101-248-101.compute-1.amazonaws.com:8000/admin/ with the ‘djangotest’ user.

Add a new user to verify that the database connection works—we will be needing that new user later on.

Step 11: Create and mount an EBS instance

This is where things get really cool!

There is a huge problem with our current setup: Once you shut down the AMI instance, all the data in our database will disappear. Enter EBS.

EBS lets you define a persistent storage volume that can be mounted by EC2 instances. If we move our database files to an EBS volume then they will persist no matter what happens to our EC2 instances.

First, go back to Elasticfox and make a note of the availability zone of your running instance—this should be something like ‘us-east-1b’.

Then click on the ‘Volumes and Snapshots’ tab. Click the ‘Create Volume’ icon and create a 1GB volume that belongs to the same availability zone as your instance.

Right-click the new volume and choose ‘Attach this volume’. This will let you attach the volume to the running instance. Use /dev/sdh as the mount point. Refresh after a couple of seconds and the ‘Attachment status’ should have changed to ‘attached’.

Go back to your terminal and create an ext3 filesystem on the new volume:

[root ~]# mkfs.ext3 /dev/sdh
mke2fs 1.40.4 (31-Dec-2007)
/dev/sdh is entire device, not just one partition!
Proceed anyway? (y,n) y
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
131072 inodes, 262144 blocks
13107 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=268435456
8 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376

Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 35 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

All that remains is to mount the filesystem, in this case to /vol:

[root ~]# echo &quot;/dev/sdh /vol ext3 noatime 0 0&quot; &gt;&gt; /etc/
fstab
[root ~]# mkdir /vol
[root ~]# mount /vol
[root ~]# df --si
Filesystem             Size   Used  Avail Use% Mounted on
/dev/sda1               11G   1.4G   8.8G  14% /
/dev/sda2              158G   197M   150G   1% /mnt
none                   895M      0   895M   0% /dev/shm
/dev/sdh               1.1G    35M   969M   4% /vol

Step 12: Moving the database to persistent storage

First make sure that PostgreSQL is stopped:

[root ~]# service postgresql stop
Stopping postgresql service:                               [  OK  ]

You should also terminate your Django development server in case it is still running.

Now move the PostgreSQL database files to the EBS volume mounted at /vol:

[root ~]# mv /var/lib/pgsql /vol

For this to work we have to make a small modification to the /etc/init.d/postgresql file—make sure that the lines starting at around line 100 look exactly like this:

...
# Set defaults for configuration variables
PGENGINE=/usr/bin
PGPORT=5432
PGDATA=/var/lib/pgsql
if [ -f &quot;$PGDATA/PG_VERSION&quot; ] &amp;amp;&amp;amp; [ -d &quot;$PGDATA/base/template1&quot; ]
then
        echo &quot;Using old-style directory structure&quot;
else
        PGDATA=/var/lib/pgsql/data
fi
PGDATA=/vol/pgsql/data
PGLOG=/vol/pgsql/pgstartup.log
...

Note that this is a Fedora-specific hack—the main idea is to have the $PGDATA system variable point at /vol/pgsql/data.

For other databases the procedure will differ. A similar procedure for MySQL is available here.

PostgreSQL can now be restarted:

[root ~]# service postgresql start
Starting postgresql service:                               [  OK  ]

To verify that Django is using the same database as before you can revisit the admin GUI—the new user you added previously should still be available.

And there you have it!

Step 13: Shutting down

For completeness’ sake, let’s review the steps required to shut everything down.

First, stop the database server and unmount the EBS volume:

[root ~]# service postgresql stop
Stopping postgresql service:                               [  OK  ]
[root ~]# umount /vol

Then return to Elasticfox, right-click the EBS volume and select ‘Detach this instance’. When you are done with this tutorial you can delete the volume instance as well—having it in storage will cost you money.

Finally, go to the ‘AMIs and Instances’ tab and terminate the running instance. That should conclude your current transaction with AWS. (Refresh the volume and instances sections to verify that everything has really shut down).

Final words

If you now repeat steps 6 to 11 you should be able to launch a brand new EC2 instance that uses the database on your stored volume—this is left as an exercise for the reader. The only deviations from the procedure are that you shouldn’t have to run the PostgreSQL ‘initdb’ command, or create the ‘djangotest’ database.

This has been a bare-bones introduction to how EBS lets you run a persistent Django installation on AWS. In real life, the following issues have to be considered:

Use a proper web server.
Make sure the web server log files, database log, django logs etc. are moved to persistent storage as well.
Create a custom AMI that is properly set up for your Django project (so that you don’t have to do the full setup procedure every time you launch an instance).

Then there’s scaling, backup, and so on. Nonetheless, hopefully this article should be enough to get you started.

Addendum

A reader pointed out that the PostgreSQL user home directory should also be changed. While I haven’t tried this myself, the correct procedure is probably to do a usermod -d /vol/pgsql postgres as root.

Serving static files with Django and AWS – going fast on a budget

August 14, 2008 Thomas Brox Røst Leave a comment

I just posted an article on how to improve Django response times through the use of pre-generated static files:

Speed matters.

When Google tried adding 20 extra results to their search pages, traffic dropped by 20%. The reason? Page generation took an extra .5 seconds.

This article will show how Eventseer utilizes an often overlooked way of improving the responsiveness of a web application: Pre-generating and serving static files instead of dynamic pages.

The full posting can be read here.