Drupal on Rackspace Cloud Sites with Automated Backups
Hosting is anything but a one size fits all situation. Sites which demand the highest performance will certainly benefit from customized backend architecture and tuning. For the large majority of less demanding sites, there is the spectrum from the few dollars per month shared managed hosting to the VPS (virtual private server).
At Urban Insight we've found that some Drupal sites are a good fit with hosting on Rackspace Cloud Sites. One requirement of ours is automated backups, which Rackspace Cloud Sites does not provide. Rackspace does provide some help with this in the form of scripts that can be used for backups, but you need to do the setup yourself, and one feature we wanted was lacking. What follows is a quick overview of setting up a Drupal site on Rackspace Cloud Sites with an emphasis on backups and how we added the missing backup feature we wanted, rotating backups.
From the perspective of getting Drupal running, Rackspace Cloud Sites is similar to managed shared hosting. What you will not get with Rackspace Cloud Sites is a shell. You will not be able to ssh into the 'server' and use drush or even untar files and I haven't yet figured out a way to use version control for deployment. Using FTP is required for interacting with files on the site. Since the number of sites you can install in your account in unlimited, this might be a good option for those with many smaller sites. Another option, Rackspace Cloud Server, does allow full shell access and ssh if you require it.
Adding a New Site
When adding a new site to Rackspace Cloud Sites, there are some options. See this page for explanations. The domain you use to set up the site can't be changed later. For simplicity, it's best to use the actual production domain. Select Linux/Apache and set up a database. Note the database information including the hostname. It should be something like 'mysql50-10.wc1.dfw1.stabletransit.com'.
By default, your Drupal site root will be /your.domain.name/web/content/. In some Rackspace documents, the /your.domain.com/ part is referred to as your webroot so beware of this potential confusion.
Getting Drupal Running
Use FTP to upload your Drupal files and install Drupal. Rackspace has a guide but it's pretty standard stuff.
A short list of things to check when installing Drupal on Rackspace Cloud Sites
- Don't forget to upload the .htaccess file at the root of your Drupal installation.
- To get clean URLs to work, uncomment the following line in .htaccess:
RewriteBase / - If you are importing an existing Drupal site, check the files directory permissions. Default location is sites/default/files
- Using Rackspace Cloud Sites gives you access to Rackspace Cloud Files but at an addtional cost. Cloud Files can be used as a CDN but we are using it to store our site backups.
Implementing Backups
Rackspace provides a couple of scripts to automate the site backups. The first one, backup.sh, dumps the database and creates a compressed tar archive of the database and filesystem in your account root, not your site root, and calls the second script. The second script, cloudfiles_backup.php, actually copies the tar archive to Cloud Files. A cron job can be set up to run the script and based on the frequency of that cron job, will create backups ad infinitum. What if you want to rotate the backups? There are a few solutions in the comments of the Cloud Files backup guide but I think I came up with a simpler solution.
The other solutions rely on parsing the date from the filename. If your backups are being created at regular intervals, then there is a direct relationship between the age of the backup and number of backups. My solution takes advantage of that fact. I've added a variable, $num_backups, to define the number of backups to keep on hand. After the rest of the stock script runs, a check is performed and any backups in excess of $num_backups are deleted. Let's look at how this is accomplished.
If you are using Cloud Files, then I recommend taking a look at the cloudfiles-php github project. This is what the Rackspace script is using to interact with Cloud Files. The included PHP objects and methods can authenticate, create containers, create data objects, list and delete the same. The documentation in the cloudfiles.php file is pretty good and using it, I was able to understand how to list data objects and delete them.
The $container->list_objects line gets a list of data objects in that container, in this case backups, and returns them as an array with the oldest backup at the 0 index position. Suppose you want to keep only 30 backups at any time. After the 31st is saved, we want to remove the first one in the array. PHP's array_shift() function seems made to order. What array_shift() does is load the value of the 0th index element of the array into a variable and then deletes that element from the array and moves the position of the remaining array elements upwards one position.
The while loop uses the data object name (backup file) returned by array_shift() to delete the backup. There's plenty or room for enhancements. Creating the container on Cloud Files if it doesn't exist is the first enhancement that came to mind. Other suggestions are welcome.
backup.sh
#!/bin/sh
#Set information specific to your site
webroot="YOUR WEBROOT"
db_host="YOUR DB HOST"
db_user="YOUR DB USERNAME"
db_password="YOUR DB PASSWORD"
db_name="YOUR DB NAME"
#Set the date and name for the backup files
date=`date '+%F-%H-%M'`
backupname="backup.$date.tar.gz"
#Dump the mysql database
mysqldump -h $db_host -u $db_user --password="$db_password" $db_name > $webroot/db_backup.sql
#Backup Site
tar -czpvf $webroot/sitebackup.tar.gz $webroot/web/content/
#Compress DB and Site backup into one file
tar --exclude 'sitebackup' --remove-files -czpvf $webroot/$backupname $webroot/sitebackup.tar.gz $webroot/db_backup.sql
#Upload your files to cloud files.
#First argument is the location of the backup file, second argument is the name to be used when uploaded
php $webroot/cloudfiles_backup.php $webroot/$backupname $backupname
#After your backup has been uploaded, remove the tar ball from the filesystem.
rm $webroot/$backupname
cloudfiles_backup.php
<?php
// include the API - note we must use the absolute server path because this script will be executed through php technology and not http
require("/YOUR WEBROOT/cloudfiles/cloudfiles.php");
// cloud info
$username = "YOUR USERNAME"; // username
$key = "YOUR API KEY"; // api key
$containername = "YOUR CONTAINER NAME"; // container name
// Backup conditions
// Number of backups to keep
$num_backups = 30;
// backup file name from command-line argument
$backup = $argv[1];
// Name to use for file once uploaded
$uploadname = $argv[2];
// Connect to Rackspace
$auth = new CF_Authentication($username, $key);
$auth->authenticate();
$conn = new CF_Connection($auth);
// Get the container we want to use
$container = $conn->get_container($containername);
// upload file to Rackspace
$object = $container->create_object($uploadname);
$object->load_from_filename($backup);
// remove old backups
$backup_list = $container->list_objects($limit=0, $marker=NULL, $prefix=NULL, $path=NULL);
while (count($backup_list) > $num_backups) {
$to_delete = array_shift($backup_list);
$deleted = $container->delete_object($to_delete);
}
?>
Post new comment