As Django has matured it’s being used for bigger and bigger projects. At the same time it’s also being used by more and more people building relatively simple applications quickly. Everyone’s application is different, but I’d wager the vast majority of these have a range of common performance problems. Performance is often something only larger teams get to spend time really getting to grips with. This is sometimes because smaller projects can’t afford the time, or more often probably that things are thought to be fast enough anyway.
One advantage of using a framework is the sharing of common solutions to common problems that happens as a community forms. In what is hopefully going to be a bit of a series I’m going to cover some simple things everyone can do to improve application performance. The patterns are generally applicable, but I’m going to focus on Django examples.
I’m going to be pretty opinionated about the stack I’m using when necessary. I’m not looking to compare different web servers or databases or python versions. And I’d rather give concrete examples than generalise. If you’re using a different stack that’s fine, somethings will just work and others will need you to know how to configure the software you’ve already chosen. I’m also going to focus on a very small and simple to understand application. Most of these techniques scale up just fine, but I feel people don’t often use them on smaller projects because they thing you can only use them on larger ones. Or that you won’t see much impact on a smaller project. Both of these don’t ring true in my opinion and I’ll hopefully show why.
Measuring Performance
In this first part of the series lets take a quick detour to frame everything else. Lets talk about ways we can measure performance so we can see if the changes we’re making have the desired impact. If you’re not measuring performance already then start here.
We’ll start out looking at a few tools which are useful when taking a view by view approach to analysing performance. These generally ignore the impact of load on the system but because of this are generally easier to understand and read.
Django Debug Toolbar
Most Django developers will hopefully already be using the excellent Debug Toolbar It has a number of features relevant to our current quest but the most interesting is the query count. Less queries is nearly always better. That’s a whopping generalisation, but looking for unnecessary queries or duplicated queries or particularly slow running queries is a great way of making your application faster. The ORM makes it pretty easy to end up with a querysplosion if you’re not paying attention.
It’s very simple to install:
pip install django-debug-toolbar
The query section shows you the number of queries, the individual queries themselves and the time taken. It’s designed to be run in debug mode, so the actual query times will likely be lower in production, but the query that’s taking ages in development will probably still be slow when you go live.
YSlow
YSlow is a browser extension for Firefox and Chrome that gives information and recommendations about a number of mainly HTTP, CSS or javascript issues individual pages might have. It will give you a score as well as suggestions for improvement:
Also useful is the break down of the number of HTTP requests, and the affect of a primed cache on page loading.
Profiling Middleware
Sometimes you want to know the very low level calls that go into making a page render. For this you’ll want to look at profiling tools. The Django wiki has a useful page on profiling which is good if dispiriting reading.
Django Snippets has several profiling middleware, one of which is packaged up in the excellent Django Snippetscream. We can install that like so:
pip install django-snippetscream
Just include the middleware in your debug environment:
You can then append ?prof to any of your URLs and instead of seeing the view you’ll see something like the following:
Look at where your code spends it’s time and whether you have repeated calls to the same methods and functions. Sometimes getting down to this level of detail is the easiest way of finding the bottleneck in your application.
Nginx Logging
Here’s the first time I’m being opinionated about the stack, by choosing Nginx as my frontend server. I’ll talk a little about why this is a good idea later, but for the moment lets concentrate on why this is useful for measuring performance.
Log files are wonderful things, and Nginx has quite a powerful syntax for adding extra information to log files. Note the last line in the definition below.
We are adding the entire request time, the time taken by the upstream server (in my case gunicorn) to respond and also the gzip ratio. This is really handy if you’re optimising an application already in production. By collecting this data here it’s easy to then analyse the logs and pull out things like slow urls or assets not being gzipped effectively.
Django Timelog
Very similar to the above nginx logging, but implemented as a django 1.3 application (so it can be used in development as well) is one of my projects, django-timelog. As well as logging the time taken for each request, django-timelog provides a management command to analyse the resulting log file. It produces output which can show in aggregate the average response time of either views or individual URLs.
It’s packaged so installation should be straightforward.
pip install django-timelog
Again this can be used in a production environment, or it can be used locally while developing. You can also use load testing tools as described in a moment to generate traffic which is then logged.
Load Testing
I’m mainly looking for a tool here which can easily generate HTTP traffic in volume, sending a decent number of concurrent requests against your application and returning some useful results. I mainly turn to ab (Apache bench) because it’s available everywhere and it’s very simple to use.
For example lets hit a site with 100 requests, with requests being sent in batches of 5.
ab -c 5 -n 100 http://www.vagrantbox.es/12/
This will print something like the following. For our purposes we’re mainly interested in the requests per second value and the mean request time.
Concurrency Level: 5
Time taken for tests: 1.981 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Total transferred: 328300 bytes
HTML transferred: 297400 bytes
Requests per second: 50.47 [#/sec] (mean)
Time per request: 99.064 [ms] (mean)
Time per request: 19.813 [ms] (mean, across all concurrent requests)
Transfer rate: 161.82 [Kbytes/sec] received
Load testing is a pretty large topic. For instance even with the above simple example how do we know if 100 requests is enough? (it’s not.) Or whether a concurrency of 5 is useful? Often what you’re interested in is where your application starts to saturate or where it starts to error. But even without getting bogged down in the details a simple test like this can show changes have had a positive or negative effect. I’ll show examples of this as we investigate optimisation techniques.
If you’re working on a larger project hopefully you’ll have the time to investigate other approaches too. I’m quite a fan of using production logs to replay requests for instance, and of using Funkload for running scenarios under load. I’ll hopefully write more about those later. I’ve heard good things about Tsung as well, HTTPerf is excellent and JMeter has many fans. I’m using ab for examples because it’s point and shoot and you probably already have it installed without knowing.
Hopefully that’s a useful list of tools to get a baseline of where you’re at with performance. The rest of the articles in this series will show approaches to improve performance, and come back to one or more of these tools to confirm we’re heading in the right direction.
So I’m a huge Ganglia fan. It’s my go-to tool for standard low level metrics and for more ad-hoc higher level stuff as well. So I’m very happy to see the newly released version of the Ganglia Web interface.
Here’s the old look and feel:
And here’s the new version:
The first thing that stands out to me is the wide range of new options. We have a navigation bar with items like search, views, agregate graphs, events and mobile. And we have dynamic date ranges, and filtering and sorting options for metrics.
Search is what you’d expect. A search as you type facility for hosts and metrics, great if you have a large cluster and want to find something quickly.
Views are interesting. Basically a way of putting together a specific set of metrics on one screen and linking to them, rather than always having to go via the node or grid overview.
You could always create agregate graphs in Ganglia, but you had to delve into writing some PHP to get what you wanted. Not you can do certain types of graph on the fly, simply by specifying what you want.
I’m cheating slightly here and using the very latest code rather than the 2.0.0 release. But it’s worth it for the Events feature. Events gives you the ability to record events like a deploy or a daily backup onto your graphs, immediately making some types of diagnosis easier. More about how I’m using this when I finish automating it.
Gnaglia always worked OK on a mobile device mainly because all you really wanted was the graphs. The new mobile UI however just makes navigating much easier. It’s not yet doing device detection and automatically redirecting my iphone to that view which would be nice but it’s definately a good addition.
Combined with views, automatic rotation shows a series of graphs, one every 30 seconds. Particularly useful for big screen dashboards, in fact I’ve built this exact feature before more than once before.
I’ve concentrated here on the new visual features, but the new UI also contains CSV and JSON output for all metrics as well as a much easier JSON based approach to defining custom metrics. The amount of stuff in this release is huge. Massive thanks to the folks behind all the new features, in particular Vladimir Vuksan.
Ganglia is such a staple that I’m used to just using whatever is in the latest distro provided packages. When it comes to the web UI however I’m going to make an exception from now on. Not only can you run the new UI in parallel with the old, deployment is simply copying a directory into place (It’s easy to forget just how simple PHP deployment is sometimes). If you’re already using Ganglia spend a few minutes installing the new UI. If you’ve rejected Ganglia previously because it didn’t have one feature or another then now is the time to look again.
I’ve been doing some basic performance profiling work with Ruby on Rails recently and one tool I found very useful was the request log analyzer. It’s a relatively simple command line application that you can point at the Rails application log files and it outputs lots of information in agregate. So information about request duration averages or about SQL queries run. When working on a recent Django project I wanted a tool to do the same thing and ended up writing timelog.
I did a bit of research to see if I could find something similar. Here are a few other interesting tools that didn’t quite do what I wanted:
Django Slow Log - This logs things like memory and load averages
Django Dump Slow - Similar but designed to just log slow requests rather than everything, also needs a Redis backend
Zamboni Middleware - This is very similar to what I wanted, but it’s not a separate module and I didn’t find anything to analyse the results
Timelog is very similar to the middleware in Zamboni, the only real difference being I’m using the new logging support in Django 1.3. More interesting is the bundled management command which will output something like:
At the moment I’ve not done any real benchmarking or optimisation of the script, but it will chew through a 300,000 line (20MB) log file in under 20s on my aging macbook.
Next add timelog to your INSTALLED_APPS list. This is purely for the management command discovery.
INSTALLED_APPS = (
'timelog',
Then configure the logger you want to use. This really depends on what you want to do, the django 1.3 logging setup is pretty powerful. Here’s how I’ve got logging setup as an example:
In order for the analyser script to work correctly you’ll need to use the plain formatter and to register a handler for the timelog.middleware logger.
With that all configured try hitting your application a few times. You should see a log file created at the location specified in TIMELOG_LOG. Fill that up with a few lines and then run the analyze_timelog management command:
python manage.py analyze_timelog
This should output something similar to the above table but with your timings and views. The management command currently allows you to point the analyzer at a different file say from a backup, or a file you’ve pulled down from production but want to run the command locally. I’ll likely add some filters as time permits for things like not showing all views or showing only views between a given date range.
The above table showing the view function is good for big picture trends, but if you want to dig down into a particular path then you can pass an argument to not reverse the path:
I’ve been using the Vagrantbox.es site as a bit of a playground recently and I’ve been meaning to blog about some of the overengineering I’ve been doing. Here’s a smaller starter.
Getting the headers returned by your web server correct is both easy to do and easy to forget about. Unless you go actively looking for headers with curl or similar you’ll probably miss them, and even then will you spot an incorrect header by eye? RedBot is an excellent online tool that not only shows you the headers but makes recommendations about what might be missing, invalid or things you haven’t considered.
Or at least they do now after a few tweaks to my nginx configuration. In particular I’ve added
add_header Vary Accept-Encoding;
I’d assumed that enabling gzip compression with gzip_vary would have set this automatically and never checked. The way RedBot provides a checklist of recommendations is very handy.
As well as checking the page itself you can also check all the assets associated with a page by adding a query string argument. For instance here is the assets view for Vagrantbox.es. RedBot also provides a JSON encoded version of the result which might be useful in a CI system. If you’d rather run your own instance of the software you can, the code can be found on github at mnot.github.com/redbot/. It currently doesn’t work with HTTPS resources but that’s about the only thing I’d like to see added.
For those that haven’t yet had a look Cloudfoundry from VMware is two things, one of which is nice, one of which is very cool indeed:
On one hand it’s a platform as a service, allowing you to easily deploy Ruby, Java and Node.js applications to cloudfoundry.com.
On the other hand it’s an open source project with all the code on Github allowing you to run the entire stack wherever you like.
I’m pretty interested in the latter. Its API could in theory become a defacto standard for application and service buildouts, in the same way as we’re seeing the EC2 API expand outside AWS for managing infrastructure (and arguably how we’re using Chef and Puppet for managing the things installed on that infrastructure). The really interesting bit is the fact it’s all open source. Not only does that mean you can run it on your own infrastructure (including as I’m doing on a virtual machine on my laptop), but it’s also designed so new services (like Redis, MySQL, RabbitMQ), new runtimes (like Ruby 1.8, RUby 1.9, Java) and new frameworks (like Rails, Sinatra, Spring) can be added easily.
I’m running vcap on a vagrant managed VirtualBox instance, but you could install it anywhere you like. I used these chef recipes to get everything installed. I ran into an issue with the mysql service not starting correctly that I fixed and I needed to manually install chef into the rvm gemset part way through the install, but the recipes pretty much just worked.
Looking for an excuse to have a route through the vcap code I decided to see if I could add rudimentary support for Python applications. After a day of noodling around I’ve forked the code and sent a few pull requests back with it basically working. This required changes to the vmc client, to the vcap service and like all good open source contributions to the test suite.
Thanks hugely to existing pull requests (mainly the one’s for adding PHP support) it’s not taken long at all. The Sinatra and Rails support which shipped with the first release from VMWare supports using Bundler to define gem dependencies that can be pulled in at deploy time. It shouldn’t be too much effort I’m hoping to do the same for using pip and virtualenv for each deployed python application. I think I can also see how to support python3 and how to add django as a supported framework.
I had huge fun, but you might not at this early stage in the project. I’m relatively happy with reading and writing Ruby, futzing with rvm, debugging installation woes and hunting down service configuration problems. The best tool for working out what was happening was generally tailing the logs in /tmp/vcap-run/ and finding the code that wrote a given message. If you just want something to work I’d wait a little while, if you like the sound of the above it’s a pretty nice codebase to play around in. I’d love to eventually see some official contributor documentation and some hints and tips on things like running the tests. At the moment flicking through reported issues and pull requests on GitHub is the best place to start.
After Patrick released Sahara, a nifty extension for the Vagrant command line tool, I’ve been meaning to put together a similar extension for interacting with the growing list of base boxes on vagrantbox.es.
The extensions adds a vagrantboxes namespace to the vagrant command line tool which provides a few useful commands. Specifically you can list all the available boxes, do text searches, show the full details of a box and most handily of all add a box that takes your fancy to your local base box store, all without having to worry about the URLs of the boxes if you don’t want to.
And another quick example, this time of show printing the full description. In the future I might look to store more structured metadata and make this more useful.
>> vagrant vagrantboxes show 18
puppet debian lenny 64
http://puppetlabs.s3.amazonaws.com/pub/debian_lenny_64.box
Debian Lucid 64 bit vagrant box. Puppet 2.6.6 installed and ready to provision using the Puppet provisioner in Vagrant.
For good sample modules, try the jeffmccune-motd and jeffmccune-mockbuild modules posted to http://forge.puppetlabs.com/
Created by Ken Barber, [email protected]
This proved a good excuse to delve into the Vagrant source code which is pretty readable for the most part once you understand the libraries it’s build upon. It’s simple enough to extend for adding commands like this too, which bodes well for other more useful additions in the future.
If anyone has any problems with this extensions do let me know. Error handling currently consists of returning blank output rather than sensible error codes or messages, and as I’ve yet to add a small test suite so their might (will) be a few bugs lying around. I’ll try and make it better behaved in the next week or two but reasoned it’s useful straight away.
A recent question on Twitter prompted me to write a quick blog post about managing cron jobs. As more and more people want to automate provisioning and deployment of web applications some, maybe previously manually managed, items come into the fold.
Cron jobs are interesting because you may prefer to see them as part of the infrastructure (like apache or mysql) or as part of your application code. I think both are valid, sometimes at the same time. For instance you might have a cron job which deals with scheduled database backups. All that requires is the database and the script to be present. At other times your cron jobs might require your entire application stack. For instance a rake task which uses a Rails application model, or a django management command.
Configuration Management and Cron
Both Chef and Puppet provide a cron resource type. This is particularly handy for things like database backup scripts or ganglia gmetric scripts. You probably want these scripts and cron jobs to be installed on machines that have the related software installed, and you’re probably already describing this in your Chef recipes or Puppet manifests. If you’re not already using one of these tools using them to manage just your cron jobs is a nice way of starting out.
<code>cron "describe your job" do
hour "2"
minute "0"
command "/usr/local/sbin/command"
end</code>
The important part is that by describing your cron jobs in code you can then version control them easily, and both Chef and Puppet have mechanisms to push these jobs out to be installed by the relevant hosts. With cron jobs you might not want all your machines to be running the same jobs for instance.
Using Whenever
An alternative, or complimentary, approach to versioning and deploying cron jobs would be to tie it in with your application code. This makes sense when those jobs are tightly coupled to your application, for instance rails specific rake tasks or django management commands. Whenever is a tool I’ve been using recently that makes this pretty simple. You describe your cron jobs in a file called schedule.rb like so:
<code>every 3.hours do
command "/usr/local/sbin/command"
end</code>
And then running the provided whenever command line application will generate a working crontab. Whenever ships with capistrano integration and some useful shortcuts for running rake tasks or accessing Ruby code, but it’s simple to write your own command shortcuts without having to write ruby code too.
Other Approaches
I have seen some tools which replace cron completely, but I’ve never liked that idea much. Cron works pretty well, and is clever enough to deal with things like daylight saving time and leap years inteligently if needed. I know other folks who are centralising all regular jobs into something like Jenkins. I can see advantages to that, although I’ve yet to find a really nice way of automating this outside the gui interface or manually modifying configuration files.
I’ve written before about why I like System Packages, but even I’ll admit that the barriers to creating them mean I don’t use them for everything. FPM however is making it much easier, to the point where I’m starting to create a few packages I can reuse on projects. I thought a write up of how I’m doing that for Cucumber-Nagios might be useful.
For those that haven’t seen it yet, FPM (or Effing Package Management) is a tool that helps you build packages, like RPMs and DEBs, quickly. It can take in gems, python packages, node.js npm files or just plain directories and make files and from that create you one or more system packages. Lets have a look at a full example with a Ruby gem.I really like using cucumber-nagios, whatever platform or language I happen to be using at the time. I have a number of Django projects for instance with cucumber-nagios checks, so being able to not worry about most of the Ruby install is useful.
In order to use FPM you’ll need to install it. It’s available as a Ruby gem so lets start there. I’m not going to delve into setting up a Ruby Gems environment as it’s annoying and covered for most platforms elsehere on the internet.
<code>gem install fpm</code>
First off lets install the cucumber-nagios gem, along with all it’s dependencies, into a local folder on my build machine. This might be a virtual machine or a separate machine in your cluster. It should be running the same OS as the intended production machines however. The following examples are from Ubuntu, but it’s much the same for RPM based distros.
Cucumber-nagios has a large number of dependencies, so we’re going to need to create packages for all of those too. FPM will cleverly deal with maintaining the specified dependencies thought. We’ll use find to do this quickly, and then instruct FPM to convert from a gem to a deb. You could obviously do this line by line if you prefer.
<code>find ~/tmp/gems/cache -name '*.gem' | xargs -rn1 fpm -s gem -t deb -a all</code>
That should leave us with lots of new .deb files that we can have a closer look at:
If you already have a private apt repository set up then just upload these packages and away you go. I’d like to use apt for installing them so I can leave it to manage all the dependencies easily. If not then I’ll show you briefly how to create a local file system repo, it’s not much more work to create a shared repo available over HTTP.
First create a directory to store our packages and move our newly created .deb files into it. You’ll need to be able to execute some of these commands as root but given the topic I’m assuming you’ll be able to deal with that.
For the next part you’ll need to install the dpkg development tools, and then create a file that can be read by apt when it’s looking for packages it can install.
Now add your new package repo to your apt sources and update your package cache.
<code>/etc/apt/sources.list
deb file:/usr/local/repo ./
apt-get update</code>
At this point everything should be up and running. Let’s do a search in our repo:
<code>apt-cache search rubygem-
rubygem-amqp - AMQP client implementation in Ruby/EventMachine
rubygem-builder - Builders for MarkUp.
rubygem-bundler - The best way to manage your application's dependencies
rubygem-cucumber - cucumber-0.10.2
rubygem-cucumber-nagios - Systems testing plugin for Nagios using Cucumber.
rubygem-diff-lcs - Provides a list of changes that represent the difference between two sequenced collections.
rubygem-eventmachine - Ruby/EventMachine library
rubygem-extlib - Support library for DataMapper and Merb
rubygem-gherkin - gherkin-2.3.6
rubygem-highline - HighLine is a high-level command-line IO library.
rubygem-json - JSON Implementation for Ruby
rubygem-mechanize - The Mechanize library is used for automating interaction with websites
rubygem-net-ssh - Net::SSH: a pure-Ruby implementation of the SSH2 client protocol.
rubygem-nokogiri - Nokogiri (鋸) is an HTML, XML, SAX, and Reader parser
rubygem-rack - a modular Ruby webserver interface
rubygem-rack-test - Simple testing API built on Rack
rubygem-rspec - rspec-2.5.0
rubygem-rspec-core - rspec-core-2.5.1
rubygem-rspec-expectations - rspec-expectations-2.5.0
rubygem-rspec-mocks - rspec-mocks-2.5.0
rubygem-templater - Templater has the ability to both copy files from A to B and also to render templates using ERB
rubygem-term-ansicolor - Ruby library that colors strings using ANSI escape sequences
rubygem-webrat - Ruby Acceptance Testing for Web applications</code>
And finally lets install cucumber-nagios from our shiny new package.
If everything has worked out you should be able to run the cucumber-nagios-gen command to create a new project. Note that the path may be different, and in the case of debian based distros the gem bin path is not on the Path.
<code>/usr/lib/ruby/gems/1.8/bin/cucumber-nagios-gen project test
Generating with project generator:
[ADDED] .gitignore
[ADDED] .bzrignore
[ADDED] Gemfile
[ADDED] README
[ADDED] features/steps
[ADDED] features/support</code>
Since I launched it back in November my Devops Weekly email has been pretty well received I think. Folks on twitter seem to like it, as do a few people I’ve met at recent events who have said nice things. One thing a few people have been after though is an online archive, either because email just doesn’t work for them or more often because they missed out on the earlier issues.
With a little time this weekend and the help of Jekyll I’ve just added the seventeen iss§ues to date to the new Devops Weekly archive. I’d have liked to do something a bit more useful, maybe introduce per link tags or a nifty search feature, but I’m pretty busy at present and reason this should serve the main purpose of getting these links online.
I had the pleasure of speaking at Cambridge Geek Night on Monday again, the topic of conversation being using Ganglia to collect more than just base systems metrics.
The audience of web developers, the odd sysadmin and business folk seemed to enjoy it and we had lots of time for questions at the end. The main point I tried to get across was that Ganglia makes a great platform for ad-hoc metrics gathering due to the ability to just throw values at it and get time series graphs without any extra configuration.
The slides include a few bits of code I’ve been using that I’ll throw onto GitHub as a proper project when I get the time. These are very simple Python servers, one which allows for sending metrics information over HTTP, the other using TCP instead. Both really handy for getting more people to add hooks into applications.