In my last blog post I wrote about problems with common web server stress tests. Even though I wanted to talk about KVM vs. hardware setups, it seams that most of that post is related to comparing apache and lighttpd. Since I didn’t do proper testing of lighttpd and apache, I’ve decided to do that now. I’ve tested performance of these two when delivering static content (html, jpegs, gifs, docs, etc…).
For start, let me introduce you to a hardware and software used. It’s a Dell PowerEdge T300 with Intel Xeon Quad Core X3323 processor and 6GB of RAM. Disks are in software mirror, but that’s not relevant for this test. Operating system used is Ubuntu 9.04 (still unreleased), 64bit version. Version of lighttpd in Ubuntu 9.04 is 1.4.19, while apache is 2.2.11. Tools used for testing were ab and siege. All software was acquired from Ubuntu repositories for 9.04, with no configuration changes (other than those mentioned in text). I did 6 tests with siege for every setup, removed the worst and the best score and calculated average on remaining four. With ab I did just one test, since those tests are, IMO, flawed anyway (ab by it self isn’t flawed, it’s just misued by lots of people).
AB
For my first test, I disabled three cores, so that I could simulate tests which most people do. In my last blog I explained why I consider them flawed. Seriously flawed. So, for first test I started ‘ab -n 10000 -c 25 http://localhost/index.htm’. index.htm is very simple file, size of 4KB. I used plain default configuration for lighttpd and apache2-mpm-prefork. Results are favoring lighttpd:
While these tests, IMO, don’t mean a thing, I went with a flow and did the same test with all four cores. For that test, I had to tweak lighttpd’s configuration. As mentioned at lighttpd’s wiki, lighttpd doesn’t like SMP very much. Among problems mentioned on this site, idea of corrupting access logs was horrible. Since this was testing anyway, I didn’t care about that, and access logs weren’t corrupted in the end. I’ve also did more tweaking, according to performance docs. So, finally, I added:
server.max-keep-alive-requests = 4
server.max-keep-alive-idle = 4
server.event-handler = “linux-sysepoll”
server.network-backend = “linux-sendfile”
server.max-fds = 2048
server.stat-cache-engine = “simple”
server.max-worker = 8
to lighttpd’s configuration. Next, for Apache, I’ve enabled disk_cache module, just to level it with lighttpd, which does caching by default. Since Apache scales on SMP by default, no other changes were made to its configuration. I’ve also tested Apache Worker MPM. The test again showed that lighttpd is faster in serving 10000 copies of the same file:

SIEGE
When you are noticing problems with your current web server setup, it’s wise to have some statistical data of most visited sites. With that data you can create a list of URLs which you’ll use for testing your new setup. So, if index.htm is 20% of requests on your site, you’ll create a file that will have index.htm as 20% of its content. It’s also good to have some tool for testing which will randomize requests - I use siege for that. URLs in these file, in my test, were from couple of bytes to 0,25MB). In the absence of the gigabit switch, I used localhost for testing (most people do tests that way, even though that’s wrong, but let’s go with a flow). Again I removed three cores from my system and removed SMP tweaking in lighttpd. Results are:
As you can see, Apache worker with cache in memory is slower then lighttpd and Apache worker with cache on disk. I must admit, I wasn’t expecting that. Even Apache worker (without cache at all) is faster than Apache worker with cache in memory. But, let’s leave worker/mem in the dust, where it belongs. Even the fact that Apache delivered more requests per second than lighttpd isn’t quite important here. What’s really important isn’t shown in this graph, but in a ODS with raw data. Lighttpd didn’t finish single test. After ~40 seconds it just gave up serving content. I still haven’t figured out why this happens. But, anyway, I used the data collected while it was serving content.
Now, the same test with all four cores. I’ve again added support for SMP to lighttpd and again haven’t changed a thing in Apache’s configuration (except enabling cache). Results are similar to those of one core:
It’s noticeable that Apache worker with cache in memory scaled much better than lighttpd and even Apache worker with cache on disk. That was expected, since I even raised a bar of having 100 concurrent connections, instead of just 25 on single core. One could probably squize more req/s from Apache with tweaking of ThreadsPerChild and MaxClients.
In no way this blog was written to claim that one is better than another. It’s just that I’ll be needing static web server on new hardware and I wanted to have a solid data. At the moment, in my case, Apache seams better tool for the job. I might test cherokee and ngix too, just to be sure I’ve chosen the right tool.
Right, and still haven’t figured out why my virtualized provide so much worse results than hardware servers. :)
Raw data: raw-data.ods


It’s not a network stack in KVM, since it’s easy to achive >10MB/s (100mbit network) when downloading single large file. But, with lots of little static files, all three give 5-6MB/s. For those same files 11MB/s is standard on real hardware. Until I do more tests, I can conclude that virtualized web servers are OK for low traffic web sites (5-6MB/s is ~45mbit/s link), but if you are going to have high traffic site, you should really put it on a dedicated servers. I’m eagar to test dynamic content in virtualized environment.