Probando nginx

Estoy cacharreando con nginx. El nuevo servidor virtual donde tengo el dominio está un poco justo, con la migración lo dejé con Apache en vez de volver a lighttpd y ahora estoy probando alternativas.

Como aparte de configurar el servidor y el PHP con fast-cgi hay que convertir todos los rewrites del Apache (wpmu y super-cache) al formato de nginx, es posible que el blog haga cosas raras, no se actualice debidamente, algunas páginas no vayan bien, pete, no acepte comentarios, o cobre consciencia de sí mismo y decida matar a todos los humanos. YMMV.

Consejos para pruebas de carga de servidores y proxies web

  • english
  • spanish

At work, one of the technologies I work with are web proxies. On my previous work, web servers and web applications. On both cases before going into production (or before rolling an update) it’s wise doing some stress-testing, trying to assess the limits of the technology or the need for more servers. There are commercial solutions available, like e.g. Spirent‘s Avalanche, but with some work you can roll out your own stress-testing infrastructure using free software.

On the first place, before beginning with the test, you need a measuring tool. Just throwing some load to Apache and then taking a look at the logs, free, ps, and server-status by hand is plain crazy, too much raw information to deal with. One tool I’ve found extremely useful, both for analyzing stress-tests results and for everyday operations monitoring is Cacti.

Cacti is a LAMP-based software which runs a series of (programmable) probes on your systems and makes graphs out of them, graphs of different time periods: the last hours, last days, months, etc. so that you have an historic view of the evolution of the service. Having this historic information is critical, when doing these kind of tests you not only want to know how well the latest release works but how does it compare to the previous one. Or in operations, you want to know the difference between the web traffic spike on a Monday morning compared to the calm of the Sunday afternoon. Just running a test and taking a measure in that moment is not enough, you need to put it in perspective with your previous tests and what you have in your production environment. And you can develop your own probes in whatever programming language you want, so there’s nothing that can be measured (CPU or memory usage, latency, I/O load, concurrent connections, etc.) that you can’t get on the graphs. You’ll add more and more probes and graphs as you go testing and discovering new variables that can affect system performance. Detecting regressions, memory leaks, etc. is very easy (well… visual) when using Cacti.

After having some measuring tool in place, we can start stress testing the systems. If you’re dealing with a web server, you only need a client-simulation program; if a proxy, both that and a server. The server part is easy, just use Apache or (even better) lighttpd or nginx. You want to stress test your proxy application, not the web server behind it, so you want a web server as fast as possible. Even think of putting your cloud (the set of web pages you’re going to test with) in a ramdisk and disabling the server’s access.log. Make sure that the web server doesn’t become a bottleneck or you won’t be really stressing the proxy!

As for the client-simulation software, you have several options: develop your own from the ground up, develop a series of scripts with wget or curl, or use some of the existent solutions available out there, like curl-loader, Apache’s JMeter. or may others. Just Google them.

No matter which software you use for the test, one consideration: These programs usually generate a very high load but under ideal conditions. I mean: you’re running them on a local network, likely with Gbps speeds, without packet loss or rearrangements … That’s not what you’ll find “out there”. Depending on what you want to test, either the theoretical top performance or how a new development would cope with real traffic, you need to adapt the traffic of your test to that.

Things you’ll find useful for tweaking your traffic:

  • If your client-simulation software allows you, don’t run it at full network speed! trickle may come in handy here. At least run two tests: one without speed limit, and another with e.g. a 3mbps limit simulating a DSL line. Bear in mind that this is just a speed limit, the server will still have a high load because at a lesser speed it’ll likely have to cope with more concurrent (blocking?) connections. I’ve seen some software do great on a high speed laboratory test and choke in seconds with slow, real-world traffic.
  • Lose packets. Rearrange them. Insert a router/bridge linux server between your client and your test object and use Linux’s tc system. Take a look here, here, here and here.
  • Simulate multiple origin IPs. Take a whole B or C class network, make sure the routes in all your systems are coherent, and run something like this to randomize the clients’ IP addresses (a bit hacky, but it works):
    while :
    do
      I=$((1+RANDOM%254))
      J=$((1+RANDOM%254))
      /sbin/iptables -t nat -F
      /sbin/iptables -t nat -A POSTROUTING -d 1.1.1.1 -o eth1 -j SNAT --to-source 10.10.$I.$J
      sleep 0.1s
    done

With all this in place, you should have a pretty decent testing framework going.

cut | sort | uniq: Apache logs

  • english
  • spanish

Many times the reason because a web server is slow and unresponsive is that it’s under “attack”, on purpose or not, by a bot. I’ve seen cases where Google Bot, bots from research engines from universities or some other kind of indexer were responsible for more than half the traffic of a site. These cases are not real DoS attacks, this traffic can be considered legitimate, but the result is that it brings the service down. You can instruct some of these bots not to visit your site so often, like Google Bot using the Google Webmaster Tools and the sitemaps and/or robots.txt files, but usually you can’t and have to consider filtering all this traffic at the firewall. But in any case, the first step is realizing that a single IP (or a couple of them) is responsible for most of your traffic, identifying this IP and using whois learn who it belongs to.

You can run something like this to list the top five IP addresses on your Apache’s access.log:

cut -d" " -f 1 access.log | sort | uniq -c | sort -nr | head -n 5

Varios websites con HTTPS sobre una misma IP

  • english
  • spanish

Some days ago, a coworker posted on his blog an article about how to serve several HTTPS sites from a single IP address using TLS. I told him that it was easier to use a single X.509 certificate for all those domains with the subjectAltName attribute, and he dared me to write a howto about it.

I’ve done something better: I’ve modified easy-rsa so that it can generate multi-domain certs.