Archivos de la categoría lighttpd

Depurando problemas con nginx

  • english
  • spanish

Last week I’ve been debugging a problem I had with this site’s nginx server: from time to time it hanged and I had to restart the process. Some time ago I wrote a little script that checked if it was running OK and restarted it otherwise, but anyway that wasn’t a real solution.

So I spent some days really looking into it and asking for support and reporting my findings to the nginx mailing list. One useful tip I got there was enabling the “debug” mode on the error log, which shows full traces of the processes (including their PID) as they’re processing the request, the rewrites, upstreams, etc.

error_log /var/log/nginx/$host-error.log debug;

With this extended log and the PID of the process malfunctioning, it’s quite easy finding out what that process was doing right before hanging. In order to find out the PID of the hanged processes, I extended my check-reboot script to log some generic system metrics right before restarting nginx: netstat -nap (which shows the PID), ps, vmstat, etc.


LOG=/var/log/checkWeb/checkWeb-$(date +%Y%m%d).log
LOGR=/var/log/checkWeb/restart-$(date +%Y%m%d).log

if ! wget -t 1 -o /dev/null -O /dev/nul -T $TIMEOUT $CHECK
echo "ERROR, restarting nginx"
echo "** RESTARTING **" >> $TMP
date >> $TMP
echo "- CLOSE_WAIT:" >> $TMP
netstat -nap | grep -c CLOSE_WAIT >> $TMP
echo "- vmstat" >> $TMP
vmstat 1 5 >> $TMP
echo "- free" >> $TMP
free >> $TMP
echo "- ps" >> $TMP
ps aux >> $TMP
echo "- netstat" >> $TMP
netstat -nap >> $TMP
echo "" >> $TMP
echo "" >> $TMP

#       pkill -9 -f php-cgi
pkill -9 -f nginx
sleep 1s
/etc/init.d/nginx start

cat $TMP
cat $TMP >> $LOG
date >> $LOGR

rm -rf $TMP

This way, each time localhost/wp-admin was unresponsive (I was debugging a WP site), besides restarting nginx I was getting a lot of system info. With time I got to realize that nginx processes were not actually hanging, but some of their sockets got on the CLOSE_WAIT state forever until the process was restarted. Looking for the PID of those processes according to netstat on the error log, the last request they were processing before getting to the CLOSE_WAIT state was always the same: on my blog I have some examples of how running servers with daemontools; daemontools uses named pipes (FIFOs), which can become kind of black holes if there’s no process feeding them; when nginx hit one of these FIFOs, it hanged.

Funny thing is that I never had this problem with either Apache nor lighttpd. But anyway the problem is not nginx but those FIFOs which shouldn’t really be there. I removed them and have had no hanged processes in five days, while before this nginx was restarting 3-4 times a day.

Probando nginx

Estoy cacharreando con nginx. El nuevo servidor virtual donde tengo el dominio está un poco justo, con la migración lo dejé con Apache en vez de volver a lighttpd y ahora estoy probando alternativas.

Como aparte de configurar el servidor y el PHP con fast-cgi hay que convertir todos los rewrites del Apache (wpmu y super-cache) al formato de nginx, es posible que el blog haga cosas raras, no se actualice debidamente, algunas páginas no vayan bien, pete, no acepte comentarios, o cobre consciencia de sí mismo y decida matar a todos los humanos. YMMV.

cut | sort | uniq: Apache logs

  • english
  • spanish

Many times the reason because a web server is slow and unresponsive is that it’s under “attack”, on purpose or not, by a bot. I’ve seen cases where Google Bot, bots from research engines from universities or some other kind of indexer were responsible for more than half the traffic of a site. These cases are not real DoS attacks, this traffic can be considered legitimate, but the result is that it brings the service down. You can instruct some of these bots not to visit your site so often, like Google Bot using the Google Webmaster Tools and the sitemaps and/or robots.txt files, but usually you can’t and have to consider filtering all this traffic at the firewall. But in any case, the first step is realizing that a single IP (or a couple of them) is responsible for most of your traffic, identifying this IP and using whois learn who it belongs to.

You can run something like this to list the top five IP addresses on your Apache’s access.log:

cut -d" " -f 1 access.log | sort | uniq -c | sort -nr | head -n 5

Scripts daemontools para lighttpd y PHP

  • english
  • spanish

I’ve prepared a set of daemontools scripts to launch and monitor lighttpd and its PHP processes spawned with spawn-fcgi. Here is the README, a tar file with the scripts, and here you can browse the directories with all the scripts.

PS: yes, I like daemontools. It helps me achieving high availability with many services, keeping them up even when a server misbehaves and some process dies. This avoids a lot of late night calls. It’s a great invention. :)