csshX









  • english
  • spanish

Some time ago I wrote an article about Cluster SSH.

Now that I’m using a Mac at work too and am working with clusters again, I’ve googled for a MacOS X-based program similar to cssh. And of course, there’s a guy who has done it! csshX is a different implementation of the same idea, this time written on perl instead of TCL/TK and using OS X’s Terminal.app.

Works like a charm.

nginx, memcached y HTTP 304









  • english
  • spanish

I’m working on the server architecture for a new web portal. It’s going to be Amazon-EC2based and will have a nginx front end to several tomcat back end servers. The nginx servers will proxy-cache some dynamic pages (search results, internal web-services queries) and will also have a static page cache based on memcached.

While starting to roll memcached into the solution I noticed that I stoped receiving HTTP 304 responses, everything served off memcached where 200. After digging a little bit on Google I found out it was the expected behavior as memcached doesn’t store the timestamp (last-modified) of the files. So I’ve given it a try and developed a quick patch. You can find it here.

I’m quite puzzled at the simplicity of the patch, as I didn’t need to code the logic to check if the cached page is more recent than the client’s one and return 302 or 200. After successfully returning the last-modified header from memcached, nginx seems to do the rest of the job somewhere on its own!

I have tested the patch today at work and everything seems to work OK. Will do more extensive testings tomorrow and update the path if needed. In the meantime, if someone gives it a try and works for them (or doesn’t), I’m waiting for your feedback.

Samsung ML-1520 en MacOS X Snow Leopard









  • english
  • spanish

GPL drivers for many printer models for MacOS X. If the system doesn’t support your printer out of the box and the printer’s manufacturer doesn’t provide them, give these drivers a try. They work for me with a Samsung ML-1520. And if you already had them but they stopped working after upgrading to Snow Leopard, just reinstall the latest release from the links above. Happened to me.

Consejos para pruebas de carga de servidores y proxies web









  • english
  • spanish

At work, one of the technologies I work with are web proxies. On my previous work, web servers and web applications. On both cases before going into production (or before rolling an update) it’s wise doing some stress-testing, trying to assess the limits of the technology or the need for more servers. There are commercial solutions available, like e.g. Spirent‘s Avalanche, but with some work you can roll out your own stress-testing infrastructure using free software.

On the first place, before beginning with the test, you need a measuring tool. Just throwing some load to Apache and then taking a look at the logs, free, ps, and server-status by hand is plain crazy, too much raw information to deal with. One tool I’ve found extremely useful, both for analyzing stress-tests results and for everyday operations monitoring is Cacti.

Cacti is a LAMP-based software which runs a series of (programmable) probes on your systems and makes graphs out of them, graphs of different time periods: the last hours, last days, months, etc. so that you have an historic view of the evolution of the service. Having this historic information is critical, when doing these kind of tests you not only want to know how well the latest release works but how does it compare to the previous one. Or in operations, you want to know the difference between the web traffic spike on a Monday morning compared to the calm of the Sunday afternoon. Just running a test and taking a measure in that moment is not enough, you need to put it in perspective with your previous tests and what you have in your production environment. And you can develop your own probes in whatever programming language you want, so there’s nothing that can be measured (CPU or memory usage, latency, I/O load, concurrent connections, etc.) that you can’t get on the graphs. You’ll add more and more probes and graphs as you go testing and discovering new variables that can affect system performance. Detecting regressions, memory leaks, etc. is very easy (well… visual) when using Cacti.

After having some measuring tool in place, we can start stress testing the systems. If you’re dealing with a web server, you only need a client-simulation program; if a proxy, both that and a server. The server part is easy, just use Apache or (even better) lighttpd or nginx. You want to stress test your proxy application, not the web server behind it, so you want a web server as fast as possible. Even think of putting your cloud (the set of web pages you’re going to test with) in a ramdisk and disabling the server’s access.log. Make sure that the web server doesn’t become a bottleneck or you won’t be really stressing the proxy!

As for the client-simulation software, you have several options: develop your own from the ground up, develop a series of scripts with wget or curl, or use some of the existent solutions available out there, like curl-loader, Apache’s JMeter. or may others. Just Google them.

No matter which software you use for the test, one consideration: These programs usually generate a very high load but under ideal conditions. I mean: you’re running them on a local network, likely with Gbps speeds, without packet loss or rearrangements … That’s not what you’ll find “out there”. Depending on what you want to test, either the theoretical top performance or how a new development would cope with real traffic, you need to adapt the traffic of your test to that.

Things you’ll find useful for tweaking your traffic:

  • If your client-simulation software allows you, don’t run it at full network speed! trickle may come in handy here. At least run two tests: one without speed limit, and another with e.g. a 3mbps limit simulating a DSL line. Bear in mind that this is just a speed limit, the server will still have a high load because at a lesser speed it’ll likely have to cope with more concurrent (blocking?) connections. I’ve seen some software do great on a high speed laboratory test and choke in seconds with slow, real-world traffic.
  • Lose packets. Rearrange them. Insert a router/bridge linux server between your client and your test object and use Linux’s tc system. Take a look here, here, here and here.
  • Simulate multiple origin IPs. Take a whole B or C class network, make sure the routes in all your systems are coherent, and run something like this to randomize the clients’ IP addresses (a bit hacky, but it works):
    while :
    do
      I=$((1+RANDOM%254))
      J=$((1+RANDOM%254))
      /sbin/iptables -t nat -F
      /sbin/iptables -t nat -A POSTROUTING -d 1.1.1.1 -o eth1 -j SNAT --to-source 10.10.$I.$J
      sleep 0.1s
    done

With all this in place, you should have a pretty decent testing framework going.

Segundo monitor en el Lenovo T400 con Jaunty









  • english
  • spanish

A couple of weeks ago I upgraded the laptop I use at work to Jaunty, and since then I couldn’t get the external monitor to work properly.

This laptop has two graphic cards: an Intel one integrated on the mainboard, which uses few battery power; and an ATI one with all the bells and whistles, which of course takes more battery. On Windows the graphic driver can change from one to the other on the fly, but on Linux you can’t. You need to go to the BIOS and select one, disabling the other one.

So since I upgraded to Ubuntu 9.04, I the Intel card (which is the one I always use, I use this laptop for work only so I’d rather have half an hour of extra battery than ultra-fast OpenGL… heck, I even have compiz disabled) wouldn’t even boot, X hanged while booting when using that card. And with the ATI card and the fglrx driver, getting the external monitor to work is a PITA. And I need it, as from time to time I give presentations and training sessions and need to hook the laptop to a projector.

I finally found the solution on a Ubuntu forum somewhere (lost the link, sorry): uninstall the fglrx driver! Even if I didn’t use it, even if the ATI card was disabled on the BIOS (wouldn’t even show on lspci) and on the xorg.conf file was using the Intel driver and not the fglrx one, if this driver was installed X would crash when using the Intel card. Uninstall it and voila! The Intel card works again and so does the external monitor configuration applet. Weird.

Optimizando el SQL de osCommerce









  • english
  • spanish

An entry about how the way SQL queries are built affects their performance, something I think many people don’t pay attention to:

For the last couple of days I’ve been struggling with the database of an osCommerce-based on-line shop. Everything ran perfectly until after four years of existence we have a DB with over 1500 categories and 16000 products, and the web is getting increasingly slower over time. We use a heavily modified osCommerce 2.2 installation, so moving to osCommerce 3 is out of the question (besides, it’s still alpha and this is a production site). After some hw and os-level optimization that didn’t help to improve performance, I had to start digging into the code and the DB.

First of all: activate the log-slow-queries option on mySQL’s config. This creates a new log with all the queries that take longer than expected to run (def. more than 10 seconds). With this information, we can concentrate on those queries that are hitting the bigger performance penalties and start the optimization process there.

To get you on context, this is the query that extracts all the products from a given category:

SELECT pd.products_name, pd.products_description, m.manufacturers_name,
p.products_image, p.products_quantity,  p.products_date_added,
p.products_id, p.manufacturers_id, p.products_price,
p.products_tax_class_id, IF(s.STATUS, s.specials_new_products_price, NULL)
AS specials_new_products_price, IF(s.STATUS, s.specials_new_products_price,
p.products_price) AS final_price
FROM products_description pd, products p
LEFT JOIN manufacturers m ON p.manufacturers_id = m.manufacturers_id
LEFT JOIN specials s ON p.products_id = s.products_id,
products_to_categories p2c
WHERE
p.products_status = ’1′ AND p.products_id = p2c.products_id AND
pd.products_id = p2c.products_id AND pd.language_id = ’3′ AND
p2c.categories_id = ’114′
ORDER BY p.products_date_added DESC;

On the first place for those unfamiliar with osCommerce, there are a heap of tables involved. Six on this example: categories, products, products_to_categories, products_description, manufacturers and specials (products with discounted prices). According to mysql-slow.log, that query was taking around 12 seconds to run. The log also read “Rows_examined: 1903433″. No wonder it was taking that long. :-D

Why so many “rows_examined”? Going back to DB theory on the University, ;-D if memory serves and leaving each particular DB engine’s optmizations aside, when selecting from several tables the DB engine does the cartesian product (product set) of all their tuples, this is, every possible combination of each tuple on one table with every other tuple on the other tables. After that, those combinations matching the WHERE criteria are selected (and I say engine optimizations aside because I don’t get the math here: 16k products, 16k descriptions, 1k5 categories, and just 1903433 combinations?). On the other hand with JOIN the related data is appended to those tuples already selected by the WHERE clause and ONLY TO THOSE.

So, why using more than one table on the where clause if it isn’t strictly necessary? On the previous example, we want to extract all the info of the products on a given category, but we select from the products, products_description and products_to_categories tables. Wouldn’t it be better using only the products_to_categories table on the from clause, reducing the result set to only the category we are interested on, and then get all the products’ information with JOIN? It’s the same principle as when piping data through several commands on a shell-script, using first grep and then cut: first reduce the data set, so that on each following step there is less data to spend CPU cycles with, increasingly reducing the overall processing time.

With this modification, the resulting query looks like:

SELECT pd.products_name, pd.products_description, m.manufacturers_name,
p.products_image, p.products_quantity,  p.products_date_added,
p.products_id, p.manufacturers_id, p.products_price,
p.products_tax_class_id, IF(s.STATUS, s.specials_new_products_price, NULL)
AS specials_new_products_price, IF(s.STATUS, s.specials_new_products_price,
p.products_price) AS final_price
FROM products_to_categories p2c
LEFT JOIN products p ON p.products_id=p2c.products_id
LEFT JOIN products_description pd ON p.products_id=pd.products_id AND pd.language_id=’3′
LEFT JOIN manufacturers m ON p.manufacturers_id = m.manufacturers_id
LEFT JOIN specials s ON p.products_id = s.products_id
WHERE p2c.categories_id = ’114′ AND p.products_status = ’1′
ORDER BY p.products_date_added DESC

The result sets of both queries are identical, but while the first one took around 12 seconds to run, the second one only takes a couple of milliseconds. :-) I’ve found several similar cases already on my osCommerce installation, that have gone from more than 10 seconds to milliseconds after rearanging the query in a similar fashion.

Another important point are indexes: I’ve found some queries that couldn’t be further optimized, they already used only one table on the from clause, the more restricting one, and then some joins. On these cases what I’ve found is that some table lacked an index on the foreign key column, and after adding it the execution time has gone down again from seconds to milliseconds.

Validar campos de verificación en CakePHP









  • english
  • spanish

When designing a registration form for a web site, it’s usually a good idea to include verification fields for both the password and the e-mail address: a typo on the password would leave the user unable to log into the system, while one on the e-mail address would prevent us from reaching the user. Asking twice for these fields allows us and the user to verify that there are no typos and that both values are correct, thus avoiding future problems.

With CakePHP these verification fields can’t be validated automatically with the model’s definition rules, it has to be done programmatically on the controller as there is no model verification rule to check that an object’s field’s value equals that of another field (equalTo compares against a string as a literal, not against a variable).

By defining the following function on the model we are defining a custom verification rule that will allow us to do this verification properly on the model (or download it here, the plugin I use for code highlighting or wpmu itself keep trashing all > symbols):

<pre>        /**
         * Verifies that the field beeing validated matches the value of the
         * field in the second parameter. If omitted and the field beeing
         * validated’s name is NAME_verification, the value will be compared
         * to that of the NAME field.
         *
         * Examples:
         * <code>
         * var $validate(
         *      ’password_check’ =&gt; array(
         *              ’rule’ =&gt; array(‘verifies’, ‘password’),
         *              ’message’ =&gt; ‘The passwords don\’t match’
         *      ),
         *      ’email_verification’ =&gt; array(
         *              ’rule’ =&gt; array(‘verifies’),
         *              ’message’ =&gt; ‘The email addresses don\’t match’
         *      ),
         * )
         * </code>
         *
         * The second case will verify automagically against the "email" field.
         */

        function verifies($data, $field=null) {
                $keys = array_keys($data);
                $key = $keys[0];
                if(!is_string($field)) {
                        if( ($pos = strpos($key, "_verification")) === FALSE ) {
                                return FALSE;
                        }
                        $field = substr($key, 0, $pos);
                }
                return ($data[$key] == $this-&gt;data[$this-&gt;name][$field]);
        }</pre>

Cluster de correo escalable con software libre









  • english
  • spanish

At my previous job I was responsible for the MTA of a group of companies, handling around 3000 e-mail accounts spread over 20 domains. This MTA received around 150,000 mails daily, and over 95% of them was discarded/marked because it was identified as SPAM or viruses (as of last year, don’t know how this evolved since I left). We used a homegrown cluster of seven servers, which enabled us to scale as needed. And it was based on free software.

This is not an step-by-step installation guide with technical details and configuration files, but rather the story of the evolution of the service, the various problems that we faced, how we solved them, and the design decisions in each case.

Migration

The first incarnation of the server was in 2001 when we had to migrate the old server, which was starting to give lots of trouble, to more current software and hardware. I seem to remember it was a mail server from Netscape (!?) that stored the account information in an LDAP directory, but can’t recall the exact name or version of the product. The server we chosed for the migration was qmail-ldap, mainly because of the good reviews we read about its stability, reliability and security, ease of setup (personally I still think qmail is much simpler than eg sendmail) and because it also used an LDAP directory. The latter may seem a silly reason, but in the end the migration had to be done in extremis at a time that the original server wouldn’t even boot most of the times, and we got away with it with a simple ldapsearch and a little script that “translated” the LDAP scheme of one server into that of the other one. Over time the choice of qmail-ldap proved to be the right one, because thanks to its modular design it allowed us to progressively move from a one server deploy to the cluster that I refered about in the introduction.

This first server was a rack-mounted one, with redundant power supplies and hw RAID5, so that all the data was secure (or so we thought back then). We also rolled qmail-scanner and the Kaspersky anti-virus (there was no ClamAV yet, we moved to it some years later). The same server held the SMTP, POP, IMAP and WebMail (SquirrelMail) services.

Active/Passive backup

We had to do the first architectural upgrade a couple of months after the migration: a RAID5 hiccup lead to a corrupted filesystem which was quite difficult to fix. It became clear that the RAID discs and the redundant power supplies were not enough to ensure the data integrity and service availability, so we installed another server exactly like the first one, and synchronized the configuration and mailboxes using rsync and cron jobs. The switching from the primary to the backup server was manual back then, using NAT at the router.

Over time the server was upgraded to new models several times, but we kept the active/passive backup structure. The syncronization between both servers was also improved, with DRBD for the mailboxes and csync2 for the configuration, AV bases, and so on. Master-backup monitoring and service switch was automatized with heartbeat.

The SPAM flood, specialization by resources

Sometime around 2002-2003 viruses ceased beeing e-mail’s biggest problem: the increasing number of SPAM messages received every day was way worse. So we threw SpamAssassin into the mix. Over time this lead to an ever-increasing CPU and memory consumption, slowing the server to a crawl. At first it seemed that the only option was to migrate every year to a new, more powerful server (and what would we do with the old one then?), or have multiple servers and distribute all the domains among them in an attempt to distribute the load.

Finally we realized that we had two different kinds of resource needs, with different growth patterns:

  • HD space for the mailboxes: the number of mailboxes in our system was fairly stable and the vast majority of our users downloaded their e-mails using POP, so HD scalability wasn’t really that big of a problem for us. We could easily afford to upgrade disk every few years, moving the service to the backup server while we were upgrading the master one.
  • CPU for the filtering: SPAM was growing at an exponential rate, we basically needed to double the CPU power each year.

So, why not specialize our servers into storage servers and a filtering farm? We moved the SMTP service from the main servers to a front-line of SMTP servers with the follwing characteristics:

  • they were off-the-shelf PCs and their configuration was practically identical (no variations appart from hostnames and IP addressess). We prepared a system image we could easily dump in a matter of minutes to a new PC, in case one of the servers went down or we needed more raw CPU power because of an increase in SPAM.
  • we had a router load-balaincing port 25 among all these servers.
  • all these SMTP servers were independent from the central ones, except for the final step of delivering the already analized mail to its destination mailbox: each server had a local copy of the LDAP directory (synchronized with slurpd), a copy of all the configuration files and all the AV bases and the SpamAssassin bayesian database (synchronized with csync2), and a DNS resolver/cache (dnscache).
  • they did local logs, but also sent them to a centralized syslog server for easier analysis.
  • they didn’t store the mails locally for later delivery, in other words they had no delivery queue: e-mails were analyzed on the fly during the SMTP session and if one of them met certain anti-SPAM/AV criteria (blacklisted IP, a number of RBL hits, certain keywords, etc.) it was immediatelly rejected with an SMTP error and the connection was closed; on the other hand if the mail was let through (it was either legitimate, or marked as possible SPAM), it was sent to the central server on the spot, and the filtering server never gave the OK to the origin MTA until the mailboxes server acknowledged the delivery. This is done quite simply with qmail by means of replacing the qmail-queue binary with the qmail-qmqpc one. By doing this we were able to guarantee that no mail would be lost in the event that a filtering server crashed, as the origin MTA wouldn’t receive the OK from us and would re-try the delivery after a couple of minutes.

Mailboxes, the POP and IMAP services, the LDAP master, webmail, and the remote queue remained in the central server, although most of them could have been moved to independent servers if needed, but we never needed to.

Specialization by type of client

The next problem we faced came about 2-3 years ago when image- and PDF-based SPAM became popular: we added an SpamAssassin plugin which re-composed animated GIF images and did OCR to all image attachments. This extra analysis greatly increased our CPU needs (we had to go from 2 or 3 filtering servers to 5 in a couple of days) and even so there were times when a server got overloaded for some 5-10 minutes and an e-mail could take not less than 2 minutes to be processed, delivered and SMTP-OK’d. When this happened and the sending party was another MTA it represented no bigger issue, as in the event of a timeout or disconnection the remote server would re-try the delivery several times; however, if the sender was an end-user with his MUA, a longer-than-usual delivery time or (God forbid) an error message from Outlook because of an eventual dropped connection lead to a phone call to the IT team because “the mail wouldn’t work.” :-)

The solution was splitting the SMTP and analysis farm into two: one for external mail and another for internal ones, for our users. The first farm is the one the DNS’ MX records pointed to, and had all the SPAM filtering options activated; while the second one retained the domain name end users used as the SMTP server in their MUAs, had all the heavy-weight lifting filters disabled and required SMTP authentication (wouldn’t accept non-authenticated sesions even for local domains). This way all external e-mail coming from remote MTAs would go through all the filters, and our users went to the privileged servers with somewhat lesser filering capabilities (but enough for internal mail) and great response times.

The big picture

mpstat









  • english
  • spanish

One of the most basic variables to monitor on a system is CPU usage, something we usually do interactively with top or with vmstat when we want to script the process.

But there’s a problem with multi-CPU systems, with either physical or virtual CPUs, because in these cases vmstat shows the average usage across all CPUs. Depending on the software or hardware architecture, a CPU at 100% can become a bottleneck and produce process blockings ever if the other CPUs are completely idle, while vmstat would show us just a 50% CPU usage. This is quite typical with single CPU systems with HyperThreading technology.

An alternative to vmstat for these situations is mpstat, from the sysstat package, that shows the individual per-CPU usage rates. Very useful when writing a script for displaying graphs with Cacti or raising alarms with Nagios.

# mpstat -P ALL
Linux 2.6.9-023stab046.2-enterprise (domain.com) 	25/09/08
20:39:02     CPU   %user   %nice %system %iowait    %irq   %soft   %idle    intr/s
20:39:02     all    0,79    0,00    0,17    5,74    0,00    0,00   93,30      0,00
20:39:02       0    0,87    0,01    0,19    6,85    0,00    0,00   92,09      0,00
20:39:02       1    0,87    0,00    0,18    5,94    0,00    0,00   93,00      0,00
20:39:02       2    0,74    0,00    0,16    5,14    0,00    0,00   93,96      0,00
20:39:02       3    0,68    0,00    0,16    5,02    0,00    0,00   94,15      0,00

iostat & iotop: I/O debugging









  • english
  • spanish

A couple of months ago, we had an interesting issue at a customer: an application wasn’t performing well, but the system had more than 20% CPU idle and wasn’t swapping memory, so it wasn’t a lack of resources. After a deeper look into vmstat, we saw a constant 30% of CPU in I/O state. We had some kind of I/O bottleneck.

To discover the root of the issue we used two programs:

  • iostat (comes with the sysstat package): similar to vmstat or ifstat, but shows I/O operations per device and partition, updating its output every X seconds.
  • iotop: like the classic top, sorting the processes according to their I/O rate.

By using these two utilities it’s quite easy to discover which process is creating the I/O bottleneck, and on which particular device.

In our case, the problem was a RAID controller that was giving a terrible writing performance, coupled with a process that was doing around 15 small, random access writes per second.