At the company I work for, we manage the digital editions of several local newspapers spread all over Spain. None of them is big in a nation-wide sense, but almost all of them are leaders on their region.
For quite some time, we’ve had performance problems with one of them: performance here was good (<5s load times), but the users from the region that particular newspaper is distributed on kept complaining about poor performance (>40s load times, unbelievable high). The more we optimized our server and network infrastructure, the HTML layout, CSS, code… the more they complained and the more obvious it became that there was something else going on.
After some investigations we discovered that the routing between the major ISP of that region, which almost all of our readers used, and ours was the cause of the problem: a traceroute from a local DSL line there to our servers showed that the traffic went to Germany before coming back to Spain, with quite a high latency and high roundtrip times.
So, it wasn’t our fault, the real solution to the real problem was out of our reach, but in the end, our image was at stake so it was OUR problem. What could we do?
After some inspiration the solution became clear: get a housing on the local ISP which had the problems and set-up a reverse proxy there, and redirect all clients of that ISP to this proxy. Sure, the connection between the proxy and our servers would be as bad as before, but as the content would be cached and refreshed on the background, the final user shouldn’t notice it any more!
There are just two pieces of software involved here:
- squid, the most used proxy on the Linux/UNIX world.
- djbdns, our DNS server of choice. Among other things, it has the ability to return different IP addresses to an A query depending on the IP address of the client.
squid
squid is quite easy to set-up as a reverse proxy. After installing it (“apt-get install squid” in our Debian-based server) edit the main config file at /etc/squid/squid.conf and:
# Treat several concurrent queries for the same URI as one, # Define wich domains are we going to serve Obviously there's much more to configuring squid than this. These are just the basic options to get our solution going and do some preliminary tests. Then there's memory limits, object-cache management, cache-expires management (which you better have on your application code anyway), peer caches, and much much more. Get some good Squid HOW-TO or book if you want to learn how to tweak it for optimum performance. Now the tricky part: directing some users to our servers and some other to the proxy. Luckily, the DNS server we use (djbdns) has a built-in option to do this. What we've done is defining two names, isp1.example.com pointing to our IP, and isp2.example.com pointing to the proxy, and then a CNAME which will point to one or another depending on the client's IP, much like Akamai does. This way we can easily and individually access each server. # A records for our server and the proxy # Pivoting CNAME depending on the client's IP Of course, following this scheme we could add as many proxies on as many ISPs more as we wanted, creating an Akamai-like CDN (Content Delivery Network). You get the picture. For more info on djbdns data syntax, please check: http://cr.yp.to/djbdns/tinydns-data.html
http_port 80 vhost</code>
# reduces bandwidth and in our case improves performance
collapsed_forwarding on
# Refuse anything else
acl myDomains dstdomain www.example.com isp2.example.com
http_access deny !myDomainsdjbdns
%PX:XXX.YYY
# All the rest
%RS:
=isp1.example.com:A.B.C.D:300
=isp2.example.com:Z.Y.X.W:300
Cwww.example.com:isp2.example.com.:300::PX
Cwww.example.com:isp1.example.com.:300::RS
Pingback: Jompeich d’er Bisente » ¿Cuánto ralentizan los Ad Servers la web?
Pingback: Reverse Proxying with Squid « The Squid Web Proxy/Cache Blog
These are just the basics that Akamai offers but cool idea.
But it’s a cool idea. Thanks for posting!
Yes, Akamai offers much much more than this. But in some cases you may need nothing more than getting your content nearer for just one or two ISPs (or branch offices, etc.) This is an easy, cheap and maintainable way to do it.
And with some re-working and better caching this could also be used as a high-availability solution for static web sites: keep the cache updated and serve all the content off it in case the main site goes down.
Thanks for the comment.
Pingback: Akamai | Another light reading
I like Akamai, much more than CDN which is what we have used before. Nice post, good to hear your experiences.
if i have dynamic web with database can i use this tutorial ??? or maybe can u send to me topology are u using for this case thanks a lot
Yes you can, but you’ll have to be very careful with what you cache and what you don’t, fine tuning it either in squid’s config or on your app using the HTTP cache-control headers. Bear in mind that, by default, squid will NOT cache any dynamic content (your case) unless specified with the HTTP headers, so you’ll only be caching the static files in your web (images, CSSs, JSs, etc.) But if most of your pages, albeit dynamicly generated, won’t change for hours, you can setup an aggressive cache strategy and leverage the load of your web server.
For a great tutorial about HTTP cache control, take a look at:
http://www.mnot.net/cache_docs/
Thanks a lot for posting this. This helped me fixing another problem that I was trying to find a solution for. Your reference to collapsed_forwarding helped me crack the issue.
Cheers
San
@San: glad you find it useful.