Christina arranged for us to meet with John, a great Valley entrepreneur that co-owns SV Colo, and several other ventures, such as Tellme. He knows hosting. I was fortunate enough to get to pick his brain for an hour. Here’s what I learned:
The challenge in a data center is cooling. Google and others have started building datacenters in Oregon by the ocean where real estate is cheaper, and you can use the cool water from the Pacific Ocean to cool servers. The capacity to cool is measured in Watts per square foot, and the watts are simply measured by how much power the boxes consumer. Sure, some of that power ends up moving the fan and producing some noises, and machines may not run on full power all the time, but it’s a pretty solid measure.
What that means, specifically, is that you can’t pack your data center very densely. Specifically, 1U units are not worth it, because you can’t fill up a cabinet with 1U’s, anyway, because they make too much heat. A typical number is in the 100-140 Watts per square foot range. You do the math. Specifically, beware of hosting companies that do put too many units into their space. They might end up letting your machine overheat. Bad.
How you get the air moving through the racks is another challenge. Increasingly, people are using racks that pour air in from the bottom and suck it out from the top, which keeps the air moving nicely. The next step up might be some form of liquid, either water being led directly to the CPU, or something else that boils at a low temperature. But that’s not quite relevant right now.
I also learned to watch out for who the ISP is that provides the bandwidth, and make sure that they’re reliable and deal promptly with spammers so you don’t risk getting black-listed because of some other idiot on the same IP block as yours. And we should avoid Cogent like the plague. They’re the low-cost leader but unreliable. And I learned a term, “well-lit,” meaning there’s a lot of glass fiber in the ground. Silicon Valley is quite well-lit :)
If you want to get really good uptime, forget about one data center, go for two completely separate datacenters, with DNS that you can control outside of this, so you can switch from one to the other by changing the DNS. Trying to get five nines in one data center isn’t cost-effective.
On firewalls, John’s recommendation was that we forget the single firewall in front, because it’s a single point of failure, and instead firewall each machine aggressively. Specifically, that means running ipfilter (Redhat, Debian, Fedora, etc.) or rc.firewall (FreeBSD).
For load balancing, you can go with a decicated box (a Foundry one will run you $5K, but try getting a used one on Ebay). For redundancy, what you can do is, if each individual app server is available on the public internet with its own IP and DNS name, then if the load balancer should blow out, you can, as a backup, go in and change the DNS of the main host, and point that to one of the app servers (it would be a CNAME), and have that configured to do load balancing instead, while you rush out and buy a now load balancer. That’s the advantage of firewalling the hosts and having them available. Neat, huh?
Another thing that allows for is what’s called “direct server return”, which basically means that the request comes through the load balancer, but the hosts send the response directly to the client, rather than go through the load balancer. I’m going ta have to look more into the implications and configuration of that, but it seems like it would lessen the burden on the load balancer a bit.
Databases are quite a bit more tricky, but one thing you can do, is that you set up master/slave setups, and then if the master gets lost, the application can be switched to run in read-only mode. That way people can still get to their data, but you can’t add or update anything. Not sure if it’s worth adding the extra overhead and complexity to the app for this, but it’s an option.
One more thing: Don’t run NFS. Why? Because if one server gets hacked or goes down, it can hose the whole cluster. A better model is to have a server that’s the primary file server, and then use rsync to keep things in sync. Really simple and it works. Anecdotally, John related that in his experience, NFS and LDAP were the two things that were most often responsible for taking a whole cluster down. Also, you can’t run NFS between two data centers easily, and it requires rpcbind, which makes it harder to secure the server. Without NFS, you can pretty much close down all UDP traffic and only allow TCP.
That’s about it. I also got to chat with Tom from EngineYard at the STIRR mixer in Palo Alto on Wednesday. These guys definitely have their s**t together. They were using Global File System for shared files, which is also very cool, but requires special hardware. And Tom described how they had already thought about the two-data center solution, and had some ideas for how to manage the MySQL installation, so you could do active failover to another master. Or they’d rely on Oracle for that. It seems to me like a problem where the cost of solving it outweighs the cost of downtime should it happen weighted by the risk that it does happen. But I haven’t done the math on that one.
I hope you can use all of this for something. I’ve definitely learned a bundle from talking to these smart people. I have to say I find hosting interesting and fun. Developing apps and interfaces is really nice, but when you get it out there in the world, you have to take all of these things into consideration. And seeing some of the racks of some of the companies that they host at SV Colo was enlightening. You can tell that some have grown really quicly without much planning, and then next to it, you can see their new rack, all nice and clean with cables running neatly between the boxes. Tells you something about the companies.