2011-05-31

Maintainability, Security vs Perfectly Deployed

I was part of an interesting conversation today. Specifically, a client for the day job wanted some feedback on their proposed push to the cloud. Their plan was architected by their main developer, but the client wanted our input since we manage their existing set of servers (an eclectic set of differently configured and versioned FreeBSD boxes and a CentOS 5 web server that we had migrated their main web presence (and store) to a few years back).

The developer's server layout was simple enough, and should cause no problems; Using goGrid's cloud offering, a set of web servers and app servers behind the included F5 load balancer (comes with the cloud), with a MySQL cluster on the back-end. Now, I'm sure it's going to start as just a single server of each type, if even that, but that's a fairly vanilla scalable cloud architecture, no surprises there. The problem is, of course, in the details. It even starts innocuous enough, the developer wants to run Debian on the systems.

Now, I don't have anything against Debian. It's a fine distribution, one of the big few, in fact. The suggestion I had though, was to use a corporate backed distribution in lieu of Debian. My reasoning goes like so; having market pressure to fix problems is a wonderful incentive to get important things fixed FAST. In this respect, for an enterprise deployment, I view Ubuntu (LTS) as a more fitting if a debian-like distribution is desired. When running a business, it's important to know that the company responsible for your server software is tied financially to providing the support you need.

That's simple enough, and hardly a point to get caught up on for anyone not trying to take a hardline stance on software freedom (a point which would be hard to defend for a company that develops and sells software as the client does, and that wasn't their intention). The real problem comes next. The developer wants to install a minimal debian root and manually compile all software needed into /srv along with the content. His reasons are simple, but in my opinion, very, very wrong.

I'll outline them here, and explain in detail why from an administration and maintenance standpoint, these are the wrong choices.

  • He wants the extra performance compiling a lean Apache with only required modules compiled in directly allows

  • He doesn't want to use a package manager, as it makes everything too hard to deal with. He would prefer a few hand written shell scripts to do building.

  • He wants more control over security updates, to only apply and reload services when the security problem affects them



I'm not going to spend much time addressing the specific incident that spurred this. Based on the conversation, I think this developer just needs some education as to what features are available. I'm much more interested in addressing the idea as a whole that a customized environment is much better than using what you get from the enterprise distributions. As such, I want to focus on addressing the generalities of these statements, and why they either don't hold true, or need to be weighed in context of both their positive and negative consequences.

Note: This is all aimed at small to medium businesses, where a large set of system administrators on staff is not a requirement nor desired. At a certain scale (or in certain markets), it DOES make sense to start doing your own system engineering to get a competitive edge. If you are one of those companies, this doesn't apply to you. If you don't know if you are one of those companies, then you either aren't one, or you aren't in a position to care


Let's start by addressing the idea that compiling a custom version of apache with components and features excluded will result in extra performance. First, this is true. My experience, common sense, and the developers themselves state there are compilation options that will result in a faster program. My point is that it's generally not worth it. If you are building a system and you expect load to increase, scale out, don't scale up. That is, design the system so extra nodes can be easily added, don't overly optimize each node yet. All optimization, especially early optimization, limits your future choices (or at a minimum makes certain choices harder than they normally would) in some respect, while enhancing the feasibility of others. This isn't conducive for the best solutions later.

The next item is about package management. The claim is that it makes everything too hard to deal with (the actual words used included "nightmare"). I can only attribute this towards lack of experience with package management systems. The RPM format, for instance, is quite flexible and allows for everything from compiled binaries, scripts, documentation or even an empty packages that just act as a node to pull in all the required packages for a complete subsystem of functionality. In an RPM spec file, you specify the requirements, the build process (how to unpackage, patch, build and deploy), and pre or post-install procedures, and a list of all installed files and their locations and types. This last bit is really helpful, as it lets the package manager know what files are considered configuration files, and how to treat them (backup existing and overwrite, place new file next to existing with new extension, etc). With this you get complete package file listings, file integrity and permission checks, easy installation and removal, dependency tracking, and a complete separation of your runtime environment from your content in a manageable manner.

Finally, there is apparently a desire by the developer to manage all security updates manually, so security updates don't negatively impact the production environment. This shows the biggest ignorance of what an enterprise distribution really provides, which is reliability. There's a few things to address where, so I'll spread it out into a few areas; security management, testing, and finally division of roles.

Something it seems many people don't understand is that enterprise distributions back-port patches to the versions of programs they shipped with so they don't need to change their environment. That is, if RHEL5 shipped with Apache 2.0.52, chances are it will stay version 2.0.52. Specifically, Red Hat will handle any security problems, and back-port any fixes from the version they were implemented in to the version they shipped. This allows for a stable environment, where all you have to worry about is that bugs and security fixes are dealt with, but all other functionality stays the same. API changes in newer versions? Removed features? New bugs? Not your problem. Note: In RHEL systems, some packages may be considered for new versions on point releases, generally every six months. You can rely on they updates to not reduce functionality, and to be compatible with the prior version in all aspects you may rely on. Additionally, instead of upgrading the version, they may choose to back-port an entire feature into the older version. These changes all go through extensive QA processes, and are deployed to thousands of systems, which happens to be the next point. In the end, what this really means is that you get security updates without having to worry unduly about whether they will cause a problem in your environment.

Testing is important. By rolling your own environment, you are saying that you believe you can integrate all the components, and test them sufficiently that you feel confident that the environment should perform as needed without problem. Most developers get this part right, as they know their core needs better than anyone else. The problem is on all the little changes. You need to re-certify this environment after any change. How many developers take the time to do this? It's drudge work, it's hard to find all the edge cases (much lest test them), and most of the time, it just works. Or appears to, at least. One of the single biggest advantages to using a pre-packaged binary with wide distribution is that any problems are likely to have been encountered before you, and they might have been documented and fixed already as well. You are essentially leveraging the QA, testing and pre-existing deployments of both the ditributer (such as Red Har or Ubuntu), and the thousands of companies that rely on them.

Finally, there's the division of roles. Should a developer really be responsible for tracking down, examining, applying and testing security updates? These should be fire-and-forget procedures that cause no worry, and nothing but a slight blip on the meters are services are restarted as needed. A developer should be building new products or supporting old products, not performing the role of a sys-admin because they decided they needed a little bit more control. That just means the company isn't getting their money's worth for the developer, as that's all wasted time.

In the end, this all boils down to a trade-off in terms of functionality for other features, such as ease of maintenance and security. In most cases, you don't want to skimp on the maintainability, and especially not the security. Those can be real liabilities for a company later. In most cases, it's much easier to become a little bit less flexible but much more stable and secure. And also, who's to say there's not a middle ground here. Some packages may very well be better running with cutting edge versions than are manually compiled, but that doesn't necessarily mean that EVERYTHING must be.

1 comment:

  1. I'm surprised that the sysadmin who wanted to do the roll-your-own compile scripts never considered Gentoo Linux (especially given the history of FreeBSD in that environment). Gentoo has a fairly robust package manager combined with the ability to customize features and compile for a specific platform (and it works a lot like the FreeBSD ports system).

    That being said, and as was said here, there is something to be said for the consistency and back-porting of security fixes that a distro like RedHat Enterprise (RHEL) would provide.

    Either way, having used both RedHat and Gentoo, I would go with either of these two solutions rather than trying to do all the porting, compiling, and maintenance of software all by myself. It sounds easy, and in some circumstances it is, but it is far better to become active in the maintenance of a distribution like Gentoo (for example) and gain the expertise of hundreds (if not thousands) of fellow system admins to help than to try to handle this all by yourself.

    ReplyDelete