Archive for the ‘Servers’ Category



Patches

Friday, September 25th, 2009

Meet Patches. Keep him healthy and he’ll be with you a long time. Look at that face! Knowing the anxiety Patches suffers when going to the vet, do you religiously take him every time there’s a new medicine on the market, just in case he might catch some exotic bug he has .001% chance of contracting?  No, most likely not.  But you do take him to the vet for regular shots to prevent things dogs of his kind are likely to have problems with – a regular maintenance visit, you might say.

our-dog-patches

Why it is worth the cost and commotion of going for the maintenance visit, but not every time a new vaccine or pill is announced?  Because the cost/benefit equation is right for one and not the other.

A lot can be learned from Patches about the discipline of patching servers. We are occasionally asked “How often should I patch my servers?” and we get into discussions with a wide variety of customers with widely differing views on the subject. Often though, we find that it largely boils down to one’s view of the world – is your glass half empty or half full? Certainly, we need to keep systems patched to at least the minimum level supported by our software vendors, but given the cost and commotion (dare I say trauma) of the patching process, how far beyond that is necessary or prudent? If you have Internet facing assets, then clearly you want to keep those up to date with the latest security patches as soon as they’re available. But if you have private, stable, non-web assets well behind well-managed firewalls, a less rigorous approach is reasonable. There is no need or rational justification to blindly apply a patch willy-nilly simply because it’s available. Who has not been the victim of downtime because an ill-behaved patch did something that it was not supposed to do? And, lest we forget, rebooting a Windows server after patching is not always a trivial event – just ask the sysadmin of a Blackberry Enterprise server.

Remember the purpose of infrastructure is to keep running – the very namesake of this blog.  Our infrastructure does us no good when it’s down. Every patch brings with it the some level of risk to uptime. So, the obvious thing to do would be to test every patch before we apply it to a product system. Do you? Really? Every time? Or is easier to just apply the latest raft of fixes from say, Microsoft, and just hope for the best? For those of us who have to endure the regular water-boarding process of a SAS 70 Type II audit, hope is not a strategy. Not only do we have to test every patch before applying it to a live system, but we also have to prove that we did so, and that we have a defined process that meets the muster of the auditors.

agentsmith3

This process of patching is costly in terms of time, money, and risk. So how often should we patch? Somewhere between hope and SAS 70 lies the right answer for most of us. Like maintaining your car, regular maintenance of a server is necessary to keep a system “on the road.”  This many mean spending time regularly (like an oil change) researching patches to see which of them you really need as opposed to those that make you feel warm and fuzzy, and then testing appropriately first. On the other hand, regular maintenance may not imply regular patching. If a system needs to be running the latest Windows server OS, or the application vendor forces your hand, then you will certainly be patching more often. If on the other hand, you have a functionally stable system that doesn’t change much, has been running well and isn’t the flagship of your ecommerce empire, then you will probably patch extremely infrequently if ever, and that’s OK. We’ve got a Red Hat 7.2 system here that sees heavy daily usage, has not been patched in years, and has not been hacked or had any problems over that same span of time. Sacrilege? Perhaps, but we believe it’s prudence. It could also be Pennsylvania Dutch stubbornness.

You do need a patching process, but it should reflect you particular situation and account for the nature of each of your servers. Like the Pirate Code, best practices in this arena are more like guidelines. You can spend a lot of money and create a lot of headaches with a one-size-fits-all approach.  A socialist patching approach sounds good on paper, but as you would expect with anything socialistic, it tends not to work out well in reality.

pirates-guidelines-cover-we

Weigh the risks and cost of downtime vs. the potential benefit of a patch.  Part of your process should include a justification phase where IT and business stakeholders have an opportunity to understand what is being patched, why it’s been deemed necessary, and what the possible ramifications are if things go awry. And, most importantly, the stakeholders should have both veto power and the power to determine the scheduling of patch activity.

Patching is a necessary evil, but it is manageable if you take the time to think through the process and come up with a practical plan that fits your business. Or, you could simply delegate the process to folks who know how to both open and close Pandora’s box.

//spk

Post to Twitter Tweet This Post to Delicious Delicious Post to Digg Digg This Post Post to StumbleUpon Stumble This Post

Up or Out?

Friday, June 26th, 2009

I recently read a post that took a new twist on the long-term debate over whether it’s better to scale up (buy bigger servers) or scale out (buy more servers).  Traditionally this battle has been fought mostly on the technical considerations only.  ”Which is better for processing the real-time inventory of my growing  Dippin’ Dots empire vs. the fast serving of web  pages on my trendy social site?”

DIPPIN' DOTS BANANA SPLIT

The conversation is often reduced to raw number crunching power vs. the benefits of highly parallel processing or high availability.  But in this era of sacrifice, we might want to take a look at the oft-overlooked cost factors lurking behind the curtain.  Following the framework of the aforementioned post, let’s consider the costs using IBM hardware.

First, we fire up IBM’s server configuration tool and build a big dreadnought-class server with an x3950 M2:

  • 4 CPU sockets (using 6-core processors)
  • 32 memory sockets
  • 4 drive bays
  • 2 power supplies
  • 4U

Total Price:  $68,429  MSRP.  In Pennsylvania, we throw on another 6% for the Governor, so the number rounds to $72,500.   Sure enough, scaling up has the hefty price tag one would expect.

If instead we were to scale out, what kind of horsepower could we get for the same money?  Taking a trip to the other end of the product line, we find the modest x3250 M2:

  • 1 CPU socket (using a 4 core processor)
  • 4 memory sockets
  • 2 drive bays
  • 1 power supply
  • 1U

Total Price: $2,431 MSRP.  Allowing again for the Governor, we come in at $2,575, which means that for the same $72,500 we could buy 28 of these unassuming smaller servers.

So, if we decide to go shopping to IBM with $72,500 as our budget, what can we get for our money?

: ———— Scaling Up           Scaling Out

CPU’s               24                          112

RAM              256GB                   224GB

Disk               1.2TB                      28TB

It would seem that scaling out puts more resources in our data center for the same money.  Score 1 for scaling out.

Now let’s take a look at things from the software angle:

:————————————— Scaling Up             Scaling Out

Windows 2K8 Server*            $2,515                     $20,524

SQL-Server                              $7,400                    $16,800

Our quick mental math says that scaling out costs nearly 4 times as much in software.  Score 1 for scaling up.

And now for the tie breaker, let’s examine operational power costs, assuming the boxes run on average at 50% of peak and without factoring in cooling:

——————————-: Scaling Up           Scaling Out

Peak Watts                1440w                     9,828w

Power Cost/Year        $441                       $3,013

Scaling out is an order of magnitude higher in power costs.  Final score: Scaling up appears to win in a narrow 2-1 victory.

Having seen the costs, which approach seems to make more sense?   If you object to this question, you’re quite right to do so.  From a strictly financial point of view, scaling up seems to be way the go, unless you decide to level the playing field by zeroing the software costs with open source (e.g. Linux and PostgreSQL).  Scaling out becomes more financially appealing when open source is in play, which is what we often find in places like Google.

Of course, the decision can’t be made solely from a financial point of view, but prior to this exercise  have you ever even considered these hidden-in-plain-sight costs?  Ultimately the decision still does still come down to your particular business needs which must be discussed on the technical requirements involved.   Watching your team whiteboard the various options can sometimes be more tedious than reading Klingon poetry,

klingon2

but you need to let the team work through both the technical and cost considerations to arrive at the best solution.

There are two take-aways from this example.  The first is that when the technical requirements don’t point hard in either direction, you may be able to appeal to cost to help arbitrate the decision.  The second is that you really don’t need to make these types of decisions anymore.  The infrastructure utility trend is already in motion and is gaining momentum.   Before investing significant capital of any scale, consider deploying new applications in a professional hosting data center. Outsource these ongoing scaling decisions to others while you focus on the bigger picture of providing the right applications for your business.

//spk

* Please don’t flame me with comments like “How’d you get those prices, we only pay $20 for Win2K8 server?”   I haven’t spent the four years of education required to be fully conversant in the Microsoft  Licensing program, which is more complex and complicated than ancient Hebrew Law.  These prices were based on recent customer quotes and internal pricing from our distributors.

Post to Twitter Tweet This Post to Delicious Delicious Post to Digg Digg This Post Post to StumbleUpon Stumble This Post

Who’s Afraid Of The Big Bad Wolf?

Monday, April 20th, 2009

wolf

News  of Cisco’s intent to enter the server market with its Unified Computing System offering has set the industry pundit’s hair ablaze.   “How will IBM & HP respond?”, “How much market share will be lost to Cisco?”, “Do you want a plumber building your servers?” and on it goes.  The FUD truly has been flying.  You would think the Big Bad Wolf had just come back to Grandma’s house.

So, what does the announcement of UCS mean to us here in the non-rarified air of business computing?  Will it help us run our shops better?

Listen to Cisco CEO Chambers closely…

We look at this as bringing virtualization to life…unleashing the power of virtualization.   We go about it catching market transitions and trying to set timing, first in the data center, but make no mistake about it [UCS will make it] all the way in the home… [emphasis added]

 

What market transitions, pray tell, is he referring to?  Could it be anything other than the transition to utility based computing? It’s fairly clear he’s not talking about our server rooms and data centers.  No, it would seem Cisco has its sights on something much larger. Chamber’s message is unmistakeable.  If the coming world of utility-based computing were to be compared to The Matrix, Cisco would not be found content with simply supplying the network plumbing – they want to be the Matrix itself. Having already tucked away the network, we now see a move into processors. Can storage be far behind? Perhaps the Big Bad Wolf already has that in the oven.

It doesn’t seem on the surface that UCS is intended for the typical IT shop, but let’s assume otherwise for a moment.  Is there a compelling reason for us to consider (or fear) UCS?    What would make us willing to try a  brand new brand?

In many ways, owning server hardware is a lot like owning a vehicle. First, you make your purchase based on size, looks, performance, the features you need, reliability, serviceability, and of course the price. Sometimes you’re looking to save gas (power), but not always. Maybe you decide to lease it. If you end up with a lemon, you know that very early in the game, and you get the vehicle fixed or replaced under warranty. From that point on, if you put in decent gasoline (clean UPS power), do regular maintenance (clean the fan grids, do disk defrags), and operate it within its design limits (proper cooling), it will run well for a long time.   When it wears out, or after you simply get tired of it and want something new and sexy, you buy a new one, sell or trade the old one, or possibly keep it and run it until the wheels fall off.

In the final analysis, whether you buy Chevy, Ford, Chrysler, or a brand you’ve never tried before really doesn’t matter. You go through the same decision process and ultimately you buy what you like or what you feel comfortable with.  The care, maintenance, and disposal process is the same no matter what you buy. And statistically, the reliability is pretty much the same across the board, despite the religious fervor that surrounds each brand. They all run well on balance, and they all have an occasional breakdown. For every hardware horror story out there, there are scores of identical hardware instances that run their entire lifetimes without a glitch.

Of course, if you absolutely must be the first kid on the block with a new hardware vendor, your mileage may vary.

Early UCS adopters on the phone with Cisco Tech Support

Early UCS adopters on the phone with Cisco Tech Support

For most of us, UCS is not going to help with the primary purpose of our infrastructure.  So what does make a difference in how well our business systems stay up and running?

If you put a good driver (software) behind the wheel of your vehicle, you can be confident it will stay on the road doing what you intend it to do.  If you put an unskilled, abusive or reckless driver behind the wheel, you can expect more mechanical breakdowns (minor outages), accidents (major outages), or worse (disaster declaration).

I resisted naming operating system names above, but ask yourself, when was the last time you had down time because an operating system or application went off into the weeds?   Do you schedule weekly or nightly reboots “just for good measure” because you can’t trust things to stay healthy?    It is an alarmingly common practice in our client base.

There’s a Red Hat 7.2 system that’s been hosting workload here for years that only comes down when we take it down to replace or upgrade the hardware.   We have a farm of VMWare ESX servers that behave just as well.   Yet we also have a number of Win32 servers running on the same hardware for which I can’t say the same. 

It’s not the hardware.

Lemon’s notwithstanding, the brand of hardware, be it IBM, HP, Dell, and now ostensibly Cisco, really is not the key factor in maintaining uptime.   In this day of clusters-everywhere and RAID-everything, it’s typically not the hardware that takes you down – it’s unreliable software, change  or human error.

As for UCS, it doesn’t look like the Big Bad Wolf is coming to our house anytime soon, but it is a good idea to keep a watchful eye on where he is going.  Cisco has cold hard cash and a big vision, but that vision seems cast for The Matrix, not our server rooms.

theciscomatrix

Buy what you’re comfortable with and put the right driver behind the wheel, or better yet, let us worry about that for you.

 

 

Post to Twitter Tweet This Post to Delicious Delicious Post to Digg Digg This Post Post to StumbleUpon Stumble This Post


Twitter links powered by Tweet This v1.6.1, a WordPress plugin for Twitter.