Retro Hardware
April 15th, 2010, by Scott Kantner
Take a look at this awesome photo essay. Thankfully, this gear is long gone from our data centers.
//spk

April 15th, 2010, by Scott Kantner
Take a look at this awesome photo essay. Thankfully, this gear is long gone from our data centers.
//spk
April 1st, 2010, by Scott Kantner
Don’t ever press the wed, err red one. While I laughed hysterically at this cartoon as kid, I never thought it would become my reality one day. Yes, I have pressed the red one, but I hope to never have to again.
The “red one” is none other than the Emergency Power Off button, and here on the east coast it’s pretty hard to build a data center without one. What?! You don’t have one? Shhhh…I won’t tell. You’re secret is safe with me. Here’s what a real EPO red button looks like in case you’ve never seen one.
Notice the label. I firmly believe it should also say “UPDATE YOUR RESUME BEFORE PRESSING” as pressing this is in most cases is a resume-generating, if not career-ending event. Why? When activated, this button’s job is to do one thing, and one thing only: cut the power to your data center. All of it. Let that sink in for a moment. Think through that what that would mean in your shop. No power. No sound. Just deafening silence, that is of course, unless you pressed it by accident and the silence gives way to the sound of clanging pitch forks and the smell of torches being lit over in the end-user community.
I am obviously a bit biased about this topic. I don’t think these systems are necessary, but you should do some research and draw your own conclusions. I am 100% all for safety, but from the historical evidence I’ve seen, the risk that EPO is designed to mitigate is lower than what you’re exposed to driving to work every day. APC’s white paper #22 pretty much nails it:
EPO is a subsystem that is specifically designed to override all redundancy and fault tolerance built into the
network-critical physical infrastructure (NCPI), thereby putting the entire network at risk. EPO operation is
one of the largest causes of unplanned data center shutdown. The design of an EPO system must
therefore try to prevent any possibility of accidental operation, and it must minimize deliberate operation for
any reason other than a valid life-threatening emergency. [Emphasis mine]
Red buttons are no panacea, but we are nevertheless forced to install then, and then make them nigh unto impossible to press unless you Really Mean It. Note in the photo above that the button is both recessed and protected by a plastic cover. Without the plastic cover, the recessed nature of the button is the only thing preventing it from accidentally being bumped and also hopefully slows down a would-be pusher enough to stop and ask “Do I Really Mean It?” Speaking of the cover, note also the small gray loop of wire in the upper left corner of the housing – we opted to install covers with alarms. Lifting the cover results in a piercing electronic squeal capable of penetrating 2-hour fire-rated walls and forces one once again to stop and ask “Do I Really Mean It?” Cover alarms are designed to stop non-data center savvy electricians and others from innocently doing something disastrous, such as pressing the red button before installing a new circuit breaker. Yes, it happens. Well, the label does contain the word “off”, doesn’t it? Changing the label from “Electrical Power Off” to “Emergency Power Off” tends to alter the results little. The word “off” seems to be the Pavlovian trigger.
As I write this, our EPO system is being expanded to accommodate the growth of our operations. If you are building a new data center with EPO, make sure the designer includes a way to disable the system during maintenance and expansion activities. This seems like an obvious feature to include, but don’t take it for granted. This is also a handy feature to have if your operations are prone to having “civilians” in the data center, i.e. those who are unfamiliar with the various buttons and switches on the walls. It is very reassuring to be able to disarm the red buttons while such folks are meandering about the room. Even when escorted, such folks have been known to find ways to activate the EPO system, either accidentally by bumping a non-recessed red button, or deliberately pushing it out of curiosity when no one is watching.
Once you have an EPO system in place, you will have to learn to live with it. It is a risk that must be managed like all the others. If you’re building a new data center, you at least have the opportunity to design and build it properly, and then test it without jeopardizing your operations. Retrofitting an existing data center with EPO or expanding an existing system is a different matter entirely. You will want to engage an engineering firm and electricians that are very experienced with EPO systems, as most electricians are not familiar with the complexities involved with wiring EPO into a live data center environment. There is no second chance to get it right.
Here is scary story that makes my point. Cutting to the chase, the article states:
About a month after opening a new facility in March 2003, Roberts, the director of data center services for Novi, Mich.-based Trinity Health, got a call. It was Easter morning, and a contractor had accidentally activated the EPO switch as he tried to replace a module connecting the button to the fire alarm system. According to Roberts, the fiasco “took the data center out.”
“We went out at 8:30 that morning,” he said. “By 11:30 that night, we were probably 95% up and going, so we were pretty lucky. But from that day forward, I tried to lessen the effect of this EPO.”
Lessen the effect indeed. This not the kind of resurrection we want to be talking about on Easter Sunday.
Stress Relief Department
After all of this talk about outages, and with my own data center’s EPO being modified as we speak, it’s time for some needed stress relief:
Happy Easter!
//spk
P.S. I did press the red button, several times actually, but it wasn’t in a live situation. It was during the initial testing of our system. The lead engineer said “May as well press it now if you want, because you never will again.” Hopefully he was a genuine prophet.
March 12th, 2010, by Scott Kantner
A discussion of labeling in the data center could go on for days and probably be done as a nine-part DVD mini-series and sold as a cure for insomnia. Nevertheless, the importance of good and proper labeling can not be understated, but it can be simply stated: Label Everything. Let’s take a quick look at the big items.
Racks – Label both front and back doors. We use a scheme based on the row number and position within the row, such that the first rack in row 5 would be labeled “5A”, the second “5B’, etc. Other folks use the time-honored “Battleship”-style system, based on an XY grid that maps out the room, most often based on two-foot squares that make up a typical raised floor system.
![]() |
![]() |
An example would be “AJ06″, where the “X” coordinate is “AJ” and the “Y” is “06″. Neither method is necessarily superior to the other, and we happen to use both. We use the row/rack scheme for our racks and XY coordinates for infrastructure items like Air Handlers, floor PDUs, chilled water valves, etc. The reason we use row/rack rather than XY coordinates is that in a large room full of equipment, it is often hard to see the grid system on the walls and figure out where things are located. We believe it’s easier for a new sysadmin to find rack 5C (row 5, third rack) than to ask him to find rack BQ59 in a room chocked full of racks where he can’t see any latitude/longitude markers on the walls to get his bearings. Again, there is neither right nor wrong here; just a couple of different ways to approach it.
Servers and Network Gear – Label both the front and the back. The name is probably sufficient. Security-sensitive folks in your shop may balk at using IP addresses on server labels.
PDUs and power whips – Both floor and in-rack units. If you have an A+B redundant power distribution system, everything can be tagged with a number to identify the unit and a color to indicate to which feed (“A” or “B”) it’s attached. Note in the pictures how this flows all the way through – even the whips are colored.
![]() |
![]() |
![]() |
![]() |
Patch Panels – We could talk about patch panel labeling schemes for days. The important thing to do is pick one system and stick with it. Here’s a look at ours. We label our panels using a a “source/destination” scheme, so in the photo “1A/1D (1-6)” means that these are the first 6 ports running from rack 1A (the rack this panel is in) to rack 1D. Very easy for new sysadmins to grasp. This does not follow the ballyhooed TIA standard for labeling patch panels, but we find it be very practical and easy for the people working in the room.
Cables – Labeling cables is a religious issue for another day also, but in our data center we typically only label the key cables in the network backbone and edges so that trouble shooting is easier at 3 AM. When we do label a cable, we label each end with a wrap-around style label that identifies where the other end of the cable can be found. You can see an example of this in the photo above. If you click on the photo to enlarge it, you can almost read the label.
Air Handler (a.k.a “CRAC”) units – Simple to do, and very helpful when the units send alarms to systems management tools.
Emergency Power Off (EPO) and Fire suppression controls – I actually think the EPO button should be labeled “Update Resume Before Pushing,” but that’s a topic for another day.
Mechanical Support Systems - Here you need to not only identify the control accurately, but sometimes you need to be very specific about it’s operation:
After all the hard work of designing and implementing a label system is done, you’ll need to put ongoing enforcement into place for which a label shouldn’t be necessary:
Take-away: What’s most important is that you pick a labeling system that works for you and is easily maintainable, because it needs to be useful, and it’s a never ending process. If it’s confusing to use or a pain to keep up to date, even Bubba and his .50 cal aren’t going to help.
//spk
Twitter links powered by Tweet This v1.6.1, a WordPress plugin for Twitter.