Thursday, May 08, 2008

Green Data Center Seminar

Today I attended a seminar / marketing event sponsored by Ziff Davis with presentations by IBM and Sycomp, a data center solutions vendor (plus some attendees were senior data center consultants). The view from Sycomp’s 17th floor conference room in Foster City, CA was spectacular.

The following is a write-up of my notes, offered as a public service. [My comments are in square brackets.] I have no financial or business relationship with any company or product mentioned. Apologies for any inaccuracies or omissions. Feedback is welcome.

The IT industry accounts for 2% of human CO2 emissions, about equal to the airline industry. Electricity and cooling over 5 years exceeds hardware costs. A 1U server generates 2.0 tons of CO2 / year, versus a typical car (driven 12K mi / year) that generates 2.4 tons. Thus eliminating a server is comparable to taking a car off the road.

Site power and cooling accounts for 55% of energy consumption, versus 45% for power drawn by the IT equipment. Of the amount drawn by the equipment, 30% is for the processor and the balance of 70% for power supply, fans, memory, drives, etc. Of the 30% drawn by the processor, 80% is idle time versus 20% actual use. (Green processors are not the only issue.) Unlike other corporate players, IT wants low asset utilization, to accommodate spikes and fail overs.

Tackle the building first [not a VC play]. Talk with building engineers about fans and chillers, and whether they need to be running and when. Get them on your team. It usually desirable to retrofit to more efficient lighting, and your power company may provide rebates for HVAC upgrades. Several states offer “Energy Efficiency Credits” for upgrading servers. These may require audits, or merely be based on the N of machines replaced.

In most organizations there is a disconnect from the accounting side. Those who procure equipment don’t see the power bill, and those who authorize retrofits don’t see the power company rebate in their P&L. Then they are told to cut more, without getting to take the credit that went to the firm.

“Comfort cooling” is only needed 2,200 hours / year, during working hours, whereas “process cooling” runs 24 x 365.25 = 8,766 hours / year, nearly 4X of comfort cooling (hence the need for close attention to reducing its cost).

In former times it was gospel that data centers ran at 59° F with humidity of 52% ± 2%. Modern equipment tolerates higher temperatures, e.g. 68-98, if old habits can be changed. It’s okay to let the temperature rise to around 75-78, as long as it doesn’t fluctuate [which stresses bi-metallic connections, including solder joints]. A/C systems are more efficient at temperatures above 59.

Other energy conservation measures considered helpful include:
  • Motion sensors on light switches
  • Variable speed fans (Fan power draw is proportional to the cube of the speed)
  • Close up floor tiles if too much cold air in wrong place
  • Hot and cold aisles, in-row cooling
  • Liquid cooling removes heat much more efficiently than air (Rear door liquid cooling, cools back of rack. Direct cooling, cools processor.)
  • Reducing the hours a chiller runs makes it last longer too
  • Expelling hot air to the outside is cheaper than re-cooling
Power conservation technology has been pioneered in laptops (the various sleep modes), and is now being migrated to servers. Prior versions of Windows Server shipped with power conservation off by default, but in Windows 2008 it defaults to on.

Some IBM power director units (PDUs) will communicate with a software console that displays a detailed machine-by-machine analysis of power consumption patterns. If a machine is only used at certain times, do aggressive power management during its off-hours (by putting components into sleep modes).

[My take: Data center A/C should be separate from the rest of the building and maintain 75 or so year round. Servers and racks should be liquid cooled (direct or rear-door) by yet another unit. Put in all the monitoring tools you can find.]

Turning to virtualization and related IT issues –
  • Consolidate apps onto one machine via virtualization
  • Consolidate machines into one box as blade servers
  • Consolidate disk drives into a storage network
Inactive data such as medical x-rays can be pushed out to tape, then reloaded to disk when the patient returns 6-12 months later.

Major virtualization products include VMWare, Xen (open source), Novell ZenWorks, and IBM Director (with energy management). Oracle has something that virtualizes their database servers. VMWare is winning, because there’s no stomach to train data center staff on multiple environments. IBM servers come with VMWare on a chip.

Many ISV software products are not virtualizable. [Presumably this will be remedied in the next few years, after they master multi-core programming]. Many vendors won't support their apps when run virtually (even on the same hardware and OS).

[The Intel X86 has 17 non-virtualizable instructions. Intel is working on this issue. Presumably apps that use any of these instructions won’t run under VMWare.] Someone said BSD will not run under VMWare, perhaps for this reason, but I have not verified that.

Good candidates for virtualization are servers and processes with low utilization, such as weekly accounting jobs. High volume apps such as major commercial web servers are not virtualization candidates, since they need to run on more servers with load balancing, not fewer.

Virtualization can be done on large multi-way servers as well as on smaller blade servers, each with its advantages. There is a max of 14 blades in a unit before you need another unit. VMWare can only access up to 3.6 GB of memory and 2 processors, and is limited to one blade. It does not load balance across blades. However on a multi-processor server, it will dynamically allocate more processors to an instance that needs them. Hence multi-way servers are a prime option for virtualizing larger apps.

Virtualization is a big help in fail-over and DR situations, since a server image can be restored and back in operation on a new instance, possibly at a DR site, in 5 minutes or less. If your DR site has fewer boxes, your response time will degrade, but presumably your SLAs won't apply during a disaster.

All-in, per-server, “before” costs can run $1,800 / month (including labor), so aggressive virtualization (of servers, apps, and storage) can yield 75-90% savings. With up-front hardware costs in the $N million range this can yield a payback period of 8-11 months, and is economically viable even if only half as effective.

Data center consultants (like IBM and Sycomp) will send in people to sell your management, analyze and redesign your facility, formulate a big hardware order, and drive the installation and auditing process. Many senior execs want to know as little as possible about details.

There's a big opportunity to consolidate end-user PCs, giving users thin-client terminals with flash memory (and no drives) and virtualizing their PCs onto shared boxes. However this is not as far along. Good initial applications are point-of-sale and call centers [perhaps because such users feel little "ownership" of their box]. Lack of drives might reduce data theft.

[Virtualization introduces new security risks. Each instance is isolated from all others, and thus from anti-virus running in another instance. If clever malware can launch its own instance, nothing can touch it. Microsoft has demonstrated a proof-of-concept. No one in the room had any answers. As usual technology is being massively deployed with security in the back seat. But have no fear. VCs have recently funded 4 startups addressing virtualization security, so maybe they will come up with something.]

Labels: , , , ,