FindTech Blogs
Contrubutor ListTopic ListAbout Us
Fountainhead
  Search this blog

Description: details
By Ken Oestreich    About this blogger

Is an "internal" cloud an oxymoron?

By definition, "the cloud" lies external to the enterprise data center. And it's got great properties: in an Infrastructure-as-a-Service example, per-CPU operating costs are on the order of $800/year (see Amazon's EC2 price list), whereas CPU operating costs are typically $3k-$5k/year in the average data center.

It seems to me that the industry has become overly-fixated on hosted clouds (IaaS, PaaS, SaaS etc.) that are run by third parties which have all of those nice economies-of-scale.

But what about implementing an "Internal" cloud inside of corporate data centers? John Foley of InformationWeek just raised this question in talking about Elastra.
[they are] working on a version of Cloud Server for data center VMware environments, or what it refers to as "private clouds." That's an oxymoron since cloud computing, by definition, happens outside of the corporate data center, but it's the technology that's important here, not the semantics."
Semantics aside, what properties would an ideal "internal" cloud have? Clearly the same economics as a "traditional" cloud, but with some added benefits to avoid the current pitfalls of external clouds. The improved properties include --
  • Should work with existing physical & virtual resources in the data center (heterogeneous platforms & O/S's)
  • Should let you specify whether your apps are virtualized or not (but either way, provide capacity-on-demand)
  • Wouldn't require that sensitive data be hosted outside the enterprise; it would maintain internal auditability
  • Ought to adhere to internal security and configuration management processes
  • Would not disrupt existing software architectures
  • Would allow you to add additional capacity (compute resources) on-the-fly
  • Could be segregated to support both production & development environments
  • Would provide internal metering & billing for internal users and business units
Maybe an "internal cloud" needs a new name, but what it represents is essentially the basis of Utility Computing. Check out the Cloudy Times blog, that also references the potential of an Internal Cloud.

I'm guessing that, as cloud computing gains steam, IT organizations will want the same properties internally - through implementing an "internal" cloud leveraging utility computing infrastructure.

ProductionScale recently ruminated on this topic (calling it a private cloud, instead of internal):
"What is private cloud computing? To make a non-technical analogy, Private Cloud Computing is a little like owning your own car instead of using a rental car that you share with others others and that someone else owns for your automobile and transportation needs. Rental cars haven't completely replaced personal automobile ownership for many obvious reasons. Public Cloud Services will not likely replace dedicated private servers either and will likely drive adoption of private cloud computing".
Working for Cassatt, I'm biased toward believing that a market for Internal Cloud infrastructure providers will emerge.... and potentially help enterprises dovetail their internal clouds with public clouds. Any other opinions?

Sanity check: Data center energy summit

As I mentioned in yesterday's entry, the Silicon Valley Leadership Group's (SVLG) energy summit was fantastic - the first time I've seen actual implementations and data from innovations to help with energy efficiency. But on further reflection, there was something missing and rather alarming.

BTW, all of the the data was correct, and the conclusions were dead-on - the findings/predictions from the 2007 EPA report on data center efficiency were validated.

However, check out the list of projects in the Accenture report: 9 of them focused on site infrastructure, while only 3 of them focused on IT equipment.

Why weren't more projects aimed at making the IT equipment itself more efficient?

Now, if you look at where power is used in a data center, you'll find that with a "good" PUE, 30% might go toward infrastructure (with 70% getting to IT equipment), and that with a "bad" PUE, maybe 60% goes toward infrastructure (with the remaining 40% getting to IT equipment). In either case, the IT equipment is chewing-up a great deal of the total energy consumed... and yet only 1/4 of the projects undertaken had to do with curbing that energy.

This is like saying that a car engine is the chief energy-consuming component in a car, but that to increase gas mileage, scientists are focusing on drive trains and tire pressure.

My take is that the industry is addressing the things it knows and feels comfortable with: wires, pipes, ducts, water, freon, etc. Indeed, these are the "low-hanging fruit" of opportunities to reduce data center power. But why aren't IT equipment vendors addressing the other side of the problem: Compute equipment and how it's operated?

IT equipment is operated as an "always-on" and statically-allocated resource. Rather, it needs to be viewed as a dynamically-allocated, only-on-when-needed resource. More of a "utility" style resource. This approach will ultimately (and continuously) minimize capital resources, operational resources and (by association) power, while always optimizing efficiency. It is where the industry is going -- what's termed as cloud computing. This observation cuts directly to Bill Coleman's keynote (video here) earlier this week at O'Reilly's Velocity conference. It also alludes to Subodh Bapat's keynote where he outlined a continuously-optimized IT, energy, facilities and power grid system.

I certainly hope that at the SVLG's data center energy summit '09 next year, more projects focus on how IT equipment is operated, rather than on the "plumbing" that surrounds it. I can't wait to see the efficiency numbers that emanate from an "IT Cloud" resource.

Real data center efficiency project results

I spent the day at the SVLG data center energy summit, where 11 actual case studies were presented, each detailing how companies and their technology partners addressed energy efficiency, while quantifying the specific results. For all of the hubub with industry associations talking about theory, metrics, standards and efficiency, today marked the first time someone is actually doing something. And, representing Cassatt's efforts, I was happy to provide input and real-life user data as well.

We had about 300 people in the room, including press, analysts, industry and government. It was a full day of presentations and reviews of various types of projects that were undertaken, as well as what the outcomes were. The results given during the day were then collated by Teresa Tung of Accenture Technology Labs, and compared against the results posited by the 2007 EPA report to congress on data center energy efficiency. The results matched up quite well. (even Andrew Fanara, EPA Energy Star's project director, blew a sigh of relief that his report in fact matched reality).

The complete set of session reports, as well as the Accenture report, are already on the web for all to see and learn from. That others could learn from these pioneers was part of the goal of the day.

The first session of the day featured IT resource optimization. Synopsis outlined how they consolidated data centers and saved power, square feet and more. Cassatt (presented by yours truly) then described how Active Power Management helped a nearby company curb over 20% of their power bill with innovative power management software.

The day also had a number of sessions sponsored by Lawrence Berkeley Labs, Oracle, the USPS, Power Assure and others, focussing on air flow management, cooling, power distribution projects and more.

Over lunch, we were treated to a CIO panel that included Teri Takai, California's CIO -- overseeing more than 120 other CIOs of state agencies (how does it feel to be CIO for the earth's 7th largest economy?) plus PK Agarwal, California's CTO.

I also sat in on a session given by Dave Shroyer of NetApp, Bill Tschudi of Lawrence Berkeley National Labs, and Ray Pfeifer -- covering NetApp's ultra-efficient data centers. Dave pointed out that during certain seasons of the year, NetApp achieves a PUE of 1.19!! That's probably the best I've heard of ever.

Finally, at the end of the day, Accenture presented its report findings, summarizing the results of the 11 projects. Receiving the report were Andrew Fanara of the EPA, Paul Roggensack of the California Energy Commission, Paul Scheihing of the DOE, Bill Tschudi of Lawrence Berkeley labs, and Kevin Timmons of Yahoo. I suppose you could call this peer-review. But what it finally signals to the industry is that people are finally taking action, comparing reality against theory, and sharing their pragmatic best-practices. Crazy as this sounds, I feel it was a seminal event, and hope that other industry associations follow suit.

I'm sure you'll see additional coverage of this by other online folks I met there: Debra Grove of Grove Associates, as well as Rich Miller of Data Center Knowledge.

Energy, data centers, and the cloud

It's been a pretty information-loaded week regarding energy-management issues within the data center.

The first was the keynote Bill Coleman gave at O'Reilly's Velocity Conference earlier. He addressed why the current trends (point-products, resulting complexity) in data centers is unsustainable, and why economies-of-scale are declining. His answer: the cloud. He talks about 1.0 (where we are today), cloud 2.0 (more sophisticated, may even replace the PC), and 3.0 where it really represents the "webtone" of services available on-demand, reliable, and composable. And the more efficient these economies-of-scale become, the more energy efficient they become as well. DataCenter Knowledge covered that as well, with a few relevant pointers to boot.

The other nifty reference was on Microsoft's TechNet Magazine. There, Dave Ohara pointed out similar observations from a number of sources --- that turning off unused assets is a path to sustainable computing. He gave a few examples including

  • Weill Medical College's HPC clusters leveraging IPMI for node shut-down
  • Citrix' approach to using their PowerSmart utility to power-down idle HP-based presentation servers
  • Microsoft's "Power-Aware Server Provisioning and Load Dispatching for Connection-Intensive Internet Services"
  • Commercially-available solutions, such as Cassatt's
With this mounting set of options, will power management continue to gain in popularity?

Postcards from Gartner's IT Infrastructure, Operations & Management Summit 2008 (Day 2)

Today's presentations were almost entirely about virtualization with generous servings of analysis of cloud computing.

Thomas Bittman
Thomas Bittman opened with a keynote on Gartner's predictions on cloud computing, and the likely march the industry will take over the next 5+ years toward this eventuality.

Gartner's predictions are that there will be many varieties of cloud computing, from the AWS-style of raw hardware, up through various types of service providers of platforms, services, and even component services that will be wired-together by other types of providers.

Bittman even went so far as to suggest that service "brokers" could emerge. For example, you've established an SLA with a cloud computing service provider, and for whatever reason, that SLA isn't met (maybe AWS has another glitch). Instantaneously, your broker finds another compatible cloud and "fails-over" instantly to that provider.

Gartner's sense was that there will likely be a few "mega" providers (AWS, Salesforce, Google, MSFT, others) and then hundreds/thousands of smaller mid-market and specialty providers... not unlike the evolution of the hardware market today. And on that note, they also predicted that the hardware providers (like Dell) will probably get into the hardware-as-a-service market shortly as well. That should be interesting to watch.


Cameron Haight
Next, Cameron Haight spoke about emerging "virtualization standards."

He made the very reasonable assumption that users will want to manage multiple VM technologies using a single tool. (And, with a straw-poll, the audience conclusively agreed).

A few of the initiatives already underway include:

* DMTF (Distributed Management Task Force) is already working on draft specifications for interesting standards... not for VMs, but for properaties that would aid in management -- such as a Virtual system profile (i.e. for re-creating a set of VMs), and a resource allocation capability profile (i.e. for monitoring managing VM hardware resources like CPU, memory, network ports, storage, etc.

* also an Open Virtualization Format (OVF) is underway. This isn't a standard for VM files. Rather, this would tag VMs with metadata to ID them, say for packaging/distribution. For example, it would help characterize what's "inside" a VM before powering it on. My suspicion is that this could be the foundation for a "common" type of SW container, and a common approach to monitoring/managing such VMs. But I also suspect that the vendors will either (a) fight this tooth-and-nail, or (b) adopt it, but "extend" it to suit their needs...

Cameron also ran a few interesting audience polls during his session. Follows are some notes I took, but I believe he'll probably publish them in a forthcoming research note:

Q: What VMware products are you currently using?
71% VirtualCenter
36% Update manager
31% Site recovery manager
23% Lab manager

Q: What do you think is most important for VMware to focus on?
30% optimization of VM performance
18% VM sprawl
16% Maintanance/patching
12% Accounting/chargeback
11% Root-cause analysis

Yep, these pass my sanity check. It should be quite interesting to see what VMware's next moves will be.

Postcards from Gartner's IT Infrastructure, Operations & Management Summit 2008

Here I am in Orlando at the Gartner IT conference. The day's been insightful and validating, if you happen to be in the Real-Time Infrastructure (RTI) business.

Andy Kyte
The opening keynote was from Andy Kyte, Gartner VP and Fellow. He's a dynamic speaker, and focussed mostly on IT Modernization and strategy. He was careful to define strategy/strategic planning -- and accused just about every IT management organization of buying "puppies". That is, most orgs buy products because cute "here-and-now" reasons, without realizing that they're really signing-up to a 15-year-long relationship with a dirty, hairy, high-maintenance and expensive pet. His point was validated when he pointed-out all of the point-products that organizations have purchased that essentially only add to cost & complexity, rather than reduce it. He posited that more products have to be purchased with a long-term (7+ years) strategic vision, and that short-term economic validation was often to blame for the morass that IT finds itself within.

Donna Scott
Next was Donna Scott, speaking about IT Ops management trends -- and later on in the day, speaking about IT modernization and RTI. She led-off with a list of projects that enable business growth ... and that projects that don't enable business growth should be canceled.

But most interesting was her coverage of the "cloud" which she (and Thomas Bittman, next) predicted would be where IT is evolving. She suggested that IT ops will evolve into an "insourced hosting" model - where IT departments will be building "internal cloud-computing" style infrastructures to support business owners. We here at Cassatt salute you, since that's what we enable :)

What was also cool about Donna's presentations were her many polls from the audience (probably 1,000 plus). Her first question was "what grade would you give leading IT management providers" 70% of them (CA, BMC, HP and IBM) got a "C" or worse. Her conclusion was that they still don' t manage complexity (they may monitor it, though), they still support the point-solution mentality, and most focus on single homogeneous platforms.

Finally, and most validating, Donna listed the chief properties/components of a Real-time infrastructure system... which she feels is practically on the market. Her list:
  • IT services provisioning
  • IT services automation (starting & stopping applications as-needed)
  • Process automation & change management
  • Dynamic Virtualization management
  • Services optimization
  • Performance management, capacity management.
Personally, it sounds like a pretty familiar list. She outlined what RTI could enable; the list was also pretty familiar:
  • Service virtualization management
  • J2EE management
  • Oracle RAC management
  • Disaster Recovery - sharing & re-configuring assets
  • Managing a shared test environment
  • "loosely-coupled" HA - replacing failed nodes
  • Dynamic Repurposing nodes
  • Dynamic capacity on demand / capacity expansion
Thomas Bittman
Thomas gave a great talk on "Virtualization changes virtually everything"... and essentially outlined the path the industry will likely take towards cloud computing. He essentially pointed out where "automation" is going wrong today... that "automation" tools are focusing on components, rather than on service levels. Until that happens, IT will continue down its complexity path.

Then he hit on a concept that will IMHO be the next big thing: The Meta-O/S. Think of it as an O/S for the data center -- the O/S that enables RTI. For example, what if you started with VirtualCenter, made it work with any VM technology (Xen, MSFT, etc.), made it manage physical/native resources as well, and finally abstracted away the rest of your physical infrastructure? Then, what if it could be told to optimize resources for application service levels , and/or to minimize power or capital or some combination at all times?

We're probably closer to this vision than you think - and the more industry is comfortable with sharing resources, and more dissatisfied with vendor point-solutions, the more it will be accepting of this meta-O/S concept.

I sometimes use this analogy:
What if you walked into a data center and were told to manage it - - 10,000 servers, 100 different HW models, 5,000 applications, various O/S flavors and revs, multiple networks, etc. etc. Well, you *wouldn't* tell me that you'd hire 200 sysadmins, buy multitple software management tools and analysis packages, set up complicated CMDBs and change-management boards, and buy a bunch of pagers for after-hours fire-drills. But that's how it's done today.

Rather (given a clean slate) you'd say "I'd get a computer to figure-out how and when to run applications, and to govern what software was paired with what hardware when. It would prioritize resources, and continually optimize overall operating costs. That's the rational approach. and that's what the Meta-O/S will do.