Virtualization no longer matters

There is no doubt. The product is there. The vision, too. At times, they leave some space to arrogance as well but, come on, they are the market leader, aware of being far ahead than anybody else in this field. A field they actually invented themselves. We almost feel like forgiving that arrogance. Don’t we.

The AWS summit 2013 in London has been just one more time the confirmation that the cloud infrastructure market is there, the potential is higher than ever and that Amazon “gets” it, drives it and dominates it quite undisturbed. All the others struggle to distinguish themselves among a huge amount of technology companies, old and new, who are strongly convinced of having jumped into the cloud business but, I’m pretty sure, the majority of their executives thinks that cloud is just the new name for hosting services.

Before going forward, I want to thank Garret Murphy (@garrettmurphy) for having transferred his AWS summit ticket to me, without even knowing who I was, but simply and kindly responding to my tweeted inquiry. I wish him and his Dublin-based startup 247tech.ie the required amount of luck that, coupled with great talent, leads to success.

Now, I won’t go through the whole event, because being this a roadshow which London wasn’t the first edition, much has been said already here and here. The general perception I had is that AWS is still focusing on presenting the advantages of cloud-based as opposed to on-premises IT infrastructures, showing off the rich toolset they have put in place and eventually bringing MANY (I counted nearly 20 ones) customers testifying how they are effectively using the AWS cloud and what advantages they got doing that. Ok, most of them were the usual hyper-scale Internet companies but I’ve seen the effort to bring enterprise testimonials like ATOC (The Association of Train Operating Companies of the UK). However, they all said to be using AWS only for web facing applications, staging environment or big data analytics. Usual stuff which we know to be cloud friendly.

What really impressed me was the OpsWorks demo. OpsWorks was released not long ago as the nth complementary Amazon Web Service to help operating resilient self-healing applications in the cloud. Aside from the confusion around what-to-use-when, given the large number of tools available (and without considering those from third parties which are growing uncontrolled day by day), there is one evident trend arising from that.

For those who don’t know OpsWorks, it is an API-driven layer built on top of Chef in order to automate the setup, deployment and un-deployment of application stacks. An attempt to the DevOps automation. How this is going to meet customers’ actual requirements while still keeping simplicity (a.k.a. without having to provide a too large number of options) is not clear yet.
During the session demonstrating OpsWorks, the AWS solution architect remarked that no custom AMIs (Amazon Machine Images) are available for selection while creating an application stacks. Someone in the audience immediately complained on Twitter about this, probably because he wasn’t happy about having to re-build all his customizations through Chef recipes on top of lightweight basic OS images, discarding them from his custom VM image.

In fact there are several advantages of moving the actual machine setup to the post-boostrap automation layer. For example, the ease of upgrading software versions (e.g. Apache, MySQL) simply by changing a line in a configuration file instead of having to rebuild the whole operating system image. But mostly because, keeping OS images adherent to the clean vendor releases, you probably will find them available in other cloud providers, making your application setup completely cross-cloud. Of course there are disadvantages too, including the delay added by operations like software download or configuration that may be necessary each time you decide to scale-up your application.

Cross-cloud application deployment. No vendor lock-in. Cool. There is actually a Spanish startup called Besol that is building its entire (amazing) product “Tapp into the Cloud” on the management of cross-cloud application stacks, leveraging a rich library of Chef cookbook templates. And while I was writing this post on a flight from London, Jason Hoffman (@jasonh) was being interviewed by GigaOM and, while announcing a better integration between Joyent and Chef, he mentioned the compatibility between cloud environments as a major advantage of using Chef.

What we’re observing is a major shift from leveraging operating system images towards the adoption of automation layers that can quickly prepare for you whatever application you want your virtual server to host. That means that one of the major advantages introduced by virtualization technology, that is the software manipulation of OS images, one of the triggers of the rise of cloud computing, no longer matters.

Potentially, with the adoption of automation platforms like Chef, Puppet or CFEngine, service providers could build a complete cloud infrastructure service, without employing any kind of hypervisor. And this trend is further confirmed by facts like:

Of course there are still advantages for using a hypervisor, because certain applications require architectures made of many micro-instances for performing parallel computing, thus it’s still necessary to slice a server into many small portions. However, with the silicon processors increasing the number of cores and the ability of using threads, virtualization may not be so important anymore for the cloud.

In the end, I think we no longer can say that virtualization is the foundation of cloud computing. The correct statement could perhaps be that virtualization inspired cloud computing. But the future may leave even a smaller space for that.

IaaS eats the biggest slice

When I read market research firms saying that SaaS is the most adopted cloud model by the enterprises I can’t but concur due to the ease of use and the simplicity of integration with existing IT assets. Actually, the integration ends up being minimal and entirely in developers’ hands, who can make use of the SaaS service usually comprehensive API, thus completely bypassing their internal IT department.

So what about IaaS and PaaS? Should those who invested heavily in those two cloud models start worrying about their choice? No way. As my provocative title says, I am fairly much convinced that the lower layers of the cloud stack eventually share the whole cloud business, with IaaS eating the biggest slice of it, both directly and indirectly.

I am actually writing this post to give further insight and supporting data to a tweet of mine I wrote some time ago:

Now let’s see what I mean by indirectly.

Layers over layers

In computer science we are used to have layers over layers called “abstraction layers“, each one of them aimed at hiding the complexity of the lower one, while providing some kind of added value and an interface for the immediately upper layer to access resources. With the rise of cloud services, the approach of the community has been the same again: using abstraction layers to handle the increased complexity of IT infrastructures, which now involve thousands of resources to be managed and orchestrated.

As mentioned above, there are three main cloud layers largely accepted by the community: IaaS, PaaS and SaaS. However, many cloud providers don’t fit exclusively in one of them as they tend to enlarge their offering with different services at multiple layers of the stack. Since this creates a little confusion among cloud consumers, I want to take the opportunity to present them one more time from a different perspective, trying to concentrate on what added value each layer brings to the stack.

Ok, I still have to work a bit on my ability to visually represent concepts but I hope the above chart can help making some clarification. First, we have raw resources at the bottom of the stack and if we add some elasticity we obtain an IaaS. This is over-simplified as there are certainly more values brought by any good IaaS layer, however, for the sake of understanding, I’ll limit myself to the most evident one: elasticity, a.k.a. the ability to create, destroy, enlarge and shrink computing resources on demand via an API.

Let’s now go upper, we have an IaaS layer and we decide to add some DevOps tools and operations such as middlewares, auto-scaling, application deployment and code validation mechanisms. While doing that, if the principle of abstraction layers is respected, we don’t need to care anymore about how to handle raw resources, since the IaaS provides us with tools to automate their management. What we obtain is a Platform-as-a-Service, an environment where multiple users can deploy their applications.

Eventually, let’s take some business logic to solve a specific problem (i.e. CRM, ERP, etc) and, provided of course that we have done all the multi-tentancy stuff and that we want it to be consumed as-a-service, we are now working at the SaaS layer. At this stage, we can concentrate on making our software more powerful, adding killing features and conquering our market niche. We don’t need (neither we do want, right?) to take care of all the infrastructure to serve our users nor we want to know what hardware lies underneath, as those would be just a distraction from our core business focus.

Sounds logical doesn’t it? All the layers stack up together so nicely and they look so complementary. Indeed they are. In fact, cloud companies end up buying services from other cloud companies that operate at a lower level of the stack. For further evidence, I have done a small research and I found out that most SaaS companies deploy their software on top of a PaaS provider that, itself, deploys its automation layer on top of one (or more) IaaS providers. What does that mean? That if an enterprise adopts a SaaS cloud service and pays for it, eventually some dollars will end up in some IaaS providers’ pocket. You like it or not.

The infrastructure of PaaS providers

To bring supporting examples, let’s check the most popular PaaS providers infrastructures as they’re most likely obliged to reveal their backends in order to inform their customers on their data center locations.

Most popular PaaS services rely on IaaS providers
PaaS Provider Supported Languages IaaS backends
Heroku Ruby, Java, Python, Node.js AWS
Engine Yard Ruby, PHP, Node.js AWS, Verizon Terremark
AppFog PHP, Java, Node, .NET, Ruby AWS, Rackspace, HP Cloud Services
OpenShift PHP, Java, Node.js, Python, Ruby AWS, Rackspace
Nodejitsu Node.js Joyent
AppHarbor .NET AWS
CloudBees Java AWS, HP Cloud Services

The cloud market is known to be huge and it is mandatory for every player in the IT industry today to take up a position, a vision and a direction within this space. If you’re an investor who wants to participate in the cloud opportunity, it is extremely valuable to understand how different cloud models are currently sharing the market. On the other hand, if you’re an enterprise evaluating the adoption of any cloud service, you should be concerned about who’s running the games up and down the cloud stack, as this will eventually affect you service level, your security and your data integrity.

POST UPDATE on 4/8/2013

I’ve been asked by Jack Clarke (@mappingbabel) of ZDnet on what basis did I single out the above PaaS providers as “most popular”. The answer I gave him is press coverage as well as “on the field”, meaning talking to customers and gathering experiences. It’s a simple personal feeling which is not based on any scientific data. I’m actually a field person and not a researcher. Besides, I don’t think any of those provider is really willing to disclose customer data.

However, it’s noteworthy to mention there are other PaaS services offered by large vendors that are difficult to define in terms of popularity; the press usually refers to the vendor as a whole and since they’re no longer in the startup phase, you can’t even measure the funding amount they’ve received from VCs. Despite the difficult measurability, I owe them a mention in this post for being active players in the PaaS landscape, contributing effectively to the cloud awareness battle.

And one can assume the above theory is respected by the above providers as well, for example with Elastic Beanstalk running on top of EC2 and App Engine running on top of Compute Engine. However, given those services are provided by the same vendor as the PaaS provider, they don’t trigger any economic transaction and thus no real shift in the measurement of the market size.