Cloud Computing

May 08, 2013

Do You Have a Plan for Switching Cloud Providers?

Moving-dayLast week's announcement from Xeround to discontinue their public database service may have taken some people by surprise.


Xeround's web site claims that they power 32,000 applications and for sure a few of those will have trouble moving out in a short notice. Doesn't matter whether you use small cloud provider like Xeround or larger one like AWS or Azure, having a plan how to move your workloads between providers is important for your business continuity.


Here are a few things you want to think about when you develop such a plan.

Do you have all the login information?

This one is silly but you will be surprised how may companies don't keep track of who has access to their cloud provider accounts. Having a central lock box for such passwords and auditing the access to it has been for long an enterprise practice but startups normally don't have such a rigorous process. When you receive a message from your provider your first question should not be: "Does anyone knows the how to login?"

Is your application provider agnostic?

There are a lot of debates about the so called "provider lock-in" and I am not going to dig deeper into this. If you use provider specific functionality in your application make sure you follow those steps:

  1. Be aware what provider services or functionality you use and where are they used
  2. While developing or migrating your application to the cloud abstract those services from the main logic of the application
  3. Estimate the effort needed to remove the custom functionality
  4. Look at few other cloud providers and make a plan what would it take to re-code the application with their services

What is the business impact of the switch?

With the current level of standardization, switching between cloud providers is not a trivial task. In lot of cases you may incur some downtime. Think about the following:

  1. Is downtime acceptable for the business?
  2. How long will the switch (and the downtime) take?
  3. Does it require development efforts and how much?
  4. Does it require data migration?
  5. What will be the impact on the customers?

How easy it is for you to obtain your code and data?

Assumption is that you own your code and data although it is hosted on the cloud provider's infrastructure. But the problem with the data is that it can be… well, a lot. Getting your data may be one of the problem areas. While your application is running it is most probably collecting some information and storing it either in a cloud database or cloud file system. Over time the amount of data grows and downloading this over Internet may be troublesome.


Work with your cloud provider to have a plan to retrieve your data upon request in the easiest way for you. 

How supportive is your current cloud provider?

A good cloud provider will have an established support channel that is constantly monitored by its employees. Having a phone number that you can call 24/7 should be requirement. You should work with your cloud provider throughout the switch process in order to make sure this goes smooth.


Last but not least your cloud provider should give you enough notice in advance to enable you smooth migration.

 

It is not a bad exercise to replay the scenario at least once in order to see whether you have accounted for all the variables. Here is a challenge for you! If you have only a week to move your app from one cloud provider to another (as it is the case with Xeround) can you make it?

May 01, 2013

How Do You Choose Your Cloud Provider?

Decide-image-blogIf you have been thinking how to choose your public cloud vendor you are not the only one. There are hundreds of offerings that you can choose from and comparing those can be a cumbersome exercise. Hence most of the people just run to the vendor (or technology) they are either most familiar with or gives them the best price. This is all good until they discover that… well, that vendor is not what they have been looking for.

 

Lately I've been few times asked: "Which public cloud provider would you recommend to deploy our application?" Doesn't matter how much I want it to be, the answer is unfortunately not that simple. Here are few questions to ask yourself while doing the research.

Do you plan to migrate existing applications or to develop brand new applications? 

Very few if any legacy applications are designed with cloud patterns in mind. Things like sticky sessions, admin access, usage of local machine resources or direct access to OS are often deeply woven into the application logic, and ripping them off may be equivalent to starting from scratch. In such cases you may be better off keeping those apps in house or be limited to only IaaS providers. Even in the IaaS case you may not be able to use some of the functionality like sticky sessions for example. For enterprises this is the predominant scenario, and if you want to get to the answer easier you need to be very familiar with your internal application portfolio.

 

The answer is much simpler for startups - they don't have the legacy apps baggage of the enterprises. 

What technology do you plan to use?

If you plan to migrate legacy application then you don't have a choice - you are stuck with the technology it is already using. For the past decade Java and .NET made their way into the enterprise world and most of the public cloud providers offer those as a choice. However you need to be careful what each provider understands under Java and .NET support. Very few PaaS providers allow you to move your legacy application to the cloud without any modifications - most of the times the vendor just supports the language on their platform but not the full technology stack.

 

PHP and Ruby seems to be the choices for startups and lot of vendors offer those as choices. 

What is your developers expertise?

If your staff spent most of their lives developing in Java or .NET and you want to use PHP or Ruby in cloud you obviously are not making the right choice. The migration to the cloud should be smooth. Think about the ramp-up time they will need to just learn this new paradigm. Although very promising and exciting it will require from them to change lot of the development and design processes they have been using for years. It will be easier for them if they don't need to learn yet another language.

 

Ruby is the choice for startups because it is easy for prototyping but may not be well suited for large scale apps. If your startup gets traction you may need to migrate from one cloud vendor to another.

What is your budget?

The question about price is always pressuring. It is mistake though to make decisions only based on price. There are two things going on right now that impact the prices of cloud computing. One is big players like Amazon and Microsoft who are almost in a price war, and two is the numerous new players who try to enter the market by offering more competitive prices. While the first one is good for you the second is, in my opinion, not so desirable.

 

The reason is that some of those smaller players may not survive the pressure of the lower prices and may end up out of business in a year or two. If you are one of their unlucky customers you may need to migrate your applications even before you complete your first migration. This doesn't mean that you should not choose any of the smaller providers. What it means though is that you should always have a back up plan and this is true also for AWS and Azure.

How long the vendor have been on the market?

Last but not least you should think about how stable and experienced the provider is. There are lot of players in the area and this is good for bringing the price down but they lack the experience that older ones have. And once again the question here is not about size - think about HP that released they cloud just recently vs Joyent that has been hosting wordpress.com for years.

 

Making the choice of cloud provider should be approached in a thoughtful way with consideration of all the above factors and maybe few more. Building a solid plan and evaluating as many choices as possible is you best bet to migrate your apps successfully.

 

April 29, 2013

Business Strategy for Enterprise Cloud Startups

BusinessSuccessWith cloud computing becoming the center of almost every new enterprise IT project, more and more startups decide to compete in the area. This raises the question: "Are they ready to fulfill the enterprise needs?" Forget the need to have one big customer. This can open few doors but if your startup's business strategy is wrong those will be shut down soon. It is true that one prominent customer can boost your sales but in my opinion there are two more important things that can help your startup get customers fast.

First question you need to ask yourself is: "Do I target the right audience?" If you want to play in the cloud space you need to look at two different points of view - the developer one and the IT one, and make sure that you understand both. And trust me, John Engates, CTO of Rackspace, is right by saying "the traditional datacenter admin, even the CIO, don't necessary understand a developer's standpoint" (see Cloud Operability and the Battle for the Open Cloud at forbes.com). 

If your product is targeted to a developers audience the capabilities you should outline are speed and agility (aka faster time-to-market), integration with existing development tools and abstraction from the underlying infrastructure. Think whether your product offers services that developers can use to stitch together and deliver their application faster. Or whether it makes the build and deployment process smooth and faster. Or whether your product allows them to run their application uninterrupted without the need to deploy and configure new instances every time something fails.

If you target IT admins on the other side you need to think more about cost savings, standards and security, and integration with the underlying infrastructure. A customer once told me: "IT will always be on the expense side of the balance sheet. The more we are able to save costs the longer we will keep our jobs". Does your product help IT teams reduce operating expenses or reduces capital expenditures? Does it follow established standards and allows easy integration with existing infrastructure and applications? Does it require extensive training and ramp-up time?

The other question you need to ask yourself is: "What is my pricing strategy?" Subscription models are nice because they guarantee you regular payments but they not always work. Some enterprises may have rules to finance their IT through credit instruments like bonds, and such can only be issued for capital expenditures. On the other side if you only offer perpetual licenses then the customer may not be willing to upgrade at the pace you ant them to. Getting the right balance between pricing and needs can only be achieved if you know your audience. If you target developer organizations OpEx may be fine because they are revenue driven. On the other side IT teams are  expense driven and they may need more time to prove that your product saves money than your subscription license allows them.

The more traction the cloud industry gains the harder will be for new startups to enter the area. Hence, defining your audience and pricing strategy early on will be crucial for the  success of your venture.

March 27, 2013

The Importance of Private Clouds

PrivateFew days ago I noticed a question on a LinkedIn group that made me thinking how important is the notion of private clouds. First, let's briefly look at what is the difference between public, private and community clouds as well as hybrid clouds. Once again those are very well defined in NIST Definition of Cloud Computing but stated with simple words they are:


  • Private Cloud is cloud infrastructure that belong to single organization (enterprise, university, government organization etc.) that is hosted either on or off premise and is managed by the organization or third party contracted by the organization. The key point for private cloud is that the infrastructure is dedicated to this particular organization. Very often though you will notice that the term is used for cloud infrastructure that is hosted in the organization's datacenter.
  • Community Cloud is very similar to the Private Cloud with the difference that instead belonging to a single organization it belongs to a community of customers (can be organizations). You can look at the community cloud as a special case of the private cloud.
  • Public Cloud is a cloud infrastructure that belongs to the cloud vendor and is open for use to the general public. What this means is that any organization or single individual can use the public cloud infrastructure to deploy and run their application
  • Hybrid Cloud is any mix of the above three. However most often the term "hybrid" is used to describe the extension of the private cloud to the public infrastructure for a specific organization.

Sometimes you will also hear the term personal cloud. What it refers to is either a cloud simulation environment or scaled back cloud environment that you can run on your desktop. Personal clouds can also be specialized cloud appliances that are used for in-house development and testing. 


Now, let's go back to the private cloud and look at why they are so important.

If you look back in time you will notice that organizations have been developing their own IT infrastructure for years. This infrastructure has been used and continues to be used to host business critical applications for the organization, and will exist for years to come. The two most important reasons why organizations will continue to build and manage their own data centers are:


  • Data sovereignity - they would like to keep business critical data internally and don't expose it through widely available public interfaces
  • Regulations - certain organizations deal with Person Identifiable Information (PII) and must comply with rules like PCI or HIPAA

But there are few others that lot of people do not consider right now. 


The first one is the so called "vendor lock-in". With the lack of standardization between cloud providers moving workloads between those is still a hassle. There are of course third party vendors who make the process easier but this is at the expense of a higher price. 


This brings us to the second reason that can make private clouds more attractive - the financials or more specifically the Operational Expenses (OpEx). At certain point the benefits of using public infrastructure start to diminish. This happens because the operational cost, although significantly reduced, continues to be carried by the organization (think of managing VMs or applications deployed on the public cloud the same as managing VMs and applications on-premise). At some point though the margin that you pay to the cloud vendor can outgrow the operational cost of managing the infrastructure on-premise. This may not be applicable for a business that runs tens or even hundreds of servers but is certainly true for enterprise that has needs for thousands of servers.


The last is also related to the financials but this time is the Capital Expenditures (CapEx). All this infrastructure that organizations have build over the years still needs to be utilized. They have the choice to continue using it in the traditional way and incur OpEx as until now or repurpose it into private cloud infrastructure and significantly reduce their management expenses.


With all the factors outlined above it is clear that the private clouds will play significant role in the cloud computing trends over the next couple of years.

March 20, 2013

Migrating Legacy Applications to the Cloud

ToolkitWith everybody jumping on the cloud computing bandwagon lately, developers and architects need to spend extra time analyzing applications that can become good candidates for migration. It is wrong to believe that every legacy application can be easily migrated from the traditional on-premise infrastructure to any cloud computing environment. Therefore such migration efforts should be approached carefully and systematically.

Let's look at couple of issues that you may face when trying to migrate legacy applications to the cloud.

Client-Server Applications

Client-server applications are characterized with tight coupling between the business logic and the data tier. Most of the times the business logic is implemented as stored procedures in the database and pulling it ut can be a substantial effort. In addition such application establish a sticky session between the client and the server, which violates common cloud architecture patterns and complicates the migration process.

The obvious approach for migrating client-server applications to the cloud is to gradually abstract the business logic in a service layer and deploy the latter to the cloud. The cleaned up data tier can still be hosted on the current infrastructure until time comes to either migrate the data or retire it. At a high level your should follow these steps:

  • Identify the business services that are exposed to the clients
  • Implement those services as a separate business layer
  • Deploy the new business layer on a cloud enabled infrastructure (either IaaS or PaaS)
  • Implement a thin client layer on top of the services (in certain cases you may be able to modify the existing clients to connect to the services instead the data tier)
  • Roll-out the new client among your users
  • Retire the business logic in the data tier

This approach provides smooth migration because it postpones the data migration, a highly critical business component, to a later stage and in the mean time the organization is gaining important knowledge and discovers potential issues with the cloud technologies.

Scheduled Tasks

Scheduled tasks or batch jobs are another legacy application pattern that can introduce some challenges when migrating to the cloud. The premise of such applications is that they are triggered either at certain intervals or by a new batch of data that gets delivered. Majority of the times the latter approach involves transfers of files between machines. Two things that are at the core of such applications contradict with the modern cloud architectural patterns:

  • The reliance on always-up machines that will trigger the execution at certain intervals
  • The reliance on always-available file system used for file exchange

Functionality that such applications provide is easily achieved through the queue-centric workflow application pattern as described by Bill Wilder in his book Cloud Architecture Patterns. However, redesigning those legacy applications to use message queues can be substantial implementation effort. Hence you should approach the migration in phases. For jobs that rely on file transfers you can use these steps:

  • Change the jobs to use cloud storage instead local file systems
  • Add functionality at the delivery side to drop a message in the queue in addition to dropping the file
  • Remove the polling functionality in the processing job and instead use the message in the queue as a triggering mechanism

For the scheduled tasks you need to change the implementation to use messages in the queue instead time intervals to trigger the tasks.

You can achieve additional benefits if you add Map-Reduce as part of your modern application design. 

Scale Up Applications

Last but not least is the type of applications that rely on additional local resources in order to handle increased loads. Such resources can be CPU speed, memory or disk storage. Unfortunately such applications are hard to migrate to the cloud unless they get redesigned to use horizontal instead vertical scaling. Most of the times such challenges are imposed at the data tier of the applications and can be solved through data-sharding.

The process for migration involves:

  • Analyzing the data and potential de-normalization
  • Identifying the shard key
  • Splitting the data amongst the shards

As bottom line the gains for the organization in the above mentioned migration approaches are:

  • Improved (and more cloud-ready) application architecture
  • Enabled economies of scale at the different tiers of the application

However the biggest benefits is the cloud computing knowledge that the organization gains throughout the process.

March 06, 2013

Are there Other "as-a-Service" Cloud Offerings?

ServicesignAs already described in the previous few articles NIST defines three different service models for the cloud - IaaS, PaaS and SaaS. However if you look at the Wikipedia article about cloud computing you will notice that there are quite a few more "as-a-service" models mentioned there. Lot of vendors are using those to differentiate their offerings but there are two problems that arise:

  • First, those additional models are not officially recognized by any standardization authority and
  • Second, and maybe more important, those new acronyms add to the confusion among the novice cloud users

The proliferation of service models confuses even people with long experience with the cloud and doesn't help with the adoption. If you look closely you will notice that all of the new "as-a-service" models are just a subset of one of the three main service models - IaaS, PaaS and SaaS. Let's look at them one by one!

  • Network-as-a-Service (or NaaS) - The network is the lowest level in the application stack and its configuration and management has always been considered infrastructure component. NaaS is just the networking part of IaaS. Network-as-a-Service is not new as a concept - Akamai, Limelight and the TelCos have been offering networking services for more than decade already, either as Content Delivery Networks (CDNs), Virtual Private Networks (VPNs) or leased lines (bandwidth). 
  • Storage-as-a-Service (or STaaS) - Storage is the next level in the application stack and similar to the Network layer it should be considered infrastructure. 
  • Security-as-a-Service (or SECaaS) can be looked at from two different points of view. The first one is as a platform component (think of authentication and authorization services) that one can use when building new applications. The other one is as Software-as-a-Service that one uses on subscription basis for example to anti-virus and anti-malware detection.
  • Data-as-a-Service (or DaaS) - even Wikipedia's article mentions that Data-as-a-Service is a cousin of SaaS - data and software quite often go together. The difference here is that instead monetizing their investment in code (software) through a service model, vendors also monetize their investments in data through the same service model. 
  • Desktop-as-a-Service (or DaaS) - the first thing you need to be careful here is the overuse of the acronym DaaS. Nowadays where tablets have become mainstream having powerful desktop on demand that is accessible from anywhere becomes a necessity. Nevertheless though Desktop-aaS can be looked at from two different points of view - either part of the infrastructure (or IaaS) that enables new types of applications or as a software-as-a-service where the software is the operating system that it runs. The concept is not new at all - it even reminds us about the old days when we had big supercomputers with multiple terminals connected to them. In addition services like remote desktop access and terminal servers have been available for decades.
  • Integrated-Developmen-Environment-as-a-Service (or IDEaaS) - this is just specialized type of software-as-a-service (i.e. one targeted to developers). There are numerous examples here but maybe most prominent ones are Cloud9 and Orion however we should not think of the IDEs as editors only but also as a whole development infrastructure. I would add Atlassian, GitHub and Microsoft's hosted TFS into that category.
  • Database-as-a-Service (or DBaas) - Database normally sits between the Operating System and the Middleware in the application stack and its logical place will be as part of the platform (PaaS)
  • Test-Environment-as-a-Service (or TEaaS) - test environments are just specialized infrastructure or can be easily simulated on platform-as-a-service. Hence TEaaS blends easily in either IaaS or PaaS.
  • API-as-a-Service (or APIaaS) - in my humble opinion this one is quite confusing. The whole problem is that majority of the functionality in the cloud is exposed as APIs (hence it warrants the as-a-services part in its name) and this acronym will be no more descriptive than saying "service-as-a-service". I will certainly require more clarifications from a vendor who has such offering.
  • Backend-as-a-Service (or BaaS) - BaaS became very popular with the development of mobile applications. In essence though BaaS is a platform-as-a-service that offers services specific for mobile applications (like location based or check-in services for example). Good example for BaaS is Buddy.
  • Integration-Platform-as-a-Service (or IPaaS) - As its name says this is just an integration PaaS and you can consider every PaaS vendor that offers Service Bus and messaging functionality as a IPaaS vendor too.

The lesson here is to look at any new "as-a-service" model as something that can easily be described with the three main cloud computing service models. Some of the acronyms above are legit vendor offerings that complete the cloud computing picture (NaaS or Data-aaS for example) while others are just another name for one of the three main service models (like BaaS and IDEaaS) used by vendors for differentiation.

February 26, 2013

Evaluating Cloud Computing Uptime SLAs

Last week's Windows Azure Storage outage made me thinking how many of us evaluate the vendor's Service Level Agreement (SLA) before they decide to deploy workloads in the cloud. I bet many think about it only when it is too late. 

Sedrvice
Let's take Windows Azure SLA and see how we as consumers of the cloud services are protected in case of downtime. Before all though I would like to point out that it is in the nature of any service (public or private) to experience outage once in a while - think about power outages that we hear about or live through every winter. It is important to understand that this will happen and as users of cloud services we need to be prepared for it. In this post I will use Windows Azure as example not because their services are better or worse than the other cloud vendors but to illustrate how the SLAs impact us and how they differ from vendor to vendor. 


Each SLA (or at least the ones that bigger cloud vendors offer) contains few main sections:

  • Definitions - defining the terms used in the document
  • Claims - describing how and under what terms one can submit a claim for incidents as well as how much you will be credited
  • Exclusions - describing in what cases the vendor is not liable for the outage
  • The actual SLAs - those can be two types:
    • Guaranteed performance characteristics of the service
    • Uptime for the service

Looking at Windows Azure SLAs web page the first thing you will notice is that there are different SLAs for each service. You don't need to read all of them unless you utilize all of the services the vendor offer. The main point here is that you need to read the SLAs for the services you use. If, for example you use Windows Azure Storage and Windows Azure Compute you will notice that the uptime for those differ by 0.05% (Compute has uptime guarantee of 99.95% while Storage has uptime guarantee of 99.90%). Although this number is negligible at first sight using an SLA calculator you will notice that the expected downtime for Storage is twice as much as the expected downtime for Compute. It is obvious that the closer the uptime is to 100% the better the service is.


The next thing that you need to keep in mind is the timeframe for which the uptime is calculated for. In the case of Windows Azure the uptime is guaranteed on a monthly basis (for both Storage and Compute). In comparison Amazon's EC2 has annual uptime guarantee. Monthly SLA guarantees are preferable because you will avoid the case where the service experiences severe outage in particular month and stays up the rest of the year. Just to illustrate the last point imagine that EC2 experiences outage of 3h in particular month and stays up for the next 11 months. This outage is less than the 99.95% guarantee or 4:22:47.99 hours acceptable downtime per year and you will not be eligible for credit for it. On the other side if the SLA guarantee is on a monthly basis you will be eligible for the maximum credit for it because it severely exceeds the 21 minutes acceptable downtime per month. 


One note about the acceptable downtime. In reality hardware in cloud data-centers fails all the time, which may result in downtime for your particular service but will not impact other services or workloads. Such outages are normally covered by the exclusion clause of the SLA and are your own responsibility. You should follow the standard architectural practices for cloud application and always make your services redundant in order to avoid this. The acceptable downtime metric is calculated for outages that impact vast amount of services or customers. Surprisingly though nowhere in the SLAs is mentioned how many customers need to be impacted in order for the vendor to report the outage. It may happen that a rack of servers in the datacenter goes down and few tens of customers are impacted for some amount of time. If you are one of those do not expect to see official statement from the cloud vendor about the outage. As a rule of thumb if the outage doesn't show up in the news you may have hard time proving that you deserve credit


The last thing to keep in mind when evaluating SLAs from big cloud providers is the Beta and trial services. It is simple - there are no SLAs for services released in Beta functionality. You are free to use them at your own risk but don't expect any guarantees for uptime from the vendor.


When the so called secondary cloud providers are concerned you need to be much more careful. Those providers (and there are a lot of them) build their services on top of the bigger cloud vendors and thus are very dependent on the uptimes from the big guys. Hence they don't publish standard SLAs but negotiate the contracts on customer-by-customer basis. Most of the time this is based on the size of business you create for them and you can rely on good terms if you are big customer. Of course they put a lot of effort in helping you design your application for redundancy and avoid the risk of executing the SLA because of primary vendor outage. In the opposite case where you are a single developer you may end up without any guarantees for uptime from smaller cloud vendors.

February 11, 2013

There is more to PaaS than you think

As described in the last week's post NIST defines three different cloud computing service models - IaaS, PaaS and SaaS. IaaS and SaaS are really easy to grasp but I see people struggling to understand the PaaS model. As a long-time application developer though I find the PaaS model the most compelling one for new applications. Here is why.

Screen Shot 2013-02-10 at 11.31.31 PM

I will look at two examples: one enterprise and one from the consumer world.

Let's start with the enterprise scenario. If you examine any enterprise application portfolio you will find out that almost every development team has implemented it's own code for handling common functionality like authentication, authorization, database access etc. There are also numerous cases when the same team developed the same functionality over and over in each new project. Even the componentization model doesn't help in this situation because either developers are often not aware of the existence of the components or there are too many options they can choose from and they cannot find the right fit for their scenario. The Service Oriented Architecture (SOA) was the holly grail for this problem but many enterprises are still far from achieving this goal.

The next problem that you will see in enterprises is that each application has it's own way of accessing common services and resources like external systems, databases and storage. This results in configuration sprawl and the configuration management overhead.

Last but not least purchasing and provisioning the necessary infrastructure for every application is long and tedious process that significantly impacts time-to-market and adds unnecessary tensions between the IT department and the business groups.

Implementing PaaS in the enterprise can alleviate each one of those problems by implementing common functionality like authentication, authorization, database access, messaging etc., reducing the configuration sprawl by providing central service catalog and dynamic reconfiguration, and by decoupling the underlying infrastructure from the application. In addition PaaS leverages the underlying IaaS functionality to provide load balancing, high availability and auto-scaling on the application level.

All PaaS benefits from the enterprise scenario can be easily applied to a consumer application. Those are even more important with mobile, where the growth of users can become exponential. Delivering fluent, scalable and reliable functionality can be crucial to the success of every mobile application. Getting it fast to market though is one of the most important parts. By leveraging PaaS services, mobile application developers can build new experiences fast and easy, without the need to spend time on reimplementing basic functionality. Common features like location awareness, push notifications and even Instagram-like filters are offered by many public PaaS providers - mobile application developers just need to stitch those together in a new experience and publish it on the app stores. Adding the device-indepent nature of those services makes cross-platform rollouts several times faster than if those need to be implemented from scratch.

More than decade ago the application servers advanced the way new applications are developed by offering common framework and set of reusable components. Platforms-as-a-Service are the next step in the evolution of application development by adding inherent cloud computing characteristics like elasticity, on-demand self service and measuring.

January 30, 2013

Cloud Computing Service Models

In this post I will look at the three different service models for cloud computing as defined by NIST. More specifically I will look at the management and operations overhead for each one of the models and compare it to the traditional on-premise model.

Cloud Service Models

Traditional Model

Let's look at how things have been done in the past. Traditionally enterprises have been responsible for managing their own IT infrastructure as well as the software stack that runs their applications. For small companies that meant hiring polyglot employees with wide range of skills varying from low level networking to high level application support. For larger ones, that can afford more staff, it meant creating specialized teams responsible for only networking infrastructure, or only storage or servers and virtualization. However for lot of those enterprises the core business has never been managing IT infrastructure - the only thing they are interested in is to manage their line-of-business (LOB) apps.

Here are just some of the tasks IT teams enterprises have been required to do in the past:

  • Build racks with servers and wire them into the network
  • Build storage arrays and wire them into the network
  • Configure routers
  • Configure firewalls and DMZ zones
  • Install operating system software on the servers
  • Create virtual machines (if virtualization is utilized)
  • Install operating system software on the virtual machines
  • Install databases, set up replication and backups
  • Install middleware used for hosting the application code
  • Patch and update operating system software
  • Patch and update databases
  • Patch and update middleware
  • Patch and update runtime
  • Install application software
  • Patch and update application software

Although long this is by no mean the complete list of tasks that IT personnel has been responsible for. From the list above only the last two (install application software and update application software) have been essential to the core business of the enterprise. In addition to the IT operational costs (OpEx) enterprises also incurred significant capital expenditures (CapEx) used to procure the necessary hardware.

More than a decade ago hosting providers recognized the need to help businesses with those tasks and allowed them to outsource the build-up of infrastructure, and concentrate on just managing their applications. Although hosting providers helped enterprises with OpEx and CapEx they still lacked some of the essential cloud characteristics like on-demand self-service, rapid elasticity and measured service as outlined in Essential Cloud Computing Characteristics.


Infrastructure-as-a-Service (IaaS) Model

IaaS model was the first model that complies with NIST cloud computing characteristics. In essence it offers cloud computing environment consisting of virtual machines. It offers self-service portal where you can on-demand start a virtual machine with preferred operating system, it is broadly accessible, elastic (you can easily start identical virtual machine or shut down exiting one), it uses pool of virtual machines that are collocated on common hardware, and at the end it measures your usage of those virtual machines.

If you look at the picture above you will see that the IaaS model provides automation in the lower layers (up to the virtualization layer) of the application stack. What that means is that tasks like starting the virtual machine, adding it to the network, configuring the routing and the firewalls and attaching storage to it is automatically done by the automation software. The vendor that provides the service is also responsible for managing any hardware failures and service the underlying hardware.

As you have already noticed the IaaS model provides cloud services up to the virtualization layer of the application stack. However as a consumer of the IaaS service you are still responsible for managing the virtual machine. Hence you are still responsible for patching and updating the operating system on the VM, installing and maintaining any databases or middleware that your application uses in addition to maintaining your actual application.

IaaS is very similar to the traditional hosting model with the added benefits of self-service, elasticity and metering.


Platform-as-a-Service (PaaS) Model

With PaaS you have much less things to worry about. As you can see from the picture the whole stack needed for your application is managed by the vendor. Your only responsibilities are your application and the data your application uses. In addition to the tasks managed in IaaS case the vendor (or in the case of private PaaS the platform owner) is also responsible for patching and updating the operating system, installing and maintaining the middleware as well as the runtime your application uses.

One important thing that you need to be aware of when using PaaS is that the automatic updates the vendor does may sometimes have negative impact on your application. Why is that? Very often OS and middleware vendors do incompatible changes between versions of their software. If your application depends on any underlying OS and middleware functionality it may break between platform updates. And because you are not in control of those updates you may end up with your application being down.

The premise of PaaS though is not only to offer maintenance free application stack but also additional services that you can utilize in your application. Very often PaaS providers are exposing middleware and databases as services and abstract the connectivity to those through APIs in order to free up developers from the need to locate the actual systems. Additional services can be authentication and authorization, video encoding, location based services etc. Using the PaaS services will allow you to abstract your applications from the underlying stack and as long as the APIs are kept intact it will be protected from failures between platform updates.


Software-as-a-Service (SaaS) Model

SaaS is the model with the highest abstraction and offers the most maintenance free option. As a SaaS consumer you are just using the software offered by the vendor. As depicted on the picture the whole stack is maintained by the vendor. This includes also updates for the application as well as application data management. SaaS model is very similar to the off-the-shelf software model where you go and buy the CD, install the software and start using it.

Traditionally one of the hardest problems application developers had to deal with was the data migration between different versions. SaaS vendors are also responsible for migrating your data and keeping it consistent. Similar to the off-the-shelf software model you can rely that you can access and read your data once you upgrade to a new version.

The SaaS model is the most resource-efficient model because it utilizes application multi-tenancy. What this means is that the same application instance handles multiple user-organizations. This is good for both the vendor and the customer because better resource utilization brings the maintenance costs down and hence the price for the services down. On the other side though tenant data is comingled and there is the security risk of one tenant accidentally getting access to another tenant's data.

Although not exhaustive the cloud computing service models explanation above should be enough to kick-start your initial discussion about your cloud strategy.

January 21, 2013

Essential Cloud Computing Characteristics

If you ask five different experts you will get maybe five different opinions what cloud computing is. And all five may be correct. The best definition of cloud computing that I have ever found is the National Institute of Standards and Technology Definition of Cloud Computing. According to NIST the cloud model is composed of five essential characteristics, three service models, and four deployment models. In this post I will look at the essential characteristics only, and compare to the traditional computing models; in future posts I will look at the service and deployment models. 

Because computing always implies resources (CPU, memory, storage, networking etc.), the premise of cloud is an improved way to provision, access and manage those resources. Let's look at each essential characteristic of the cloud:

On-Demand Self-Service

Essentially what this means is that you (as a consumer of the resources) can provision the resources at any time you want to, and you can do this without assistance from the resource provider

Here is an example. In the old days if your application needed additional computing power to support growing load, the process you normally used to go through is briefly as follows: call the hardware vendor and order new machines; once the hardware is received you need to install the Operating System, connect the machine to the network, configure  any firewall rules etc.; next, you need to install your application and add the machine to the pool of other machines that already handle the load for your application. This is a very simplistic view of the process but it still requires you to interact with many internal and external teams in order to complete it - those can be but are not limited to hardware vendors, IT administrators, network administrators, database administrators, operations etc. As a result it can take weeks or even months to get the hardware ready to use.

Thanks to the cloud computing though you can reduce this process to minutes. All this lengthy process comes to a click of a button or a call to the provider's API and you can have the additional resources available within minutes without. Why is this important?

Because in the past the process involved many steps and usually took months, application owners often used to over provision the environments that host their application. Of course this results in huge capital expenditures at the beginning of the project, resource underutilization throughout the project, and huge losses if the project doesn't succeed. With cloud computing though you are in control and you can provision only enough resources to support your current load.

Broad Network Access

Well, this is not something new - we've had the Internet for more than 20 years already and the cloud did not invent this. And although NIST talks that the cloud promotes the use of heterogenous clients (like smartphones, tablets etc.) I do think this would be possible even without the cloud. However there is one important thing that in my opinion  the cloud enabled that would be very hard to do with the traditional model. The cloud made it easier to bring your application closer to your users around the world. "What is the difference?", you will ask. "Isn't it that the same as Internet or the Web?" Yes and no. Thanks to the Internet you were able to make your application available to users around the world but there were significant differences in the user experience in different parts of the world. Let's say that your company is based on California and you had a very popular application with millions of users in US. Because you are based in California all servers that host your application are either in your basement or in a datacenter that is nearby so that you can easily go and fix any hardware issues that may occur. Now, think about the experience that your users will get across the country! People from East Coast will see slower response times and possibly more errors than people from the West. If you wanted to expand globally then this problems will be amplified. The way to solve this issue was to deploy servers on the East Cost and in any other part of the world that you want to expand to.

With cloud computing though you can just provision new resources in the region you want to expand to, deploy your application and start serving your users.

It again comes to the cost that you incur by deploying new data centers around the world versus just using resources on demand and releasing them if you are not successful. Because the cloud is broadly accessible you can rely on having the ability to provision resources in different parts of the world.

Resource Pooling

One can argue whether resource pooling is good or bad. The part that brings most concerns among users is the colocation of application on the same hardware or on the same virtual machine. Very often you can hear that this compromises security, can impact your application's performance and even bring it down. Those have been real concerns in the past but with the advancement in virtualization technology and the latest application runtimes you can consider them outdated. That doesn't mean that you should not think about security and performance when you design your application.

The good side of the resource pooling is that it enabled cloud providers to achieve higher application density on single hardware and much higher resource utilization (sometimes going up to 75% to 80% compared to the 10%-12% in the traditional approach). As a result of that the price for resource usage continues to fall. Another benefit of the resource pooling is that resources can easily be shifted where the demand is without the need for the customer to know where those resources come from and where are they located. Once again, as a customer you can request from the pool as many resources as you need at certain time; once you are done utilizing those you can return them to the pool so that somebody else can use them. Because you as a customer are not aware what the size of the resource pool is, your perception is that the resources are unlimited. In contrast in the traditional approach the application owners have always been constrained by the resources available on limited number of machines (i.e. the ones that they have ordered and installed in their own datacenter).

Rapid Elasticity

Elasticity is tightly related to the pooling of resources and allows you to easily expand and contract the amount of resources your application is using. The best part here is that this expansion and contraction can be automated and thus save you money when your application is under light load and doesn't need many resources.

In order to achieve this elasticity in the traditional case the process would look something like this: when the load on your application increases you need to power up more machines and add them to the pool of servers that run your application; when the load on your application decreases you start removing servers from the pool and then powering them off. Of course we all know that nobody is doing this because it is much more expensive to constantly add and remove machines from the pool and thus everybody runs the maximum number of machines all the time with very low utilization. And we all know that if the resource planning is not done right and the load on the application is so heavy that the maximum number of machines cannot handle it, the result is increase of errors, dropped request and unhappy customers.

In the cloud scenario where you can add and remove resource within minutes you don't need to spend a great deal of time doing capacity planning. You can start very small, monitor the usage of your application and add more and more resources as you grow. 

Measured Service

In order to make money the cloud providers need the ability to measure the resource usage. Because in most cases the cloud monetization is based on the pay-per-use model they need to be able to give the customers break down of how much and what resources they have used. As mentioned in the NIST definition this allows transparency for both the provider and the consumer of the service. 

The ability to measure the resource usage is important in to you, the consumer of the service, in several different ways. First, based on historical data you can budget for future growth of your application. It also allows you to better budget new projects that deliver similar applications. It is also important for application architects and developers to optimize their applications for lower resource utilization (at the end everything comes to dollars on the monthly bill).

On the other side it helps the cloud providers to better optimize their datacenter resources and provide higher density per hardware. It also helps them with the capacity planning so that they don't end up with 100% utilization and no excess capacity to cover unexpected consumer growth.

Compare this to the traditional approach where you never knew how much of your compute capacity is utilized, or how much of your network capacity is used, or how much of your storage is occupied. In rare cases companies were able to collect such statistics but almost never those have been used to provide financial benefit for the enterprise.

Having those five essential characteristics you should be able to recognize the "true" cloud offerings available on the market. In the next posts I will go over the service and deployment models for cloud computing.