The Cloud Operating Model

The Cloud Operating Model is a high-level representation of how an organization will deliver on its Cloud Strategy. It is a blueprint for organizing effectively to deliver the capabilities and outcomes required to deliver value through cloud services. Once the Cloud Strategy has been defined, ratified, and communicated, the Cloud Operating Model defines the operational processes required for an organization to execute on the strategy.

Cloud services (either created or brokered) cannot be supported effectively by traditional Enterprise IT organizations. The siloed approach to operations – network, operating system, storage, middleware teams – will be less effective supporting cross-functional, cross-domain services. These specialized teams remain critical to providing the subject matter expertise in their respective domains, but we need a dedicated set of resources to manage the new model. Just as the Cloud Operating Model provides a layer of abstraction above hybrid and public cloud complexity, dedicated teams are needed to provide a similar layer of abstraction between our IT consumers and our core infrastructure teams. These resources can be net new or trained and moved from existing teams.

 

The Cloud Strategy

In my previous post “The Importance of a Cloud Strategy” I discussed why a good cloud strategy is imperative for a successful IT transformation initiative. The Cloud Strategy aligns business outcomes with technical outcomes and defines a high-level framework for governance and management of resources in a multi-cloud world. Where the Cloud Strategy defined the “what” and “why” of cloud service adoption, the Cloud Operating Model defines the “how” and “who” for the ongoing governance, management and operationalization of cloud services.

The Cloud Platform

“Cloud Platform” describes the technology platform that supports the delivery and operations of IT services that adhere to the principles of cloud computing (self-service, shared resource pools, metered chargeback, etc). The Cloud Platform is a critical component of the Cloud Operating Model. It supports the creation and management of cloud services and includes the technical capabilities required to adopt some of the operational best practices used by the hyper-scale cloud providers to increase operational efficiency in enterprise organizations, such as self-driving operations and programmable provisioning.

Cloud Platform

The Cloud Platform provides the technical capabilities needed to support the following types of cloud services:

  • Private Cloud Services – Built internally, usually on on-prem infrastructure, on the Cloud Platform.
  • Brokered Services – Public Cloud Services brokered and managed via the Cloud Platform.
  • Public Cloud Services – Services built and maintained outside of the Cloud Platform with the potential for the platform to discover and bring under management.

The VMware Cloud Management Portfolio provides all of the capabilities needed for a comprehensive Cloud Platform.

 

An Operating Model by any other name…

Like everything related to “cloud” in the industry, there is no agreed-upon definition for “Cloud Operating Model”. Similar concepts are described as “Cloud Adoption Framework”, “Cloud Operating Architecture”, “Service-Oriented Model” or any other number of combinations. I define the Cloud Operating Model as the operational processes required to maximize the benefit from cloud service adoption. In other words, which operating concepts will the organization adopt in order to operate more like a cloud service provider – to be able to rapidly scale services and resources, onboard or create new services efficiently, and provide a secure, trusted cloud platform for internal and external consumers.

Like all operating models, the Cloud Operating Model consists of the people, processes, and technology required to execute successfully.

The Cloud Operating Model

 

People

Despite the relentless focus on automation to increase operational efficiency, people are still the most critical component of the operating model.  What changes are needed to the organizational structure to operate effectively? How will we organize to prioritize communication and collaboration? Are there new roles to be defined? Do we need to recruit new skillsets, or can we train existing resources? How do we ensure there are clear lines of accountability and responsibility? Who are our consumers and stakeholders?

 

Consumer Personas

It is important to understand the different personas that will be the consumers of your services. In the past, application teams (either developers or business units buying packaged software) would interact with IT via a project manager and/or ticketing system to request a resource from a standard offering of VMs and operating systems. IT Operations was usually responsible up to the operating system (and standard packages) and beyond that responsibility was handed to the application team. In the Cloud Operating Model, if you do not understand the use-cases and requirements of the consumer you will not be able to meet their needs. For each service delivered using the cloud service model, the persona should be well understood – what they need, when they need it, how long they will need it for, and how they want to consume the services. A high-level analysis of consumer personas is in the Cloud Strategy, but the onboarding or creation of each service should provide more detail on the target and future consumers of the service.

Personas

 

Stakeholder Personas

Your stakeholders can be your biggest supporters or your biggest detractors. Stakeholders are defined in the Cloud Strategy, and the Operating Model must provide a feedback loop to make sure that the success of an initiative is communicated to all the right people. IT teams are historically not great at self-promotion but need to break the perception of the expensive, slow, inflexible group. KPIs associated with each service must be shared regularly with the relevant stakeholders, preferably via a report or dashboard capability.

Stakeholder Personas

 

Operational Teams

We are no longer delivering infrastructure components; we are delivering solutions consumed as cloud services and supporting them requires that the operational teams are organized around the services. Depending on your approach to transformation (evolutionary or revolutionary), this may be one big reorganization effort or a phased approach. A good first step is the creation of a cross-functional team with experts from the critical IT areas represented. Often this matures into a “Cloud Center of Excellence” that is responsible for setting standards, overseeing governance and making decisions about the onboarding or creation of new services. As the organization matures further this team will either be superseded or augmented by service delivery teams that are dedicated to, and organized around, the services delivered by the Cloud Service Model.

Cloud Operations

The example shown here is not a reporting structure, rather a hierarchy of governance, best practices, and shared services. The reporting structure is important, however. Candidly, I have encountered too many enterprise organizations where the VP stakeholders – infrastructure, applications, architecture, and strategy – are not aligned (or worse, competing).  This makes it really hard to be successful. The earlier all stakeholders are included in the requirements gathering and planning stages for the new delivery model the more support you will get from those critical leadership roles. To identify the critical stakeholder in your organization continue moving up in the organizational chart until you reach the resource that can make the final decision on all aspects of the plan. This is usually the CIO but could be a VP if you have a combined organizational structure. Stakeholder alignment should have been achieved as part of the cloud strategy ratification. If you find yourself encountering resistance at the operating layer you may have more work to do on the creation and communication of the cloud strategy.

 

Process

For generations IT has operated environments that place reliability and stability above all else – after all, if users cannot access the solution then IT is considered to be failing. This led to a host of frameworks, processes and protections that, while created in good faith, led to environments paralyzed by processes.   It was often easier to simply say no – to change, to new requests, to customizations – than to navigate the warren of process and policy. Think ITIL, Change Management Boards and Architecture Review Boards.

Automation can help in some areas, but it is not enough. Fundamental change is required. Consider a simple VM provisioning process. In the first private cloud environment that I built, a change ticket was needed (48 hours advance notice) and someone had to speak to the change at the Change Review Board (that was held twice a week) just to bring a VM online. This is presumably because someone at some time in the past had brought a system online without a unique IP address or somehow inadvertently promoted an infrastructure server that caused havoc. Nobody ever asked questions of the change and it was pretty much accepted to be a waste of time. Human intervention in this case added zero value. What was required was the information that the change had happened. I automated this into the server build process – create CR, create RFC, build server, update CR, close RFC. We made the change a pre-approved change. Now, if there was an issue there was a record of the activity in the system. There are hundreds of similar processes involved in the day-to-day operations of an enterprise IT organization. Some can be automated; some aren’t actually needed any more (but nobody can remember why they were created in the first place and are hesitant to change). IT operations has been groomed to perform in a reactive, CYA, don’t “move fast and break things” for 30+ years. Enterprise IT cannot continue to survive in this mode. Application teams are adopting new technologies that allow them to build resiliency into the application and the ability to move fast is a must, therefore if internal IT cannot adapt then they risk becoming irrelevant.

The Cloud Operating Model requires a complete review of existing operational standards, policies, and processes to determine which can be ported from the legacy way of operating, and those that should be updated or retired.

There may be a requirement to create new processes and rewrite policies. When IT did not know which applications or data were on systems they often defaulted to a position of “err on the side of caution”. This meant architecting every environment with the highest level of data protection, resiliency and security. This is an expensive, inflexible position to take and does not scale. The Cloud Operating Model requires that you understand the applications running in the environment, that data classification processes are in place, and that you can automate the provisioning requests to place resources in the environments that meet their needs. Want to deploy a development machine for a week of testing? Great, choose the resource from the self-service catalog and allow the automated logic to place the system on a cheaper tier of infrastructure with a built-in lease period where the resource will be decommissioned when it expires.

Common operating processes need to be reviewed, streamlined or replaced to provide an efficient, consistent experience across public and private cloud environments. Examples* are listed below.

*This is not an exhaustive list. Yes, it is overwhelming. It has taken decades to get to where we are today, it will take some effort to reevaluate.

  • Data classification and data protections
  • Compliance
  • Workload placement policy and process
  • Access control
  • Lifecycle management
  • Change management
  • Financial model – chargeback or funding of shared, short-term resources, budgeting, reporting
  • Integration process with backend systems, shared system requirements
  • Governance processes and policy enforcement processes
  • Responsibility and Accountability
  • Streamline decision making processes
  • Monitoring, SLAs, reporting, alerting, logging, incident response, service monitoring
  • Resource rightsizing, dynamic scaling
  • License management

 

DevOps and Agile

Agile is a software development methodology that embraces the change inherent to project requirements over time. It prioritizes iterative change and cross-functional collaboration more so than traditional methods like waterfall. The success of Agile for software development has led to the adoption of these concepts across the larger IT organization with varying degrees of success. What is clear is that the adoption of Agile for software development requires an IT organization to be able to respond more quickly, which McKinsey & Company calls “Agile Infrastructure”. The recommendations that McKinsey and Company suggest for agile infrastructure is very much aligned with the Cloud Operating Model principles described in this post.

DevOps defines a culture where development and operational teams are embedded to own all aspects of a service or application lifecycle, to deliver software faster and more reliably.

DevOps”, “Agile” and other processes can be considered components pf the Cloud Operating Model. There is enough overlap in organizational structure and processes that they do not conflict. My recommendation is to adopt the pieces of each that make the most sense for your organization.

 

Technology

Technology provides the foundation for this new model, and the choices made here can make it either much easier or much harder to be successful. For this reason, VMware is quite prescriptive on our technology recommendations to support the Cloud Operating Model, starting with the multi-cloud capabilities that we consider essential to deliver and manage cloud services. We believe the tools that you use to provide the supporting functionality must be able to support the heterogeneous resources required to build the solutions. We believe that there is an essential set of capabilities required to support a Cloud Operating Model and that they are consistent across your private, hybrid and multi-cloud environments.

We layer these capabilities from a solid, consistent infrastructure foundation (inclusive of private and public resources), up through the stack to application delivery and observability. I introduced these capabilities at the start of this blog post, let’s review again here.

Cloud Platform

 

Our VMware Cloud Foundation solution along with our cloud management portfolio (vRealize Suite and Cloud Health) offer all of the capabilities that we consider critical to a private, hybrid and multi-cloud environment. I will go into more detail regarding capabilities essential to support a cloud platform in an upcoming blog post.

Cloud Management

 

Next Steps

The change does not have to happen all at once, although there are certainly customers who are implementing these changes more aggressively than others. The reward is certainly worth the considerable effort, but many organizations have no resource capacity to do anything beyond keeping the lights on. This is why we recommend starting with the adoption of technologies to improve the operational efficiency layer of the operating model; not only will this free up hardware resources for reallocation, but it will free up human resources so that they can focus on building the service delivery capabilities that are critical to the Cloud Operating Model.

 

Final Thoughts

People remain the most critical component of the Cloud Operating Model and making sure teams have the right support in place to make this transition is essential – from retraining existing resources to hiring new, from updating hiring practices and compensation models to attract and retain talent, to updating on-call and flex-time policies. VMware has made this transition internally and are guiding thousands of customers through various stages of the transition. Reach out to your VMware team to hear more about how we can help.

The post The Cloud Operating Model appeared first on VMware Cloud Management.

Powered by WPeMatico