Of all of the seven Rs of cloud migration, “lift and shift”, or “rehost”, is perhaps the easiest to do, yet in many people’s opinion provides the least benefit. As I often explain to clients, moving to the cloud without changing the way you work is just expensive hosting. Yet many organisations who want to move to cloud often consider lift and shift as their first option.
Let’s take a look at why this migration method is so attractive to some, yet such anathema to others, and how you can make the most of it if you absolutely have to lift-and-shift.
A quick recap of the 7 Rs of Cloud Migration
Most discussions of the 7 Rs of cloud migration, describe what they are, listing them in order of increasing effort (and therefore cost) involved.
- Retire: Don’t migrate it, don’t keep it in the data centre – turn it off and save!
- Retain: Keep it in the data centre (“on premises”) and treat it as you always have.
- Relocate: vMotion it to AWS and run it on VMware in the Cloud (VMC)
- Rehost: “Lift and Shift” – don’t make architectural changes, just run the virtual machine on an EC2 instance in AWS
- Repurchase: Time for a change. Migrate to a different (better) solution, hosted in the cloud, possibly SaaS
- Replatform: Without fundamentally changing the solution architecture, replace components with cloud based services, such as moving the database layer to a managed service like RDS.
- Refactor: Rearchitect the solution to use cloud native architectures, including serverless and container-based approaches.
There are many in-depth discussions about how to determine the best treatment for each workload, complete with high level decision criteria and flow charts. Because these methodologies (rightly) take a pragmatic ROI approach, they focus on whether it is worth investing the effort required for each migration treatment, given the nature and strategic value of each workload.
What these discussions often leave out is the ongoing operational consequences of your ROI based decision. What cloud benefits do you forgo as you live with your migration R of choice?
What is lift and shift (rehost)?
“Lift and shift,” also known as “rehosting,” means migrating an exact copy of your application or workload, including its operating system and data store, to the cloud without changes.
Because it is as much as possible an exact copy, there are several tools available to help you move your workload and rehost it in the cloud. These tools automate the continuous replication of your on-premises server, keeping it in sync until you cut over.
When should you choose to lift and shift?
Many cloud or devops purists might answer “never” if they were truly allowed to speak their mind, even though they know it is not a practical position to take for most businesses. They imagine a future world not anchored to the past, where every IT solution and business workload uses cloud technologies and paradigms to provide the resilience, agility and scale that business desires.
Most business case migration assessments, however, include lift and shift workloads. Certainly, if the business is committed to getting out of their co-lo data centre (perhaps at the end of a contract), then speed is often a priority, and there is simply not enough runway to transform all the workloads as a refactor, replatform or a repurchase. The philosophy is “focus on getting everything out, and we can worry about the rest later”. The very real danger, of course, is that kicking the can down the road can be habit forming, and “later” never comes.
There are three common cases for lift and shift in migrations even if a data centre exit is not the end goal. The first is the large (and often critical) legacy architecture enterprise system, where a hardware refresh is looming. As a legacy system, the architecture is based on old world assumptions, and the components aren’t easily replaceable with PaaS offerings. It possibly uses a database engine not available as an RDS option, or utilises features that rule out RDS.
The second case is dealing with those systems that nobody knows much about. The last person who did, left years ago, and we know it is still used, but not how it fits into the bigger IT picture.
The third is the system that is already labelled for sunset, which we need for a bit longer while we wait for a different project to make it obsolete, or it dies a natural death. The cloud migration becomes the holding pattern.
Why is lift and shift less valuable?
To answer this question, we need to have a good understanding of the drivers behind moving to the cloud, and most importantly what it is about moving to cloud that provides “value”.
According to a 2018 IDC study of 27 organisations running workloads on AWS, only 8% of the average annual benefits came from infrastructure cost reductions. The other 92% were all productivity benefits.
How does running your workloads on AWS introduce productivity benefits?
The productivity benefits in this study were realised through user productivity due to risk mitigation and reduced downtime (13%), IT staff productivity (32%), and most importantly, increased business productivity (47%).
Lift and shift workloads see very few of these productivity benefits because they retain the same architecture they did on premises, and are managed using the same personnel, processes, and tools. In many cases this is traditional sysadmin clickops.
I quite like Brazeal’s lift and shift shot clock analogy. Once a basketball team gains possession of the ball the shot clock starts and they have 24 seconds to attempt a shot at goal. If they don’t attempt a shot that at least touches the rim before the shot clock expires, they lose possession.
Once your lift and shift workload hits AWS, your hosting shot clock starts. The short term tactical advantage you gained by hosting this workload in the cloud is now diminishing as you pay for cloud hosting without cloud productivity gain. You aren’t getting the operational efficiencies, and eventually the savings from avoiding hardware refresh will be eaten away. What’s more, your nice cloud environment has started to accrete technical debt, and if your cloud operations folks are on the hook to support a raft of click-ops pets, they’re not going to remain your happiest campers.
What if I have to lift and shift?
As mentioned above, there are several use cases that point to a lift and shift migration being an appropriate tactical choice. So if you are in the position of having to lift and shift one or more workloads, here are some suggestions.
1. Automate what you can
I often say to my clients that “cloud” is far more about “how” than “where”. To add more time to the shot clock, do what you can at the start to take advantage of changing to cloud ways of working.
This includes installing agents which provide cloud security, monitoring and observability. With lift and shift migrations, you don’t have the option to reset to a standard machine image with all of the required agents and configurations installed. That doesn’t mean they shouldn’t be brought in line as much as possible. One way to do this is using the AWS MGN Post-launch actions.
At a minimum, automate backup and patching, and try to stop the accumulation of tech debt that comes from being marooned on systems several versions behind current.
2. Add resilience where you can
While some may say we’re drifting into the area of re-architecting, there is a lot that can be done around the edges of your rehosted workload, without changing the application architecture. You don’t have to containerise a workload to improve its resilience and make it at least a little cloud-like.
Components that can be treated as immutable should be, and put behind load balancers. Even if they don’t scale horizontally, they should be put in an autoscaling group configured for a single instance across multiple availability zones, so that they will be replaced if they fail a health test.
Even servers that can’t be treated as immutable can at least take advantage of EC2 autorecover. This will attach your server’s network interface and disk to a new physical machine if the underlying hardware of the instance fails.
3. Security at the infrastructure layer
Security and compliance by design is not only implemented at the OS level. Many of the guardrails and controls sit outside the virtual machine instance (EC2). Your lift and shift workload can still benefit from micro-segmentation through security groups, and VPC network level controls (such as NACLs). They also get the benefit of things like CloudTrail logs, Amazon GuardDuty, AWS Config, and Amazon Inspector, as well as organisational level controls such as SCPs for maximum permissions.
Take the opportunity to configure the server with secure versions of the protocols used by the application if it will support it, so that data is encrypted in flight. Data encryption at rest is a relatively trivial exercise on AWS, even for applications you have no knowledge of or control over.
4. Make sure you have a plan
If you don’t do anything to move towards scoring a strategic goal, your hosting shot clock will eventually expire. There are many factors that can affect the amount of time that goes on that shot clock when your cloud platform takes possession, including business criticality and strategic value, along with the rate of change and relative operational stability. In the absence of any other data point however, aim to complete your strategic treatment, whether it’s replace, repurchase or refactor, within a two-year timeframe.
Think bigger
It’s important to remember that your cloud migration is just the start of a journey, and that journey is a whole-of-business endeavour. The systems you’ve been stuck with doing a lift and shift migration for are planks forming a makeshift bridge between two sides of a divide. If you are committed to the efficiency and agility benefits that come from operating in cloud, you need to start to look to the future and ensure you’re not dragging your old world technical debt, in the form of those lifted and shifted planks, with you into your long term future.
Now is the time to educate your finance teams and procurement function on cloud purchasing. You should be giving all your application vendors clear notice that in the future you will only use products that can be deployed and managed with cloud-based operations and automation. That includes resilient self-healing architectures that take advantage of autoscaling to handle capacity management. Let your vendors know that while you would like to stay with them, if they aren’t bringing their product to cloud native deployments then you will be moving to an alternative.
While lifting and shifting isn’t always wrong, you want to only use it where it makes sense, and ensure you have a plan to not be stuck with it forever.
How long is left on your lift and shift shot clock?