Five steps for controlling cloud costs
With cloud costs accounting for nearly a third of IT budgets in 2021 and predicted to dramatically increase in the coming years as more companies undergo cloud migration efforts, the need for organizations to get the highest possible value out of their cloud spend is fairly obvious. Actually doing so, however, is an increasingly challenging endeavor due to the complex nature of public cloud environments, as well as the increasing reliance on containers and microservices.
When it comes down to it, the cloud promises speed, but it doesn’t necessarily guarantee that speed at a lower cost than traditional data centers. Maximizing the efficiency of your cloud spend requires buy-in from the entire organization, from the company leadership that makes buying decisions, to the finance teams that track and monitor that spend, all the way down to the developers, engineers and architects responsible for building and implementing those solutions. While there are those who believe that this is just not possible in the world of the public cloud, more modern and diligent approaches have proved that it most certainly is.
With that in mind, here are five of the most important things to keep top of mind when launching a cloud optimization initiative.
Observability drives understanding and optimization
It should go without saying that you can’t optimize what you don’t understand. This is especially true when it comes to cloud environments. The dynamic and inherently elastic nature of the cloud means services come and go rapidly, these services are used in vast and varied ways, and they are generally metered and priced differently. There are so many dimensions to understand here, it can seem as if you need a Rosetta Stone to decipher your bill. Nevertheless, there is a clear, urgent and increasing demand for more data granularity about how cloud costs are allocated.
The first step to understanding and getting control of cloud spend is to ensure that you have visibility into all the different facets of your cloud environment and can derive meaningful and actionable insights from the data that you collect. Rather than spending engineering time trying to pinpoint the root cause(s) of increased costs after they’ve occurred, put systems in place that will make it easy to pull and analyze data in as close to real time as possible, as well as accurately forecast your spend for the days, weeks, and months ahead.
For example, when spinning up a new piece of infrastructure, assigning metadata to it such as billing account, team, location and/or project will help tremendously when doing resource allocation and determining why your cloud bill is what it is.
Correlate costs to business objectives
Once you have the level of observability in place necessary to get truly granular and digestible data, it’s much easier to correlate that cost data to what’s happening across the rest of your business. That data granularity and resource allocation makes it much easier to understand how different development and engineering teams are consuming the cloud, and to put necessary guardrails in place to avoid nasty surprises in your bill.
However, it’s also important to keep in mind that increased costs aren’t necessarily a bad thing, particularly when they’re tied to overall business growth. Young companies might want to see their cloud spend increase if it means that they are growing and requiring more resources to sustain that growth.
Even unpredictable cost anomalies can sometimes be a good thing; seasonal bursting during peak times, new software releases and regional expansion -- or even just load testing for any of those events -- will likely incur greater cloud costs, but are also the hallmark of a successful business strategy. The challenge lies with rate optimization to ensure that your money is being spent wisely, which is another endeavor that can be tied back to observability and data granularity.
Prioritize elasticity when architecting your systems
Understanding the importance of financial observability and how it can help you manage your cloud usage and spend is a good first step. Once you do, it’s crucial to architect your environment in a way that allows you to glean those insights.
It’s also inevitable that at some point your cloud infrastructure will start driving costs in an unforeseen way, and business objectives will likely change over time. When either of those things occur, you need elasticity built into the architecture to accommodate the changes that occur. For instance, if a new ad campaign leads to more traffic on your site, you must be able to accommodate for increased demand and stress on the system, and then have the ability to do a post-mortem analysis to determine why and how those changes were implemented.
The method for doing this varies among the various public cloud providers; for example, AWS provides elasticity through auto-scaling groups (ASG) that deploy EC2 on-demand to scale up as needed. However, if you’re able to use Spot Instances to scale rather than EC2 (the most expensive way), then you can save a great deal when those unforeseen demand increases occur.
When building for elasticity, it’s important to configure those architectural instances to optimize for on-demand instances at scale. This is why truly forward-leaning organizations look at architectural decisions before they even start the development process to ensure that they can leverage the cloud and structure their workloads in a way that won’t drain their finances.
Understand the risks of discount strategies
All public cloud providers offer ways to pay a lower rate for your instances, be it through Savings Plans or Reserved Instances with AWS, or Committed Use Discounts with GCP. Yet purchasing these plans comes with inherent risk based on commitments. Commit too much and you run the risk of paying for instances that never get used; too little and you end up paying a higher incremental rate than may be necessary. This obviously becomes an even bigger challenge when your business is in a growth stage, because there’s virtually no way to accurately predict how prolific your cloud usage will be over the course of the coming year.
When faced with that level of uncertainty, it’s probably best to play it safe and only commit to the minimum that you know for sure that you’ll consume over the length of the deal, and then try to optimize your on-demand spend as much as possible once you reach your commitment threshold. It may look like you left money on the table when the year is over and you have the benefit of hindsight, but the good news is that there are partners that you can work with to help you manage that commitment risk and ensure that you get the most out of whatever on-demand spend you incur.
Realize the scope of the challenge and plan accordingly
One of the biggest mistakes that we see a lot is when companies try to do all of this without the proper resources in place. It’s not like setting up an old-school IT budget, where you didn’t run much risk of exceeding your thresholds because it relied on fixed capital spending strategies, slower supply chains when you need to expand, and centralized approvals for every decision.
The truth is that tracking and managing cloud spend and costs is a full-time job for at least one person (likely more) who not only has a great deal of expertise in cloud environments, but also the ability to contextualize cloud usage and spend to overall business objectives. It requires in-depth understanding of cloud infrastructure, financial expertise and the ability to create forecasting models that look months and years into the future.
Therefore, you have to decide if you want to try to do this all in-house and assume the risk that comes if you get it wrong. If you can’t or don’t want to invest in the necessary resources and expertise to accomplish all this, then you need to work with a technology-enabled partner that can provide advice on how to architect your cloud environment, manage your reservations and instances to ensure optimal rates and coverage and provide support for unforeseen roadblocks along the way. Choosing one that offers SaaS solutions to carry out many of these functions (rather than just make recommendations) is the only truly scalable option.
Additionally, keeping up with the continuous and rapidly changing nature of cloud infrastructure is an almost impossible task for humans to do on their own. Intelligent technology has emerged in recent years as the only way to keep pace and the strongest ecosystem partners will supply this to you so that your systems don’t become outdated and inefficient.
Getting all of this off your plate not only reduces the risk that you take on, but also frees up resources that can be dedicated to what actually matters most: growing your business.
John Purcell is Chief Product Officer, DoiT International