Why enterprises need to take control of their cloud costs [Q&A]
Modern businesses are more than ever reliant on using the cloud, but it's easy to develop an 'out of sight out of mind' mentality to costs when systems are not hosted on site.
We spoke to Archera CEO Aran Khanna to find out about the challenges businesses face with cloud costs and how they can keep them under control.
BN: Companies are pushing further into the cloud. What challenges are they facing with regard to costs?
AK: The first issue every customer faces is visibility. With many systems and developers running on a single AWS organization or common set of accounts, it is hard to pick apart which team or department owns specific resources. It's even tougher to understand whether increased cloud spend is caused by the core business doing well and gaining traffic, or by wasteful deployments and cloud sprawl.
Until now, labeling cloud resources, reporting on their usage, and executing optimizations of these resources have been heavy manual processes. This tedium fell on the backs of engineers, whose motivation for moving to the cloud -- ironically -- was to move faster and avoid the undifferentiated heavy lifting of infrastructure management.
The burden on engineers doesn't end there. Engineers also have to monitor the day-to-day usage of each system to ensure they aren’t hit with unexpected charges. In addition, they have to forecast usage by their systems over thousands of changing SKUs. Then comes trying to choose the most cost-effective versions of the 36+ commitment types. These different purchase options have varying discount rates, and a dizzying array of vendor offerings for every cloud resource.
Given that procurement and financial control processes for cloud resources are still nascent, managing the entire cost and commitments lifecycle is often third priority on an engineer’s checklist. Short on time, they may well opt for the easiest or most flexible option to get it done, leaving money on the table.
BN: What are the nuances of cloud costs? Why does this seem to be such a blind spot?
AK: The complexity of the billing structures, where customers consume programmatically before they pay, is where lack of visibility begins. The vast array of choices in resource and payment options just makes it worse. There are trillions of combinations when you consider contract types and different cloud resources. Factor in the problem of attributing costs and savings (to owners) that stem from different contract commitments, and it goes beyond what you'd want to calculate in a spreadsheet. In massive shared environments, slight miscalculations can impact costs in a big way. AWS and other providers have so many services and ways to purchase them, that procurement and finance people simply don’t have the expertise needed to determine what engineers need. Your cost managers must really be experts in both the cloud service offerings and the engineering side for resources each service requires -- not just today, but also looking ahead. You can see why it’s incredibly difficult to optimize this mix. It takes a ton of back and forth between whoever centrally manages resources and the engineering teams. That slows down corrective actions, which leaves money on the table, and many hours of engineer time are lost with cost accounting.
BN: Do you think if cloud costs continue to be a hindrance to growth that companies will start to bring more workloads back on-prem?
AK: The genie is out of the bottle and many workloads, especially those that must dynamically scale, work best in the cloud. The world will move to more commoditized IaaS services and horizontal, cross-cloud managed services, like Snowflake versus Redshift, Datadog versus CloudWatch. Platforms like Archera help you model, manage, migrate and optimize cloud resources, so that companies get better value from running in the cloud vs on-prem.
The speed to ship new features and services that the cloud offers is too much of a competitive advantage to give up, and that holds true even if core COGS services return to on prem. Being able to capitalize costs on margin as a business grows, means most R&D will remain in the cloud. I expect that only very large businesses with a $100M+ revenue whose core product runs on AWS would make the case for repatriating to on prem.
BN: What are some ways that companies can curb their cloud costs?
AK: Most have done the best they could with a legacy approach that combines existing visibility tools, spreadsheets and scripts. There's great room for improvement, though. We see highly automated, centralized cloud resource management as the most effective way to take control of cloud costs. Either way, there are five steps to execute in sequence.
It starts with visibility and labeling each resource as belonging to a canonical team, project, product, cost-center, etc. This clarifies which pieces of your cloud environment are actually driving costs and who owns each dollar being spent. Traditionally, each engineering team did this by hand, following company practices. Legacy visibility tools like AWS Cost Explorer, CloudHealth and CloudAbility only come into play after this labeling is done manually. More modern tools like Archera handle this centrally within a separate data platform and rules engine.
The next step is planning resource changes like moving to newer instances and terminating unused/underutilized resources. This entails forecasting overall spend, which requires domain expertise across each application. Aligning expected usage with overall business forecasts may involve collaboration among multiple teams.
With these forecasts in place, you can set budgets. Then it becomes a process of governance, checking actuals against the plan frequently enough to correct issues that arise. Continuous monitoring and alerting is key to intercept divergences from plan quickly, since manually checking dashboards every day and then involving engineers to correct problems is prone to omissions and delays. Modern tools like Archera allow this set of processes to be automated with configurable governance and alerting rules configurable, and action items pushed immediately to owners or first responders.
Once budgets and governance are in place customers need to plan their commitment purchases and renewals. It's a very complex process if done correctly. Archera is designed to quickly find optimal choices from the countless potential blends of commitments.
Once you define your purchasing strategy, the challenge them comes in monitoring the commitments themselves, and making adjustments as actual usage and costs diverge from the original plan. You might then respond by reselling commitments in the marketplace, if appropriate, or exchanging commitments via the cloud providers' first-party mechanisms. Again, it's a very complex task for a specialist, going on recommendations from legacy tools. A modern tool automates the task, end to end, and we de-risk the obligations by guaranteeing to buy back any unused commitments.
Photo Credit: ImageFlow/Shutterstock