The Seismic Shift in IT Infrastructure Finances: Part Deux
Welcome back! In my prior post, I talked about how the shift to cloud infrastructure has created financial challenges due to the shift to a usage-based pricing model and an increase in bill complexity. Both of those are key factors that ultimately lead to what I believe is one of the fundamental mistakes when managing an organization’s cloud infrastructure and that is something call the Tragedy of the Commons.
Tragedy of the Commons
Effectively, when there is a shared resource, there is less inclination for people to take care of that resource. This phenomenon is known as the Tragedy of the Commons. Think of a dog park. It’s a shared resource that folks use with their pups. If you were to compare that to someone’s private backyard, which do you think would be cleaner. Part of that would certainly be due to the larger volume of traffic in the dog park, but certainly there is also an aspect of how responsible one feels for the park. If that were not the case, then there would be little need for large signs indicating hefty fines for not picking up after their dogs. How does this relate to the cloud? Cloud infrastructure is the digital version of our dog park.
How many times have you heard about people leaving instances provisioned or not cleaning up old log files that are no longer needed? It happens constantly and the problem is that retroactively cleaning the resources up is perceived as a long-term solution, when in reality it is a point in time cost optimization that doesn’t address the root cause. In order for the root cause to be addressed, it requires two key ingredients, transparency and accountability.
Transparency is a crucial ingredient for large organizations in the cloud. Costs need to be transparent down to at least the application level (at a minimum). There are many technical strategies to address this challenge which I won’t dive too deep into, such as resource tagging or account level cost breakdowns. If costs are not easily transparent, then one probably shouldn’t be surprised if people keep forgetting to shut off that mega server when they are not using it.
On its own, transparency isn’t quite enough though. The other key ingredient is accountability. Each resource provisioned needs to have a single accountable party. An example could be a product owner for a particular workload. The purpose of accountability isn’t to create some draconian set of rules, but rather to set expectations that with all the great things the cloud has to offer, one still has to act responsibly, and help be good stewards. Additionally, we also have to recognize that we’re all human and that mistakes will happen. It’s our job to learn from them and if not ultimately remove them, at least minimize their significance.
As I mentioned, these posts aren’t really focused on the more technical aspects, such as account structure or available SaaS tools, but rather a look at the human side of why we’re seeing such an increased focus on how to manage cloud infrastructure spending effectively. Please drop a note if you’re interested in hearing more or if you have additional perspectives on this problem.