Metrics for calculating the cost of processing

created: Saturday, Aug 19, 2023

Every cloud platform must have some metric of cost assiciaated with its services. For most public cloud offerings, the target metrics for cost are usually CPU core count and amount of memory. That gets multiplied by the amount of time used and we arrive at usage-based pricing.

This model has its pros and cons, so let’s iterate a little over this.

The Pros

Since all cost is tied to a usage pattern, the goal ist to reduce cost by reducing usage.

The overall price can be somewhat estimated by just adding up the required CPU/Memory amount over time. This makes pricing somewhat predictable (more on that on the Cons side).

The Cons

Companies who aim for cost predictability will choose the solution that gives that. So if I want to be sure about cost, I will choose the least flexible path and scale to peak performance. This sounds counterintuitive at first (money-wise), but this is happening often in large organizations since most are preferring predictability over some unpredictable cost-savings.

Everyone has some idea of what a CPU/vCPU is, but it is actually not a good metric to compare between providers. I always have to do my own benchmarking to see how instance types, CPU architecture, and vertical scaling affect my performance.

The variable CPU processing power also makes it very difficult to compare the actual impact, as in environmental impact, for my project. Some provider already stepped up and make CO2 (or CO2 equivalent) metrics available, but even these become very complex as it becomes harder to understand what is included and what is excluded.

Conclusion

So having usage-based pricing is a big step forward in comparison to just paying some monthly amount and nobody cared about what it entails. It also tries so give an incentive to reduce the overall footprint, but it also gives just limited capabilities for comparing services between providers.

So maybe it is time to take usage-based pricing a step further and change the metric here.

A first attempt

So if we start with CPU and memory consumption over time, we can capture the amount of resources utilised at the provider level. One metric we are missing here, or just include it into the pricing, is power (as in watt consumed).

So instead having a cost calculated by processing, and a environmental footprint calculated through some CO2 metric, how about we merge those two and calculated processing power in watts. This way we can provide one metric for cost and impact.

That way, a company that wanted to improve its environmental footprint would automatically improve its cost and vice versa. Both goals would be visible and actionable.

How should it work

That leads to the question of how it should work, because in theory all it all sounds simple. In reality there is big problem, our current hardware does not provide this kind of metrics in the resolution we need.

So for now we have to estimate and work around our current hardware limitations.

What we have implemented for now is a power meter on each machine, this way we know the total amount of energy consumed (with a minute resolution). This metric gets overlayed to the user-based workload on those machines. What this gives us is an estimate of how much energy was consumed by which user. We also want to bring this metric into the UI as soon as possible.

Epilogue

So this is our first attempt at this topic. We will provide another update if we can see movement in this direction or if we run into problems that make this is bad idea.

For now, this looks like the way to go.