Azure costs can quickly mount, without careful supervision and management. This article will detail cost mitigation strategies using security and design
The cloud offers a wide range of tools at a fraction of the cost that we would need, if we managed it fully ourselves. However, we still may run into some scenarios where monitoring our Azure costs will be key to identify a possible attack, identify possible optimization, or provide incentives to our organization to improve how we’re designing our applications. Before we look at techniques to audit our use of resources, we should consider our design with how we use Azure across our organization. Good design can save us hours in development when we begin to monitor our costs.
Design Considerations for Auditing Azure Costs
With the exception of start-up situations where we use few resources in Azure, how we design our applications or use specific tools in the cloud with on premise applications will affect how we want to audit our costs. For an example, we could simply look at our total spend for a month in our subscription – if we maintain one subscription only like the below image shows.
But this figure may mean very little if we manage many teams that use many resources across many different environments – such as a development environment, a preproduction environment, and a production environment. In the same situation, only seeing the spend in a specific environment may mean less than seeing the spending by team – if we manage an environment with multiple teams.
Outside of the rare exceptions where we use very few cloud resources and can simply look at the total costs; we want to consider how we use Azure. The below questions make strong starting points so that we know how to audit our usage.
- Are we using Azure with premise-based resources for our applications? We have to be careful about how meaningful any Azure costs audit may be here because it’s possible that issues with our premise-based resources cause a spike in Azure use, so any auditing in Azure should be complemented with auditing for premise-based resources
- How are our applications managed on the human capital level? Do we have teams that create and maintain the applications that run in Azure? Do we run multiple applications or can we divide our application into meaningful partitions for auditing spending?
- How do we manage multiple environments in Azure? I’ve seen and discussed designs where all applications and their necessary cloud infrastructure are run under one grouping (subscription, resource group, etc) for all environments such as development, preproduction and production, and I’ve seen these environments partitioned to different groupings. Due to possible resource limitations, the latter may be more useful for auditing Azure cost information along with horizontally scaling better
- Other than management, who is the primary consumer of this information? Will we share this information with teams and use it as incentives (ie: bonuses for improving performance)?
With these initial questions, we can design our infrastructure in Azure (or update existing infrastructure) to provide what information we need based on our answers. The two big areas of concern with costs involve security and organization – whether team organization or environment organization.
Security and Azure Costs
While the cloud may provide some security features, there are risks associated with using any cloud provider. One of the risks may involve a compromise to the administrator or other account with enough permissions to increase the scale of resources above what’s needed – imagine a SQL database being scaled to the highest level when the basic level is needed. This compromise may impact a person or organization’s costs. Unlike a denial of service attack on a site – where a site is offline for customers, this attack keeps the resources online by overuses resources, costing the company or individual money.
With the rise of sim-swaps and other two-factor authentication attacks (hackers using standard two-factor authentication against users), we should be careful how we allow ourselves to be validated. In general, some best practices I recommend to avoid these situations for individuals:
- Use extreme caution when using any mobile phone for two-factor authentication. The Department of Homeland Security produced a report called DHS Study on Mobile Device Security I highly recommend reading
- In general, email is superior for two-factor authentication provided that your mobile is not tied directly to your email and your email doesn’t use a standard naming convention
Unfortunately, organizations tend to violate these security practices, such as using email names that allow for easy spear-phishing attacks, etc. If login credentials are compromised and changed, even the best audit for Azure costs may not be effective because an attacker can reset this. This is an important point worth considering for design – we want multiple receivers of our audit information and any silence on spending could be a sign of an attack. Still, in the case of security, prevention is much cheaper than cure.
Attackers can compromise Azure in other ways, by introducing malicious code, but for the context of this article, we are keeping these attacks on the context of overspending on resources. One login may be enough to reset any Azure cost auditing we have, but if this audit is disabled (or doesn’t follow a pattern we expect), we may detect the problem quickly. Likewise, if our users don’t have too many permissions, even if compromised, the attacker may not be able to scale or scale without triggering alerts by monitoring users.
Organization and Azure Costs
When we consider our organization and how our organization uses Azure, this will directly affect how we want to audit for our Azure costs. With the organization component, we want to look at how Azure is used across teams, how we’ve partitioned our environments, and how we want our environments to be grouped. Consider the below two images that contrast a horizontally partitioned environment – one by demarcating environments on the subscription level and the other by demarcating environments on the resource group level. For tracking our costs, we want to think about designs that allows us to provide our organizational structure the most meaningful information possible.
The team part of the organization is the easiest because most companies with multiple teams either have shared development of any resource or partitioned develop, where a specific team develops specific resources. The information we get about auditing can be shared on the team level or management level and we may face more challenges in the shared environment where improvements must get consensus across multiple developers. For the actual auditing, we may require that individual teams manage auditing their own resources along with organizing this (such as using tags in Azure).
When we consider environment partitions and auditing Azure costs, we have to think about the purpose of our environments. The development, preproduction, production approach is very common (some companies have 3-9 different environments). Likewise, a growing approach is using development, functional testing only, security testing only, performance testing only, beta customers, and production. This differs because we may not care as much about scale for functional or security testing, whereas we may scale our performance environment higher prior to a test and lower the scale following a test. In a similar manner, we would probably expect our costs for our beta customers to be much lower than our main group of customers, provided the beta customers are a small fraction of the entire customers.
How we group environments matters because we may want these in their own subscriptions, resource groups, or partitioned in another organized manner, depending on what resources we’re using. Consider the below example image where we have three Azure SQL databases for different environments sharing the same server. If we think about this in the context of monitoring Azure costs, is this design the most appropriate? What if we need to add new databases in the future? The answer depends because we may only need to use one Azure SQL databases, or these may be part of a larger group of resources we need to use.
To save ourselves the most time when considering the automation of auditing Azure costs, we want to review our design – from how we access Azure resources for security purposes to how our organization uses Azure. The automation step and monitoring becomes easy when we have a design that is consistent for how we use Azure, along with preventing attackers from disabling it if they try to compromise our login.