In this article, I am going to explain about the AWS Well-Architected Framework that helps AWS customers to design solutions following best practices while designing the architectures of their solutions. It enables the users to design secure, reliable and high performant cloud applications and workloads. This is more of a theoretical concept that is often advised to be followed while thinking of the architecture of any system. There are five pillars of the AWS Well-Architected Framework that enables customers to evaluate their existing architectures and implement scalable solutions. In this article, we will learn more about those five pillars and the best practices around them. The discussion below is a summarized form of the official whitepaper: AWS Well-Architected Framework.
Five pillars of the AWS Well-Architected Framework
As already mentioned above, there are five pillars on which the Well-Architected Framework is based upon.
- Operational Excellence
- Performance Efficiency
- Cost Optimization
You can find more information about the individual pillars on https://aws.amazon.com/architecture/well-architected.
Figure 1 – AWS Well-Architected Framework (Source)
Let us now discuss all these pillars in detail.
One of the important pillars of the AWS Well-Architected Framework is Operational Excellence. This is about running workloads, monitoring these workloads and responding to various events efficiently generated by the workloads.
- Operations as Code – Automate the creation of different infrastructure using tools like CloudFormation
- Automated Documentation from Annotations – We should document how different components of the system interact with each other. Whenever there are some changes in the systems, the documentation should also update automatically. This will prevent integrations from breaking apart upon some changes
- Make frequent and reversible changes – It is a good idea to make small and reversible changes to the production environment, rather than big time changes. This helps to quickly restore to a version in case there are some issues with the changes
- Anticipate Failure – Always design your system to anticipate and accept failures, test them to make your system more robust
- Learn from Operational Failures – Whenever there is a failure, make a note of the root cause and take lessons
In order to implement the Operational Excellence, use services like CloudWatch, CloudTrail, X-Ray and VPC Flow Logs. Understand the health of your workload and operations, and how to manage workload and operational events.
Figure 2 – AWS CloudWatch (Source)
The Security pillar throws light on the concepts of protecting your data and system from unauthorized access and threats by conducting continuous risk assessments and figuring out strategies to mitigate the risks.
- Strong Identity Foundation – Follow key principles like granting least privilege, separation of duties, appropriate authorization level, etc.
- Enable Traceability – Audit any change or action to any environment and by whom. This enables us to maintain transparency within the organization. Monitor logs and takes action when an anomaly is detected
- Security at all Layers – Apply security at multiple layers, like VPC, Load Balancers, Security Groups, EC2 instances, etc.
- Automate Security Best Practices – Implement security as code and version control all security measures for future use
- Protect Data in Transit and at Rest – Data should be protected using encryption, authorization tokens and Access Control Mechanisms
- Keep people away from data – As far as possible, data should be kept away from handling by many people by implementing proper policies and access control
Leverage the services like Identity and Access Management (IAM), Multi-Factor Authentication (MFA) and Organizations to secure your account. Enable GuardDuty and CloudTrail to monitor any unwanted access and take appropriate actions. Use VPC, Shield and WAF to define rules on who is authorized to access the applications and how. Use Data Encryption to secure data and Macie to identify unsecured data stored on S3 buckets.
For a system to be reliable, the failures should be minimized and in case of failures, how quickly or efficiently can the system recover from the failure. It is also important that your applications can scale dynamically based on the workload rather than depending on static inputs for scaling which might lead to under provisioning or overprovisioning the resources.
- Test Recovery Procedures – Inject or simulate failures to your system and test how it recovers from the failure
- Automatically Recover from Failure – Ensure that recoveries from failures are always automated, monitor metrics on CloudWatch, and take proper actions whenever any thresholds are reached. Automated notifications to humans should also be set up as a best practice
- Scale horizontally – Avoid using monolithic architectures and use smaller resources to keeps multiple systems isolated from one another
- Stop guessing capacity – Since the cloud allows dynamic capacity management, you should never guess your capacity beforehand. Let the system automatically scale up and down based on the demand
Use Foundations in place to prevent accidental overprovisioning of resources. Detect and respond to failures and prevent recurrence of the same failures in the future. Backup data and environment configurations to improve recovery time and ensure that a proper disaster recovery plan is made and ready to be implemented whenever necessary. Use the Personal Health Dashboard to understand the health of your resources.
Figure 3 – AWS Personal Health Dashboard (Source)
There are two parts to maintain performance efficiency, one to choose the correct resource and services and the second to continuously evolve your resources as the technology changes.
- Consume advanced technologies as a service – Use more managed services as they reduce efforts on provisioning, configuring, scaling, backing up, etc.
- Go global in minutes – Since AWS is globally deployed across multiple regions, you can leverage this and deploy your application to multiple regions to help lower the latency of your application
- Use serverless architectures – Using serverless architectures helps you to run your code directly without managing any other services. For example, use S3 to host a static website instead of running it on an EC2 instance
- Experiment more often – Experimenting your solution across several metrics helps you identify performance bottlenecks and take appropriate actions
Deploy services to multiple regions and use serverless functions like AWS Lambda instead of running applications on an EC2 instance.
The Cost Optimization pillar is another important part of the AWS Well-Architected Framework which allows AWS customers to deliver business values at the lowest possible cost.
- Adopt a consumption model – Pay only for those resources which are actually in use and scale your resources up or down as per the demand. No need to pay static payment charges as per the demand forecasted
- Measure overall efficiency – Keep measuring your costs over a time period to understand and keep track of the trends. Optimize wherever and whenever possible
- Avoid spending money on operations – Leave all your operational expenditure on AWS and focus on your customers and business logic
- Analyze and attribute expenditure – Attribute your own resources, analyze and monitor the expenditure for each individual department or team. Implement resource tagging and resource groups to analyze expenditure more efficiently
- Use managed and app-level services to reduce TCO – Using managed apps helps to save overall costs of maintaining the services
Use the Cost Explorer to monitor current and forecast future costs. Visualize costs by resources and cut down costs by removing unused resources. Use Budgets to get alerts based on your budgeted values.
Figure 4 – AWS Cost Explorer (Source)
In this article, we have understood in detail the importance of the AWS Well-Architected Framework. The AWS Well-Architected Framework is a combination of a few areas which allows the AWS customers and users to build robust scalable architectures and solutions and evaluate them accordingly. If you are an AWS Solutions Architect, you should always keep in mind these five pillars of the AWS Well-Architected Framework. Although, this article is focused on the design principles in AWS, the concepts can be the same for any other cloud application architecture. You can read more about this from the official documentation.