An overview of AWS Well-Architected Principles

In this article, I am going to explain about the AWS Well-Architected Framework that helps AWS customers to design solutions following best practices while designing the architectures of their solutions. It enables the users to design secure, reliable and high performant cloud applications and workloads. This is more of a theoretical concept that is often advised to be followed while thinking of the architecture of any system. There are five pillars of the AWS Well-Architected Framework that enables customers to evaluate their existing architectures and implement scalable solutions. In this article, we will learn more about those five pillars and the best practices around them. The discussion below is a summarized form of the official whitepaper: AWS Well-Architected Framework.

Five pillars of the AWS Well-Architected Framework

As already mentioned above, there are five pillars on which the Well-Architected Framework is based upon.

Operational Excellence
Security
Reliability
Performance Efficiency
Cost Optimization

You can find more information about the individual pillars on https://aws.amazon.com/architecture/well-architected.

Figure 1 – AWS Well-Architected Framework (Source)

Let us now discuss all these pillars in detail.

Operational Excellence

One of the important pillars of the AWS Well-Architected Framework is Operational Excellence. This is about running workloads, monitoring these workloads and responding to various events efficiently generated by the workloads.

Design Principles

Operations as Code – Automate the creation of different infrastructure using tools like CloudFormation
Automated Documentation from Annotations – We should document how different components of the system interact with each other. Whenever there are some changes in the systems, the documentation should also update automatically. This will prevent integrations from breaking apart upon some changes
Make frequent and reversible changes – It is a good idea to make small and reversible changes to the production environment, rather than big time changes. This helps to quickly restore to a version in case there are some issues with the changes
Anticipate Failure – Always design your system to anticipate and accept failures, test them to make your system more robust
Learn from Operational Failures – Whenever there is a failure, make a note of the root cause and take lessons

In order to implement the Operational Excellence, use services like CloudWatch, CloudTrail, X-Ray and VPC Flow Logs. Understand the health of your workload and operations, and how to manage workload and operational events.

Figure 2 – AWS CloudWatch (Source)

Security

The Security pillar throws light on the concepts of protecting your data and system from unauthorized access and threats by conducting continuous risk assessments and figuring out strategies to mitigate the risks.

Design Principles

Strong Identity Foundation – Follow key principles like granting least privilege, separation of duties, appropriate authorization level, etc.
Enable Traceability – Audit any change or action to any environment and by whom. This enables us to maintain transparency within the organization. Monitor logs and takes action when an anomaly is detected
Security at all Layers – Apply security at multiple layers, like VPC, Load Balancers, Security Groups, EC2 instances, etc.
Automate Security Best Practices – Implement security as code and version control all security measures for future use
Protect Data in Transit and at Rest – Data should be protected using encryption, authorization tokens and Access Control Mechanisms
Keep people away from data – As far as possible, data should be kept away from handling by many people by implementing proper policies and access control

Leverage the services like Identity and Access Management (IAM), Multi-Factor Authentication (MFA) and Organizations to secure your account. Enable GuardDuty and CloudTrail to monitor any unwanted access and take appropriate actions. Use VPC, Shield and WAF to define rules on who is authorized to access the applications and how. Use Data Encryption to secure data and Macie to identify unsecured data stored on S3 buckets.

Reliability

For a system to be reliable, the failures should be minimized and in case of failures, how quickly or efficiently can the system recover from the failure. It is also important that your applications can scale dynamically based on the workload rather than depending on static inputs for scaling which might lead to under provisioning or overprovisioning the resources.

Design Principles

Test Recovery Procedures – Inject or simulate failures to your system and test how it recovers from the failure
Automatically Recover from Failure – Ensure that recoveries from failures are always automated, monitor metrics on CloudWatch, and take proper actions whenever any thresholds are reached. Automated notifications to humans should also be set up as a best practice
Scale horizontally – Avoid using monolithic architectures and use smaller resources to keeps multiple systems isolated from one another
Stop guessing capacity – Since the cloud allows dynamic capacity management, you should never guess your capacity beforehand. Let the system automatically scale up and down based on the demand

Use Foundations in place to prevent accidental overprovisioning of resources. Detect and respond to failures and prevent recurrence of the same failures in the future. Backup data and environment configurations to improve recovery time and ensure that a proper disaster recovery plan is made and ready to be implemented whenever necessary. Use the Personal Health Dashboard to understand the health of your resources.

Figure 3 – AWS Personal Health Dashboard (Source)

Performance Efficiency

There are two parts to maintain performance efficiency, one to choose the correct resource and services and the second to continuously evolve your resources as the technology changes.

Design Principles

Consume advanced technologies as a service – Use more managed services as they reduce efforts on provisioning, configuring, scaling, backing up, etc.
Go global in minutes – Since AWS is globally deployed across multiple regions, you can leverage this and deploy your application to multiple regions to help lower the latency of your application
Use serverless architectures – Using serverless architectures helps you to run your code directly without managing any other services. For example, use S3 to host a static website instead of running it on an EC2 instance
Experiment more often – Experimenting your solution across several metrics helps you identify performance bottlenecks and take appropriate actions

Deploy services to multiple regions and use serverless functions like AWS Lambda instead of running applications on an EC2 instance.

Cost Optimization

The Cost Optimization pillar is another important part of the AWS Well-Architected Framework which allows AWS customers to deliver business values at the lowest possible cost.

Design Principles

Adopt a consumption model – Pay only for those resources which are actually in use and scale your resources up or down as per the demand. No need to pay static payment charges as per the demand forecasted
Measure overall efficiency – Keep measuring your costs over a time period to understand and keep track of the trends. Optimize wherever and whenever possible
Avoid spending money on operations – Leave all your operational expenditure on AWS and focus on your customers and business logic
Analyze and attribute expenditure – Attribute your own resources, analyze and monitor the expenditure for each individual department or team. Implement resource tagging and resource groups to analyze expenditure more efficiently
Use managed and app-level services to reduce TCO – Using managed apps helps to save overall costs of maintaining the services

Use the Cost Explorer to monitor current and forecast future costs. Visualize costs by resources and cut down costs by removing unused resources. Use Budgets to get alerts based on your budgeted values.

Figure 4 – AWS Cost Explorer (Source)

Conclusion

In this article, we have understood in detail the importance of the AWS Well-Architected Framework. The AWS Well-Architected Framework is a combination of a few areas which allows the AWS customers and users to build robust scalable architectures and solutions and evaluate them accordingly. If you are an AWS Solutions Architect, you should always keep in mind these five pillars of the AWS Well-Architected Framework. Although, this article is focused on the design principles in AWS, the concepts can be the same for any other cloud application architecture. You can read more about this from the official documentation.

Author
Recent Posts

Aveek Das

Aveek is an experienced Data and Analytics Engineer, currently working in Dublin, Ireland. His main areas of technical interest include SQL Server, SSIS/ETL, SSAS, Python, Big Data tools like Apache Spark, Kafka, and cloud technologies such as AWS/Amazon and Azure.

He is a prolific author, with over 100 articles published on various technical blogs, including his own blog, and a frequent contributor to different technical forums.

In his leisure time, he enjoys amateur photography mostly street imagery and still life. Some glimpses of his work can be found on Instagram. You can also find him on LinkedIn

View all posts by Aveek Das

Five pillars of the AWS Well-Architected Framework

Operational Excellence

Design Principles

Security

Design Principles

Reliability

Design Principles

Performance Efficiency

Design Principles

Cost Optimization

Design Principles

Conclusion

Related posts: