Design Principles

To help organize the framework and make it more valuable, Amazon focused the framework around the following five pillars:

  1. Operational excellence

  2. Security

  3. Reliability

  4. Performance efficiency

  5. Cost optimization

Operational excellence

The ability to run and monitor systems to deliver business value and to continually improve supporting processes and procedures.

The overall objective of this pillar is to make sure you run and monitor systems to ensure that they are providing value for the business goals of the organization

  • Perform operations in code.

  • Annotate documentation as much as possible.

  • Make frequent small and reversible changes to the architecture in order to improve it

  • Refine your operational procedures frequently in order to improve them.

  • Anticipate failures and have your recovery plans in place.

  • Learn from any failures that you might have in your architecture in AWS.

Security

The ability to protect information, systems, and assets while delivering business value through risk assessments and mitigation strategies

The job of this pillar is to help protect your assets, your systems, and your information associated with AWS. This pillar should also assist you with risk assessments and your mitigation practices.

  • You should use strong identity practices in your architecture.

  • There should be full traceability in all operations.

  • Security should be implemented in absolutely all layers of your architecture.

  • There should be a concerted effort to automate as many of the security best practices as possible.

  • Information should be secured at rest as well as in transit.

  • You should prepare as much as possible for the inevitable security events in your architecture and cloud.

Reliability

The ability of a system to recover from infrastructure or service disruptions, dynamically acquire computing resources to meet demand, and mitigate disruptions such as misconfigurations or transient network issues.

This pillar consists of many important design principles that all center around ensuring your design can easily recover from service failures. It also ensures your architecture can grow resources as needed on-demand. Reliability in the cloud also means that disruptions can be mitigated with relative ease.

  • Test recovery.

  • Automate failure recovery as much as possible.

  • Automatically scale horizontally when needed.

  • Stop guessing at capacity for IT resources.

  • Manage changes through automation.

Performance Efficiency

The ability to use computing resources efficiently to meet system requirements, and to maintain that efficiency as demand changes and technologies evolve.

This pillar concerns itself with the use of AWS resources as efficiently as possible. The efficiency should be maintained as demand changes and technology evolves.

  • Democratize advanced technologies—meaning make them available to the masses.

  • Take resources globally in minutes.

  • Target serverless computing as much as possible.

  • Experiment freely and often.

  • Maintain mechanical sympathy—meaning match business goals to the appropriate technologies.

Cost Optimisation

The goal of this pillar is quite simple—to save money and stop the wasting of investments in technology.

  • Adopt a consumption model; this emphasizes the OpEx approach to IT.

  • Measure the efficiency of your architecture closely.

  • Stop spending money needlessly in an attempt to solve IT problems.

  • Closely analyze the expenditures in your AWS implementation.

  • Use managed services as much as possible.

Fault Tolerance and High Availability

High availability (HA) refers to the ability of your entire architecture to maintain an increased level of availability. You should note that fault tolerance is a subcomponent of high availability.There are two important considerations for high availability with AWS. First, the HA should be able to be achieved at a small fraction of the cost of achieving HA in a traditional data center approach on your premises. Second, the HA should be achievable with a minimum of human intervention. In fact, most consider HA to mean there is no human intervention.

HA components - ELB, ElasticIP, Route53, AutoScaling, CloudWatch

FaultTolerance - SQS, S3, Simple DB

AWS Well-Architected Framework we use these terms

Component - is the code

Workload - set of components that together deliver business value

Technology portfolio - collection of workloads that are required for the business to operate.

Architecture - How components work together in a a workload. How components communicate and interact is often the focus of the architecture diagrams.

When architecting workloads you make trade-offs between pillars based upon your business context. These business decisions can drive your engineering priorities. You might optimize to reduce cost at the expense of reliability in development environments, or, for mission-critical solutions, you might optimize reliability with increased costs. In ecommerce solutions, performance can affect revenue and customer propensity to buy. Security and operational excellence are generally not traded-off against the other pillars.

General Design Principles

The Well-Architected Framework identifies a set of general design principles to facilitate good design in the cloud:

    • Stop guessing your capacity needs: Eliminate guessing about your infrastructure capacity needs. When you make a capacity decision before you deploy a system, you might end up sitting on expensive idle resources or dealing with the performance implications of limited capacity. With cloud computing, these problems can go away. You can use as much or as little capacity as you need, and scale up and down automatically.

    • Test systems at production scale: In the cloud, you can create a production-scale test environment on demand, complete your testing, and then decommission the resources. Because you only pay for the test environment when it's running, you can simulate your live environment for a fraction of the cost of testing on premises.

    • Automate to make architectural experimentation easier: Automation allows you to create and replicate your systems at low cost and avoid the expense of manual effort. You can track changes to your automation, audit the impact, and revert to previous parameters when necessary.

    • Allow for evolutionary architectures: Allow for evolutionary architectures. In a traditional environment, architectural decisions are often implemented as static, one-time events, with a few major versions of a system during its lifetime. As a business and its context continue to change, these initial decisions might hinder the system's ability to deliver changing business requirements. In the cloud, the capability to automate and test on demand lowers the risk of impact from design changes. This allows systems to evolve over time so that businesses can take advantage of innovations as a standard practice.

    • Drive architectures using data: In the cloud you can collect data on how your architectural choices affect the behavior of your workload. This lets you make fact-based decisions on how to improve your workload. Your cloud infrastructure is code, so you can use that data to inform your architecture choices and improvements over time.

    • Improve through game days: Test how your architecture and processes perform by regularly scheduling game days to simulate events in production. This will help you understand where improvements can be made and can help develop organizational experience in dealing with events.

DevOps Tools -

https://www.youtube.com/watch?v=esEFaY0FDKc