1. Principles
1.1. Risk Management
1.1.1. Availability
1.1.1.1. Resiliency
1.1.1.1.1. Fault tolerance
1.1.1.1.2. Recovery
1.1.1.2. Reliability
1.1.1.2.1. Consistency
1.1.1.2.2. Stability
1.1.2. Security
1.1.2.1. Vulnerability response
1.1.2.2. Vulnerability mitigation
1.1.2.2.1. Static Analysis
1.1.3. Opportunity cost of diminishing risk
1.1.3.1. Capacity Planning
1.1.3.1.1. infrastructure provisioning
1.1.3.1.2. Capaciity cost/benefit
1.1.4. Failure Scenarios Testing with Gamedays
1.2. Simplicity
1.2.1. Self Service
1.2.2. Patterns
1.2.3. Reference Implementations
1.3. Service Level Objectives
1.3.1. Performance Measurement
1.3.1.1. Service Level Indicators
1.3.1.2. KPIs
1.3.2. Performance Monitoring
1.3.2.1. Error Budget
1.3.2.1.1. Velocity
1.3.2.1.2. response strategy
1.4. Release Engineering
1.4.1. Branching Strategy
1.4.1.1. GitHub Flow
1.4.1.2. Release Branching
1.4.2. Release Strategy
1.4.2.1. Deployment Pipelines
1.4.2.2. Feature Flags
1.4.2.3. Change Markers
1.4.2.4. Release Patterns
1.4.2.5. Communicating Changes
1.4.3. Testing
1.4.3.1. Automated
1.4.3.2. Integration
1.4.3.3. Load
1.4.3.4. Environments
1.4.3.5. Production
1.4.4. Metrics
1.4.4.1. MTTR
1.4.4.2. Change Frequency
1.4.4.3. Lead time for changes
1.4.4.4. Change failure rate
1.5. Toil Reduction
1.5.1. Manual
1.5.2. Repetitive
1.5.3. Automatable
1.5.4. Tactical
1.5.5. Operational Work Linearly Scaling with Service Growth
1.5.6. Self Service
1.5.6.1. patterns
1.5.6.2. reference implementations
1.6. Monitoring Distributed Systems/Observability
1.6.1. Telemetry
1.6.1.1. Monitoring/Alerting
1.6.1.1.1. KPIs
1.6.1.1.2. Service Levels
1.6.1.1.3. Golden Signals
1.6.2. Tracing
1.6.2.1. Transactions
1.6.2.2. Service Mapping
1.6.3. Logging
1.6.4. Documentation
1.6.4.1. Architectural Diagrams
1.6.4.2. Playbooks
1.6.4.3. User Journeys
1.6.4.4. Production Readiness Review
2. Lens
2.1. Architecture: WAF
2.1.1. Operational Excellence
2.1.1.1. Incident Response
2.1.1.2. Observability
2.1.2. Cost Optimization
2.1.2.1. Capacity Planning
2.1.2.2. Self Service
2.1.3. Reliability
2.1.4. Performance Efficiency
2.1.4.1. Service Level Objectives
2.1.4.2. Reliability
2.1.5. Sustainability
2.1.5.1. Toil Reduction
2.1.6. Security
2.1.6.1. Security Response
2.1.6.2. Release Strategy
2.2. Product: 3 Essentials
2.2.1. #3 What does success look like and am I achieving it?
2.3. Engineering: Workload Ownership
3. Practices
3.1. CAI Requirements
3.1.1. Production Readiness Reviews
3.2. Evolving SRE Engagement Model
3.2.1. Production Readiness Reviews
3.2.2. SRE Deep Dive Engagements
3.3. Incident Management
3.4. Security Vulnerability Management
3.5. Post Mortem Culture
3.6. Interrupt/Cognitve Load Management
3.6.1. Team Topology
3.6.2. DevOps Topologies
3.6.3. Workload Catalog
3.6.4. Workload Ownership
4. Actors
4.1. Architecture
4.2. Site Reliability Engineering
4.3. Delivery Stream
4.4. Release Train
4.5. Scrum Teams
4.6. Enablement teams
4.7. Engineering Leaders
5. **Platforms/Tools**
5.1. Platform Engineering (Engineering Enablement)
5.1.1. Common Tools
5.1.2. Self Service Management
5.1.3. Patterns
5.1.3.1. modular architecture
5.1.3.2. Small, frequent deployments