Loosely coupled teams

DORA Architecture

Начать. Это бесплатно
или регистрация c помощью Вашего email-адреса
Loosely coupled teams создатель Mind Map: Loosely coupled teams

1. Understand

1.1. Have a loosely coupled architecture/teams that make possible to work independently

1.2. Work independently means

1.2.1. A small team

1.2.2. Responsible of a given service

1.2.3. Can

1.2.3.1. 1. Make large-scale changes to the system design

1.2.3.1.1. WITHOUT

1.2.3.2. 2. Complete work

1.2.3.2.1. WITHOUT

1.2.3.3. 3. Do most of their testing on demand

1.2.3.3.1. WITHOUT

1.2.3.4. 4. Release product / deploy service

1.2.3.4.1. INDEPENDENTLY of

1.3. So that it

1.3.1. increase developer productivity

1.3.2. improve deployment outcomes

2. Pitfalls

2.1. Bing bang release

2.1.1. Meaning

2.1.1.1. Need to simultaneously release many services

2.1.1.2. Due to complex interdependencies

2.1.2. Test usually

2.2. Large long integration - acceptance test

2.2.1. For 100 - 1000x of developpers

2.2.2. Requires week to get the global test environment

2.2.3. Ussualy not well aligned to production

2.3. Bottleneck

2.3.1. one teams with manual operations

2.3.2. on the critical patch to release for others

3. Implement

3.1. Be strict on the 4 the wanted outcomes:

3.1.1. Independently

3.1.1.1. Make a large design change

3.1.1.2. Complete work

3.1.1.3. test

3.1.1.4. release-deploy

3.2. Choose an archetype that facilitate the wanted outcomes, long term

3.2.1. See above

3.3. Do not assiume the archi archetype is enough

3.3.1. Their is no perfect architecture for all product all scales

3.3.2. It is possible to have a microservice archi, and still work in a tightly-coupled way ...

3.4. Empower the team to

3.4.1. discuss the architecture

3.4.1.1. as architectural need changes too

3.4.2. experiment with ideas

3.4.3. choose their own tools

3.5. Have well defiend contract between services

3.5.1. aka APIs

3.5.1.1. backward compatible as often as possible

3.5.1.2. versioned

4. Improve

4.1. Hope for the best, plan for the worst.

4.1.1. Explore failure domains

4.1.1.1. one process

4.1.1.2. one instance

4.1.1.3. a rack

4.1.1.4. a data center

4.1.1.5. one binary deployed on x servers

4.1.1.6. a global config system

4.1.2. Counter measures

4.1.2.1. Decouple: monolith to micro svc

4.1.2.1.1. Re architecture

4.1.2.2. Progressive release

4.1.2.2.1. first, canary e.g. 1% traffic

4.1.2.2.2. then, progressive saturation: e,g, 10%, 25%, 50%, 100%

4.1.2.2.3. then next geo

4.1.2.2.4. enable by

4.1.2.3. Watch remaining global changes

4.1.2.3.1. e.g config management

4.1.2.3.2. Have a specialized role

4.1.2.4. Spread risks

4.1.2.4.1. track and remove SPOF single point of failure

4.1.2.4.2. transfer to a CSP, e.g. use GCP/PubSub, GCP Spanner ...

4.1.2.5. Mitigate impacts

4.1.2.5.1. Fail gracefully

5. Backgound

5.1. Inverse Conway Maneuver

5.1.1. Do not minimize that

5.1.1.1. The inter team comminication patterns

5.1.1.2. affect the software design patterns

5.1.2. Meaning

5.1.2.1. tightly-coupled team communication

5.1.2.2. delivers tightly-coupled architectures

5.1.2.3. so what?

5.1.2.3.1. small change

5.1.2.3.2. may result in

5.1.2.3.3. so what?

5.1.3. INVERSE

5.1.3.1. Loosely couple team communication

5.1.3.2. deliver lossely coupled architecture

5.2. Architecture archetype

5.2.1. compare

5.3. Microservices

5.3.1. Monolith vs microservices

5.3.2. by specific well bounded business functions

5.3.3. benefits

5.3.3.1. scalable

5.3.3.1.1. horizontal, vertical, and geographically

5.3.3.1.2. automatically

5.3.3.1.3. cattle vs pets

5.3.3.2. reliable

5.3.3.2.1. reduce blast radius, auto healing

5.3.3.3. agile

5.3.3.3.1. decouple team work by microservices

5.3.4. cost

5.3.4.1. additional complexity, e.g

5.3.4.1.1. cascading failures

5.3.4.1.2. network policies between micro svc

5.3.4.2. Wacht limits proactively !!

5.4. Load balancing

5.4.1. Expectation on backends

5.4.1.1. homogeneous

5.4.1.2. interchangeable

5.4.1.3. spread on multiple instances

5.4.1.3.1. min 3

5.4.1.3.2. ussually 100 -1000

5.4.1.3.3. sometime +10000

5.4.2. Benefits

5.4.2.1. Availability

5.4.2.1.1. Exclude lame ducks, trigger recycling

5.4.2.2. Latency

5.4.2.2.1. Route to the closet region

5.4.2.3. Scalability

5.4.2.3.1. By adding more backend instances

5.4.2.3.2. on meaning metrics

5.5. Stateless

5.5.1. transfer state to an external component

5.5.2. e.g.

5.5.2.1. GCP PubSub, BigQuery, Spanner, GCS

5.5.3. Benefits

5.5.3.1. Scalable

5.5.3.2. Reliable

5.6. CAP Theorem

5.6.1. Eric Brewer

5.6.2. A distributed system CANNOT be at the same time

5.6.2.1. C = Consistent: same data state everywhere

5.6.2.2. A = Available for read and writes

5.6.2.3. P = tolerent to Partition (e.g. loose a DC, a rack, a fiber ..)

5.6.3. Usually

5.6.3.1. P is a given requirement of distributed / cloud systems

5.6.3.2. Remains

5.6.3.2.1. AP

5.6.3.2.2. CP

5.6.4. Key SLO

5.6.4.1. Availability

5.6.4.1.1. cAp / Availability

5.6.4.2. Latency

5.6.4.2.1. caP / Partition Tolerant

5.6.4.3. Correctness

5.6.4.3.1. Cap / Consistency

5.7. N+2 principle

5.7.1. N instances = what is needed to serve (incl peak)

5.7.2. + first instance to cover an unexpected failure

5.7.3. +second instance to run planned offline maintenance