Lesson 50 of 60 intermediate

Application Dependency Mapping

Apps fail in layers, not magic

Open interactive version (quiz + challenge)

Real-world analogy

Every app is a kitchen line: front-of-house takes the order, the kitchen cooks, the dishwasher keeps up, the fridge holds ingredients. If the fridge is empty, the customer thinks ‘the restaurant is broken’ — but you find the real culprit by walking the line.

What is it?

App dependency mapping is the operational discipline that turns ‘app is slow’ into ‘service X is slow because dependency Y timed out.’ Once you think this way, every outage becomes solvable, not mystical.

Real-world relevance

A CRM page loads but saves fail. Dependency map shows CRM → API → identity → DB → storage. Identity is fine. DB healthy. Storage returns 503s for some regions — a cloud provider incident. You communicate, wait, validate recovery. Zero wasted escalations.

Key points

The 5-layer mental model — User → app frontend → app backend/API → supporting services (cache, queue, DB, file store, identity) → infra (OS, network, cloud). A failure usually lives in one layer; your job is to find it.
Health endpoints — Most apps expose /health or /status. Use them. Also ping DBs, hit cache, check queues, verify identity providers separately. A ‘green frontend’ can hide a red dependency.
The right first 5 questions — (1) Who is affected (one user, many, all)? (2) When did it start? (3) What changed? (4) Which dependency is failing? (5) What’s the error code / trace ID? Good answers prevent wasted hours.
Logs, metrics, traces — Logs: what happened. Metrics: numbers over time. Traces: a request’s journey across services. All three together beat any one alone.
Dependency diagrams — A one-page diagram for each app showing its dependencies saves the world during outages. Label external parties, identity providers, DNS, storage, secrets stores.
Secrets and certificates — Apps fail spectacularly when a secret rotates unexpectedly or a certificate expires. Know where they live, their renewal cadence, and their owner. ‘Certs expired at midnight’ has caused many global outages.
Tenants, regions, zones — Many SaaS apps have multi-tenant, multi-region architectures. An outage may affect only your tenant or only one region. Before declaring ‘it’s broken,’ ask scope.
Talking across teams — Junior IT is often the bridge between end users and multiple specialist teams (app, DB, network, cloud, vendor). Clear dependency language makes you the go-to person.

Code example

// Dependency-mapping template (per app)

App name:       BillingPortal
Owner team:     Finance Systems
Criticality:    Tier 1

External users: Customers via https://billing.contoso.com
Internal users: Finance ops team

Frontend:       React SPA on CDN + WAF
Backend:        REST API (container) in region A, zone 1+2

Dependencies (and failure signal):
  - Identity:       Entra (OIDC)                    -> login fails
  - DB:             PostgreSQL HA cluster           -> 500 on save
  - Cache:          Redis                          -> slow reads
  - Queue:          RabbitMQ                       -> delayed events
  - Storage:        S3 bucket for invoices         -> download fails
  - Secrets:        Key Vault                      -> startup crash
  - Email:          SMTP gateway                   -> no notifications
  - External:       Tax API                        -> partial failures

Monitors:
  - /health endpoint per service
  - synthetic user journey (login -> save -> download)
  - logs + metrics + traces with correlation IDs
  - on-call runbook for each failure mode

Line-by-line walkthrough

1. Dependency template
2. App name
3. Owner team
4. Criticality tier
5. Blank separator
6. External users
7. Internal users
8. Blank separator
9. Frontend description
10. Backend description
11. Blank separator
12. Dependencies list
13. Identity failure signal
14. DB failure signal
15. Cache failure signal
16. Queue failure signal
17. Storage failure signal
18. Secrets failure signal
19. Email failure signal
20. External API failure
21. Blank separator
22. Monitors
23. Per-service health endpoint
24. Synthetic user journey
25. Logs + metrics + traces + correlation IDs
26. On-call runbook per failure mode

Spot the bug

App outage ticket: 'Everything is broken, please fix!' Junior tells the whole company via email: 'CRM down, we don’t know yet, working on it.'

Need a hint?

Which two pieces of discipline are missing?

Show answer

(1) Scope first — identify whether this affects all users, one region, or one tenant before announcing enterprise-wide. (2) Use structured comms — follow the incident comms process (Comms Lead, channel, update cadence) rather than an unreviewed company-wide email. Calm, accurate, timed updates beat panic.

Explain like I'm 5

Every big app is a team of smaller helpers. When something breaks, don’t blame the whole team — find which helper fell down, then you can fix it fast.

Fun fact

In many severe incidents, the ‘failure’ turns out to be an expired TLS certificate somewhere quiet, like an internal API or a DNS-validated domain. Certificate observability (renewal alerts + inventory) pays for itself the first time it saves a Sunday night.

Hands-on challenge

Pick any app you use daily (email, bank app, streaming). Draw its likely dependency map: frontend, backend, identity, DB, cache, storage, external APIs. Mark how each failure would feel to a user.

More resources

Observability basics (Honeycomb) (Honeycomb)
OpenTelemetry intro (OpenTelemetry)
Tracing and observability talks (YouTube search)

Open interactive version (quiz + challenge) ← Back to course: IT Jobs Bootcamp