A notification service routes messages to the right channel (push, email, SMS) for the right user at the right time. The hard parts: per-user preferences, channel-specific delivery, retry on failure, and not waking the user at 3 AM. The naive implementation gets all four wrong.

Advertisement

API surface

Producers call send(user_id, template_id, params). The service resolves: which channels the user has opted into, whether quiet hours apply, whether dedup applies (sent the same template in last hour?). Then dispatches per-channel.

Channel adapters

Each channel (APNs, FCM, SendGrid, Twilio) has its own adapter implementing a uniform interface. Channels can fail independently: SMS down doesn't block push. Each adapter handles its own auth, rate-limiting, retry.

Advertisement

Retry strategy

Exponential backoff with jitter (1s, 2s, 4s, 8s, 16s). Cap at 5 attempts. After exhausted retries, route to dead-letter queue for human review. Don't retry on 4xx errors (user opted out, invalid token) — those won't succeed.

Quiet hours + frequency caps

Per-user table: quiet hours (e.g., 22:00–07:00 local time), max notifications per day. Service enforces before dispatching. Time-zone-aware (don't notify the Indian user at midnight UTC). The most-complained-about bug if you skip this.

Observability

Track per-notification: sent timestamp, delivered (carrier ACK), opened (push), bounced (email). Per-user metric: notification fatigue score (opens / sends over last 30 days). Throttle users with low engagement automatically.

Channel adapters + retry with DLQ + quiet hours + fatigue throttling. Per-user state is the heart; everything else is plumbing.