Theoretical Foundations

Welcome to the curriculum workspace. Here you will find long-form technical guidelines outlining core architectural blueprints and implementation mechanics.

Module 13: Edge Gateways, Security, & Traffic Shaping

PREREQUISITE STATEMENT: Read this module after completing Module 12 (CQRS & Event Sourcing). Separating write and read pathways optimizes internal state management, but exposing these raw interfaces directly to the client network creates severe security risks, protocol mismatches, and data exposure. This module teaches you how to shield your internal microservices using an intelligent Edge API Gateway.

1. Introduction: The Hazards of Direct Backend Exposure

Exposing backend microservices directly to the public internet presents significant architectural vulnerabilities:

Security Sprawl: Every service must independently implement authentication, authorization check logic, SSL/TLS certificates, and IP blocklists. This increases code duplication and the probability of a security configuration error.
Cross-Origin Resource Sharing (CORS) Friction: Client browsers calling multiple API domains require complex CORS negotiations, which can result in slow handshake performance.
Protocol Mismatches: External clients (web, mobile, third-party integrations) typically communicate via standard REST/JSON HTTP/1.1 protocols, whereas internal microservices are optimized for binary communication like gRPC over HTTP/2. Exposing internal gRPC directly requires complex routing.
Client Chatty Communication: Exposing raw services forces clients to execute multiple synchronous calls to construct a single screen (e.g. Call User service, Call Order service, Call Catalog service), introducing high latency.

An Edge API Gateway acts as a reverse proxy, serving as the single entry point for all client traffic. It isolates downstream services from client networks and handles cross-cutting concerns:

[ Client App ] --( Public HTTP/JSON )--> [ API GATEWAY ] --( Private gRPC/TCP )--> [ Backend Microservices ]

2. Distributed Authentication & Token Translation at the Edge

A primary responsibility of the API Gateway is validating client credentials at the network edge, preventing unauthorized traffic from saturating internal resources.

sequenceDiagram
    participant Client as Client Browser
    participant GW as API Gateway
    participant IDP as Identity Provider (IdP)
    participant AuthSvc as Backend Auth Service

    Client->>GW: 1. Request with Bearer Token (JWT)
    Note over GW: 2. Check JWKS Cache for Key
    alt JWKS Key not in Cache
        GW->>IDP: 3. Fetch JWKS Public Keys
        IDP-->>GW: 4. Return Public Keys
    end
    Note over GW: 5. Cryptographically Verify Signature
    GW->>AuthSvc: 6. Forward Request with X-User-Id / X-User-Roles
    AuthSvc-->>GW: 7. HTTP 200 OK (Processed)
    GW-->>Client: 8. HTTP 200 OK (Response)

A. JWT Signature Verification

Modern API gateways (e.g., Kong, Apigee, NGINX) validate JSON Web Tokens (JWT) cryptographically at the edge:

The client includes a signed JWT access token in the Authorization: Bearer <JWT> HTTP header.
The gateway retrieves the public keys of the Identity Provider (IdP) via a standard JWKS (JSON Web Key Set) endpoint.
The gateway caches the JWKS keys locally. It validates the signature, expiration time (exp), and audience claims (aud) of the incoming token without performing a database lookup or calling the authentication server.

B. The Token Translation Pattern

While external clients use signed JWTs (which contain user profile information and are heavy in size), passing these raw tokens internally to every microservice is inefficient.

The API Gateway implements the Token Translation Pattern:

The gateway validates the incoming external JWT token.
It strips the token from the header and translates it into lightweight, trusted header metadata (e.g., X-User-Id, X-User-Roles, X-Client-Type).
Internal microservices trust these headers implicitly because the gateway acts as a secure firewall, preventing external clients from sending spoofed X- headers directly.

3. Traffic Shaping & Throttling Algorithms

To protect backend services from Denial of Service (DoS) attacks, automated scrapers, and resource exhaustion, you must throttle traffic using Rate Limiting Algorithms.

             [ Token Bucket ]                        [ Leaky Bucket ]
         Tokens refilled at rate r               Requests exit at rate r
             +-----------+                           +-----------+
             |  Tokens   |                           |  Queue    |
             |  (Burst)  |                           | (Smooth)  |
             +-----------+                           +-----------+

A. Token Bucket

Logic: A bucket of capacity $B$ holds tokens. Tokens are added to the bucket at a constant rate $r$ tokens/sec. Each request consumes one token. If the bucket is empty, the request is rejected with HTTP 429 Too Many Requests.
The Redis Implementation: To scale rate limiters, we avoid running background timers to increment tokens. Instead, we store a timestamp representing the last evaluation time and calculate the token balance dynamically on each write: $$\text{Tokens}{\text{current}} = \min(B, \text{Tokens}{\text{last}} + (\text{Time}{\text{now}} - \text{Time}{\text{last}}) \times r)$$
Pros: Allows handling short bursts of traffic (up to the maximum capacity $B$).

B. Leaky Bucket

Logic: Requests enter a queue (the bucket) with capacity $C$. The bucket leaks requests at a constant, smooth rate $r$ requests/sec, distributing them to backend services. If requests arrive faster than they can leak, they queue up. If the queue is full, new requests overflow and are rejected.
Pros: Guarantees a smooth flow of traffic, protecting backend services from burst-induced spikes.
Cons: Increases request latency for clients because requests are queued rather than processed immediately.

C. Sliding Window Log

Logic: Tracks the exact timestamp of every request for a client in a sorted set (such as a Redis ZSET). When a request arrives:
1. Delete all log entries older than the current window limit ($\text{Time}_{\text{now}} - \text{Window}$).
2. Count the remaining elements in the set.
3. If the count is less than the limit, allow the request and write the new timestamp to the set.
Pros: Extremely accurate.
Cons: High memory usage; storing a timestamp for every single request can consume gigabytes of RAM in high-traffic scenarios.

D. Sliding Window Counter

Logic: An approximation algorithm that divides time into fixed windows (e.g. 1 minute). It tracks request counts for the current window and the previous window.

If a request arrives, the algorithm calculates a weighted sum of the requests: $$\text{Rate} = \text{Count}{\text{prev}} \times \left(1 - \frac{\text{Time}{\text{window_elapsed}}}{\text{Window_Size}}\right) + \text{Count}_{\text{curr}}$$

If $\text{Rate}$ is less than the limit, the request is accepted and Count_curr is incremented.
Pros: Low memory overhead (requires only two counters per client) and high execution speed.

E. Rate Limiter Comparison Table

Algorithm	Memory Overhead	Handles Bursts?	Latency Impact	Edge Use Case
Token Bucket	Very Low (2 fields)	Yes (up to $B$)	Negligible	General API rate limiting
Leaky Bucket	Low (bounded by queue)	No (forces smooth rate)	High (queues requests)	Egress queue smoothing
Sliding Window Log	High (grows with requests)	Yes	Negligible	Strict SLA tracking
Sliding Window Counter	Very Low (2 counters)	Yes	Negligible	High-scale client throttling

4. SSL/TLS Termination & Protocol Translation

SSL Termination: Performing the cryptographic SSL/TLS handshake takes substantial CPU overhead. The API Gateway terminates TLS connections at the edge, decrypting packets and forwarding them to backend services via fast, unencrypted TCP or HTTP/2 connections inside the secure VPC.
Protocol Translation: The gateway exposes standard REST endpoints to the internet while translating them into internal gRPC payloads. This keeps external developer integration simple while leveraging gRPC's binary serialization performance within the internal network.

5. Documentation Standard: Edge Traffic Shaping Architecture

Below is a declarative Kong API Gateway Configuration detailing routing, JWT validation, and rate limiting rules for a shopping cart service:

_format_version: "3.0"
services:
  - name: checkout-service
    url: http://checkout-internal.vpc:50051
    routes:
      - name: public-checkout
        paths:
          - /v1/checkout
        strip_path: true
plugins:
  # 1. Edge JWT Verification
  - name: jwt
    config:
      uri_param_names:
        - jwt
      cookie_names:
        - session_token
      claims_to_verify:
        - exp
        - aud

  # 2. Token Bucket Rate Limiting (100 req/min, burst capacity 200)
  - name: rate-limiting
    config:
      minute: 100
      policy: redis
      redis_host: rate-limit-store.vpc
      redis_port: 6379
      limit_by: consumer
      fault_tolerant: false

6. Hands-on Architecture Challenge

Scenario Description

A client application (ClientApp) communicates directly with an internal microservice (InternalService) across public endpoints. There is no access control validation, and no rate limiting is enforced to protect the server from resource exhaustion.

Your Goal:

Insert an APIGateway between the Client and the InternalService.
Within the gateway boundary, show a RateLimiter intercepting requests.
Show the RateLimiter querying a TokenBucketStore (Redis database).
Model the two routing flows:
- Allowed path: If tokens are available ($> 0$), decrement token count and forward the request to the InternalService.
- Blocked path: If the bucket is empty ($= 0$), return HTTP 429 (Too Many Requests) back to the client.
Draw this gateway architecture using the diagram editor's graph syntax.

7. Practice Challenge Template

Use this template in your sandbox to model the API Gateway traffic shaping system:

graph TD
    subgraph Legacy_Direct [Legacy Direct Access]
        ClientAppDirect[Client App] -->|Direct Raw Calls| InternalServiceDirect[Internal Service]
        style ClientAppDirect fill:#faa,stroke:#333,stroke-width:2px
        style InternalServiceDirect fill:#faa,stroke:#333,stroke-width:2px
    end

    subgraph Target_Gateway [Target Edge Gateway Architecture]
        ClientApp[Client App] -->|1. HTTP Request| APIGateway[Edge API Gateway]
        
        subgraph Gateway_Boundary [Gateway Boundary]
            APIGateway -->|2. Check Limit| RateLimiter[Rate Limiter]
            RateLimiter -->|3. Query Keys| RedisStore[(Redis Token Store)]
        end
        
        RateLimiter -->|4. If Tokens > 0 Forward| InternalService[Internal Service]
        RateLimiter -->|4. If Tokens = 0 Block| ErrorResponse[HTTP 429 Too Many Requests]
        
        style ClientApp fill:#9f9,stroke:#333,stroke-width:2px
        style APIGateway fill:#9ff,stroke:#333,stroke-width:3px
        style RateLimiter fill:#9ff,stroke:#333,stroke-width:2px
        style RedisStore fill:#9ff,stroke:#333,stroke-width:2px
        style InternalService fill:#9f9,stroke:#333,stroke-width:2px
        style ErrorResponse fill:#f99,stroke:#333,stroke-width:2px
    end

NEXT MODULE BRIDGE: Shielding your service boundaries with edge gateways manages incoming traffic volume, but internal network outages and database timeouts will still occur. Proceed to Module 14: Fault Tolerance & Self-Healing Infrastructure to discover how to build resilient systems using Circuit Breakers, Bulkheads, and Graceful Degradation strategies.

Theoretical Foundations

Welcome to the curriculum workspace. Here you will find long-form technical guidelines outlining core architectural blueprints and implementation mechanics.

Module 13: Edge Gateways, Security, & Traffic Shaping

PREREQUISITE STATEMENT: Read this module after completing Module 12 (CQRS & Event Sourcing). Separating write and read pathways optimizes internal state management, but exposing these raw interfaces directly to the client network creates severe security risks, protocol mismatches, and data exposure. This module teaches you how to shield your internal microservices using an intelligent Edge API Gateway.

1. Introduction: The Hazards of Direct Backend Exposure

Exposing backend microservices directly to the public internet presents significant architectural vulnerabilities:

Security Sprawl: Every service must independently implement authentication, authorization check logic, SSL/TLS certificates, and IP blocklists. This increases code duplication and the probability of a security configuration error.
Cross-Origin Resource Sharing (CORS) Friction: Client browsers calling multiple API domains require complex CORS negotiations, which can result in slow handshake performance.
Protocol Mismatches: External clients (web, mobile, third-party integrations) typically communicate via standard REST/JSON HTTP/1.1 protocols, whereas internal microservices are optimized for binary communication like gRPC over HTTP/2. Exposing internal gRPC directly requires complex routing.
Client Chatty Communication: Exposing raw services forces clients to execute multiple synchronous calls to construct a single screen (e.g. Call User service, Call Order service, Call Catalog service), introducing high latency.

An Edge API Gateway acts as a reverse proxy, serving as the single entry point for all client traffic. It isolates downstream services from client networks and handles cross-cutting concerns:

[ Client App ] --( Public HTTP/JSON )--> [ API GATEWAY ] --( Private gRPC/TCP )--> [ Backend Microservices ]

2. Distributed Authentication & Token Translation at the Edge

A primary responsibility of the API Gateway is validating client credentials at the network edge, preventing unauthorized traffic from saturating internal resources.

sequenceDiagram
    participant Client as Client Browser
    participant GW as API Gateway
    participant IDP as Identity Provider (IdP)
    participant AuthSvc as Backend Auth Service

    Client->>GW: 1. Request with Bearer Token (JWT)
    Note over GW: 2. Check JWKS Cache for Key
    alt JWKS Key not in Cache
        GW->>IDP: 3. Fetch JWKS Public Keys
        IDP-->>GW: 4. Return Public Keys
    end
    Note over GW: 5. Cryptographically Verify Signature
    GW->>AuthSvc: 6. Forward Request with X-User-Id / X-User-Roles
    AuthSvc-->>GW: 7. HTTP 200 OK (Processed)
    GW-->>Client: 8. HTTP 200 OK (Response)

A. JWT Signature Verification

Modern API gateways (e.g., Kong, Apigee, NGINX) validate JSON Web Tokens (JWT) cryptographically at the edge:

The client includes a signed JWT access token in the Authorization: Bearer <JWT> HTTP header.
The gateway retrieves the public keys of the Identity Provider (IdP) via a standard JWKS (JSON Web Key Set) endpoint.
The gateway caches the JWKS keys locally. It validates the signature, expiration time (exp), and audience claims (aud) of the incoming token without performing a database lookup or calling the authentication server.

B. The Token Translation Pattern

While external clients use signed JWTs (which contain user profile information and are heavy in size), passing these raw tokens internally to every microservice is inefficient.

The API Gateway implements the Token Translation Pattern:

The gateway validates the incoming external JWT token.
It strips the token from the header and translates it into lightweight, trusted header metadata (e.g., X-User-Id, X-User-Roles, X-Client-Type).
Internal microservices trust these headers implicitly because the gateway acts as a secure firewall, preventing external clients from sending spoofed X- headers directly.

3. Traffic Shaping & Throttling Algorithms

To protect backend services from Denial of Service (DoS) attacks, automated scrapers, and resource exhaustion, you must throttle traffic using Rate Limiting Algorithms.

             [ Token Bucket ]                        [ Leaky Bucket ]
         Tokens refilled at rate r               Requests exit at rate r
             +-----------+                           +-----------+
             |  Tokens   |                           |  Queue    |
             |  (Burst)  |                           | (Smooth)  |
             +-----------+                           +-----------+

A. Token Bucket

Logic: A bucket of capacity $B$ holds tokens. Tokens are added to the bucket at a constant rate $r$ tokens/sec. Each request consumes one token. If the bucket is empty, the request is rejected with HTTP 429 Too Many Requests.
The Redis Implementation: To scale rate limiters, we avoid running background timers to increment tokens. Instead, we store a timestamp representing the last evaluation time and calculate the token balance dynamically on each write: $$\text{Tokens}{\text{current}} = \min(B, \text{Tokens}{\text{last}} + (\text{Time}{\text{now}} - \text{Time}{\text{last}}) \times r)$$
Pros: Allows handling short bursts of traffic (up to the maximum capacity $B$).

B. Leaky Bucket

Logic: Requests enter a queue (the bucket) with capacity $C$. The bucket leaks requests at a constant, smooth rate $r$ requests/sec, distributing them to backend services. If requests arrive faster than they can leak, they queue up. If the queue is full, new requests overflow and are rejected.
Pros: Guarantees a smooth flow of traffic, protecting backend services from burst-induced spikes.
Cons: Increases request latency for clients because requests are queued rather than processed immediately.

C. Sliding Window Log

Logic: Tracks the exact timestamp of every request for a client in a sorted set (such as a Redis ZSET). When a request arrives:
1. Delete all log entries older than the current window limit ($\text{Time}_{\text{now}} - \text{Window}$).
2. Count the remaining elements in the set.
3. If the count is less than the limit, allow the request and write the new timestamp to the set.
Pros: Extremely accurate.
Cons: High memory usage; storing a timestamp for every single request can consume gigabytes of RAM in high-traffic scenarios.

D. Sliding Window Counter

Logic: An approximation algorithm that divides time into fixed windows (e.g. 1 minute). It tracks request counts for the current window and the previous window.

If a request arrives, the algorithm calculates a weighted sum of the requests: $$\text{Rate} = \text{Count}{\text{prev}} \times \left(1 - \frac{\text{Time}{\text{window_elapsed}}}{\text{Window_Size}}\right) + \text{Count}_{\text{curr}}$$

If $\text{Rate}$ is less than the limit, the request is accepted and Count_curr is incremented.
Pros: Low memory overhead (requires only two counters per client) and high execution speed.

E. Rate Limiter Comparison Table

Algorithm	Memory Overhead	Handles Bursts?	Latency Impact	Edge Use Case
Token Bucket	Very Low (2 fields)	Yes (up to $B$)	Negligible	General API rate limiting
Leaky Bucket	Low (bounded by queue)	No (forces smooth rate)	High (queues requests)	Egress queue smoothing
Sliding Window Log	High (grows with requests)	Yes	Negligible	Strict SLA tracking
Sliding Window Counter	Very Low (2 counters)	Yes	Negligible	High-scale client throttling

4. SSL/TLS Termination & Protocol Translation

SSL Termination: Performing the cryptographic SSL/TLS handshake takes substantial CPU overhead. The API Gateway terminates TLS connections at the edge, decrypting packets and forwarding them to backend services via fast, unencrypted TCP or HTTP/2 connections inside the secure VPC.
Protocol Translation: The gateway exposes standard REST endpoints to the internet while translating them into internal gRPC payloads. This keeps external developer integration simple while leveraging gRPC's binary serialization performance within the internal network.

5. Documentation Standard: Edge Traffic Shaping Architecture

Below is a declarative Kong API Gateway Configuration detailing routing, JWT validation, and rate limiting rules for a shopping cart service:

_format_version: "3.0"
services:
  - name: checkout-service
    url: http://checkout-internal.vpc:50051
    routes:
      - name: public-checkout
        paths:
          - /v1/checkout
        strip_path: true
plugins:
  # 1. Edge JWT Verification
  - name: jwt
    config:
      uri_param_names:
        - jwt
      cookie_names:
        - session_token
      claims_to_verify:
        - exp
        - aud

  # 2. Token Bucket Rate Limiting (100 req/min, burst capacity 200)
  - name: rate-limiting
    config:
      minute: 100
      policy: redis
      redis_host: rate-limit-store.vpc
      redis_port: 6379
      limit_by: consumer
      fault_tolerant: false

6. Hands-on Architecture Challenge

Scenario Description

Your Goal:

Insert an APIGateway between the Client and the InternalService.
Within the gateway boundary, show a RateLimiter intercepting requests.
Show the RateLimiter querying a TokenBucketStore (Redis database).
Model the two routing flows:
- Allowed path: If tokens are available ($> 0$), decrement token count and forward the request to the InternalService.
- Blocked path: If the bucket is empty ($= 0$), return HTTP 429 (Too Many Requests) back to the client.
Draw this gateway architecture using the diagram editor's graph syntax.

7. Practice Challenge Template

Use this template in your sandbox to model the API Gateway traffic shaping system:

graph TD
    subgraph Legacy_Direct [Legacy Direct Access]
        ClientAppDirect[Client App] -->|Direct Raw Calls| InternalServiceDirect[Internal Service]
        style ClientAppDirect fill:#faa,stroke:#333,stroke-width:2px
        style InternalServiceDirect fill:#faa,stroke:#333,stroke-width:2px
    end

    subgraph Target_Gateway [Target Edge Gateway Architecture]
        ClientApp[Client App] -->|1. HTTP Request| APIGateway[Edge API Gateway]
        
        subgraph Gateway_Boundary [Gateway Boundary]
            APIGateway -->|2. Check Limit| RateLimiter[Rate Limiter]
            RateLimiter -->|3. Query Keys| RedisStore[(Redis Token Store)]
        end
        
        RateLimiter -->|4. If Tokens > 0 Forward| InternalService[Internal Service]
        RateLimiter -->|4. If Tokens = 0 Block| ErrorResponse[HTTP 429 Too Many Requests]
        
        style ClientApp fill:#9f9,stroke:#333,stroke-width:2px
        style APIGateway fill:#9ff,stroke:#333,stroke-width:3px
        style RateLimiter fill:#9ff,stroke:#333,stroke-width:2px
        style RedisStore fill:#9ff,stroke:#333,stroke-width:2px
        style InternalService fill:#9f9,stroke:#333,stroke-width:2px
        style ErrorResponse fill:#f99,stroke:#333,stroke-width:2px
    end

NEXT MODULE BRIDGE: Shielding your service boundaries with edge gateways manages incoming traffic volume, but internal network outages and database timeouts will still occur. Proceed to Module 14: Fault Tolerance & Self-Healing Infrastructure to discover how to build resilient systems using Circuit Breakers, Bulkheads, and Graceful Degradation strategies.

Module 13: Edge Gateways & Traffic Shaping

Theoretical Foundations

Module 13: Edge Gateways, Security, & Traffic Shaping

1. Introduction: The Hazards of Direct Backend Exposure

2. Distributed Authentication & Token Translation at the Edge

A. JWT Signature Verification

B. The Token Translation Pattern

3. Traffic Shaping & Throttling Algorithms

A. Token Bucket

B. Leaky Bucket

C. Sliding Window Log

D. Sliding Window Counter

E. Rate Limiter Comparison Table

4. SSL/TLS Termination & Protocol Translation

5. Documentation Standard: Edge Traffic Shaping Architecture

6. Hands-on Architecture Challenge

Scenario Description

Your Goal:

7. Practice Challenge Template

Module Deliverables

Module 13: Edge Gateways & Traffic Shaping

Theoretical Foundations

Module 13: Edge Gateways, Security, & Traffic Shaping

1. Introduction: The Hazards of Direct Backend Exposure

2. Distributed Authentication & Token Translation at the Edge

A. JWT Signature Verification

B. The Token Translation Pattern

3. Traffic Shaping & Throttling Algorithms

A. Token Bucket

B. Leaky Bucket

C. Sliding Window Log

D. Sliding Window Counter

E. Rate Limiter Comparison Table

4. SSL/TLS Termination & Protocol Translation

5. Documentation Standard: Edge Traffic Shaping Architecture

6. Hands-on Architecture Challenge

Scenario Description

Your Goal:

7. Practice Challenge Template

Module Deliverables