API Gateway Design Patterns

Reverend Philip Dec 23, 2025 9 min read

Implement API gateways for microservices. Handle routing, authentication, rate limiting, and request aggregation.

API gateways serve as the front door to your microservices, handling cross-cutting concerns like authentication, rate limiting, and request routing. This guide covers patterns for designing effective API gateways.

What Is an API Gateway?

The Problem

Without a gateway, clients must:

  • Know addresses of all services
  • Handle authentication with each service
  • Manage different protocols
  • Implement retry logic everywhere

This leads to duplicated logic and tight coupling between clients and your internal service architecture.

Client → Service A (auth, rate limit, logging)
      → Service B (auth, rate limit, logging)
      → Service C (auth, rate limit, logging)

The Solution

An API gateway centralizes these concerns, providing a single entry point for all client requests.

Client → API Gateway → Service A
                    → Service B
                    → Service C

The gateway handles cross-cutting concerns once.

Core Responsibilities

Request Routing

The gateway's most fundamental job is routing requests to the appropriate backend service based on the URL path. This configuration maps URL prefixes to internal service addresses.

# Kong configuration
services:
  - name: users-service
    url: http://users:8080
    routes:
      - paths: ["/api/users"]

  - name: orders-service
    url: http://orders:8080
    routes:
      - paths: ["/api/orders"]

  - name: products-service
    url: http://products:8080
    routes:
      - paths: ["/api/products"]

This routing abstraction lets you change internal service locations without affecting clients.

Authentication

Centralized authentication at the gateway means individual services trust that incoming requests are already authenticated. This simplifies service development significantly.

# Verify JWT at gateway
plugins:
  - name: jwt
    config:
      secret_is_base64: false
      claims_to_verify:
        - exp
      header_names:
        - Authorization

For more complex authentication logic, you can implement it as middleware in your gateway application. The following example validates JWTs and extracts user information for downstream services.

// Laravel gateway middleware
class GatewayAuthentication
{
    public function handle(Request $request, Closure $next)
    {
        $token = $request->bearerToken();

        if (!$token) {
            return response()->json(['error' => 'Unauthorized'], 401);
        }

        try {
            $payload = JWT::decode($token, $this->publicKey, ['RS256']);
            $request->attributes->set('user_id', $payload->sub);
            $request->attributes->set('scopes', $payload->scopes);
        } catch (Exception $e) {
            return response()->json(['error' => 'Invalid token'], 401);
        }

        return $next($request);
    }
}

Notice how the middleware attaches user information to the request attributes. Backend services can then access this data without re-validating the token.

Rate Limiting

Rate limiting protects your services from abuse and ensures fair resource allocation among clients. Configure limits based on your capacity and business requirements.

# Kong rate limiting
plugins:
  - name: rate-limiting
    config:
      minute: 100
      hour: 1000
      policy: redis
      redis_host: redis

Here is a Laravel implementation that supports different limits for authenticated and anonymous users.

// Laravel rate limiting
class GatewayRateLimiter
{
    public function handle(Request $request, Closure $next)
    {
        $key = $this->resolveKey($request);
        $limit = $this->getLimit($request);

        if (RateLimiter::tooManyAttempts($key, $limit)) {
            $retryAfter = RateLimiter::availableIn($key);
            return response()->json(
                ['error' => 'Rate limit exceeded'],
                429,
                ['Retry-After' => $retryAfter]
            );
        }

        RateLimiter::hit($key, 60);

        return $next($request);
    }

    private function resolveKey(Request $request): string
    {
        $userId = $request->attributes->get('user_id');
        return $userId ? "user:{$userId}" : "ip:{$request->ip()}";
    }
}

The Retry-After header tells clients exactly when they can retry, which is important for well-behaved API clients.

Request/Response Transformation

Sometimes you need to transform requests or responses to maintain backward compatibility or standardize formats. This middleware wraps legacy API responses in a consistent envelope format.

// Transform legacy API response
class TransformLegacyResponse
{
    public function handle(Request $request, Closure $next)
    {
        $response = $next($request);

        if ($request->is('api/v1/*')) {
            $data = json_decode($response->getContent(), true);

            // Transform to new format
            $transformed = [
                'data' => $data['result'] ?? $data,
                'meta' => [
                    'timestamp' => now()->toIso8601String(),
                    'version' => 'v1',
                ],
            ];

            $response->setContent(json_encode($transformed));
        }

        return $response;
    }
}

This pattern is particularly useful during API migrations, allowing you to maintain v1 compatibility while internal services return v2 formats.

Request Aggregation

Backend for Frontend (BFF)

Different clients often have different data needs. A mobile app might need a compact dashboard view, while a web app needs more detail. The BFF pattern creates specialized gateways for each client type.

Mobile App → Mobile BFF → Users Service
                       → Orders Service
                       → Products Service

Web App → Web BFF → Users Service
                 → Orders Service
                 → Products Service

The following controller aggregates data from multiple services into a single response optimized for mobile clients. Using parallel requests minimizes latency.

// Mobile BFF endpoint
class MobileDashboardController
{
    public function index(Request $request)
    {
        $userId = $request->attributes->get('user_id');

        // Parallel requests to backend services
        $responses = Http::pool(fn ($pool) => [
            $pool->get("http://users-service/users/{$userId}"),
            $pool->get("http://orders-service/users/{$userId}/orders?limit=5"),
            $pool->get("http://products-service/recommendations/{$userId}?limit=3"),
        ]);

        return response()->json([
            'user' => $responses[0]->json(),
            'recent_orders' => $responses[1]->json(),
            'recommendations' => $responses[2]->json(),
        ]);
    }
}

The Http::pool method executes all requests concurrently. Without this, three sequential 100ms requests would take 300ms total; with pooling, they complete in about 100ms.

GraphQL Gateway

GraphQL provides another approach to request aggregation, letting clients specify exactly what data they need in a single query.

# Single query aggregates multiple services
query Dashboard {
  user(id: "123") {          # Users service
    name
    email
  }
  orders(userId: "123") {    # Orders service
    id
    total
    status
  }
  recommendations(limit: 5) { # Products service
    id
    name
    price
  }
}

This approach shifts aggregation logic to the client, which can be more flexible but requires more sophisticated client implementations.

Circuit Breaker

Preventing Cascade Failures

When a backend service fails, you do not want the gateway to keep sending requests and backing up. A circuit breaker detects failures and fails fast, giving the struggling service time to recover.

class CircuitBreakerMiddleware
{
    public function handle(Request $request, Closure $next)
    {
        $service = $this->resolveService($request);
        $circuitBreaker = $this->getCircuitBreaker($service);

        if ($circuitBreaker->isOpen()) {
            return $this->fallbackResponse($service);
        }

        try {
            $response = $next($request);

            if ($response->isServerError()) {
                $circuitBreaker->recordFailure();
            } else {
                $circuitBreaker->recordSuccess();
            }

            return $response;
        } catch (Exception $e) {
            $circuitBreaker->recordFailure();
            return $this->fallbackResponse($service);
        }
    }

    private function fallbackResponse(string $service)
    {
        return response()->json([
            'error' => 'Service temporarily unavailable',
            'service' => $service,
        ], 503);
    }
}

The circuit breaker has three states: closed (normal operation), open (failing fast), and half-open (testing if recovery occurred). This pattern prevents a single failing service from bringing down your entire system.

Load Balancing

Strategies

The gateway distributes traffic across multiple instances of backend services. Different strategies suit different scenarios.

# Nginx upstream configuration
upstream backend {
    # Round Robin (default)
    server backend1:8080;
    server backend2:8080;
    server backend3:8080;
}

upstream backend_weighted {
    # Weighted
    server backend1:8080 weight=5;
    server backend2:8080 weight=3;
    server backend3:8080 weight=2;
}

upstream backend_least {
    # Least Connections
    least_conn;
    server backend1:8080;
    server backend2:8080;
}

upstream backend_hash {
    # IP Hash (sticky sessions)
    ip_hash;
    server backend1:8080;
    server backend2:8080;
}

Round robin works well for stateless services. Weighted distribution helps when servers have different capacities. Least connections is best when request processing times vary significantly. IP hash provides session affinity without explicit session management.

Protocol Translation

REST to gRPC

Your gateway can expose REST APIs while backend services use more efficient protocols like gRPC. This gives clients a familiar REST interface while your internal services benefit from gRPC's performance.

class GrpcBridgeController
{
    public function getUser(Request $request, $id)
    {
        $client = new UserServiceClient('users-service:50051', [
            'credentials' => ChannelCredentials::createInsecure(),
        ]);

        $grpcRequest = new GetUserRequest();
        $grpcRequest->setId($id);

        [$response, $status] = $client->GetUser($grpcRequest)->wait();

        if ($status->code !== \Grpc\STATUS_OK) {
            return response()->json(['error' => $status->details], 500);
        }

        return response()->json([
            'id' => $response->getId(),
            'name' => $response->getName(),
            'email' => $response->getEmail(),
        ]);
    }
}

This translation layer isolates clients from protocol changes and lets you evolve internal communication patterns independently.

WebSocket Proxy

WebSocket connections require special handling since they upgrade from HTTP and maintain long-lived connections. The gateway authenticates the initial connection then proxies the WebSocket traffic.

// Gateway handles WebSocket upgrade
const WebSocket = require('ws');
const httpProxy = require('http-proxy');

const proxy = httpProxy.createProxyServer({
    target: 'ws://notifications-service:8080',
    ws: true
});

server.on('upgrade', (req, socket, head) => {
    // Authenticate WebSocket connection
    const token = req.url.split('token=')[1];
    if (!validateToken(token)) {
        socket.destroy();
        return;
    }

    proxy.ws(req, socket, head);
});

Authenticating at connection time rather than per-message significantly reduces overhead for high-frequency WebSocket communication.

Caching

Response Caching

The gateway can cache responses to reduce backend load and improve latency. Only cache GET requests and respect cache headers from backend services.

class GatewayCacheMiddleware
{
    public function handle(Request $request, Closure $next)
    {
        if ($request->method() !== 'GET') {
            return $next($request);
        }

        $cacheKey = $this->cacheKey($request);
        $cached = Cache::get($cacheKey);

        if ($cached) {
            return response($cached['body'], 200, $cached['headers'])
                ->header('X-Cache', 'HIT');
        }

        $response = $next($request);

        if ($response->isSuccessful() && $this->isCacheable($response)) {
            $ttl = $this->parseCacheControl($response);
            Cache::put($cacheKey, [
                'body' => $response->getContent(),
                'headers' => $response->headers->all(),
            ], $ttl);
        }

        return $response->header('X-Cache', 'MISS');
    }
}

The X-Cache header helps with debugging and lets clients know whether they received a cached response.

Observability

Request Logging

Comprehensive request logging at the gateway provides visibility into all API traffic. Include correlation IDs to trace requests across services.

class RequestLoggingMiddleware
{
    public function handle(Request $request, Closure $next)
    {
        $requestId = Str::uuid()->toString();
        $request->headers->set('X-Request-ID', $requestId);

        $startTime = microtime(true);

        $response = $next($request);

        $duration = (microtime(true) - $startTime) * 1000;

        Log::info('API Request', [
            'request_id' => $requestId,
            'method' => $request->method(),
            'path' => $request->path(),
            'status' => $response->status(),
            'duration_ms' => round($duration, 2),
            'user_id' => $request->attributes->get('user_id'),
            'ip' => $request->ip(),
        ]);

        return $response->header('X-Request-ID', $requestId);
    }
}

Returning the request ID in the response header lets clients include it in support requests, making debugging much easier.

Distributed Tracing

Distributed tracing follows a request through multiple services. The gateway initiates the trace and passes trace context to backend services.

class TracingMiddleware
{
    public function handle(Request $request, Closure $next)
    {
        $traceId = $request->header('X-Trace-ID') ?? Str::uuid()->toString();
        $spanId = Str::uuid()->toString();

        $request->headers->set('X-Trace-ID', $traceId);
        $request->headers->set('X-Span-ID', $spanId);

        $response = $next($request);

        return $response
            ->header('X-Trace-ID', $traceId)
            ->header('X-Span-ID', $spanId);
    }
}

When combined with tracing in backend services, this enables powerful debugging capabilities like seeing exactly which service caused a slow request.

Gateway Technologies

Technology Type Best For
Kong Full-featured Enterprise, plugins
AWS API Gateway Managed AWS-native
Nginx Lightweight Simple routing
Envoy Service mesh Kubernetes
Laravel/Express Custom BFF, custom logic

Best Practices

  1. Keep gateways stateless - Scale horizontally
  2. Implement health checks - Monitor backend services
  3. Use circuit breakers - Prevent cascade failures
  4. Cache aggressively - Reduce backend load
  5. Log everything - Distributed tracing is essential
  6. Version your APIs - Support gradual migration
  7. Secure by default - Authenticate all requests

Conclusion

API gateways simplify client interactions with microservices by centralizing cross-cutting concerns. Start with basic routing and authentication, then add rate limiting, caching, and circuit breakers as needed. Choose between managed services for simplicity or custom implementations for flexibility. The gateway becomes critical infrastructure;invest in monitoring and high availability.

Share this article

Related Articles

Distributed Locking Patterns

Coordinate access to shared resources across services. Implement distributed locks with Redis, ZooKeeper, and databases.

Jan 16, 2026

Need help with your project?

Let's discuss how we can help you build reliable software.