API gateways serve as the front door to your microservices, handling cross-cutting concerns like authentication, rate limiting, and request routing. This guide covers patterns for designing effective API gateways.
What Is an API Gateway?
The Problem
Without a gateway, clients must:
- Know addresses of all services
- Handle authentication with each service
- Manage different protocols
- Implement retry logic everywhere
This leads to duplicated logic and tight coupling between clients and your internal service architecture.
Client → Service A (auth, rate limit, logging)
→ Service B (auth, rate limit, logging)
→ Service C (auth, rate limit, logging)
The Solution
An API gateway centralizes these concerns, providing a single entry point for all client requests.
Client → API Gateway → Service A
→ Service B
→ Service C
The gateway handles cross-cutting concerns once.
Core Responsibilities
Request Routing
The gateway's most fundamental job is routing requests to the appropriate backend service based on the URL path. This configuration maps URL prefixes to internal service addresses.
# Kong configuration
services:
- name: users-service
url: http://users:8080
routes:
- paths: ["/api/users"]
- name: orders-service
url: http://orders:8080
routes:
- paths: ["/api/orders"]
- name: products-service
url: http://products:8080
routes:
- paths: ["/api/products"]
This routing abstraction lets you change internal service locations without affecting clients.
Authentication
Centralized authentication at the gateway means individual services trust that incoming requests are already authenticated. This simplifies service development significantly.
# Verify JWT at gateway
plugins:
- name: jwt
config:
secret_is_base64: false
claims_to_verify:
- exp
header_names:
- Authorization
For more complex authentication logic, you can implement it as middleware in your gateway application. The following example validates JWTs and extracts user information for downstream services.
// Laravel gateway middleware
class GatewayAuthentication
{
public function handle(Request $request, Closure $next)
{
$token = $request->bearerToken();
if (!$token) {
return response()->json(['error' => 'Unauthorized'], 401);
}
try {
$payload = JWT::decode($token, $this->publicKey, ['RS256']);
$request->attributes->set('user_id', $payload->sub);
$request->attributes->set('scopes', $payload->scopes);
} catch (Exception $e) {
return response()->json(['error' => 'Invalid token'], 401);
}
return $next($request);
}
}
Notice how the middleware attaches user information to the request attributes. Backend services can then access this data without re-validating the token.
Rate Limiting
Rate limiting protects your services from abuse and ensures fair resource allocation among clients. Configure limits based on your capacity and business requirements.
# Kong rate limiting
plugins:
- name: rate-limiting
config:
minute: 100
hour: 1000
policy: redis
redis_host: redis
Here is a Laravel implementation that supports different limits for authenticated and anonymous users.
// Laravel rate limiting
class GatewayRateLimiter
{
public function handle(Request $request, Closure $next)
{
$key = $this->resolveKey($request);
$limit = $this->getLimit($request);
if (RateLimiter::tooManyAttempts($key, $limit)) {
$retryAfter = RateLimiter::availableIn($key);
return response()->json(
['error' => 'Rate limit exceeded'],
429,
['Retry-After' => $retryAfter]
);
}
RateLimiter::hit($key, 60);
return $next($request);
}
private function resolveKey(Request $request): string
{
$userId = $request->attributes->get('user_id');
return $userId ? "user:{$userId}" : "ip:{$request->ip()}";
}
}
The Retry-After header tells clients exactly when they can retry, which is important for well-behaved API clients.
Request/Response Transformation
Sometimes you need to transform requests or responses to maintain backward compatibility or standardize formats. This middleware wraps legacy API responses in a consistent envelope format.
// Transform legacy API response
class TransformLegacyResponse
{
public function handle(Request $request, Closure $next)
{
$response = $next($request);
if ($request->is('api/v1/*')) {
$data = json_decode($response->getContent(), true);
// Transform to new format
$transformed = [
'data' => $data['result'] ?? $data,
'meta' => [
'timestamp' => now()->toIso8601String(),
'version' => 'v1',
],
];
$response->setContent(json_encode($transformed));
}
return $response;
}
}
This pattern is particularly useful during API migrations, allowing you to maintain v1 compatibility while internal services return v2 formats.
Request Aggregation
Backend for Frontend (BFF)
Different clients often have different data needs. A mobile app might need a compact dashboard view, while a web app needs more detail. The BFF pattern creates specialized gateways for each client type.
Mobile App → Mobile BFF → Users Service
→ Orders Service
→ Products Service
Web App → Web BFF → Users Service
→ Orders Service
→ Products Service
The following controller aggregates data from multiple services into a single response optimized for mobile clients. Using parallel requests minimizes latency.
// Mobile BFF endpoint
class MobileDashboardController
{
public function index(Request $request)
{
$userId = $request->attributes->get('user_id');
// Parallel requests to backend services
$responses = Http::pool(fn ($pool) => [
$pool->get("http://users-service/users/{$userId}"),
$pool->get("http://orders-service/users/{$userId}/orders?limit=5"),
$pool->get("http://products-service/recommendations/{$userId}?limit=3"),
]);
return response()->json([
'user' => $responses[0]->json(),
'recent_orders' => $responses[1]->json(),
'recommendations' => $responses[2]->json(),
]);
}
}
The Http::pool method executes all requests concurrently. Without this, three sequential 100ms requests would take 300ms total; with pooling, they complete in about 100ms.
GraphQL Gateway
GraphQL provides another approach to request aggregation, letting clients specify exactly what data they need in a single query.
# Single query aggregates multiple services
query Dashboard {
user(id: "123") { # Users service
name
email
}
orders(userId: "123") { # Orders service
id
total
status
}
recommendations(limit: 5) { # Products service
id
name
price
}
}
This approach shifts aggregation logic to the client, which can be more flexible but requires more sophisticated client implementations.
Circuit Breaker
Preventing Cascade Failures
When a backend service fails, you do not want the gateway to keep sending requests and backing up. A circuit breaker detects failures and fails fast, giving the struggling service time to recover.
class CircuitBreakerMiddleware
{
public function handle(Request $request, Closure $next)
{
$service = $this->resolveService($request);
$circuitBreaker = $this->getCircuitBreaker($service);
if ($circuitBreaker->isOpen()) {
return $this->fallbackResponse($service);
}
try {
$response = $next($request);
if ($response->isServerError()) {
$circuitBreaker->recordFailure();
} else {
$circuitBreaker->recordSuccess();
}
return $response;
} catch (Exception $e) {
$circuitBreaker->recordFailure();
return $this->fallbackResponse($service);
}
}
private function fallbackResponse(string $service)
{
return response()->json([
'error' => 'Service temporarily unavailable',
'service' => $service,
], 503);
}
}
The circuit breaker has three states: closed (normal operation), open (failing fast), and half-open (testing if recovery occurred). This pattern prevents a single failing service from bringing down your entire system.
Load Balancing
Strategies
The gateway distributes traffic across multiple instances of backend services. Different strategies suit different scenarios.
# Nginx upstream configuration
upstream backend {
# Round Robin (default)
server backend1:8080;
server backend2:8080;
server backend3:8080;
}
upstream backend_weighted {
# Weighted
server backend1:8080 weight=5;
server backend2:8080 weight=3;
server backend3:8080 weight=2;
}
upstream backend_least {
# Least Connections
least_conn;
server backend1:8080;
server backend2:8080;
}
upstream backend_hash {
# IP Hash (sticky sessions)
ip_hash;
server backend1:8080;
server backend2:8080;
}
Round robin works well for stateless services. Weighted distribution helps when servers have different capacities. Least connections is best when request processing times vary significantly. IP hash provides session affinity without explicit session management.
Protocol Translation
REST to gRPC
Your gateway can expose REST APIs while backend services use more efficient protocols like gRPC. This gives clients a familiar REST interface while your internal services benefit from gRPC's performance.
class GrpcBridgeController
{
public function getUser(Request $request, $id)
{
$client = new UserServiceClient('users-service:50051', [
'credentials' => ChannelCredentials::createInsecure(),
]);
$grpcRequest = new GetUserRequest();
$grpcRequest->setId($id);
[$response, $status] = $client->GetUser($grpcRequest)->wait();
if ($status->code !== \Grpc\STATUS_OK) {
return response()->json(['error' => $status->details], 500);
}
return response()->json([
'id' => $response->getId(),
'name' => $response->getName(),
'email' => $response->getEmail(),
]);
}
}
This translation layer isolates clients from protocol changes and lets you evolve internal communication patterns independently.
WebSocket Proxy
WebSocket connections require special handling since they upgrade from HTTP and maintain long-lived connections. The gateway authenticates the initial connection then proxies the WebSocket traffic.
// Gateway handles WebSocket upgrade
const WebSocket = require('ws');
const httpProxy = require('http-proxy');
const proxy = httpProxy.createProxyServer({
target: 'ws://notifications-service:8080',
ws: true
});
server.on('upgrade', (req, socket, head) => {
// Authenticate WebSocket connection
const token = req.url.split('token=')[1];
if (!validateToken(token)) {
socket.destroy();
return;
}
proxy.ws(req, socket, head);
});
Authenticating at connection time rather than per-message significantly reduces overhead for high-frequency WebSocket communication.
Caching
Response Caching
The gateway can cache responses to reduce backend load and improve latency. Only cache GET requests and respect cache headers from backend services.
class GatewayCacheMiddleware
{
public function handle(Request $request, Closure $next)
{
if ($request->method() !== 'GET') {
return $next($request);
}
$cacheKey = $this->cacheKey($request);
$cached = Cache::get($cacheKey);
if ($cached) {
return response($cached['body'], 200, $cached['headers'])
->header('X-Cache', 'HIT');
}
$response = $next($request);
if ($response->isSuccessful() && $this->isCacheable($response)) {
$ttl = $this->parseCacheControl($response);
Cache::put($cacheKey, [
'body' => $response->getContent(),
'headers' => $response->headers->all(),
], $ttl);
}
return $response->header('X-Cache', 'MISS');
}
}
The X-Cache header helps with debugging and lets clients know whether they received a cached response.
Observability
Request Logging
Comprehensive request logging at the gateway provides visibility into all API traffic. Include correlation IDs to trace requests across services.
class RequestLoggingMiddleware
{
public function handle(Request $request, Closure $next)
{
$requestId = Str::uuid()->toString();
$request->headers->set('X-Request-ID', $requestId);
$startTime = microtime(true);
$response = $next($request);
$duration = (microtime(true) - $startTime) * 1000;
Log::info('API Request', [
'request_id' => $requestId,
'method' => $request->method(),
'path' => $request->path(),
'status' => $response->status(),
'duration_ms' => round($duration, 2),
'user_id' => $request->attributes->get('user_id'),
'ip' => $request->ip(),
]);
return $response->header('X-Request-ID', $requestId);
}
}
Returning the request ID in the response header lets clients include it in support requests, making debugging much easier.
Distributed Tracing
Distributed tracing follows a request through multiple services. The gateway initiates the trace and passes trace context to backend services.
class TracingMiddleware
{
public function handle(Request $request, Closure $next)
{
$traceId = $request->header('X-Trace-ID') ?? Str::uuid()->toString();
$spanId = Str::uuid()->toString();
$request->headers->set('X-Trace-ID', $traceId);
$request->headers->set('X-Span-ID', $spanId);
$response = $next($request);
return $response
->header('X-Trace-ID', $traceId)
->header('X-Span-ID', $spanId);
}
}
When combined with tracing in backend services, this enables powerful debugging capabilities like seeing exactly which service caused a slow request.
Gateway Technologies
| Technology | Type | Best For |
|---|---|---|
| Kong | Full-featured | Enterprise, plugins |
| AWS API Gateway | Managed | AWS-native |
| Nginx | Lightweight | Simple routing |
| Envoy | Service mesh | Kubernetes |
| Laravel/Express | Custom | BFF, custom logic |
Best Practices
- Keep gateways stateless - Scale horizontally
- Implement health checks - Monitor backend services
- Use circuit breakers - Prevent cascade failures
- Cache aggressively - Reduce backend load
- Log everything - Distributed tracing is essential
- Version your APIs - Support gradual migration
- Secure by default - Authenticate all requests
Conclusion
API gateways simplify client interactions with microservices by centralizing cross-cutting concerns. Start with basic routing and authentication, then add rate limiting, caching, and circuit breakers as needed. Choose between managed services for simplicity or custom implementations for flexibility. The gateway becomes critical infrastructure;invest in monitoring and high availability.