Rate Limiting Guide

Overview

Rate limiting protects your MCP server from abuse by restricting the number of requests users can make within a time window. NitroStack provides built-in rate limiting via the @RateLimit decorator.

Basic Rate Limiting

Using @RateLimit Decorator

import { Tool, RateLimit } from 'nitrostack';

@Tool({ name: 'send_email' })
@RateLimit({ requests: 10, window: '1m' })  // 10 requests per minute
async sendEmail(input: any, ctx: ExecutionContext) {
  await this.emailService.send(input);
  return { success: true };
}

Rate Limit Options

interface RateLimitOptions {
  requests: number;                        // Max requests allowed
  window: string;                          // Time window ('1m', '1h', '1d')
  key?: (ctx: ExecutionContext) => string; // Custom rate limit key
  message?: string;                        // Custom error message
  skipSuccessfulRequests?: boolean;        // Only count failed requests
  skipFailedRequests?: boolean;            // Only count successful requests
}

Time Windows

Common Windows

// Per minute
@RateLimit({ requests: 60, window: '1m' })

// Per hour
@RateLimit({ requests: 1000, window: '1h' })

// Per day
@RateLimit({ requests: 10000, window: '1d' })

// Per week
@RateLimit({ requests: 50000, window: '7d' })

Window Formats

'1s'   // 1 second
'30s'  // 30 seconds
'1m'   // 1 minute
'5m'   // 5 minutes
'1h'   // 1 hour
'12h'  // 12 hours
'1d'   // 1 day
'7d'   // 7 days

Rate Limit Keys

Default (IP-Based)

// Limits by IP address
@RateLimit({ requests: 100, window: '1h' })

User-Based

@Tool({ name: 'create_post' })
@UseGuards(JWTGuard)
@RateLimit({
  requests: 50,
  window: '1h',
  key: (ctx) => ctx.auth?.subject || 'anonymous'
})
async createPost(input: any, ctx: ExecutionContext) {
  // Each user has their own limit
}

API Key-Based

@Tool({ name: 'api_call' })
@UseGuards(ApiKeyGuard)
@RateLimit({
  requests: 1000,
  window: '1h',
  key: (ctx) => ctx.auth?.keyId || 'unknown'
})
async apiCall(input: any, ctx: ExecutionContext) {
  // Each API key has its own limit
}

Custom Key

@Tool({ name: 'search' })
@RateLimit({
  requests: 10,
  window: '1m',
  key: (ctx) => {
    const userId = ctx.auth?.subject;
    const endpoint = ctx.toolName;
    return `${userId}:${endpoint}`;
  }
})
async search(input: any, ctx: ExecutionContext) {
  // Limit per user per endpoint
}

Tiered Rate Limits

By User Role

@Tool({ name: 'api_request' })
@UseGuards(JWTGuard)
@RateLimit({
  requests: (ctx) => {
    const role = ctx.auth?.role;
    if (role === 'premium') return 10000;
    if (role === 'pro') return 1000;
    return 100; // free tier
  },
  window: '1h'
})
async apiRequest(input: any, ctx: ExecutionContext) {
  // Different limits based on subscription
}

By Plan

const RATE_LIMITS = {
  free: { requests: 100, window: '1h' },
  basic: { requests: 1000, window: '1h' },
  premium: { requests: 10000, window: '1h' },
  enterprise: { requests: 100000, window: '1h' }
};

@Tool({ name: 'advanced_feature' })
@UseGuards(JWTGuard)
@RateLimit((ctx) => {
  const plan = ctx.auth?.plan || 'free';
  return RATE_LIMITS[plan];
})
async advancedFeature(input: any, ctx: ExecutionContext) {
  // Dynamic limits based on plan
}

Multiple Rate Limits

Stacked Limits

@Tool({ name: 'expensive_operation' })
@RateLimit({ requests: 10, window: '1m' })    // Per minute
@RateLimit({ requests: 100, window: '1h' })   // Per hour
@RateLimit({ requests: 1000, window: '1d' })  // Per day
async expensiveOperation(input: any) {
  // Must pass all rate limit checks
}

Error Handling

Custom Error Messages

@RateLimit({
  requests: 10,
  window: '1m',
  message: 'Too many requests. Please wait before trying again.'
})

With Retry Information

@RateLimit({
  requests: 10,
  window: '1m',
  message: (remaining, resetAt) => 
    `Rate limit exceeded. ${remaining} requests remaining. Resets at ${resetAt}`
})

Advanced Patterns

Burst Allowance

@Injectable()
export class BurstRateLimiter {
  @RateLimit({ requests: 10, window: '1s' })   // Burst
  @RateLimit({ requests: 100, window: '1m' })  // Sustained
  async handleRequest() {
    // Allows bursts but limits sustained load
  }
}

Adaptive Rate Limiting

@Injectable()
export class AdaptiveRateLimiter {
  private systemLoad = 0;
  
  @RateLimit({
    requests: (ctx) => {
      // Reduce limits under high load
      if (this.systemLoad > 0.8) return 50;
      if (this.systemLoad > 0.5) return 100;
      return 200;
    },
    window: '1m'
  })
  async handleRequest() {
    // Limits adjust based on system load
  }
}

Geographic Rate Limiting

@RateLimit({
  requests: (ctx) => {
    const region = ctx.metadata.region;
    // Higher limits for preferred regions
    if (region === 'us-east') return 1000;
    return 100;
  },
  window: '1h'
})

Storage Backends

In-Memory (Default)

// Fast but not distributed
// Lost on restart
// Single-server only

Redis

import { createClient } from 'redis';

@Injectable()
export class RedisRateLimiter {
  private client = createClient({
    url: process.env.REDIS_URL
  });
  
  async checkLimit(key: string, limit: number, window: number): Promise<boolean> {
    const current = await this.client.incr(key);
    
    if (current === 1) {
      // First request, set expiry
      await this.client.expire(key, window);
    }
    
    return current <= limit;
  }
  
  async getRemainingQuota(key: string, limit: number): Promise<number> {
    const current = await this.client.get(key);
    return limit - (parseInt(current || '0'));
  }
}

Distributed Rate Limiting

@Injectable()
export class DistributedRateLimiter {
  constructor(private redis: RedisService) {}
  
  async checkLimit(
    userId: string,
    limit: number,
    window: number
  ): Promise<boolean> {
    const key = `rate_limit:${userId}`;
    
    // Use Redis sliding window
    const now = Date.now();
    const windowStart = now - (window * 1000);
    
    // Remove old entries
    await this.redis.zremrangebyscore(key, 0, windowStart);
    
    // Count current requests
    const count = await this.redis.zcard(key);
    
    if (count >= limit) {
      return false;
    }
    
    // Add new request
    await this.redis.zadd(key, now, `${now}-${Math.random()}`);
    await this.redis.expire(key, window);
    
    return true;
  }
}

Response Headers

Include Rate Limit Info

@Tool({ name: 'api_endpoint' })
@RateLimit({ requests: 100, window: '1h' })
async apiEndpoint(input: any, ctx: ExecutionContext) {
  const result = await this.processRequest(input);
  
  // Add rate limit headers
  ctx.metadata.rateLimitLimit = 100;
  ctx.metadata.rateLimitRemaining = await this.getRemainingQuota(ctx);
  ctx.metadata.rateLimitReset = await this.getResetTime(ctx);
  
  return result;
}

Monitoring

Track Rate Limit Events

@Tool({ name: 'monitored_tool' })
@RateLimit({ requests: 100, window: '1h' })
async monitoredTool(input: any, ctx: ExecutionContext) {
  try {
    return await this.process(input);
  } catch (error) {
    if (error.code === 'RATE_LIMIT_EXCEEDED') {
      ctx.emit('rate_limit.exceeded', {
        userId: ctx.auth?.subject,
        tool: ctx.toolName,
        limit: 100
      });
    }
    throw error;
  }
}

Metrics Collection

@Injectable()
export class RateLimitMetrics {
  private exceeded = 0;
  private allowed = 0;
  
  @OnEvent('rate_limit.exceeded')
  handleExceeded() {
    this.exceeded++;
  }
  
  @OnEvent('rate_limit.allowed')
  handleAllowed() {
    this.allowed++;
  }
  
  getMetrics() {
    const total = this.exceeded + this.allowed;
    return {
      exceeded: this.exceeded,
      allowed: this.allowed,
      rejectionRate: total > 0 ? this.exceeded / total : 0
    };
  }
}

Best Practices

1. Set Appropriate Limits

// ✅ Good - Match resource consumption
@RateLimit({ requests: 1, window: '5s' })    // Very expensive operation
@RateLimit({ requests: 100, window: '1h' })  // Moderate operation
@RateLimit({ requests: 1000, window: '1h' }) // Light operation

// ❌ Avoid - Too restrictive or too lenient
@RateLimit({ requests: 1, window: '1h' })    // Too strict
@RateLimit({ requests: 1000000, window: '1s' }) // Too lenient

2. Use Per-User Limits

// ✅ Good - Per user
@RateLimit({
  requests: 100,
  window: '1h',
  key: (ctx) => ctx.auth?.subject || ctx.metadata.ip
})

// ❌ Avoid - Global limit (DDoS vulnerable)
@RateLimit({ requests: 1000, window: '1h' })

3. Provide Clear Errors

// ✅ Good - Helpful message
@RateLimit({
  requests: 10,
  window: '1m',
  message: 'Rate limit: 10 requests per minute. Please slow down.'
})

// ❌ Avoid - Generic message
@RateLimit({
  requests: 10,
  window: '1m',
  message: 'Error'
})

4. Monitor and Adjust

// Track metrics
@OnEvent('rate_limit.exceeded')
async handleExceeded(data: any) {
  await this.metrics.record('rate_limit_exceeded', {
    userId: data.userId,
    endpoint: data.tool
  });
  
  // Alert if too many users hitting limits
  if (await this.metrics.getExceededRate() > 0.1) {
    await this.alerts.send('Rate limits may be too strict');
  }
}

5. Implement Graceful Degradation

@Tool({ name: 'search' })
@RateLimit({ requests: 100, window: '1h' })
async search(input: any, ctx: ExecutionContext) {
  try {
    return await this.fullSearch(input);
  } catch (error) {
    if (error.code === 'RATE_LIMIT_EXCEEDED') {
      // Fall back to basic search
      return await this.basicSearch(input);
    }
    throw error;
  }
}

Common Patterns

Email Sending

@Tool({ name: 'send_email' })
@RateLimit({ requests: 10, window: '1m' })   // Per minute
@RateLimit({ requests: 100, window: '1h' })  // Per hour
@RateLimit({ requests: 500, window: '1d' })  // Per day
async sendEmail(input: any) {
  // Prevent email spam
}

API Calls

@Tool({ name: 'external_api' })
@RateLimit({
  requests: 50,
  window: '1m',
  key: (ctx) => ctx.auth?.apiKey || 'anonymous'
})
async callExternalApi(input: any) {
  // Comply with external API limits
}

File Uploads

@Tool({ name: 'upload_file' })
@RateLimit({ requests: 5, window: '1m' })  // Prevent abuse
async uploadFile(input: any) {
  // Limit upload frequency
}

Troubleshooting

Users Hitting Limits

Check if limits are too strict
Verify window is appropriate
Consider tiered plans
Monitor legitimate usage patterns

Limits Not Working

Verify decorator is applied
Check rate limit key is correct
Ensure storage backend is working
Test with multiple requests

Performance Issues

Use Redis for distributed systems
Implement sliding windows
Clean up expired keys
Monitor storage size

Next Steps

Pro Tip: Start with generous limits and tighten based on actual usage patterns and resource availability!