Rate Limits

Understand and manage API rate limits to optimize your application's performance.

Rate Limit Tiers

Tier	Monthly Requests	Per-Minute Limit	Burst Allowance	Price
Free	1,000	10	20	$0
Starter	25,000	60	120	$19/month
Growth	150,000	120	240	$79/month
Business	500,000	300	600	$199/month

Burst Allowance: Short-term allowance above per-minute limit for handling traffic spikes. Burst capacity refills gradually when usage is below the base rate.

Rate Limit Headers

Every API response includes headers to help you monitor your rate limit status:

Response Headers

HTTP/1.1 200 OK
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 59
X-RateLimit-Reset: 1642246800
Retry-After: 60

{
  "id": "mod_1a2b3c4d5e6f",
  "status": "completed",
  ...
}

X-RateLimit-Limit

Your per-minute request limit for current tier

X-RateLimit-Remaining

Requests remaining in current minute window

X-RateLimit-Reset

Unix timestamp when rate limit resets

Retry-After

Seconds to wait before retrying (on 429 errors)

Handling Rate Limits

429 Too Many Requests

When you exceed your rate limit, the API returns a 429 status code with aRetry-After header indicating how long to wait before making another request.

✅ Best Practices

• Implement exponential backoff
• Monitor rate limit headers
• Use request queues for high volume
• Respect Retry-After headers
• Cache responses when possible

💡 Optimization Tips

• Batch requests when possible
• Spread requests evenly
• Use webhooks for async processing
• Monitor usage patterns
• Consider higher tiers for growth

❌ What to Avoid

• Immediate retries after 429
• Ignoring rate limit headers
• Burst traffic without queuing
• Not handling 429 errors
• Aggressive retry strategies

Implementation Examples

Exponential Backoff

// JavaScript - Exponential backoff implementation
class RateLimitHandler {
  constructor(baseDelay = 1000, maxDelay = 32000, maxRetries = 5) {
    this.baseDelay = baseDelay;
    this.maxDelay = maxDelay;
    this.maxRetries = maxRetries;
  }
  
  async makeRequest(requestFn) {
    for (let attempt = 0; attempt < this.maxRetries; attempt++) {
      try {
        const response = await requestFn();
        
        // Check rate limit headers
        const remaining = response.headers.get('X-RateLimit-Remaining');
        if (remaining && parseInt(remaining) < 5) {
          console.warn('Approaching rate limit. Consider slowing down requests.');
        }
        
        return response;
        
      } catch (error) {
        if (error.status === 429 && attempt < this.maxRetries - 1) {
          // Get retry delay from header or calculate exponential backoff
          const retryAfter = error.headers?.get('Retry-After');
          const delay = retryAfter 
            ? parseInt(retryAfter) * 1000 
            : Math.min(this.baseDelay * Math.pow(2, attempt), this.maxDelay);
          
          console.log(`Rate limited. Waiting ${delay}ms before retry ${attempt + 1}`);
          await this.sleep(delay);
          continue;
        }
        throw error;
      }
    }
  }
  
  sleep(ms) {
    return new Promise(resolve => setTimeout(resolve, ms));
  }
}

// Usage
const rateLimiter = new RateLimitHandler();
const response = await rateLimiter.makeRequest(async () => {
  return fetch('https://api.moder8r.app/v1/moderate/text', {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer m8r_sk_your_key_here',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({ content: 'Text to moderate' })
  });
});

Request Queue for High Volume

// Request queue for high-throughput applications
class RequestQueue {
  constructor(requestsPerMinute = 60) {
    this.requestsPerMinute = requestsPerMinute;
    this.requestTimes = [];
    this.queue = [];
    this.processing = false;
  }
  
  async addRequest(requestFn) {
    return new Promise((resolve, reject) => {
      this.queue.push({ requestFn, resolve, reject });
      this.processQueue();
    });
  }
  
  async processQueue() {
    if (this.processing || this.queue.length === 0) {
      return;
    }
    
    this.processing = true;
    
    while (this.queue.length > 0) {
      const now = Date.now();
      
      // Remove requests older than 1 minute
      this.requestTimes = this.requestTimes.filter(time => now - time < 60000);
      
      // Check if we can make another request
      if (this.requestTimes.length >= this.requestsPerMinute) {
        const oldestRequest = Math.min(...this.requestTimes);
        const waitTime = 60000 - (now - oldestRequest);
        
        console.log(`Rate limit reached. Waiting ${waitTime}ms`);
        await this.sleep(waitTime);
        continue;
      }
      
      // Process next request
      const { requestFn, resolve, reject } = this.queue.shift();
      
      try {
        this.requestTimes.push(now);
        const result = await requestFn();
        resolve(result);
      } catch (error) {
        reject(error);
      }
      
      // Small delay between requests to avoid overwhelming the server
      await this.sleep(100);
    }
    
    this.processing = false;
  }
  
  sleep(ms) {
    return new Promise(resolve => setTimeout(resolve, ms));
  }
}

Monitoring and Scaling

Usage Monitoring

• Track request patterns over time
• Set alerts at 80% of monthly quota
• Monitor error rates and 429 responses
• Analyze peak usage periods
• Use the /usage endpoint for real-time stats

When to Upgrade

• Consistently hitting monthly limits
• Frequent 429 errors during normal operation
• Need for higher burst capacity
• Planning for traffic growth
• Requiring better SLA guarantees

Pro tip: Consider upgrading before you hit limits rather than after. This ensures consistent performance for your users and prevents unexpected service disruptions.