Rate Limits
Understand and manage API rate limits to optimize your application's performance.
Rate Limit Tiers
| Tier | Monthly Requests | Per-Minute Limit | Burst Allowance | Price |
|---|---|---|---|---|
Free | 1,000 | 10 | 20 | $0 |
Starter | 25,000 | 60 | 120 | $19/month |
Growth | 150,000 | 120 | 240 | $79/month |
Business | 500,000 | 300 | 600 | $199/month |
Burst Allowance: Short-term allowance above per-minute limit for handling traffic spikes. Burst capacity refills gradually when usage is below the base rate.
Rate Limit Headers
Every API response includes headers to help you monitor your rate limit status:
HTTP/1.1 200 OK
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 59
X-RateLimit-Reset: 1642246800
Retry-After: 60
{
"id": "mod_1a2b3c4d5e6f",
"status": "completed",
...
}Your per-minute request limit for current tier
Requests remaining in current minute window
Unix timestamp when rate limit resets
Seconds to wait before retrying (on 429 errors)
Handling Rate Limits
429 Too Many Requests
When you exceed your rate limit, the API returns a 429 status code with aRetry-After header indicating how long to wait before making another request.
✅ Best Practices
- • Implement exponential backoff
- • Monitor rate limit headers
- • Use request queues for high volume
- • Respect Retry-After headers
- • Cache responses when possible
💡 Optimization Tips
- • Batch requests when possible
- • Spread requests evenly
- • Use webhooks for async processing
- • Monitor usage patterns
- • Consider higher tiers for growth
❌ What to Avoid
- • Immediate retries after 429
- • Ignoring rate limit headers
- • Burst traffic without queuing
- • Not handling 429 errors
- • Aggressive retry strategies
Implementation Examples
Exponential Backoff
// JavaScript - Exponential backoff implementation
class RateLimitHandler {
constructor(baseDelay = 1000, maxDelay = 32000, maxRetries = 5) {
this.baseDelay = baseDelay;
this.maxDelay = maxDelay;
this.maxRetries = maxRetries;
}
async makeRequest(requestFn) {
for (let attempt = 0; attempt < this.maxRetries; attempt++) {
try {
const response = await requestFn();
// Check rate limit headers
const remaining = response.headers.get('X-RateLimit-Remaining');
if (remaining && parseInt(remaining) < 5) {
console.warn('Approaching rate limit. Consider slowing down requests.');
}
return response;
} catch (error) {
if (error.status === 429 && attempt < this.maxRetries - 1) {
// Get retry delay from header or calculate exponential backoff
const retryAfter = error.headers?.get('Retry-After');
const delay = retryAfter
? parseInt(retryAfter) * 1000
: Math.min(this.baseDelay * Math.pow(2, attempt), this.maxDelay);
console.log(`Rate limited. Waiting ${delay}ms before retry ${attempt + 1}`);
await this.sleep(delay);
continue;
}
throw error;
}
}
}
sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
}
// Usage
const rateLimiter = new RateLimitHandler();
const response = await rateLimiter.makeRequest(async () => {
return fetch('https://api.moder8r.app/v1/moderate/text', {
method: 'POST',
headers: {
'Authorization': 'Bearer m8r_sk_your_key_here',
'Content-Type': 'application/json',
},
body: JSON.stringify({ content: 'Text to moderate' })
});
});Request Queue for High Volume
// Request queue for high-throughput applications
class RequestQueue {
constructor(requestsPerMinute = 60) {
this.requestsPerMinute = requestsPerMinute;
this.requestTimes = [];
this.queue = [];
this.processing = false;
}
async addRequest(requestFn) {
return new Promise((resolve, reject) => {
this.queue.push({ requestFn, resolve, reject });
this.processQueue();
});
}
async processQueue() {
if (this.processing || this.queue.length === 0) {
return;
}
this.processing = true;
while (this.queue.length > 0) {
const now = Date.now();
// Remove requests older than 1 minute
this.requestTimes = this.requestTimes.filter(time => now - time < 60000);
// Check if we can make another request
if (this.requestTimes.length >= this.requestsPerMinute) {
const oldestRequest = Math.min(...this.requestTimes);
const waitTime = 60000 - (now - oldestRequest);
console.log(`Rate limit reached. Waiting ${waitTime}ms`);
await this.sleep(waitTime);
continue;
}
// Process next request
const { requestFn, resolve, reject } = this.queue.shift();
try {
this.requestTimes.push(now);
const result = await requestFn();
resolve(result);
} catch (error) {
reject(error);
}
// Small delay between requests to avoid overwhelming the server
await this.sleep(100);
}
this.processing = false;
}
sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
}Monitoring and Scaling
Usage Monitoring
- • Track request patterns over time
- • Set alerts at 80% of monthly quota
- • Monitor error rates and 429 responses
- • Analyze peak usage periods
- • Use the
/usageendpoint for real-time stats
When to Upgrade
- • Consistently hitting monthly limits
- • Frequent 429 errors during normal operation
- • Need for higher burst capacity
- • Planning for traffic growth
- • Requiring better SLA guarantees
Pro tip: Consider upgrading before you hit limits rather than after. This ensures consistent performance for your users and prevents unexpected service disruptions.