Rate Limiting

The IoT Agent Mesh API implements comprehensive rate limiting to ensure fair usage, system stability, and optimal performance for all users. This guide covers rate limiting policies, headers, handling strategies, and best practices.

Overview

Rate limiting controls the number of API requests that can be made within a specific time window. The IoT Agent Mesh platform uses multiple rate limiting strategies:

Per-User Limits: Individual user request quotas
Per-Organization Limits: Organization-wide request quotas
Per-Endpoint Limits: Specific limits for different API endpoints
Per-IP Limits: Network-level protection against abuse
Concurrent Connection Limits: Real-time connection restrictions

Rate Limiting Policies

Standard Rate Limits

Endpoint Category	Rate Limit	Window	Scope
Authentication	10 requests	1 minute	Per IP address
Organizations	100 requests	1 minute	Per user per org
Users	100 requests	1 minute	Per user per org
IoT Agents	200 requests	1 minute	Per user per org
Telemetry Read	500 requests	1 minute	Per user per org
Telemetry Write	1000 requests	1 minute	Per user per org
Alerts	200 requests	1 minute	Per user per org
Edge Functions	500 requests	1 minute	Per user per org
Real-time Connections	50 connections	Concurrent	Per user

Rate Limiting Strategies

1. Token Bucket Algorithm

The primary rate limiting uses a token bucket approach:

Each user/organization gets a bucket with tokens
Each request consumes one token
Tokens are replenished at a steady rate
Burst traffic is allowed up to bucket capacity

2. Sliding Window

For precise rate limiting, a sliding window approach is used:

Tracks requests in rolling time windows
More accurate than fixed windows
Prevents burst at window boundaries

3. Hierarchical Limits

Multiple limits can apply simultaneously:

Request → IP Limit → User Limit → Organization Limit → Endpoint Limit → Allow/Deny

Handling Rate Limits

Best Practices for Clients

1. Respect Rate Limit Headers

Always check rate limit headers and adjust behavior accordingly:

// JavaScript/TypeScript example
async function makeAPIRequest(requestFn) {
  try {
    const response = await requestFn()
// Check rate limit headers
const remaining = parseInt(response.headers['x-ratelimit-remaining'])
const reset = parseInt(response.headers['x-ratelimit-reset'])

if (remaining &lt; 10) {
  const resetTime = new Date(reset * 1000)
  console.log(`Rate limit low. Resets at: ${resetTime}`)
  
  // Slow down requests
  await new Promise(resolve =&gt; setTimeout(resolve, 1000))
}

return response

} catch (error) {
if (error.status === 429) {
await handleRateLimit(error)
return makeAPIRequest(requestFn) // Retry
}
throw error
}
}

2. Implement Exponential Backoff

async function exponentialBackoff(requestFn, maxRetries = 5) {
  let attempt = 0
while (attempt < maxRetries) {
try {
return await requestFn()
} catch (error) {
  if (error.status === 429) {
    attempt++
    
    // Get retry delay from header or calculate
    const retryAfter = error.headers?.['retry-after'] || 
                      Math.pow(2, attempt) + Math.random()
    
    console.log(`Rate limited. Retrying in ${retryAfter}s (attempt ${attempt})`)
    await new Promise(resolve =&gt; setTimeout(resolve, retryAfter * 1000))
    
  } else {
    throw error
  }
}

}
throw new Error('Max retries exceeded')
}

3. Request Queuing

Implement a queue to manage request rate:

class RequestQueue {
  constructor(requestsPerSecond = 10) {
    this.requestsPerSecond = requestsPerSecond
    this.queue = []
    this.processing = false
    this.lastRequestTime = 0
  }
async enqueue(requestFn) {
return new Promise((resolve, reject) => {
this.queue.push({ requestFn, resolve, reject })
this.processQueue()
})
}
async processQueue() {
if (this.processing || this.queue.length === 0) return
this.processing = true

while (this.queue.length &gt; 0) {
  const { requestFn, resolve, reject } = this.queue.shift()
  
  // Ensure minimum time between requests
  const now = Date.now()
  const timeSinceLastRequest = now - this.lastRequestTime
  const minInterval = 1000 / this.requestsPerSecond
  
  if (timeSinceLastRequest &lt; minInterval) {
    await new Promise(r =&gt; setTimeout(r, minInterval - timeSinceLastRequest))
  }
  
  try {
    const result = await requestFn()
    resolve(result)
  } catch (error) {
    if (error.status === 429) {
      // Put request back at front of queue
      this.queue.unshift({ requestFn, resolve, reject })
      
      const retryAfter = (error.headers?.['retry-after'] || 30) * 1000
      await new Promise(r =&gt; setTimeout(r, retryAfter))
    } else {
      reject(error)
    }
  }
  
  this.lastRequestTime = Date.now()
}

this.processing = false

}
}
// Usage
const requestQueue = new RequestQueue(15) // 15 requests per second
const result = await requestQueue.enqueue(() =>
supabase.from('agents').select('*')
)

Language-Specific Examples

Python Rate Limiting

import time
import requests
from functools import wraps
class RateLimiter:
def init(self, max_requests=100, time_window=60):
self.max_requests = max_requests
self.time_window = time_window
self.requests = []
def wait_if_needed(self):
    now = time.time()
    
    # Remove old requests outside the window
    self.requests = [req_time for req_time in self.requests 
                    if now - req_time &lt; self.time_window]
    
    if len(self.requests) &gt;= self.max_requests:
        # Calculate wait time
        oldest_request = min(self.requests)
        wait_time = self.time_window - (now - oldest_request)
        
        if wait_time &gt; 0:
            time.sleep(wait_time + 0.1)  # Add small buffer
    
    self.requests.append(now)

def with_rate_limiting(rate_limiter):
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
rate_limiter.wait_if_needed()
        max_retries = 3
        for attempt in range(max_retries):
            try:
                response = func(*args, **kwargs)
                
                if hasattr(response, 'status_code') and response.status_code == 429:
                    retry_after = int(response.headers.get('retry-after', 30))
                    time.sleep(retry_after)
                    continue
                
                return response
                
            except requests.exceptions.RequestException as e:
                if attempt == max_retries - 1:
                    raise
                time.sleep(2 ** attempt)  # Exponential backoff
        
        return response
    return wrapper
return decorator

Usage
rate_limiter = RateLimiter(max_requests=100, time_window=60)
@with_rate_limiting(rate_limiter)
def api_request(url, **kwargs):
return requests.get(url, **kwargs)

Java Rate Limiting

import java.time.Instant;
import java.util.concurrent.Semaphore;
import java.util.concurrent.TimeUnit;
public class RateLimiter {
private final Semaphore semaphore;
private final int maxRequests;
private final long timeWindowMs;
private volatile long windowStart;
private volatile int requestCount;
public RateLimiter(int maxRequests, long timeWindowMs) {
    this.maxRequests = maxRequests;
    this.timeWindowMs = timeWindowMs;
    this.semaphore = new Semaphore(maxRequests);
    this.windowStart = System.currentTimeMillis();
    this.requestCount = 0;
}

public void acquire() throws InterruptedException {
    long now = System.currentTimeMillis();
    
    synchronized (this) {
        if (now - windowStart &gt;= timeWindowMs) {
            // Reset window
            semaphore.release(maxRequests - semaphore.availablePermits());
            windowStart = now;
            requestCount = 0;
        }
    }
    
    semaphore.acquire();
    requestCount++;
}

public void release() {
    // Don't release immediately - let time window handle it
    // This prevents burst beyond the rate limit
}

public boolean tryAcquire(long timeout, TimeUnit unit) throws InterruptedException {
    return semaphore.tryAcquire(timeout, unit);
}

}
// Usage with HTTP client
public class APIClient {
private final RateLimiter rateLimiter;
private final HttpClient httpClient;
public APIClient() {
    this.rateLimiter = new RateLimiter(100, 60000); // 100 requests per minute
    this.httpClient = HttpClient.newHttpClient();
}

public HttpResponse&lt;String&gt; makeRequest(HttpRequest request) throws Exception {
    rateLimiter.acquire();
    
    try {
        HttpResponse&lt;String&gt; response = httpClient.send(
            request, 
            HttpResponse.BodyHandlers.ofString()
        );
        
        if (response.statusCode() == 429) {
            String retryAfter = response.headers()
                .firstValue("retry-after")
                .orElse("30");
            
            Thread.sleep(Integer.parseInt(retryAfter) * 1000);
            return makeRequest(request); // Retry
        }
        
        return response;
        
    } finally {
        rateLimiter.release();
    }
}

}

Monitoring Rate Limits

Rate Limit Dashboard

Monitor your API usage through the dashboard:

// Get rate limit status for organization
async function getRateLimitStatus() {
  const response = await supabase.functions.invoke('rate-limit-status')
return {
current_usage: response.data.current_usage,
limits: response.data.limits,
reset_times: response.data.reset_times,
burst_available: response.data.burst_available
}
}
// Example response:
{
"current_usage": {
"telemetry_write": 750,
"telemetry_read": 340,
"agents": 45,
"alerts": 12
},
"limits": {
"telemetry_write": 1000,
"telemetry_read": 500,
"agents": 200,
"alerts": 200
},
"reset_times": {
"telemetry_write": "2024-07-23T17:01:00Z",
"telemetry_read": "2024-07-23T17:01:00Z"
},
"burst_available": {
"telemetry_write": 1250,
"telemetry_read": 750
}
}

Alerting on Rate Limits

Set up monitoring alerts:

// Monitor rate limit usage
function monitorRateLimits() {
  setInterval(async () => {
    try {
      const status = await getRateLimitStatus()
  Object.entries(status.current_usage).forEach(([endpoint, usage]) =&gt; {
    const limit = status.limits[endpoint]
    const percentage = (usage / limit) * 100
    
    if (percentage &gt; 80) {
      console.warn(`Rate limit warning: ${endpoint} at ${percentage.toFixed(1)}%`)
      
      // Send alert to monitoring system
      sendAlert({
        type: 'rate_limit_warning',
        endpoint,
        usage,
        limit,
        percentage
      })
    }
  })
  
} catch (error) {
  console.error('Failed to check rate limits:', error)
}

}, 30000) // Check every 30 seconds
}

Optimization Strategies

1. Batch Operations

Use batch endpoints to maximize efficiency:

// Instead of multiple single requests
for (const telemetryPoint of data) {
  await supabase.from('telemetry').insert(telemetryPoint)  // 100+ requests
}
// Use batch endpoint
await supabase.functions.invoke('telemetry-batch', {
body: { telemetry: data }  // 1 request
})

2. Caching

Implement client-side caching:

class CachedAPIClient {
  constructor(ttlSeconds = 300) {
    this.cache = new Map()
    this.ttl = ttlSeconds * 1000
  }
async get(key, fetchFn) {
const cached = this.cache.get(key)
if (cached &amp;&amp; Date.now() - cached.timestamp &lt; this.ttl) {
  return cached.data
}

const data = await fetchFn()
this.cache.set(key, {
  data,
  timestamp: Date.now()
})

return data

}
async getAgents() {
return this.get('agents', async () => {
const { data } = await supabase.from('agents').select('*')
return data
})
}
}

3. Request Optimization

Optimize queries to reduce API calls:

// Instead of separate requests
const agents = await supabase.from('agents').select('*')
const telemetry = await supabase.from('telemetry').select('*')
const alerts = await supabase.from('alerts').select('*')
// Use joins or batch queries
const data = await supabase
.from('agents')
.select(*, latest_telemetry:telemetry(*, created_at.desc.limit(1)), active_alerts:alerts(*))

4. Connection Pooling

For real-time connections, manage connection pools:

class ConnectionPool {
  constructor(maxConnections = 10) {
    this.maxConnections = maxConnections
    this.connections = []
    this.waitingQueue = []
  }
async getConnection() {
if (this.connections.length < this.maxConnections) {
const connection = supabase.channel(conn-${Date.now()})
this.connections.push(connection)
return connection
}
// Wait for available connection
return new Promise(resolve =&gt; {
  this.waitingQueue.push(resolve)
})

}
releaseConnection(connection) {
if (this.waitingQueue.length > 0) {
const resolve = this.waitingQueue.shift()
resolve(connection)
}
}
}

Custom Rate Limits

Enterprise Custom Limits

Enterprise customers can request custom rate limits:

// Contact support to configure custom limits
const customLimits = {
  organization_id: "org_01H8K2M3N4P5Q6R7S8T9U0V1W2",
  custom_limits: {
    "telemetry_write": 10000,    // 10x standard limit
    "telemetry_read": 5000,      // 10x standard limit
    "real_time_connections": 500, // 10x standard limit
    "custom_endpoints": {
      "/functions/v1/custom-processing": 2000
    }
  },
  burst_multiplier: 3.0,  // Allow 3x burst instead of 2x
  notes: "High-volume IoT deployment with 10,000+ sensors"
}

API Key Specific Limits

Different API keys can have different limits:

// Generate API key with specific limits
const response = await supabase.functions.invoke('generate-api-key', {
  body: {
    name: 'High Volume Telemetry Key',
    permissions: ['telemetry:write'],
    rate_limits: {
      'telemetry_write': 5000,  // Custom limit for this key
      'burst_allowance': 2.5
    },
    expires_in_days: 365
  }
})

Error Recovery

Graceful Degradation

Handle rate limits gracefully in your application:

class ResilientAPIClient {
  constructor() {
    this.fallbackData = new Map()
    this.rateLimitActive = false
  }
async fetchWithFallback(key, fetchFn, fallbackFn) {
try {
if (this.rateLimitActive) {
// Use fallback immediately if rate limited
return await fallbackFn()
}
  const data = await fetchFn()
  
  // Cache successful responses
  this.fallbackData.set(key, {
    data,
    timestamp: Date.now()
  })
  
  return data
  
} catch (error) {
  if (error.status === 429) {
    this.rateLimitActive = true
    
    // Reset flag after rate limit window
    const retryAfter = error.headers?.['retry-after'] || 60
    setTimeout(() =&gt; {
      this.rateLimitActive = false
    }, retryAfter * 1000)
    
    // Use fallback data
    return await fallbackFn()
  }
  
  throw error
}

}
async getAgentsWithFallback() {
return this.fetchWithFallback(
'agents',
() => supabase.from('agents').select('*'),
() => {
// Return cached data or basic fallback
const cached = this.fallbackData.get('agents')
if (cached && Date.now() - cached.timestamp < 300000) { // 5 min cache
return cached.data
}
    // Return minimal fallback data
    return [{ id: 'offline', name: 'Data temporarily unavailable' }]
  }
)

}
}

Best Practices Summary

Do's ✅

Monitor Headers: Always check X-RateLimit-* headers
Implement Backoff: Use exponential backoff for retries
Use Batch Operations: Combine multiple operations when possible
Cache Responses: Implement appropriate caching strategies
Handle 429 Gracefully: Provide fallback behavior
Monitor Usage: Track your rate limit consumption
Plan for Bursts: Account for traffic spikes in your design

Don'ts ❌

Don't Ignore 429: Always handle rate limit responses
Don't Retry Immediately: Wait for the specified retry period
Don't Hard Code Delays: Use the Retry-After header
Don't Spam Requests: Implement proper request spacing
Don't Skip Error Handling: Always handle rate limit errors
Don't Ignore Warnings: Monitor when approaching limits
Don't Assume Limits: Check current limits via API

Support and Escalation

When to Contact Support

Consistently hitting rate limits despite optimization
Need for custom enterprise rate limits
Suspected rate limiting bugs or issues
Questions about appropriate limits for your use case

Support Information

Include when contacting support:

Organization ID
API endpoint experiencing issues
Current request patterns and volumes
Rate limit headers from responses
Business justification for higher limits

Related Resources

API Home - General API usage guidelines
Authentication API - Authentication rate limits
Error Handling - Handling rate limit errors
Telemetry API - High-volume telemetry ingestion
Edge Functions API - Custom function rate limits

Rate limiting ensures fair usage and optimal performance for all users. Following these guidelines will help you build robust, efficient applications that work well within the platform limits.

# Rate Limiting

Overview

Rate limiting controls the number of API requests that can be made within a specific time window. The IoT Agent Mesh platform uses multiple rate limiting strategies:

Per-User Limits: Individual user request quotas
Per-Organization Limits: Organization-wide request quotas
Per-Endpoint Limits: Specific limits for different API endpoints
Per-IP Limits: Network-level protection against abuse
Concurrent Connection Limits: Real-time connection restrictions

Rate Limiting Policies

Standard Rate Limits

Endpoint Category	Rate Limit	Window	Scope
Authentication	10 requests	1 minute	Per IP address
Organizations	100 requests	1 minute	Per user per org
Users	100 requests	1 minute	Per user per org
IoT Agents	200 requests	1 minute	Per user per org
Telemetry Read	500 requests	1 minute	Per user per org
Telemetry Write	1000 requests	1 minute	Per user per org
Alerts	200 requests	1 minute	Per user per org
Edge Functions	500 requests	1 minute	Per user per org
Real-time Connections	50 connections	Concurrent	Per user

Burst Allowance

All endpoints support burst traffic up to 2x the standard rate limit for short periods (up to 30 seconds), after which the standard rate applies.

Premium Tier Limits

Plan	Multiplier	Additional Features
Starter	1x	Standard limits
Professional	5x	Higher burst allowance
Enterprise	10x	Custom limits available

Rate Limiting Headers

Response Headers

Every API response includes rate limiting information in the headers:

HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1721750460
X-RateLimit-Window: 60
X-RateLimit-Policy: per-user-per-org
Retry-After: 30

Header Descriptions

Header	Description	Example
`X-RateLimit-Limit`	Maximum requests allowed in the current window	`1000`
`X-RateLimit-Remaining`	Requests remaining in the current window	`999`
`X-RateLimit-Reset`	Unix timestamp when the window resets	`1721750460`
`X-RateLimit-Window`	Window duration in seconds	`60`
`X-RateLimit-Policy`	Rate limiting policy applied	`per-user-per-org`
`Retry-After`	Seconds to wait before retrying (only on 429)	`30`

Rate Limit Exceeded Response

When rate limits are exceeded, the API returns a 429 Too Many Requests status:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1721750460
X-RateLimit-Window: 60
Retry-After: 30
Content-Type: application/json

{
  "error": {
    "code": "RATE_LIMIT_EXCEEDED",
    "message": "Rate limit exceeded for telemetry ingestion",
    "details": {
      "limit": 1000,
      "window_seconds": 60,
      "policy": "per-user-per-org",
      "reset_at": "2024-07-23T17:01:00Z",
      "retry_after_seconds": 30
    },
    "request_id": "req_01H8K3L4M5N6O7P8Q9R0S1T2U3V4",
    "timestamp": "2024-07-23T17:00:00Z"
  },
  "status": 429
}

Endpoint-Specific Rate Limits

Authentication Endpoints

Endpoint	Rate Limit	Window	Scope	Notes
`POST /auth/v1/token`	10 requests	1 minute	Per IP	Sign in attempts
`POST /auth/v1/signup`	5 requests	1 minute	Per IP	Account creation
`POST /auth/v1/recover`	3 requests	1 minute	Per IP	Password reset
`POST /auth/v1/token?grant_type=refresh_token`	20 requests	1 minute	Per user	Token refresh

Security Note: Authentication endpoints have stricter limits to prevent brute force attacks.

Organization Management

Endpoint	Rate Limit	Window	Scope
`GET /rest/v1/organizations`	100 requests	1 minute	Per user
`POST /rest/v1/organizations`	5 requests	1 minute	Per user
`PATCH /rest/v1/organizations`	30 requests	1 minute	Per user per org
`DELETE /rest/v1/organizations`	2 requests	1 minute	Per user

User Management

Endpoint	Rate Limit	Window	Scope
`GET /rest/v1/users`	100 requests	1 minute	Per user per org
`POST /functions/v1/invite-user`	10 requests	1 minute	Per user per org
`PATCH /rest/v1/users`	30 requests	1 minute	Per user per org
`GET /functions/v1/user-activity`	50 requests	1 minute	Per user per org

IoT Agent Management

Endpoint	Rate Limit	Window	Scope
`GET /rest/v1/agents`	200 requests	1 minute	Per user per org
`POST /rest/v1/agents`	20 requests	1 minute	Per user per org
`PATCH /rest/v1/agents`	100 requests	1 minute	Per user per org
`DELETE /rest/v1/agents`	10 requests	1 minute	Per user per org
`GET /functions/v1/agent-status`	1000 requests	1 minute	Per user per org
`POST /functions/v1/agents-bulk-register`	5 requests	1 minute	Per user per org

Telemetry Data

Endpoint	Rate Limit	Window	Scope	Notes
`GET /rest/v1/telemetry`	500 requests	1 minute	Per user per org	Data queries
`POST /rest/v1/telemetry`	1000 requests	1 minute	Per user per org	Single ingestion
`POST /functions/v1/telemetry-batch`	100 requests	1 minute	Per user per org	Batch ingestion
`GET /functions/v1/telemetry-latest`	200 requests	1 minute	Per user per org	Latest data
`GET /functions/v1/telemetry-aggregate`	100 requests	1 minute	Per user per org	Aggregations
`POST /functions/v1/telemetry-export`	10 requests	1 hour	Per user per org	Data export

High-Volume Ingestion: Contact support for custom telemetry ingestion limits if you need higher throughput.

Alert Management

Endpoint	Rate Limit	Window	Scope
`GET /rest/v1/alerts`	200 requests	1 minute	Per user per org
`POST /rest/v1/alert-rules`	20 requests	1 minute	Per user per org
`PATCH /rest/v1/alert-rules`	50 requests	1 minute	Per user per org
`POST /rest/v1/alerts/{id}/acknowledge`	100 requests	1 minute	Per user per org
`POST /rest/v1/alerts/{id}/resolve`	100 requests	1 minute	Per user per org
`GET /functions/v1/alert-statistics`	30 requests	1 minute	Per user per org

Edge Functions

Function	Rate Limit	Window	Scope
`POST /functions/v1/process-device-data`	1000 requests	1 minute	Per user per org
`POST /functions/v1/webhook-handler`	2000 requests	1 minute	Per IP
`POST /functions/v1/send-alert-notification`	200 requests	1 minute	Per user per org
`POST /functions/v1/generate-report`	10 requests	1 hour	Per user per org
Custom functions	500 requests	1 minute	Per user per org

Real-time Connections

Connection Type	Limit	Scope	Notes
WebSocket connections	50 concurrent	Per user	Real-time subscriptions
Channel subscriptions	100 concurrent	Per user	Per connection
Message rate	1000 messages	1 minute per connection	Incoming messages

Rate Limiting Strategies

1. Token Bucket Algorithm

The primary rate limiting uses a token bucket approach:

Each user/organization gets a bucket with tokens
Each request consumes one token
Tokens are replenished at a steady rate
Burst traffic is allowed up to bucket capacity

2. Sliding Window

For precise rate limiting, a sliding window approach is used:

Tracks requests in rolling time windows
More accurate than fixed windows
Prevents burst at window boundaries

3. Hierarchical Limits

Multiple limits can apply simultaneously:

Request → IP Limit → User Limit → Organization Limit → Endpoint Limit → Allow/Deny

Handling Rate Limits

Best Practices for Clients

1. Respect Rate Limit Headers

Always check rate limit headers and adjust behavior accordingly:

// JavaScript/TypeScript example
async function makeAPIRequest(requestFn) {
  try {
    const response = await requestFn()
    
    // Check rate limit headers
    const remaining = parseInt(response.headers['x-ratelimit-remaining'])
    const reset = parseInt(response.headers['x-ratelimit-reset'])
    
    if (remaining < 10) {
      const resetTime = new Date(reset * 1000)
      console.log(`Rate limit low. Resets at: ${resetTime}`)
      
      // Slow down requests
      await new Promise(resolve => setTimeout(resolve, 1000))
    }
    
    return response
    
  } catch (error) {
    if (error.status === 429) {
      await handleRateLimit(error)
      return makeAPIRequest(requestFn) // Retry
    }
    throw error
  }
}

2. Implement Exponential Backoff

async function exponentialBackoff(requestFn, maxRetries = 5) {
  let attempt = 0
  
  while (attempt < maxRetries) {
    try {
      return await requestFn()
      
    } catch (error) {
      if (error.status === 429) {
        attempt++
        
        // Get retry delay from header or calculate
        const retryAfter = error.headers?.['retry-after'] || 
                          Math.pow(2, attempt) + Math.random()
        
        console.log(`Rate limited. Retrying in ${retryAfter}s (attempt ${attempt})`)
        await new Promise(resolve => setTimeout(resolve, retryAfter * 1000))
        
      } else {
        throw error
      }
    }
  }
  
  throw new Error('Max retries exceeded')
}

3. Request Queuing

Implement a queue to manage request rate:

class RequestQueue {
  constructor(requestsPerSecond = 10) {
    this.requestsPerSecond = requestsPerSecond
    this.queue = []
    this.processing = false
    this.lastRequestTime = 0
  }
  
  async enqueue(requestFn) {
    return new Promise((resolve, reject) => {
      this.queue.push({ requestFn, resolve, reject })
      this.processQueue()
    })
  }
  
  async processQueue() {
    if (this.processing || this.queue.length === 0) return
    
    this.processing = true
    
    while (this.queue.length > 0) {
      const { requestFn, resolve, reject } = this.queue.shift()
      
      // Ensure minimum time between requests
      const now = Date.now()
      const timeSinceLastRequest = now - this.lastRequestTime
      const minInterval = 1000 / this.requestsPerSecond
      
      if (timeSinceLastRequest < minInterval) {
        await new Promise(r => setTimeout(r, minInterval - timeSinceLastRequest))
      }
      
      try {
        const result = await requestFn()
        resolve(result)
      } catch (error) {
        if (error.status === 429) {
          // Put request back at front of queue
          this.queue.unshift({ requestFn, resolve, reject })
          
          const retryAfter = (error.headers?.['retry-after'] || 30) * 1000
          await new Promise(r => setTimeout(r, retryAfter))
        } else {
          reject(error)
        }
      }
      
      this.lastRequestTime = Date.now()
    }
    
    this.processing = false
  }
}

// Usage
const requestQueue = new RequestQueue(15) // 15 requests per second

const result = await requestQueue.enqueue(() => 
  supabase.from('agents').select('*')
)

Language-Specific Examples

Python Rate Limiting

import time
import requests
from functools import wraps

class RateLimiter:
    def __init__(self, max_requests=100, time_window=60):
        self.max_requests = max_requests
        self.time_window = time_window
        self.requests = []
    
    def wait_if_needed(self):
        now = time.time()
        
        # Remove old requests outside the window
        self.requests = [req_time for req_time in self.requests 
                        if now - req_time < self.time_window]
        
        if len(self.requests) >= self.max_requests:
            # Calculate wait time
            oldest_request = min(self.requests)
            wait_time = self.time_window - (now - oldest_request)
            
            if wait_time > 0:
                time.sleep(wait_time + 0.1)  # Add small buffer
        
        self.requests.append(now)

def with_rate_limiting(rate_limiter):
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            rate_limiter.wait_if_needed()
            
            max_retries = 3
            for attempt in range(max_retries):
                try:
                    response = func(*args, **kwargs)
                    
                    if hasattr(response, 'status_code') and response.status_code == 429:
                        retry_after = int(response.headers.get('retry-after', 30))
                        time.sleep(retry_after)
                        continue
                    
                    return response
                    
                except requests.exceptions.RequestException as e:
                    if attempt == max_retries - 1:
                        raise
                    time.sleep(2 ** attempt)  # Exponential backoff
            
            return response
        return wrapper
    return decorator

# Usage
rate_limiter = RateLimiter(max_requests=100, time_window=60)

@with_rate_limiting(rate_limiter)
def api_request(url, **kwargs):
    return requests.get(url, **kwargs)

Java Rate Limiting

import java.time.Instant;
import java.util.concurrent.Semaphore;
import java.util.concurrent.TimeUnit;

public class RateLimiter {
    private final Semaphore semaphore;
    private final int maxRequests;
    private final long timeWindowMs;
    private volatile long windowStart;
    private volatile int requestCount;
    
    public RateLimiter(int maxRequests, long timeWindowMs) {
        this.maxRequests = maxRequests;
        this.timeWindowMs = timeWindowMs;
        this.semaphore = new Semaphore(maxRequests);
        this.windowStart = System.currentTimeMillis();
        this.requestCount = 0;
    }
    
    public void acquire() throws InterruptedException {
        long now = System.currentTimeMillis();
        
        synchronized (this) {
            if (now - windowStart >= timeWindowMs) {
                // Reset window
                semaphore.release(maxRequests - semaphore.availablePermits());
                windowStart = now;
                requestCount = 0;
            }
        }
        
        semaphore.acquire();
        requestCount++;
    }
    
    public void release() {
        // Don't release immediately - let time window handle it
        // This prevents burst beyond the rate limit
    }
    
    public boolean tryAcquire(long timeout, TimeUnit unit) throws InterruptedException {
        return semaphore.tryAcquire(timeout, unit);
    }
}

// Usage with HTTP client
public class APIClient {
    private final RateLimiter rateLimiter;
    private final HttpClient httpClient;
    
    public APIClient() {
        this.rateLimiter = new RateLimiter(100, 60000); // 100 requests per minute
        this.httpClient = HttpClient.newHttpClient();
    }
    
    public HttpResponse<String> makeRequest(HttpRequest request) throws Exception {
        rateLimiter.acquire();
        
        try {
            HttpResponse<String> response = httpClient.send(
                request, 
                HttpResponse.BodyHandlers.ofString()
            );
            
            if (response.statusCode() == 429) {
                String retryAfter = response.headers()
                    .firstValue("retry-after")
                    .orElse("30");
                
                Thread.sleep(Integer.parseInt(retryAfter) * 1000);
                return makeRequest(request); // Retry
            }
            
            return response;
            
        } finally {
            rateLimiter.release();
        }
    }
}

Monitoring Rate Limits

Rate Limit Dashboard

Monitor your API usage through the dashboard:

// Get rate limit status for organization
async function getRateLimitStatus() {
  const response = await supabase.functions.invoke('rate-limit-status')
  
  return {
    current_usage: response.data.current_usage,
    limits: response.data.limits,
    reset_times: response.data.reset_times,
    burst_available: response.data.burst_available
  }
}

// Example response:
{
  "current_usage": {
    "telemetry_write": 750,
    "telemetry_read": 340,
    "agents": 45,
    "alerts": 12
  },
  "limits": {
    "telemetry_write": 1000,
    "telemetry_read": 500,
    "agents": 200,
    "alerts": 200
  },
  "reset_times": {
    "telemetry_write": "2024-07-23T17:01:00Z",
    "telemetry_read": "2024-07-23T17:01:00Z"
  },
  "burst_available": {
    "telemetry_write": 1250,
    "telemetry_read": 750
  }
}

Alerting on Rate Limits

Set up monitoring alerts:

// Monitor rate limit usage
function monitorRateLimits() {
  setInterval(async () => {
    try {
      const status = await getRateLimitStatus()
      
      Object.entries(status.current_usage).forEach(([endpoint, usage]) => {
        const limit = status.limits[endpoint]
        const percentage = (usage / limit) * 100
        
        if (percentage > 80) {
          console.warn(`Rate limit warning: ${endpoint} at ${percentage.toFixed(1)}%`)
          
          // Send alert to monitoring system
          sendAlert({
            type: 'rate_limit_warning',
            endpoint,
            usage,
            limit,
            percentage
          })
        }
      })
      
    } catch (error) {
      console.error('Failed to check rate limits:', error)
    }
  }, 30000) // Check every 30 seconds
}

Optimization Strategies

1. Batch Operations

Use batch endpoints to maximize efficiency:

// Instead of multiple single requests
for (const telemetryPoint of data) {
  await supabase.from('telemetry').insert(telemetryPoint)  // 100+ requests
}

// Use batch endpoint
await supabase.functions.invoke('telemetry-batch', {
  body: { telemetry: data }  // 1 request
})

2. Caching

Implement client-side caching:

class CachedAPIClient {
  constructor(ttlSeconds = 300) {
    this.cache = new Map()
    this.ttl = ttlSeconds * 1000
  }
  
  async get(key, fetchFn) {
    const cached = this.cache.get(key)
    
    if (cached && Date.now() - cached.timestamp < this.ttl) {
      return cached.data
    }
    
    const data = await fetchFn()
    this.cache.set(key, {
      data,
      timestamp: Date.now()
    })
    
    return data
  }
  
  async getAgents() {
    return this.get('agents', async () => {
      const { data } = await supabase.from('agents').select('*')
      return data
    })
  }
}

3. Request Optimization

Optimize queries to reduce API calls:

// Instead of separate requests
const agents = await supabase.from('agents').select('*')
const telemetry = await supabase.from('telemetry').select('*')
const alerts = await supabase.from('alerts').select('*')

// Use joins or batch queries
const data = await supabase
  .from('agents')
  .select(`
    *,
    latest_telemetry:telemetry(*, created_at.desc.limit(1)),
    active_alerts:alerts(*)
  `)

4. Connection Pooling

For real-time connections, manage connection pools:

class ConnectionPool {
  constructor(maxConnections = 10) {
    this.maxConnections = maxConnections
    this.connections = []
    this.waitingQueue = []
  }
  
  async getConnection() {
    if (this.connections.length < this.maxConnections) {
      const connection = supabase.channel(`conn-${Date.now()}`)
      this.connections.push(connection)
      return connection
    }
    
    // Wait for available connection
    return new Promise(resolve => {
      this.waitingQueue.push(resolve)
    })
  }
  
  releaseConnection(connection) {
    if (this.waitingQueue.length > 0) {
      const resolve = this.waitingQueue.shift()
      resolve(connection)
    }
  }
}

Custom Rate Limits

Enterprise Custom Limits

Enterprise customers can request custom rate limits:

// Contact support to configure custom limits
const customLimits = {
  organization_id: "org_01H8K2M3N4P5Q6R7S8T9U0V1W2",
  custom_limits: {
    "telemetry_write": 10000,    // 10x standard limit
    "telemetry_read": 5000,      // 10x standard limit
    "real_time_connections": 500, // 10x standard limit
    "custom_endpoints": {
      "/functions/v1/custom-processing": 2000
    }
  },
  burst_multiplier: 3.0,  // Allow 3x burst instead of 2x
  notes: "High-volume IoT deployment with 10,000+ sensors"
}

API Key Specific Limits

Different API keys can have different limits:

// Generate API key with specific limits
const response = await supabase.functions.invoke('generate-api-key', {
  body: {
    name: 'High Volume Telemetry Key',
    permissions: ['telemetry:write'],
    rate_limits: {
      'telemetry_write': 5000,  // Custom limit for this key
      'burst_allowance': 2.5
    },
    expires_in_days: 365
  }
})

Error Recovery

Graceful Degradation

Handle rate limits gracefully in your application:

class ResilientAPIClient {
  constructor() {
    this.fallbackData = new Map()
    this.rateLimitActive = false
  }
  
  async fetchWithFallback(key, fetchFn, fallbackFn) {
    try {
      if (this.rateLimitActive) {
        // Use fallback immediately if rate limited
        return await fallbackFn()
      }
      
      const data = await fetchFn()
      
      // Cache successful responses
      this.fallbackData.set(key, {
        data,
        timestamp: Date.now()
      })
      
      return data
      
    } catch (error) {
      if (error.status === 429) {
        this.rateLimitActive = true
        
        // Reset flag after rate limit window
        const retryAfter = error.headers?.['retry-after'] || 60
        setTimeout(() => {
          this.rateLimitActive = false
        }, retryAfter * 1000)
        
        // Use fallback data
        return await fallbackFn()
      }
      
      throw error
    }
  }
  
  async getAgentsWithFallback() {
    return this.fetchWithFallback(
      'agents',
      () => supabase.from('agents').select('*'),
      () => {
        // Return cached data or basic fallback
        const cached = this.fallbackData.get('agents')
        if (cached && Date.now() - cached.timestamp < 300000) { // 5 min cache
          return cached.data
        }
        
        // Return minimal fallback data
        return [{ id: 'offline', name: 'Data temporarily unavailable' }]
      }
    )
  }
}

Best Practices Summary

Do's ✅

Monitor Headers: Always check X-RateLimit-* headers
Implement Backoff: Use exponential backoff for retries
Use Batch Operations: Combine multiple operations when possible
Cache Responses: Implement appropriate caching strategies
Handle 429 Gracefully: Provide fallback behavior
Monitor Usage: Track your rate limit consumption
Plan for Bursts: Account for traffic spikes in your design

Don'ts ❌

Don't Ignore 429: Always handle rate limit responses
Don't Retry Immediately: Wait for the specified retry period
Don't Hard Code Delays: Use the Retry-After header
Don't Spam Requests: Implement proper request spacing
Don't Skip Error Handling: Always handle rate limit errors
Don't Ignore Warnings: Monitor when approaching limits
Don't Assume Limits: Check current limits via API

Support and Escalation

When to Contact Support

Consistently hitting rate limits despite optimization
Need for custom enterprise rate limits
Suspected rate limiting bugs or issues
Questions about appropriate limits for your use case

Support Information

Include when contacting support:

Organization ID
API endpoint experiencing issues
Current request patterns and volumes
Rate limit headers from responses
Business justification for higher limits

Related Resources

[API Home](API-Home.md) - General API usage guidelines
[Authentication API](API-Authentication.md) - Authentication rate limits
[Error Handling](API-Error-Handling.md) - Handling rate limit errors
[Telemetry API](API-Telemetry.md) - High-volume telemetry ingestion
[Edge Functions API](API-Edge-Functions.md) - Custom function rate limits

Rate limiting ensures fair usage and optimal performance for all users. Following these guidelines will help you build robust, efficient applications that work well within the platform limits.

Rate Limiting - arilonUK/iotagentmesh GitHub Wiki

Rate Limiting

Overview

Rate Limiting Policies

Standard Rate Limits

Rate Limiting Strategies

1. Token Bucket Algorithm

2. Sliding Window

3. Hierarchical Limits

Handling Rate Limits

Best Practices for Clients

1. Respect Rate Limit Headers

2. Implement Exponential Backoff

3. Request Queuing

Language-Specific Examples

Python Rate Limiting

Usage

Java Rate Limiting

Monitoring Rate Limits

Rate Limit Dashboard

Alerting on Rate Limits

Optimization Strategies

1. Batch Operations

2. Caching

3. Request Optimization

4. Connection Pooling

Custom Rate Limits

Enterprise Custom Limits

API Key Specific Limits

Error Recovery

Graceful Degradation

Best Practices Summary

Do's ✅

Don'ts ❌

Support and Escalation

When to Contact Support

Support Information

Related Resources

Overview

Rate Limiting Policies

Standard Rate Limits

Burst Allowance

Premium Tier Limits

Rate Limiting Headers

Response Headers

Header Descriptions

Rate Limit Exceeded Response

Endpoint-Specific Rate Limits

Authentication Endpoints

Organization Management

User Management

IoT Agent Management

Telemetry Data

Alert Management

Edge Functions

Real-time Connections

Rate Limiting Strategies

1. Token Bucket Algorithm

2. Sliding Window

3. Hierarchical Limits

Handling Rate Limits

Best Practices for Clients

1. Respect Rate Limit Headers

2. Implement Exponential Backoff

3. Request Queuing

Language-Specific Examples

Python Rate Limiting

Java Rate Limiting

Monitoring Rate Limits

Rate Limit Dashboard

Alerting on Rate Limits

Optimization Strategies

1. Batch Operations

2. Caching

3. Request Optimization

4. Connection Pooling

Custom Rate Limits

Enterprise Custom Limits

API Key Specific Limits

Error Recovery

⚠️ GitHub.com Fallback ⚠️