Bot attack on Public API - rahul7838/quora.clone GitHub Wiki

Let me explain various strategies to protect your public APIs from bot attacks, building from basic protections to more sophisticated approaches.

At the most basic level, rate limiting is your first line of defense. Think of it like a bouncer at a club who keeps track of how many times someone tries to enter. You can implement rate limiting based on:

  1. IP Address rate limiting:
@Configuration
public class RateLimitConfig {
    @Bean
    public KeyResolver ipKeyResolver() {
        return exchange -> Mono.just(exchange.getRequest().getRemoteAddress().getHostName());
    }
}

// In application.yml
spring:
  cloud:
    gateway:
      routes:
        - id: auth-service
          uri: lb://auth-service
          predicates:
            - Path=/auth/**
          filters:
            - name: RequestRateLimiter
              args:
                redis-rate-limiter.replenishRate: 10
                redis-rate-limiter.burstCapacity: 20
                redis-rate-limiter.requestedTokens: 1

But IP-based rate limiting alone isn't enough, as bots can rotate IP addresses. That's where CAPTCHA comes in. For your login API, you might implement it like this:

@PostMapping("/login")
public ResponseEntity<?> login(@RequestBody LoginRequest request, 
                             @RequestParam String captchaToken) {
    // First verify the CAPTCHA
    boolean isValidCaptcha = captchaService.verify(captchaToken);
    if (!isValidCaptcha) {
        return ResponseEntity.badRequest()
            .body(new ErrorResponse("Invalid CAPTCHA"));
    }
    
    // Proceed with normal login logic
    return authenticationService.login(request);
}

For even better protection, you can implement progressive security that gets stricter as suspicious behavior is detected:

@Service
public class ProgressiveSecurityService {
    private final LoadingCache<String, AttemptInfo> attemptCache;
    
    public ProgressiveSecurityService() {
        this.attemptCache = CacheBuilder.newBuilder()
            .expireAfterWrite(1, TimeUnit.HOURS)
            .build(new CacheLoader<String, AttemptInfo>() {
                @Override
                public AttemptInfo load(String key) {
                    return new AttemptInfo();
                }
            });
    }
    
    public SecurityLevel determineSecurityLevel(String identifier) {
        AttemptInfo info = attemptCache.getUnchecked(identifier);
        
        if (info.getAttempts() > 10) {
            return SecurityLevel.BLOCK; // Too many attempts
        } else if (info.getAttempts() > 5) {
            return SecurityLevel.REQUIRE_ADVANCED_CAPTCHA; // Stricter verification
        } else if (info.getAttempts() > 2) {
            return SecurityLevel.REQUIRE_SIMPLE_CAPTCHA; // Basic verification
        }
        
        return SecurityLevel.NORMAL;
    }
}

Here are some additional protections you should consider:

  1. Request Headers Validation:
@Component
public class RequestValidationFilter implements WebFilter {
    @Override
    public Mono<Void> filter(ServerWebExchange exchange, WebFilterChain chain) {
        ServerHttpRequest request = exchange.getRequest();
        
        // Check for basic browser headers
        HttpHeaders headers = request.getHeaders();
        if (!headers.containsKey(HttpHeaders.USER_AGENT) ||
            !headers.containsKey(HttpHeaders.ACCEPT_LANGUAGE)) {
            return exchange.getResponse().setComplete();
        }
        
        return chain.filter(exchange);
    }
}
  1. Implement Browser Fingerprinting:
@Service
public class BrowserFingerprintService {
    public String generateFingerprint(HttpServletRequest request) {
        StringBuilder fingerprint = new StringBuilder();
        
        // Combine various browser characteristics
        fingerprint.append(request.getHeader("User-Agent"))
                  .append(request.getHeader("Accept-Language"))
                  .append(request.getHeader("Accept-Encoding"))
                  .append(request.getHeader("Sec-Ch-Ua-Platform"));
        
        return DigestUtils.md5DigestAsHex(fingerprint.toString().getBytes());
    }
}
  1. Add Honeypot Fields:
public class SignupRequest {
    private String email;
    private String password;
    
    // Honeypot field - should be hidden via CSS
    @JsonIgnore
    private String username_confirmation;
    
    public boolean isBot() {
        // If this hidden field is filled, it's likely a bot
        return username_confirmation != null && !username_confirmation.isEmpty();
    }
}

To make this system even more effective, you should implement these measures in layers:

  1. First layer: Basic rate limiting and header validation
  2. Second layer: Simple CAPTCHA for suspicious traffic
  3. Third layer: Advanced CAPTCHA or more stringent verification for highly suspicious patterns
  4. Fourth layer: Temporary IP blocks for confirmed bot behavior

Remember to also keep logs of attack patterns to continuously improve your detection mechanisms:

@Service
public class SecurityAuditService {
    public void logSuspiciousActivity(String identifier, SecurityEvent event) {
        // Log details about the suspicious activity
        SecurityAudit audit = SecurityAudit.builder()
            .identifier(identifier)
            .eventType(event)
            .timestamp(Instant.now())
            .build();
            
        // Store for analysis and pattern detection
        securityAuditRepository.save(audit);
    }
}

Would you like me to explain more about implementing any of these security layers, or shall we discuss how to handle special cases like legitimate high-volume API consumers?

Rate Limit on Authenticated APis

Yes, we absolutely should implement rate limiting even for authenticated APIs. Let me explain why through an example scenario and then break down the specific benefits. Imagine you have an e-commerce application with an authenticated API endpoint that lets users check product prices. Even though users must be logged in to use this API, without rate limiting, a single compromised user account could potentially make thousands of requests per second. This could happen if:

The user's credentials are stolen and used maliciously The user themselves writes a script to constantly poll your API A bug in your frontend code accidentally triggers rapid repeated requests

⚠️ **GitHub.com Fallback** ⚠️