Serverless Applications - kimschles/schlesinger-knowledge GitHub Wiki

Self-Healing Serverless Applications AKA How to build systems that are resistant to failure

Nate Taggart, CEO of Stackery Gluecon: May 16, 2018

Common Serverless Failures for Lambda-Based Architectures

Runtime Errors
- Uncaught exception
- Timeout
- Bad state
Scaling Errors
- Concurrency limits (how high you can scale)
- Spawn limits (how fast you can scale)
- Bottlenecking

The Alternative: Self-Healing Serverless Apps

3 Design Principles:

Plan for failure
- Know the service limits during your planning
- Use self-throttling
- Consider alternatives to AWS lambda
Standardize
- Introduce universal instrumentation
- Collect event-centric diagnostics
- Give everyone visibility
Fail Gracefully
- Reroute and unblock failed traffic
- Automate known solutions
- Notify a human

Research

DB connection pool
API Gateway
Python try: except and raise (try-catch loop)
Kinesis Stream
Decorator Pattern