AWS Lambda - keshavbaweja-git/guides GitHub Wiki
Types of Serverless applications -
-
Web applications
Serve the frontend code via Amazon S3 and CloudFront or automate full deployment and hosting services with AWS Amplify
-
Web and mobile backends
API Gateway provides API endpoints for frontend code to invoke backend services. Amazon Cognito or APN partners like Auth0 provide integrated authentication and authorization.
-
Data processing
AWS Glue for managed Hadoop, Spark environment. AWS Athena for serverless big data query capability. Stream processing with AWS Kinesis and Lambda
-
IOT workloads
Lambda is on-demand, serverless compute service that runs custom code in response to events. Many AWS services generate events and can act as event source for Lambda invocations. Examples of events that can invoke Lambda -
- S3 Put/Delete event
- HTTP request via API Gateway
- Schedule managed by EventBridge rule
- IOT event
Lambda function invocations are limited to 15 mins in duration.
Events passed to Lambda handler functions are JSON objects. These are immutable representation of facts or changes in system state.
Code outside of handler function and that runs before handler function is known as INIT or initialization code. This includes code to import libraries and declare and initialize global variables.
Benefits of event driven architecture
-
Replace inefficient polling mechanism
Polling results in resource wastage and delays in inter-service communication
-
Replace webhooks
Webhooks add complexity of custom authentication and authorization flows.
-
Reduce complexity, increase agility
Break down a monolith into a collection of loosely coupled event based services. Each service can be independently developed and deployed.
Add buffers between services, remove complexity of retries, circuit breaker logic implemented in synchronous, tighly coupled architectures.
-
Improve scalability and extensibility
Scale services independently, add new services as required.
Trade offs of event driven architecture
-
Variable latency
Event driven architecture comprises of multiple independent services communicating over a network, this introduces variable latency into architecture.
This architecture is not suitable for workloads like high-frequency trading or RPA applications that require consistent, low-latency (sub-millisecond) performance.
-
Eventual cosistency
Event driven architectures are eventually consistent as an event flows through different services. Transaction management needs to be handled differently in comparison to a monolithic architecture.
Many workloads include a combination of eventually consistenet and strongly consistent requirements. AWS Services like DynamoDB have support for strongly consistent behavior at cost of higher latency and increased resource usage.
-
Returning value to caller
Event driven architectures by their nature are not designed for a request-response integration pattern. This might not be suitable for interactive clients of the system.
-
Increased complexity of monitoring & debugging
Effective log management and monitoring solution is required for event driven architecture. AWS CloudWatch provides these capabilities.
AWS X-Ray provides distributed tracing capability.
Design principles for serverless event driven applications
-
Use purpose built AWS services
Pattern AWS Service Queue Amazon SQS Publish/Subsribe Amazon SNS Event streams Amazon Kinesis Event bus Amazon EventBridge Orchestration AWS Step Functions API Amazon API Gateway -
Stateless functions
Lambda functions do not retain any state, data on file-system across invocations.
Initialization code is used to initialize connections to external data sources, load libraries in the execution environment. These execution environments are reused across invocations, but should not be used to maintain internal state like counters.
-
Lambda function design
Build concise functions with single responsibility. Lambda functions should not be used for orchestration or worklow management.
Global scope constants should be modeled as environment variables to allow updates without deployments.
Secrets or sensitive information should be stored in AWS Secrets Manager or AWS Systems Manager.
-
Adopt event driven design over batch based, polling design solutions.
-
Use AWS Step Functions for Orchestration
Extract out error handling, routing and branching logic, instead use state machines declared as JSON.
Reduce complexity of Lambda functions, make workflows more robust and observable, add versioning support for workflows.
-
Handling retries and failures Build idempotent functions as these are invoked multiples times to in failure and retry scenarios.
Lambda anti-patterns
- Lambda monolith
- Lambda as orchestrator
- Recursive event patterns - lambda executions in loop
- Lambda calling other lambda functions
- Synchronous waiting in a single lambda function
Lambda concurrency controls
-
Reserved concurrency
Maximum concurrent instances for a function, no other function can use this reserved concurrency. There is no charge for configuring Reserved concurrency. This applies to function as whole including versions and aliases.
-
Provisioned concurrency
Initializes a requested number of execution environments so that they are available to process invocation requests immediately. Configuring provisioned concurrency incurs charge to your AWS account.
Lambda also integrates with Application Auto Scaling, this can be used to manage provisioned concurrency on schedule or based on utilization.
Provisioned concurrency counts towards a function's reserved concurrency and regional quotas.
Accessing VPC resources
You can configure a Lambda function to connect to private subnets in a VPC in your account. When you connect a function to a VPC, Lambda creates an elastic network interface for each subnet in your function's VPC configuration.
Lambda functions can't connect directly to a VPC with dedicated instance tenancy. To connect to resources in a dedicated VPC, peer it to a second VPC with default tenancy.
aws lambda create-function \
--function-name my-function \
--runtime nodejs12.x \
--handler index.js \
--zip-file fileb://function.zip \
--role arn:aws:iam::accountId:role/role-name \
--vpc-config SubnetIds=subnet1,subnet2,SecurityGroupIds=securityGroup1
Interface VPC endpoint for Lambda
To invoke Lambda API/functions from your VPC resources, without routing traffic over internet, a Lambda VPC interface endpoint can be created. Each interface endpoint is represented by one or more elastic network interfaces in your subnets.
-
Keep-alive for persistent connections
Lambda purges idle connections over time. Use keep-alive directive to maintain persistent connections. Attempting to reuse an idle connection when invoking a function results in a connection error.
-
Billing considerations
There is no additional cost to access a Lambda function through an interface endpoint. Standard pricing for AWS PrivateLink applies to interface endpoints for Lambda. Your AWS account in billed for every hour an interface endpoint is provisioned in each AZ and for the data processed through interface endpoint.
-
VPC Peering
VPC Peering is a networking connection between two VPCs. A VPC Peering connection can be established between two VPCs across accounts or regions as well. Once a VPC Peering connection has been established between two VPCs, a VPC interface endpoint for Lambda created in one VPC can be accessed by resources in other VPC.
Configuring database access - RDS Proxy
You can create an Amazon RDS Proxy for your functions. A database proxy manages a pool of database connections and relays queries from a function.
Invocation modes
Synchronous invocation
Synchronous call from caller to Lambda to function. When a function in invoked in Synchronous mode, Lambda waits to the function to complete and then returns the function response/error object to caller.
Caller needs to handle retries in cases of timeouts, throttling, service/function errors.
AWS CLI and SDK automatically retry on client timeouts, throttling and service errors.
Asynchronous invocation
Several AWS services, like SNS and S3 invoke Lambda functions asynchronously.
For asynchronous invocations, Lambda places the event on an internal queue and returns a success response to client without additional information. A separate process reads events off queue and invokes function.
Function error In case of a function error, Lambda attempts to run it two more times, with one minute wait for the second attempt and two minutes wait for the third attempt. Function error may be due to function code or runtime.
For throttling errors (429) and system errors (500), Lambda returns the event to queue and attempts to run the function again for upto six hours.
It is possible for a function to receive an event multiple times as the internal queue is eventually consistent. Functions should be able to handle duplicate events.
It is possible for an event to age out and be deleted from the internal queue under heavy loads. Ensure that the function is configured with sufficient concurrency to handle heavy loads.
Invocation record An invocation record contains details about request and response in JSON. You can configure separate destinations for events that have been processed successfully and for events that have failed all processing attempts. Destinations for invocation record
- SQS
- SNS
- EventBridge
- Lambda Alternativly SQS or SNS can be configured as dead-letter queue for discarded events. For dead-letter queue, Lambda only sends the content of the event without details of response.
Error handling configuration for asynchronous invocation
- Maximum age of event (upto 6 hours)
- Retry attempts (0 to 2)
Event source mapping
Services that can act as event source for Lambda
- DynamoDB stream
- SQS
- SNS
- EventBridge
- Kinesis
- MSK
- self-managed Apache Kafka
aws lambda create-event-source-mapping \
--function-name my-function \
--batch-size 500 \
--starting-position LATEST \
--event-source-arn arn:aws:dynamodb:us-east-2:123456789012:table/my-table/stream/2019-06-10T19:26:16.525
If function returns an error, entire batch is reprocessed until the function succeeds to the items in the batch expire.
Lambda supports in-order processing for Kinesis, Kafka and SQS FIFO. For Kinesis & Kafka in-order processing is available at partition/shard level. For SQS FIFO in-order processing is availabe at message group level.