AWS Lambda - keshavbaweja-git/guides GitHub Wiki

Types of Serverless applications -

  1. Web applications

    Serve the frontend code via Amazon S3 and CloudFront or automate full deployment and hosting services with AWS Amplify

  2. Web and mobile backends

    API Gateway provides API endpoints for frontend code to invoke backend services. Amazon Cognito or APN partners like Auth0 provide integrated authentication and authorization.

  3. Data processing

    AWS Glue for managed Hadoop, Spark environment. AWS Athena for serverless big data query capability. Stream processing with AWS Kinesis and Lambda

  4. IOT workloads

Lambda is on-demand, serverless compute service that runs custom code in response to events. Many AWS services generate events and can act as event source for Lambda invocations. Examples of events that can invoke Lambda -

  • S3 Put/Delete event
  • HTTP request via API Gateway
  • Schedule managed by EventBridge rule
  • IOT event

Lambda function invocations are limited to 15 mins in duration.

Events passed to Lambda handler functions are JSON objects. These are immutable representation of facts or changes in system state.

Code outside of handler function and that runs before handler function is known as INIT or initialization code. This includes code to import libraries and declare and initialize global variables.

Benefits of event driven architecture

  1. Replace inefficient polling mechanism

    Polling results in resource wastage and delays in inter-service communication

  2. Replace webhooks

    Webhooks add complexity of custom authentication and authorization flows.

  3. Reduce complexity, increase agility

    Break down a monolith into a collection of loosely coupled event based services. Each service can be independently developed and deployed.

    Add buffers between services, remove complexity of retries, circuit breaker logic implemented in synchronous, tighly coupled architectures.

  4. Improve scalability and extensibility

    Scale services independently, add new services as required.

Trade offs of event driven architecture

  1. Variable latency

    Event driven architecture comprises of multiple independent services communicating over a network, this introduces variable latency into architecture.

    This architecture is not suitable for workloads like high-frequency trading or RPA applications that require consistent, low-latency (sub-millisecond) performance.

  2. Eventual cosistency

    Event driven architectures are eventually consistent as an event flows through different services. Transaction management needs to be handled differently in comparison to a monolithic architecture.

    Many workloads include a combination of eventually consistenet and strongly consistent requirements. AWS Services like DynamoDB have support for strongly consistent behavior at cost of higher latency and increased resource usage.

  3. Returning value to caller

    Event driven architectures by their nature are not designed for a request-response integration pattern. This might not be suitable for interactive clients of the system.

  4. Increased complexity of monitoring & debugging

    Effective log management and monitoring solution is required for event driven architecture. AWS CloudWatch provides these capabilities.

    AWS X-Ray provides distributed tracing capability.

Design principles for serverless event driven applications

  1. Use purpose built AWS services

    Pattern AWS Service
    Queue Amazon SQS
    Publish/Subsribe Amazon SNS
    Event streams Amazon Kinesis
    Event bus Amazon EventBridge
    Orchestration AWS Step Functions
    API Amazon API Gateway
  2. Stateless functions

    Lambda functions do not retain any state, data on file-system across invocations.

    Initialization code is used to initialize connections to external data sources, load libraries in the execution environment. These execution environments are reused across invocations, but should not be used to maintain internal state like counters.

  3. Lambda function design

    Build concise functions with single responsibility. Lambda functions should not be used for orchestration or worklow management.

    Global scope constants should be modeled as environment variables to allow updates without deployments.

    Secrets or sensitive information should be stored in AWS Secrets Manager or AWS Systems Manager.

  4. Adopt event driven design over batch based, polling design solutions.

  5. Use AWS Step Functions for Orchestration

    Extract out error handling, routing and branching logic, instead use state machines declared as JSON.

    Reduce complexity of Lambda functions, make workflows more robust and observable, add versioning support for workflows.

  6. Handling retries and failures Build idempotent functions as these are invoked multiples times to in failure and retry scenarios.

Lambda anti-patterns

  1. Lambda monolith
  2. Lambda as orchestrator
  3. Recursive event patterns - lambda executions in loop
  4. Lambda calling other lambda functions
  5. Synchronous waiting in a single lambda function

Lambda concurrency controls

  1. Reserved concurrency

    Maximum concurrent instances for a function, no other function can use this reserved concurrency. There is no charge for configuring Reserved concurrency. This applies to function as whole including versions and aliases.

  2. Provisioned concurrency

    Initializes a requested number of execution environments so that they are available to process invocation requests immediately. Configuring provisioned concurrency incurs charge to your AWS account.

    Lambda also integrates with Application Auto Scaling, this can be used to manage provisioned concurrency on schedule or based on utilization.

    Provisioned concurrency counts towards a function's reserved concurrency and regional quotas.

Accessing VPC resources

You can configure a Lambda function to connect to private subnets in a VPC in your account. When you connect a function to a VPC, Lambda creates an elastic network interface for each subnet in your function's VPC configuration.

Lambda functions can't connect directly to a VPC with dedicated instance tenancy. To connect to resources in a dedicated VPC, peer it to a second VPC with default tenancy.

aws lambda create-function \
--function-name my-function \
--runtime nodejs12.x \
--handler index.js \
--zip-file fileb://function.zip \
--role arn:aws:iam::accountId:role/role-name \
--vpc-config SubnetIds=subnet1,subnet2,SecurityGroupIds=securityGroup1

Interface VPC endpoint for Lambda

To invoke Lambda API/functions from your VPC resources, without routing traffic over internet, a Lambda VPC interface endpoint can be created. Each interface endpoint is represented by one or more elastic network interfaces in your subnets.

  1. Keep-alive for persistent connections

    Lambda purges idle connections over time. Use keep-alive directive to maintain persistent connections. Attempting to reuse an idle connection when invoking a function results in a connection error.

  2. Billing considerations

    There is no additional cost to access a Lambda function through an interface endpoint. Standard pricing for AWS PrivateLink applies to interface endpoints for Lambda. Your AWS account in billed for every hour an interface endpoint is provisioned in each AZ and for the data processed through interface endpoint.

  3. VPC Peering

    VPC Peering is a networking connection between two VPCs. A VPC Peering connection can be established between two VPCs across accounts or regions as well. Once a VPC Peering connection has been established between two VPCs, a VPC interface endpoint for Lambda created in one VPC can be accessed by resources in other VPC.

Configuring database access - RDS Proxy

You can create an Amazon RDS Proxy for your functions. A database proxy manages a pool of database connections and relays queries from a function.

Invocation modes

Synchronous invocation

Synchronous call from caller to Lambda to function. When a function in invoked in Synchronous mode, Lambda waits to the function to complete and then returns the function response/error object to caller.

Caller needs to handle retries in cases of timeouts, throttling, service/function errors.

AWS CLI and SDK automatically retry on client timeouts, throttling and service errors.

Asynchronous invocation

Several AWS services, like SNS and S3 invoke Lambda functions asynchronously.

For asynchronous invocations, Lambda places the event on an internal queue and returns a success response to client without additional information. A separate process reads events off queue and invokes function.

Function error In case of a function error, Lambda attempts to run it two more times, with one minute wait for the second attempt and two minutes wait for the third attempt. Function error may be due to function code or runtime.

For throttling errors (429) and system errors (500), Lambda returns the event to queue and attempts to run the function again for upto six hours.

It is possible for a function to receive an event multiple times as the internal queue is eventually consistent. Functions should be able to handle duplicate events.

It is possible for an event to age out and be deleted from the internal queue under heavy loads. Ensure that the function is configured with sufficient concurrency to handle heavy loads.

Invocation record An invocation record contains details about request and response in JSON. You can configure separate destinations for events that have been processed successfully and for events that have failed all processing attempts. Destinations for invocation record

  • SQS
  • SNS
  • EventBridge
  • Lambda Alternativly SQS or SNS can be configured as dead-letter queue for discarded events. For dead-letter queue, Lambda only sends the content of the event without details of response.

Error handling configuration for asynchronous invocation

  • Maximum age of event (upto 6 hours)
  • Retry attempts (0 to 2)

Event source mapping

Services that can act as event source for Lambda

  • DynamoDB stream
  • SQS
  • SNS
  • EventBridge
  • Kinesis
  • MSK
  • self-managed Apache Kafka
aws lambda create-event-source-mapping \
--function-name my-function \
--batch-size 500 \
--starting-position LATEST \
--event-source-arn arn:aws:dynamodb:us-east-2:123456789012:table/my-table/stream/2019-06-10T19:26:16.525

If function returns an error, entire batch is reprocessed until the function succeeds to the items in the batch expire.

Lambda supports in-order processing for Kinesis, Kafka and SQS FIFO. For Kinesis & Kafka in-order processing is available at partition/shard level. For SQS FIFO in-order processing is availabe at message group level.