AWS Lambda cold start times when using container images

A couple of years ago, Mikhail Shilkov wrote a blog post detailing AWS Lambda cold starts using Docker images. Let's revisit that in 2023.

I won't be as thorough as that post. He did excellent research across multiple runtimes. As a Javascript developer, I'm much more interested in Node than I am in the other runtimes.

What is a cold start?

In AWS Lambda, the first time your Lambda function is called, an execution environment starts to run your function. Think of this as a Docker container. AWS won't say, but it may be exactly that. After your function is executed, the environment stays warm for an unspecified period, and subsequent executions will be very fast. An execution environment can only handle a single execution at a time, so if your function is invoked a second time while the first time is still running, a new concurrent environment will start cold.

How does Lambda work?

You may already know this. Feel free to skip this. But let's do a deeper dive into what Lambda is.

Any code that takes an input and ultimately returns a single output can be run as a Lambda function. This could be as simple as a few lines or a full application. Indeed Serverless Framework helps you build an entire HTTP API by deploying each URL handler as a Lambda function. Lambda is also suitable for other event-driven architecture, like publisher-subscriber patterns.

It can be as simple as:

async function handler(event) {
  return "Hello world";
}
module.exports = { handler };

Or it can be a complex application with many classes and imported libraries - as long as the entry point can be, like the above, a function that takes an input and returns an output.

Under the hood

A Lambda bundle can be a zip file. What Lambda does with that zip file depends greatly on the chosen runtime (language, version, etc.). For example, consider Node.js 18.x runtime. AWS essentially starts a fairly generic Docker image containing Node.js and your code is mounted to that Docker container. It will require your application's main file and wait for it to initialize. Then, it will call your handler function.

After your function is completed, the container will remain running as a service for an unspecified period. During that time, it may receive more requests to run your function. It will only handle a single invocation at a time. So, a subsequent request while the first one is still running will cause Lambda to spin up a second concurrent environment. The number of concurrent environments is limited by your AWS account's concurrency limit or your Lambda function's concurrency reservation. Warm environments can respond very quickly, meaning that you can handle events often in less than 10ms.

Because the same environment is reused for multiple invocations, you can take advantage by leaving open external database connections or performing some in-memory caching. Subsequent invocations will re-use that open connection and cache.

Instead of providing your code as a zip file, you can instead provide an entire Docker image, as long as the image contains a runtime interface client or conforms to the Lambda runtime API. This provides a few benefits:

Use outdated / cutting-edge versions no longer / not yet supported by AWS
Use languages not supported by AWS
Use your existing Docker image that you use on other platforms like ECS or Kubernetes.
Work around the 250MB limit of zip-based Lambda functions. Container images can be much larger.

Cold starts can be painful

The initial cold start time can be relatively slow, depending on the runtime / container image. Consider an HTTP API where every route is a separate Lambda function. Cold starts are inevitable, so you might serve most requests within 10ms, some consumers are going to experience 200-500ms delays making an HTTP request. This can be jarring to consumers of your application.

On the other hand, it may not be important for event-driven architecture. If your Lambda handles an SQS queue, then you probably don't care if each message is handled in 10ms or 500ms, as long as the queue is processed quickly overall

Let's dive into the test

AWS says:

Lambda also optimizes the image and caches it close to where the functions runs so cold start times are the same as for .zip archives.

So, the claim is that the cold start time for container images is the same as .zip archives. In reality, that isn't true.

On average, container images take twice as long to start, and up to 3x longer on the high end. For an HTTP handler, a 200ms start time might be acceptable once in a while, but a 700ms start time is brutal.

Methodology

Clone the git repo here.

This repo creates a lambda function with a very simple handler:

import { Context } from "aws-lambda";
import { setTimeout } from "timers/promises";
export async function handler(event: unknown, context: Context): Promise<void> {
  await setTimeout(5000)
  console.log(JSON.stringify(event));
}

The handle simply logs out the entire event as JSON. Before that, it will wait 5 seconds. This wait time gives you the chance to start multiple concurrent invocations of the Lambda function, forcing some cold starts.

I created 4 Lambda functions:

A .zip bundle with just this very simple handler
A .zip bundle with the handler plus a very large file, still under the limit of the total bundle size for zip-based lambdas. The very large file contains random bytes, making it hard to compress.
A Docker image with just this very simple handler
A Docker image with the handler and the large file.

I presume that a large factor in cold start time is the time to pull the Docker image or unzip the bundle. This is at least somewhat evident in the result.

Determining cold start time

AWS does not produce a metric for cold starts. Frankly, I'm not sure why. I believe AWS wants to gloss over cold starts as an observability concern. The only way to measure it is the REPORT lines that appear in Lambda logs. Cold starts will contain the text "Init Duration" while warm starts will not have this line.

You can query this from CloudWatch Insights:

fields @timestamp, @initDuration, @log
| filter @type = "REPORT"
| filter @initDuration > 0
| sort @timestamp desc
| stats avg(@initDuration) by @log

This will give you the above results grouped by log group.

Conclusion

AWS Lambda is a very useful tool. Lambda is a fantastic tool for event handling, and it can be used well to handle HTTP events (incoming requests).

Sometimes 500 milliseconds matters, and sometimes it doesn't.

You can use API Gateway + Lambda to build an entire HTTP API and even handle Websocket connections. In those cases, an occasional extra 200ms might be okay for your users, but an extra 700ms might be disruptive and jittery.

For processing an SQS queue or events in a Kinesis stream or any of the many events that come through an Event Bus, the extra jitter probably doesn't matter. Let's say you have a containerized application already running in ECS or k8s. It can be very easy to build an extra entrypoint into your application for events to be processed by Lambda. Reusing the same container image for Lambda can be a very handy tool if you are aware of the extra jitter.

Cold Starts in AWS Lambda with Container Images in 2023

TL;DR they're still slow

Table of contents