Lambda@Edge AVIF conversion on CloudFront

When you cannot pre-generate every image variant at build time — user-uploaded content, a large legacy JPEG library, an admin who drops files straight into S3 — you need CloudFront to produce the modern format on demand. A CloudFront Function cannot do this: it has no filesystem, no network, and a sub-millisecond budget. Byte transformation belongs in Lambda@Edge on the origin-response event, where a full Node.js runtime can pull the original from S3, transcode it with sharp, and hand the derivative back for CloudFront to cache. This guide is part of AWS CloudFront Cache Behaviors for Media within CDN & Edge Media Delivery, and shows the exact function, the deployment dance, and the failure modes that catch every first-time deployer.

Why origin-response, and why the Accept header is awkward here

CloudFront fires Lambda@Edge on four events: viewer-request, origin-request, origin-response, and viewer-response. Transcoding belongs on origin-response — after CloudFront has fetched the original bytes from S3 but before it stores the result in its cache. Do it there and the derivative is what gets cached, so the expensive sharp encode runs once per POP per variant, not on every request.

The awkward part: by the time the origin-response event fires, the viewer’s original Accept header is not automatically present on the event object. Origin-response sees the origin request/response, not the viewer request. You have two clean ways to get the viewer’s format preference to the function:

  1. Normalize Accept in a viewer-request CloudFront Function (as in the Cache Policy guide) and ensure the Origin Request Policy forwards Accept — then read request.headers.accept off the event, which origin-response does expose as the origin request.
  2. Enable the CloudFront-Viewer-* headers so the viewer context rides along; for format negotiation, forwarding the normalized Accept is simpler and keeps the cache key aligned.

This guide uses approach 1: a viewer-request function normalizes Accept to image/avif/image/webp/image/jpeg, an Origin Request Policy forwards it, and the origin-response Lambda reads that single token. Because the same normalized token is in the cache key, each derivative caches under exactly the object the next matching viewer will look up.

Origin-response AVIF transcoding flow Viewer request is normalized to an Accept token; on a cache miss CloudFront forwards to the S3 origin; the origin-response Lambda fetches the original JPEG, transcodes it to AVIF or WebP with sharp, and returns the derivative which CloudFront caches under the Accept-keyed object. Viewer Accept → token CloudFront cache MISS forward Accept S3 origin hero.jpg bytes Lambda@Edge origin-response sharp transcode → AVIF / WebP replace body Cache store derivative next matching viewer on this POP → cache Hit, no re-encode

Prerequisite checklist

Exact solution

Step 1 — The origin-response function

// index.mjs — Lambda@Edge, origin-response trigger.
// Runs after CloudFront fetches the original from S3, before caching.
// Transcodes to the format named by the normalized Accept token.
import sharp from 'sharp';
import { S3Client, GetObjectCommand } from '@aws-sdk/client-s3';

const s3 = new S3Client({ region: 'eu-west-1' }); // region of the origin bucket
const BUCKET = 'my-media-originals';

// sharp encode settings kept modest so a cold Lambda still finishes < 5s.
const AVIF = { quality: 50, effort: 4 };  // effort 4 balances size vs edge CPU budget
const WEBP = { quality: 80, effort: 4 };  // WebP quality scale is conventional (higher=better)

export const handler = async (event) => {
  const cf = event.Records[0].cf;
  const request = cf.request;
  const response = cf.response;

  // Only transcode successful image responses; pass others through untouched.
  if (response.status !== '200') return response;

  // The viewer-request function normalized Accept to one token; the Origin
  // Request Policy forwarded it, so it is present on the origin request here.
  const accept = (request.headers['accept'] &&
                  request.headers['accept'][0].value) || 'image/jpeg';
  if (accept === 'image/jpeg') return response; // no modern format wanted → serve original

  // Fetch the original bytes from S3. request.uri is like "/images/hero.jpg".
  const key = request.uri.replace(/^\//, '');
  const obj = await s3.send(new GetObjectCommand({ Bucket: BUCKET, Key: key }));
  const input = Buffer.from(await obj.Body.transformToByteArray());

  // Guard the 1 MB Lambda@Edge origin-response body ceiling BEFORE encoding.
  // Very large masters must be pre-resized; edge responses over 1 MB are rejected.
  let pipe = sharp(input).rotate(); // rotate() honours EXIF orientation
  let out, contentType;
  if (accept === 'image/avif') {
    out = await pipe.avif(AVIF).toBuffer();
    contentType = 'image/avif';
  } else {
    out = await pipe.webp(WEBP).toBuffer();
    contentType = 'image/webp';
  }

  if (out.length > 1_000_000) return response; // over the ceiling → fall back to original

  // Replace the response body + headers. Body must be base64 for binary payloads.
  response.body = out.toString('base64');
  response.bodyEncoding = 'base64';
  response.headers['content-type']  = [{ key: 'Content-Type',  value: contentType }];
  response.headers['cache-control'] = [{ key: 'Cache-Control', value: 'public, max-age=31536000, immutable' }];
  return response;
};

Warning: The origin-response body limit is 1 MB for Lambda@Edge (viewer-response is smaller still). A 4000×3000 master will blow past it as AVIF. Resize to a sane display dimension inside the same sharp pipeline (pipe.resize({ width: 1600 })) or pre-resize at upload; the guard above prevents a hard error but silently serves the un-transcoded original if you skip it.

Step 2 — Package with a runtime-correct sharp

# sharp ships prebuilt binaries per platform. Install the one matching the Lambda
# runtime and architecture, NOT your dev machine, or the module fails to load at runtime.
npm install --os=linux --cpu=x64 sharp   # use --cpu=arm64 for a Graviton (arm64) function

# Zip the handler + node_modules together (a Layer also works and keeps the fn small).
zip -r function.zip index.mjs node_modules

Step 3 — Create, publish a version, and associate

# Lambda@Edge MUST be created in us-east-1. The association references a numbered
# version (an ARN ending in :N), never $LATEST.
aws lambda create-function --region us-east-1 \
  --function-name avif-edge-transcoder \
  --runtime nodejs20.x --handler index.handler \
  --timeout 5 --memory-size 512 \
  --role arn:aws:iam::111122223333:role/lambda-edge-transcoder \
  --zip-file fileb://function.zip

# Publish an immutable version; capture its ARN for the association.
VERSION_ARN=$(aws lambda publish-version --region us-east-1 \
  --function-name avif-edge-transcoder \
  --query 'FunctionArn' --output text)

# Associate on the /images/* behavior, EventType=origin-response.
# Edit the distribution config's LambdaFunctionAssociations, then update-distribution
# with the current --if-match ETag (as in the parent guide's Step 3).
echo "Associate $VERSION_ARN on origin-response"

Tradeoff: 512 MB memory gives sharp enough headroom for a 1600px AVIF encode in roughly 300–700 ms warm. Dropping to 128 MB saves cost but triples encode time and risks the 5 s origin-response timeout on a cold container. For image transcoding, 512 MB is the practical floor.

Verification

# AVIF-capable request → expect content-type: image/avif and, after warm-up,
# x-cache: Hit from cloudfront (the derivative, not the JPEG, is cached).
curl -sI -H 'Accept: image/avif,image/webp,*/*;q=0.8' \
  https://d111111abcdef8.cloudfront.net/images/hero.jpg \
  | grep -iE 'content-type|x-cache|content-length|age'

# WebP-only (Safari 14/15) → expect content-type: image/webp from a separate object.
curl -sI -H 'Accept: image/webp,*/*;q=0.8' \
  https://d111111abcdef8.cloudfront.net/images/hero.jpg \
  | grep -iE 'content-type|x-cache|content-length'

# Confirm the derivative is smaller than the JPEG master:
curl -s -H 'Accept: image/avif,*/*' -o /dev/null -w '%{size_download}\n' \
  https://d111111abcdef8.cloudfront.net/images/hero.jpg

Watch execution in CloudWatch Logs — Lambda@Edge logs land in the region nearest the viewer, not us-east-1, so check /aws/lambda/us-east-1.avif-edge-transcoder under a regional log group (e.g. eu-west-1). If a request never triggers the function, the association is on the wrong event or the wrong behavior.

Common mistakes

1. Reading Accept off the wrong event

Anti-pattern: Expecting the viewer’s Accept on the origin-response event without forwarding it.

Effect: request.headers.accept is undefined or is the origin’s default; every image transcodes to one format (or none).

Fix: Normalize Accept in a viewer-request CloudFront Function and forward it via the Origin Request Policy, or read CloudFront-Viewer-* headers. Then the token is present on the origin-response event.

2. Packaging the wrong sharp binary

Anti-pattern: npm install sharp on macOS/Windows and zipping the result.

Effect: Error: Could not load the "sharp" module using the linux-x64 runtime at cold start; every request 502s.

Fix: Install with --os=linux --cpu=x64 (or arm64) matching the function, or build in a Lambda-compatible container / use a prebuilt sharp Layer.

3. Ignoring the 1 MB body ceiling

Anti-pattern: Transcoding a full-resolution master and returning it directly.

Effect: CloudFront rejects the response with LambdaLimitExceeded; the image fails intermittently on large files only.

Fix: Resize inside the sharp pipeline and guard out.length > 1_000_000, falling back to the original as shown.

4. Underestimating cold starts and replication lag

Anti-pattern: Deploying a new version and testing immediately, at 128 MB.

Effect: First requests time out (cold sharp init on low memory), and the old version still serves from POPs that have not replicated yet.

Fix: Use ≥512 MB, keep the function warm on hot paths, and wait for CloudFront to finish replicating the new version (status Deployed) before judging behaviour. Replication is global and takes minutes.

5. Region mismatch on the S3 client

Anti-pattern: Hardcoding us-east-1 in the S3 client when the bucket lives elsewhere.

Effect: Cross-region GetObject on every miss adds latency and cost, or fails on a bucket with region-locked policies.

Fix: Point the S3 client at the bucket’s region; the Lambda runs at the edge but S3 is regional.