Using AWS Lambda Layers to implement sub-millisecond static caches
In an AWS serverless environment, the default solution for very low latency data access is usually DynamoDB.
Read response times depend on a variety of things, but for round numbers lets call it in the 10–20 milliseconds range (what I have experienced), to perform a keyed lookup from Lambda. What’s not to like.
This article discusses an approach to achieve sub-millisecond performance on datasets still in the multi-1000 records. This opens the door to more use cases, offers more control of data and access, and of course there’s raw performance.
TL;DR — Example
- 7000+ Keys
- 10MB total Javascript file size (in a Lambda Layer)
- Lambda Memory Settings: 1536 MB
- Lookup performance: Less than 1/10 millisecond
Environment
AWS Lambda, Lambda Layers, Node.js 12.x
Sample Use Case
Determine entities in a chatbot scenario.
Entities may include: airports, cities, attractions, zip codes, profanity, stop words etc. Each dataset contains many 100’s or 1000’s of records and 10'sMB+ of raw data.
For an incoming (tokenized) user input, perform many lookups to determine if a word, or sequence of words, is a valid entity. This is a lookup-intensive task, so typical cache implementations could get slow. Shaving a few 100ms from such a task is what we’re going for.
Data Preparation
To prepare the terrain for creating a lambda layer. In your own filesystem:
- Create and cd to directory: nodejs
- npm init -y
- Create and cd to directory: node_modules
- Create and cd to directory: cacheData (named whatever you like)
- Create a Javascript file, let’s call it: airportData.js
Sample airportData.js
module.exports = {
airports: {
"DEN": {
"airportName": "Denver International Airport",
"latitude": "39.86169815",
"longitude": "-104.6729965,
"countryCode": "US"
},
"DFW": {
"airportName": "Dallas Fort Worth Intl Airport",
"latitude": "32.896801",
"longitude": "-97.038002",
"countryCode": "US"
}
}
}
Create the Lambda Layer
Zip the above nodejs directory, name the zip file whatever you like. Let’s go with layer1.zip for our purposes
- In AWS Lambda dashboard: Create Layer
- Enter the layer name, e.g. layer1 (whatever you like)
- Upload the (layer1.zip) file
- Specify the runtime, e.g. node.js 12.x
- Click Create button
Accessing the data from Lambda
In your calling function…
// Lambda knows how to find this
const airportData = require('cacheData/airportData'); // Reference the JS object containing the keys/data
const airports = airportData.airports;let airport = airports['DFW']; // Keyed lookupconsole.log('DFW Airport: ' + JSON.stringify(airport, null, 2));
Grow the Cache
Add multiple files and/or directories into this nodejs/node_modules directory structure to expand the number of cached data objects, re-zip & upload into the already-created layer by creating a new version.
TIP: Don’t forget to point your lambda function to this newly created version to access the new/updated files.
Timing Utility
To verify your (sub millisecond) performance:
const startTimer = () => {
return process.hrtime();
}
const reportTimer = (start) => {
const endTime = process.hrtime(start);
const seconds = endTime[0];// convert nanoseconds to milliseconds
const ms = endTime[1] / 1000000;
return `${seconds}s , ${ms}ms`;
}const start1 = startTimer();let airport = airports['DFW']console.log(reportTimer(start1) +
' for DFW Airport: ' + JSON.stringify(airport, null, 2));
NOTES
The size of the data objects may impact cold-start performance, although should be minimally invasive. Will leave that for you to verify your scenario.
In my app, I have 14 JS objects (files) spread across 6 directories.
- Zip file size: 5MB
- Underlying data size: 30MB
- Cold start typically < 2s.
- Lookup performance: sub millisecond
Counterbalance
In case you think I’m obsessed with speed, here is a loaf of bread I baked recently. End to end, prep-to-completion time — approx 16 hours. Some things are worth waiting for. Response times on the Internet are not like bread!