Zero to Productive in AWS: Lambda, RDS, Kinesis, API Gateway

I am a software engineer by trade. I am more at home writing software than I am with the operations serving it.

However, as tools continue to improve, most software solutions are a combination of code and configuration of best in class hosting tools.

I wanted to familiarize myself with some of AWS's service offerings.

Project overview

Goal:

Familiarize myself with AWS Lambda, Kinesis, API Gateway, and RDS. It quickly became apparent that I also needed to understand IAM, EC2, VPC, and other configuration areas in AWS.

Task:

Our simple task is to create a system that transforms first / last name combinations into full names, then stores them in a persistent data store for later retrieval. The system needs to accept both ad hoc requests via an API gateway and streams of events piped in from another service.

Steps:

Create the Lambda
Create the RDS database
Create the Kinesis stream
Create API Gateway
Wire all the pieces together - AWS console
See the success via logs

Create the Lambda

Create an AWS account

We are working in AWS, so first you need to create an AWS account. Go here and follow the prompts:

https://console.aws.amazon.com/

Create Security Credentials

To be clear, we're not ready to dive into writing our Lambda. There is additional setup. We'll start with creating AWS credentials that will allow us to connect to and administer our AWS account locally. To create them, head to:

https://console.aws.amazon.com

Find your logged in account button and navigate to My Security Credentials:

Select Continue to Security Credentials to bypass warning about IAM. In the future, you should definitely create IAM users and provision them accordingly, but this works fine for this demo.

Select Access keys, then choose Create New Access Key

It will generate the key for you automatically. Grab the Access Key ID and Secret Access Key. These are going to live in a new file we're about to create.

Once you have copied the Access Key ID and Secret Access Key, create a file in a hidden .aws directory in your home directory called credentials. This is where the AWS tooling will look for your credentials.

mkdir -p ~/.aws/ && touch ~/.aws/credentials

Inside the credentials file, place content similar to the following swapping out your access key id and secret access key.

[default]
aws_access_key_id = (your access key id)
aws_secret_access_key = (your secret access key)

The [default] is important because it labels the credential set. It is what tells your local tooling which set to use. You can manage multiple sets, but the one labeled [default] is what will be used if you don't specify anything.

Create Configuration

We need one more file inside our .aws folder. We will use this to define our default AWS region, which is required by the tool we're going to use below, Apex.

The file should be called config and will have contents similar to the following. Choose the region which is nearest to you. I am in Ohio, so am using us-east-2. The list of available regions is here.

[default]
region=us-east-2

Install Apex

Next, we will install and configure Apex and use it to create a Lambda project.

Follow the instructions for installing Apex.

Create our Lambdas project

When you have the apex command configured on your path, navigate to a fresh folder where your project will live. From the empty directory:

apex init

Then give your project a name and description. If all goes well, your output should look something like this.

~/workspace/apex $ apex init


 _ ____ _______ __
 / \ | _ \| ____\ \/ /
 / _ \ | |_) | _| \ /
 / ___ \| __/| |___ / \
 /_/ \_\_| |_____/_/\_\



 Enter the name of your project. It should be machine-friendly, as this
 is used to prefix your functions in Lambda.

 Project name: names

 Enter an optional description of your project.

 Project description: 

 [+] creating IAM names_lambda_function role
 [+] creating IAM names_lambda_logs policy
 [+] attaching policy to lambda_function role.
 [+] creating ./project.json
 [+] creating ./functions

 Setup complete, deploy those functions!

 $ apex deploy

Let's checkout what was created for us:

~/workspace/apex $ find .
.
./project.json
./functions
./functions/hello
./functions/hello/index.js

We have a project.json, which contains apex specific configuration for deploying our Lambdas, and an example Lambda called hello inside functions/hello/index.js

Let's start with the project.json

~/workspace/apex $ cat project.json 
{
 "name": "names",
 "description": "",
 "memory": 128,
 "timeout": 5,
 "role": "arn:aws:iam::691975350058:role/names_lambda_function",
 "environment": {}
}

We can see the name and description we specified, along with a default memory (MB), timeout (seconds), and a role. Where did this role come from? Apex communicated with AWS in our init step above and create it. You can see it if you look under the roles section here:

https://console.aws.amazon.com/iam

We'll see more about why that matters soon. For now, let's take a look at the default Lambda that was created for us:

~/workspace/apex $ cat functions/hello/index.js 
console.log('starting function')
exports.handle = function(e, ctx, cb) {
 console.log('processing event: %j', e)
 cb(null, { hello: 'world' })
}

Super simple. All we're doing is logging the start of a function instance, then setting up a function called handle that we'll export to AWS as our entry point to this Lambda. The function will get invoked when an event comes in. We log the event, then invoke the callback passed in. null in this case is the error parameter, and {hello:'world'} is the body of the response.

Deploy and Test our Hello World Lambda

Next, let's deploy and see what happens when we invoke it in aws:

~/workspace/apex $ apex deploy
 • creating function env= function=hello
 • created alias current env= function=hello version=1
 • function created env= function=hello name=names_hello version=1
~/workspace/apex $ apex invoke hello
{"hello":"world"}

Pretty simple. apex deploy deploys the application using the properties defined in the project.json and the credentials and config in the default sections of ~/.aws/credentials and ~/.aws/config

Have a look at your Lambda configuration through the AWS console here:

https://console.aws.amazon.com/lambda

We have a Lambda! But, it doesn't do much yet. Let's create a new one that talks to an RDS instance we will create.

Create our CreateFullName Lambda

To create a new Lambda, let's just create a new folder under our project, and a new index.js inside it. We'll call this one createFullName

mkdir -p functions/createFullName && touch functions/createFullName/index.js

We want to use async / await, to help us express our javascript. In order to do that, we'll need to use Node 8.10 (at least). Let's add this line to our project.json: "runtime":"nodejs8.10"

{
 "name": "names",
 "description": "",
 "memory": 128,
 "timeout": 5,
 "role": "arn:aws:iam::691975350058:role/names_Lambda_function",
 "environment": {},
 "runtime":"nodejs8.10"
}

Now, we can create an async Lambda, like so:

~/workspace/apex $ cat functions/createFullName/index.js 
exports.handle = async (event, context) => {
 return {hello:'create full name!'};
}

Deploy our CreateFullName Lambda

Sweet! Let's deploy and invoke it.

~/workspace/apex $ apex deploy
 • updating config env= function=createFullName
 • updating config env= function=hello
 • updating function env= function=createFullName
 • updating function env= function=hello
 • updated alias current env= function=createFullName version=2
 • function updated env= function=createFullName name=names_createFullName version=2
 • updated alias current env= function=hello version=2
 • function updated env= function=hello name=names_hello version=2
~/workspace/apex $ apex invoke createFullName
{"hello":"create full name!"}

Since the runtime engine changed, everything got updated. Apex makes it super simple.

Create the RDS DB

The next step in our project is to connect our Lambda to a database. However, we don't have a database to connect to yet. Let's take a moment to create an RDS DB for our Lambda to use.

Create the RDS Instance

Start by heading here

https://console.aws.amazon.com/rds

Choose Create database

For the purposes of this demo, I will choose the free tier of everything:

Choose MySQL, check the option for Free Usage Tier only, then next

<img alt="" src="https://img.captechconsulting.com/blogs/library/08c5e8240f9949e2823bc78a9d0997b5.ashx?h=671&w=1000">

I took the defaults for the next pane, everything except the settings section where you need to name your DB and create an admin user / password

Again, we will take most of the defaults with a few exceptions:

Update the VPC Security Groups to use the default group.

Update the database options to specify a database to create. Mine is called namesdb:

Then, at the bottom, choose Create Database

Important: You will need need to wait a few minutes, even after AWS tells you your database has been created. During this time, it is being spun up. Wait until an endpoint becomes available in the Connect section of your instance's detail page:

Update security group to allow for connecting to the RDS instance from your local machine

Next, we need to allow for connecting to the database manually, so we can create a table to store our names. In order to accomplish this, we first need to modify the default security group to allow traffic inbound from our laptop. Head to EC2:

https://console.aws.amazon.com/ec2

Navigate to security groups. At this point, you should still only have one, the default:

Select the default security group, then choose Actions -> Edit inbound rules

Add a new rule, with protocol TCP, Port Range 3306, and select My IP from the Source drop down, which will automatically populate the source IP with your public IP. Don't forget to save.

Now, you can connect to your brand new RDS instance using the endpoint you found above. In my case, it was namesdb.cvlvyr54kv2f.us-east-2.rds.amazonaws.com on port 3306.

Create our table

Go create a table. You can use your tool of choice for this. Suggestions are My Sql Workbench, the mysql cli client, or SQuirreL. Here is the DDL for the table we'll use:

create table fullNames (
 id int primary key AUTO_INCREMENT, 
 fullName varchar(255)
);

Update policies to allow our Lambda to communicate with our RDS instance

Next, we have to add a policy to the role our Lambda uses to execute. Head out to IAM:

https://console.aws.amazon.com/iam

Navigate to roles

Then find our role, the one used by the Lambda, created by Apex. In this case, it's names_Lambda_function:

Let's attach some policies.

Specifically, we need AWSLambdaVPCAccessExecutionRole. This policy is used to allow our Lambda execution to create an ENI or Elastic Network Interface and is required to connect to RDS (or other services in your VPC)

We are also going to need the AWSLambdaKinesisExecutionRole. This will allow us later to read from the Kinesis stream.

Update our Lambda project to use the correct VPC settings

After we have added these policies, we need to tell our apex project about the VPC we want to use to attach to RDS.

We need to find our VPC, Security Groups, and Subnets. They are located here:

https://console.aws.amazon.com/vpc

Locate each from this dashboard:

Then, add them to your apex project.json.

{
 "name": "names",
 "description": "",
 "memory": 128,
 "timeout": 5,
 "role": "arn:aws:iam::691975350058:role/names_Lambda_function",
 "environment": {},
 "runtime":"nodejs8.10",
 "vpc":{
 "securityGroups":[
 "sg-ba20dfd7"
 ],
 "subnets": [
 "subnet-5246d728",
 "subnet-b86f34d0",
 "subnet-11a45c5d"
 ]
 }
}

Back to our Lambda!

Now that we have an RDS instance and we're configured, we can continue writing our Lambda to store names in RDS.

Handle Node dependencies

Next, let's import a node mysql client and connect to the DB. First, we need to make sure we have npm installed and configured. I suggest nvm, but you can install npm from many different places. Once you have npm available on your path, let's initialize a project at the level of this Lambda and add mysql.

~/workspace/apex $ cd functions/createFullName/
~/workspace/apex/functions/createFullName $ npm i -y
npm WARN createFullName No description
npm WARN createFullName No repository field.
npm WARN createFullName No README data
npm WARN createFullName No license field.

up to date in 0.068s
~/workspace/apex/functions/createFullName $ npm i --save mysql

+ [email protected]
added 11 packages in 1.834s

Update the Lambda to test connectivity to RDS

Now, we should finally be able to reach RDS. Let's write a Lambda that queries for the tables in our DB, just to make sure it's working:

const mysql = require('mysql');
const {promisify} = require('util');
const connection = mysql.createConnection({
 host:'namesdb.cvlvyr54kv2f.us-east-2.rds.amazonaws.com',
 user:'namesuser',
 password:'namespassword',
 database:'namesdb',
});
const queryFunc = promisify(connection.query.bind(connection));

exports.handle = async (event, context) => {
 console.log(`Event Received: ${JSON.stringify(event)}`);
 let results = await queryFunc('show tables;');
 return results;
}

Let's break this down. It should be easy enough to see that our Lambda entry point simply logs the event that comes in, queries the db for show tables;, then returns the result. The rest of the code is just setting up the connection with the required credentials, and promisifying (yeah, that's a word now) the callback style api of the mysql dependency. No more callbacks! Yay!

Let's see if it works...

~/workspace/apex/functions/createFullName $ cd ../../
~/workspace/apex $ apex deploy
 • config unchanged env= function=createFullName
 • updating function env= function=createFullName
 • config unchanged env= function=hello
 • code unchanged env= function=hello
 • updated alias current env= function=hello version=3
 • updated alias current env= function=createFullName version=10
 • function updated env= function=createFullName name=names_createFullName version=10
~/workspace/apex $ apex invoke createFullName
[{"Tables_in_namesdb":"fullNames"}]

Awesome. We can talk to RDS!

Create Kinesis stream

Our Lambda is ready to start receiving data from the outside world. Let's set up a Kinesis stream first.

Create the stream

We will use the web interface to create the Kinesis stream. Head here to get started:

https://console.aws.amazon.com/kinesis

Choose Get Started

Choose Create Data Stream

Choose a name. I went with "names-feed" to reflect the type of data I expect to flow through this stream.
Next, choose the number of shards to create. For this demo, I went with 1.
Finally, choose Create Kinesis stream.

Wire the stream up to our Lambda

Let's wire up Kinesis to our Lambda. This can be done right from the Lambda detail page in the browser here:

https://console.aws.amazon.com/lambda

We need to scroll down and select Kinesis from the designer tab:

Next, add and save

Done!

Publish a test message to the Kinesis stream

We think we're wired up appropriately. Let's publish a Kinesis message and see if it kicks off our Lambda.

Install AWS cli

To publish Kinesis events, let's just use the AWS cli. Find and install it. Make sure it's on your path. You can start here.

The AWS cli will use the same ~/.aws/credentials file as apex, so there shouldn't need to be any further configuration. Let's publish an event.

Publish event

~/workspace/apex $ aws kinesis put-record --stream-name names-feed --data '{"firstName": "Gavin", "lastName": "Buerk"}' --partition-key 1

Hmmmm... did anything happen? Well, one way to check.

View the logs

From your local machine, you can use apex to view the remote logs of your Lambda:

~/workspace/apex $ apex logs

Success! You see something along the lines of the following:

/aws/lambda/names_createFullName 2018-08-21T17:26:50.225Z 25d92553-c5a7-4376-8b57-d504e4babfe2 Event Received: {"Records":[{"kinesis":{"kinesisSchemaVersion":"1.0","partitionKey":"1","sequenceNumber":"49587498154421108460620632105989757294178284480329940994","data":"eyJmaXJzdE5hbWUiOiAiR2F2aW4iLCAibGFzdE5hbWUiOiAiQnVlcmsifQ==","approximateArrivalTimestamp":1534872408.836},"eventSource":"aws:kinesis","eventVersion":"1.0","eventID":"shardId-000000000000:49587498154421108460620632105989757294178284480329940994","eventName":"aws:kinesis:record","invokeIdentityArn":"arn:aws:iam::691975350058:role/names_Lambda_function","awsRegion":"us-east-2","eventSourceARN":"arn:aws:kinesis:us-east-2:691975350058:stream/names-feed"}]}

Ok, that's a lot. What's going on here. Well, we received an event from Kinesis. Let me make it more attractive by formatting the JSON:

{
 "Records": [
 {
 "kinesis": {
 "kinesisSchemaVersion": "1.0",
 "partitionKey": "1",
 "sequenceNumber": "49587498154421108460620632105989757294178284480329940994",
 "data": "eyJmaXJzdE5hbWUiOiAiR2F2aW4iLCAibGFzdE5hbWUiOiAiQnVlcmsifQ==",
 "approximateArrivalTimestamp": 1534872408.836
 },
 "eventSource": "aws:kinesis",
 "eventVersion": "1.0",
 "eventID": "shardId-000000000000:49587498154421108460620632105989757294178284480329940994",
 "eventName": "aws:kinesis:record",
 "invokeIdentityArn": "arn:aws:iam::691975350058:role/names_lambda_function",
 "awsRegion": "us-east-2",
 "eventSourceARN": "arn:aws:kinesis:us-east-2:691975350058:stream/names-feed"
 }
 ]
}

So, for each record we received, we have a kineses key with a field called data. The contents of the data, eyJmaXJzdE5hbWUiOiAiR2F2aW4iLCAibGFzdE5hbWUiOiAiQnVlcmsifQ== in the example above, is a base64 encoded version of the JSON object we passed to Kineses! All we have to do is decode it and store it, concatenate the first and last name (the main goal of our whole system), and store it in the database.

Update the Lambda to handle the Kinesis events

Let's update our Lambda accordingly.

const mysql = require('mysql');
const {promisify} = require('util');
const connection = mysql.createConnection({
 host:'namesdb.cvlvyr54kv2f.us-east-2.rds.amazonaws.com',
 user:'namesuser',
 password:'namespassword',
 database:'namesdb',
});
const queryFunc = promisify(connection.query.bind(connection));

exports.handle = async (event, context) => {
 console.log(`Event Received: ${JSON.stringify(event)}`);
 let inputObjects = event.Records.map(record => JSON.parse(Buffer.from(record.kinesis.data, 'base64').toString()));
 let fullNames = inputObjects.map(inputObject => inputObject.firstName + ' ' + inputObject.lastName);
 fullNames.forEach(fullName => {
 queryFunc(`insert into fullNames (fullName) values ("${fullName}");`);
 })
 let results = await queryFunc('select * from fullNames;');
 console.log(`All full names in the table: ${JSON.stringify(results)}`);
 return results;
}

What is new here? We're mapping our event records, base 64 decoding, and parsing them to javascript objects. Then, we're mapping our javascript objects each to full names. Next, we're inserting each full name into the DB via SQL insert statement. Lastly, we're querying the DB and logging the results, to make sure the save took place.

Test the system, end to end

apex deploy again, then run our Kinesis publish command again:

~/workspace/apex $ aws kinesis put-record --stream-name names-feed --data '{"firstName": "Gavin", "lastName": "Buerk"}' --partition-key 1

Let's check the logs:

/aws/lambda/names_createFullName START RequestId: b70f2054-3b98-4f4e-9252-ba8c5ffe6704 Version: $LATEST
/aws/lambda/names_createFullName 2018-08-21T17:51:03.315Z b70f2054-3b98-4f4e-9252-ba8c5ffe6704 Event Received: {"Records":[{"kinesis":{"kinesisSchemaVersion":"1.0","partitionKey":"1","sequenceNumber":"49587498154421108460620632202216625758044022817536606210","data":"eyJmaXJzdE5hbWUiOiAiR2F2aW4iLCAibGFzdE5hbWUiOiAiQnVlcmsifQ==","approximateArrivalTimestamp":1534873863.247},"eventSource":"aws:kinesis","eventVersion":"1.0","eventID":"shardId-000000000000:49587498154421108460620632202216625758044022817536606210","eventName":"aws:kinesis:record","invokeIdentityArn":"arn:aws:iam::691975350058:role/names_Lambda_function","awsRegion":"us-east-2","eventSourceARN":"arn:aws:kinesis:us-east-2:691975350058:stream/names-feed"}]}
/aws/lambda/names_createFullName 2018-08-21T17:51:03.320Z b70f2054-3b98-4f4e-9252-ba8c5ffe6704 All full names in the table: [{"id":1,"fullName":"Gavin Buerk"}]
/aws/lambda/names_createFullName END RequestId: b70f2054-3b98-4f4e-9252-ba8c5ffe6704
/aws/lambda/names_createFullName REPORT RequestId: b70f2054-3b98-4f4e-9252-ba8c5ffe6704 Duration: 5.61 ms Billed Duration: 100 ms Memory Size: 128 MB Max Memory Used: 25 MB

Presto! The Kinesis even triggered our Lambda, the Lambda parsed the event, concatenated the first and last names, stored them in the DB, and queried the full name table to find our full name safely stored away.

Create AWS Gateway

So, what about the second input stream, that AWS API Gateway? I want to be able to handle the Kinesis stream, like we've done already, but I also want to be able to handle a request, something like:

https://(some generated dns entry).amazonaws.com/default/names_createFullName/?firstName=Gavin&lastName=Buerk

How much has to change to get us there?

Create the gateway

First, we need to create an API gateway. This time, let's do so through the Lambda interface. That way, AWS can take care of some of the grunt work for us. Navigate out to Lambda, here:

https://console.aws.amazon.com/lambda

Let's edit the triggers for our Lambda to add a gateway:

Choose Create a new API, Open as your security policy (which, we would not do beyond demo purposes for a Lambda that writes to a database...), add, and save

That's basically it for configuration. Shortly, you should have your endpoint:

However, if we click on it, we get a {"message": "Internal server error"} response. What's going on here?

Update the Lambda to support the gateway

Well, the way our code is currently structured, we are attempting to iterate over the "Records" key in the event, but our API events look different than our Kinesis events.

We have another problem, too, though. API gateways expect a certain type of object to be returned from Lambdas. Specifically, you must have a statusCode, headers, and a body.

There are probably many ways to structure a solution such that you don't have to switch over the event type in the Lambda itself, but for the sake of this demo, let's go ahead and make our Lambda decide how to handle the event based on the fields on the event.

const mysql = require('mysql');
const {promisify} = require('util');
const connection = mysql.createConnection({
 host:'namesdb.cvlvyr54kv2f.us-east-2.rds.amazonaws.com',
 user:'namesuser',
 password:'namespassword',
 database:'namesdb',
});
const queryFunc = promisify(connection.query.bind(connection));

exports.handle = async (event, context) => {
 console.log(`Event Received: ${JSON.stringify(event)}`);
 if (event.Records) {
 handleKinesis(event);
 } else if (event.queryStringParameters) {
 handleApi(event);
 }

 let results = await queryFunc('select * from fullNames;');
 console.log(`All full names in the table: ${JSON.stringify(results)}`);
 return {
 statusCode: 200,
 headers: {},
 body: JSON.stringify(results)
 };
}

const buildFullName = ({firstName, lastName}) => firstName + ' ' + lastName;

const insertFullNameInDB = fullName => {
 queryFunc(`insert into fullNames (fullName) values ("${fullName}");`);
}

function handleApi(event) {
 let fullName = buildFullName(event.queryStringParameters);
 insertFullNameInDB(fullName);
}

function handleKinesis(event) {
 let inputObjects = event.Records.map(record => JSON.parse(Buffer.from(record.kinesis.data, 'base64').toString()));
 let fullNames = inputObjects.map(buildFullName);
 fullNames.forEach(insertFullNameInDB);
}

There's a lot there now. Let's dig in to see what is happening.

The basic differences are that we now check if the incoming event has Records. If it does, handle as though it is a Kinesis event and we check if it has queryStringParameters (provided by API Gateway). If it does, we handle as though it is an API event. The helper functions should be fairly self explanatory, they still store the fullNames, just like before.

The last main difference is the return value. As discussed above, we have structured it to make the API Gateway accept it and forward it back to the client.

Test our API gateway

Let's test it out.

~/workspace/apex $ curl -XGET "https://w32n6d50pa.execute-api.us-east-2.amazonaws.com/default/names_createFullName?firstName=From&lastName=API"
[{"id":1,"fullName":"Gavin Buerk"},{"id":2,"fullName":"From API"}]

Now, let's make sure we can still receive from Kinesis:

~/workspace/apex $ aws kinesis put-record --stream-name names-feed --data '{"firstName": "From", "lastName": "Kinesis"}' --partition-key 1
~/workspace/apex $ curl -XGET "https://w32n6d50pa.execute-api.us-east-2.amazonaws.com/default/names_createFullName"
[{"id":1,"fullName":"Gavin Buerk"},{"id":2,"fullName":"From API"},{"id":3,"fullName":"From Kinesis"}]

Fantastic. We did it. Notice that because of our if / else block, if neither the query parameters or the Kinesis records are present on the API request, it simply queries and returns what's there, bypassing the save logic. That's why we can curl to see what's in the DB.

Summary

AWS is both very complicated and very simple, in different ways. It's complicated because there are a myriad of services available and you need to know a decent amount about each including how to configure and integrate between services. However, it's also very simple. They have thought of everything. You need a relational database? They have you covered. Need a queuing mechanism? Covered. Pub / Sub? Yes. NoSQL? You bet. The documentation is very good. All their services play very nicely together. They offer great flexibility for expressing security policy. All of this, and they still offer an extremely low cost barrier of entry.

What can you make by combining these fantastic building blocks in creative ways?