Lambda: Bees with Frickin' Laser Beams

Amazon Web Services' recently released Lambda service might be the crown jewel of their toolset.

Lambda runs your code so far in cloud that it almost takes the computers out of computing. Developers can run their Node.js code in an isolated environment that includes some basic and any number of custom provided modules. The instance runs through the script before disappearing silently into the night -- though it can also report results! You can find more Lambda overview here.

Love at first invocation

We were lucky enough to use this while it was still in "preview" mode, and we found it so useful that we were using it in production within the month. There are further plans to open source our work, so keep an eye out for Imagineer. Imagineer aims to sever the tie between image sizing and any developer effort, so that we can ask for whatever we need and either get it from S3 or have it created, served, and persisted on S3.

What makes Lambda most attractive to us is the explosive scalability. To demonstrate the power, I'll walk you through making a simple load testing project with the AWS Node SDK. This was heavily inspired by Chicago Tribune's super cool Bees with Machine Guns!. BwMG, written in Python, helps users spin up several EC2 instances to "attack" an endpoint and test it's ability to handle the load. I used BwMG in load testing Imagineer, and found it as sweet as honey. The bit that stung me was the need to both spin up and spin down servers. They even include this warning in their README:

Please remember to [spin down the servers]—we aren’t responsible for your EC2 bills.

Another pain point is that EC2 instances are limited per account, and I had trouble getting enough to make it useful.

Setting goals

An example flow with BwMG might look like:

bees up -s 4 -g public -k frakkingtoasters
bees attack -n 10000 -c 250 -u http://www.ournewwebbyhotness.com/
bees down

This builds 4 servers to send 10,000 requests, 250 at a time, to the destination.

Our goal is to imitate this flow with a single line:

node bees.js -n 2000 -c 100 -u https://log.roadtrippers.com/

Assembling the hive

You will need authentication for AWS. I chose to use a credentials file at ~/.aws/credentials, but there are a number of options outlined here.

Create a directory for the project. I have lovingly named my project beeswithfrickinlaserbeams. We'll have 3 dependencies in our main script:

aws-sdk - Official API for interacting with AWS services, including Lambda.

adm-zip - Compression tools, because Lambda requires code to be provided in zip format.

bluebird - Promises will help us better organize the code.

Here's how my package.json looks:

{
  "name": "beeswithfrickinlaserbeams",
  "version": "1.0.0",
  "description": "Bees with frickin' laser beams",
  "dependencies": {
    "adm-zip": "^0.4.7",
    "aws-sdk": "^2.1.26",
    "bluebird": "^2.9.25"
  }
}

You can create that file in your project directory and install the dependencies with npm install.

Gathering the swarm

Create a file called bees.js. This will what we interact with through the command line.

At the top of the file, we'll include our depencies. Lambda requires a region to be specified, so I'm using us-east-1.

var AWS = require('aws-sdk');
var lambda = new AWS.Lambda({region: 'us-east-1'});
var Promise = require('bluebird');
var AdmZip = require('adm-zip');

Next, we'll parse out the command line arguments. This isn't very important to the demonstration, so I didn't spend much time making it robust.

var CONCURRENT_JOB_LIMIT = 50;
var config = process.argv.reduce(function(memo, arg, index) {
  switch (arg) {
    case '-n':
      memo.totalRequests = process.argv[index + 1];
      break;
    case '-c':
      memo.concurrentRequests = process.argv[index + 1];
      memo.beamsPerBee = Math.ceil(
        1.0 * memo.concurrentRequests / CONCURRENT_JOB_LIMIT);
      break;
    case '-u':
      memo.url = process.argv[index + 1];
      break;
  }
  return memo;
}, {});
if (config.concurrentRequests < CONCURRENT_JOB_LIMIT) {
  config.beeCount = config.concurrentRequests;
} else config.beeCount = CONCURRENT_JOB_LIMIT;
config.iterations = Math.ceil(
  1.0 * config.totalRequests / config.concurrentRequests);

Next, we'll create a configuration for a new Lambda function. You can create your function using their GUI, CLI tools, or directly from other code. You may notice we're working toward two other files, bee.js and laser.js.

// Create a zip file from the code for Lambda to consume
var zip = new AdmZip();
zip.addLocalFile('bee.js');
zip.addLocalFile('laser.js');

// Configure bee Lambda function
var createFunctionParams = {
  Code: {
    ZipFile: zip.toBuffer()
  },
  FunctionName: 'bee',
  Handler: 'bee.handler',
  Role: '<< your IAM ARN role >>',
  Runtime: 'nodejs',
  MemorySize: 1024,
  Timeout: 3
};

See the full explanation of options for createFunction here. Take note of the MemorySize and Timeout options. Processes that grab too much memory or run too long are swatted, but cost grows with mememory and runtime.

Now, create the function. We just want to confirm it's been created and calculate some parameters for our jobs to use. They need to know the target (url) and how many connections to make (beamsPerBee). Lastly, we kick off our recursive bee invoking method. We'll hop to that next

var createFunction = Promise.promisify(lambda.createFunction, lambda);
createFunction(createFunctionParams).
  then(function(e) {
  console.log('Releasing',
    config.beeCount,
    'bee(s) with',
    config.beamsPerBee,
    'frickin\' laser beam(s) each for',
    config.iterations,
    'attack(s).');

  var invokeParams = {
    FunctionName: 'bee',
    Payload: JSON.stringify({
      url: config.url,
      beamsPerBee: config.beamsPerBee
    })
  };

  // Start sending waves of requests
  sendInTheBees(config.iterations - 1, invokeParams);
});

The last part of this file is the sendInTheBees method. It will start a wave of jobs and perform calculations on the results. To match the createFunction earlier, we call deleteFunction at the end of our code. An ideal workflow would probably offer commands to create and delete the job.

It's not pretty, but let's get it over with so you can see the job itself!

var invoke = Promise.promisify(lambda.invoke, lambda);
function sendInTheBees(iterations, invokeParams, totals) {
  // Kick off all the jobs
  var invokedBees = [];
  for (var i = 0; i < config.beeCount; i++) {
    invokedBees.push(invoke(invokeParams));
  }
  var payload;

  Promise.all(invokedBees).then(function(results){
    // Run calculations on our results
    totals = results.reduce(function(memo, result) {
      payload = JSON.parse(result.Payload);
      memo.codes = payload.codes.reduce(function(m, v) {
        if (memo.codes.indexOf(v) === -1) m.push(v);
        return m;
      });
      memo.time += payload.time * 1.0 / config.beeCount / config.iterations;
      memo.hits += config.beamsPerBee;
      return memo;
    }, totals || { time: 0, hits: 0, codes: [] });

    if (iterations > 0) {
      console.log('Sending in', iterations, 'more swarms.');
      sendInTheBees(iterations - 1, invokeParams, totals);
    } else {
      console.log('Sent', totals.hits, 'hits');
      console.log('Recieved codes:', totals.codes);
      console.log('Mean request time:', parseInt(totals.time), 'ms');
      lambda.deleteFunction({FunctionName: 'bee'}).send();
    }
  }).catch(function(e) {
    console.log(e);
    lambda.deleteFunction({FunctionName: 'bee'}).send();
  })
}

And that's it! Only 107 lines so far.

Anatomy of a bee

Make a file named bee.js. This will hold the code that runs the Lambda job in the necessary handler format. The handler is passed an event payload and a context variable. We don't do much here except spawn child processes for each request and keep tabs on some runtime stats.

Ideally we would handle errors by calling context.fail, but we'll leave that for another day.

var cp = require('child_process');

exports.handler = function handler(event, context) {
  var count = event.beamsPerBee;
  var codes = [];
  var time = 0;

  for(var i = 0; i < event.beamsPerBee; i++) {
    var child = cp.fork('laser.js').
      on('message', function(m) {
        if (codes.indexOf(m.code) === -1) codes.push(m.code)
        time += m.time * 1.0 / event.beamsPerBee;
        count--;
        if(count === 0) {
          context.succeed({
            time: time,
            codes: codes
          });
        }
      });
    child.send({url: event.url});
  }
};

One file to go! Create laser.js if you haven't already. This code lives only to make a request to the target.

process.on('message', function(m) {
  var explicitProtocol = m.url.match(/^https|^http/);
  var protocol = require(explicitProtocol ? explicitProtocol[0] : 'http');
  var start = new Date();

  protocol.get(m.url, function(res) {
    process.send({ code: res.statusCode, time: new Date() - start });
    process.exit();
  });
});

Send in the bees

Let's put it to the test! I haven't quite nailed the power of Bees with Machine Guns, but we can still run as we set out to:

$ node bees.js -n 2000 -c 100 -u http://log.roadtrippers.com
Releasing 50 bee(s) with 2 frickin' laser beam(s) each for 20 attack(s).
Sending in 19 more swarms.
Sending in 18 more swarms.
Sending in 17 more swarms.
Sending in 16 more swarms.
Sending in 15 more swarms.
Sending in 14 more swarms.
Sending in 13 more swarms.
Sending in 12 more swarms.
Sending in 11 more swarms.
Sending in 10 more swarms.
Sending in 9 more swarms.
Sending in 8 more swarms.
Sending in 7 more swarms.
Sending in 6 more swarms.
Sending in 5 more swarms.
Sending in 4 more swarms.
Sending in 3 more swarms.
Sending in 2 more swarms.
Sending in 1 more swarms.
Sent 2000 hits
Recieved codes: 200
Mean request time: 76 ms

Translation:

As you can see, we were able to send several waves of requests to the target service. If you choose to follow along or use this project, please remember to use this only on servers you are responsible for. As mentioned in the BwMG README:

If you decide to use the Bees, please keep in mind the following important caveat: they are, more-or-less a distributed denial-of-service attack in a fancy package

Conclusion

We looked at what Lambda has to offer developers, looked at how Roadtrippers currently uses it, and walked through a simple load testing project to get acquainted with the API.

If you are interested in the source code, it is available on our repo. If you have questions about how we use Lambda for Imagineer, or about anything else we do at Roadtrippers, don't hesitate to say hello. Thank you for reading!

“To me, boxing is like a ballet, except there's no music, no choreography and the dancers hit each other.” - Jack Handey