Event-Driven Blogging with AWS Lambda

12th May, 2015

I wrote a short post (and an even shorter follow up), outlining the various tools I use to author and publish this blog. A few folks have been in touch asking for a bit more information on publishing with Lambda, so here I’ll dive into the details and hopefully start to provide some color on how to think about designing applications around AWS Lambda.

In the abstract, most application logic consists of three main components:

  1. Nouns: a collection of data models (people, places, things or ideas)
  2. Verbs: operators which act on those models, directly or indirectly (sing, dance, talk, save)
  3. Sentence structure: a way of controlling the interaction, ordering and results of nouns and verbs

I’ve found this a useful mental model when starting to whiteboard application architectures. Identifying the nouns (and adjectives), the verbs (and adverbs), and a way of composing them to solve a specific requirement is the essence of application architecture and nailing it can lead to better software design. It’s also notoriously difficult to get right.

Lambda provides an environment optimized for verbs, which are ultimately implemented as discrete, stateless, transient functions which operate against data. It is as close to a raw execution environment for ‘doing’ as I’ve come across, and as a result I’ve found this a useful mental model in helping design, architect and implement application which run in that environment.

With Lambda handling the verbs of our application, we can focus on the operations themselves, the descriptions of the nouns and how they all hang together in a sentence.

Take a blogging engine, for example

If I were asked to boil down a typical blogging engine to a single sentence, it would be:

An application which can process text, images and other media for display, publish that content to the web, and index it chronologically.

Of course, most blogging engines do this in addition to many other things (aggregation, sharing, social interaction, etc), but as a core you want to take stuff, make it readable on the web, and put the most recent stuff at the top of a list of all the other stuff.

In a single sentence we have identified the nouns (the data we need to model - text, content, etc), the verbs (the functional operations we need to implement - process, publish and index), and the sequence of events which will orchestrate them. Not bad for something that would fit inside a tweet.

The Nouns: media and content

We’ve only really got two categories of things to think about in a blog: media (the raw text, images, etc), and post content (the text, images, etc processed display on the web). To get more specific, I chose to work primarily in raw text formatted with Markdown for the media, and for the final post content, you can’t beat HTML.

Both are stored in S3, with metadata (date, title, etc.) stored in DynamoDB.

The Verbs: process, publish and index

Our blogging engine has three verbs, which we’ll implement as Lambda functions, using Javascript and Node.js.

Process: convert raw Markdown into HTML

This function takes raw Markdown text, and converts it to HTML content. The excellent marked library does all the heavy lifting here, including using the lexer to extract the post title from the first top level heading. We stash HTML in S3, and the metadata in DynamoDB.

Publish: publishes HTML to the web

This function combines the HTML content with the site template (the logo and menu on the left hand side), and publishes it to the web. The site templates are stored in a private bucket in S3, and Handlebars takes care of rendering before the final post is published to a static web site, also on S3.

Index: build a new home page

Finally, we want to index the content in reverse chronological order on the home page of the blog. We pull and sort the post metadata from DynamoDB, build the new home page using the relevant HTML snippets in S3, and render the whole thing to the home page using Handlebars.

The Sentence

Lambda functions can be called directly via the API. We could construct a command line script or even another Lambda function to control the orchestration and application flow, but less code is usually better code (and no code is, therefore, the best), so in this case we’ll use S3 event notifications to orchestrate the application components.

The event notification of new objects in each S3 bucket trigger a Lambda event for the next step of the applications; we’re taking advantage of the automatic execution of Lambda functions in response to these events to drive our application process. In essence, the event notifications encode our sentence structure in this case.

Wrapping up

Using Lambda and these simple application architecture guidelines, this blog runs on a little under 300 lines of Javascript. Lambda, DynamoDB and S3 take care of virtually all the heavy lifting of processing, storing and serving both the blog and the blogging engine, but without having to manage or maintain servers, databases or storage. Hopefully that gives you a flavor of some of the design considerations of building applications with Lambda, and some of the implementation details of this humble site.