Having worked with the AWS ecosystem for 9 yrs, I have witnessed many interesting leaps of advances in cloud computing. Be it the initial days of load balancers and clusters or the refreshing ideas of docker containers or the highly coarse grained application services like Elastic Transcoding Service (ETS). But never have I felt more amazed and excited than with AWS Lambda and the now growing ecosystem of serverless computing and function as a service (FaaS). I think it is worth while sharing the story of how I discovered it and got blessed.
It was early 2015 when we just started exploring ETS for a product and its associated service. The system was a custom 4K camera which would capture and save 4K video in real time but upload to cloud (read S3 ) in non-realtime due to its sheer size. We had to transcode the video in cloud to a form that users could view and interact with. I was really excited to use ETS for the transcoding aspect of the system as it replaced a huge blob of custom development and associated operational issues of managing servers in cloud and scaling them on need basis.
However very soon I was irked with the fact that we needed a way to trigger ETS jobs after our 4K videos were uploaded to S3 and we needed a way to do some record keeping of what is done/in progress etc. The very first idea that came to the table was: Let’s run a tomcat server in EC2 to do the record keeping on mysql and trigger the ETS jobs. The server would expose a REST API which our uploader application running on the custom camera would call. We were already running other tomcat servers for similar record keeping needs.
Yikes, we were back to the square one… the same need to maintain scalable infrastructure in cloud. What if that tomcat server went down? In terms of functionality the tomcat server represented an iota but it would be a critical link in the chain of reliable infrastructure. This is the type of nodes which can be done with micro instances but can easily become a bottleneck during application usage burst and they are idle for most time.
Wishfully thinking I said: “It would have been nice if the ETS jobs could be started all by themselves in response to “upload completed” event in an S3 bucket!”. And with this “aspiration” we started digging around and we found we could trigger lambda functions when a upload job to S3 completed and we could run some custom code in node.js to trigger our ETS Jobs. OK and what do we do when the ETS job is done? well ETS posts to a SNS topics both on completion and on error and guess what? that can trigger Lambda functions too. These two finds got us thinking, Hmm… that is so nice we won't have to bother running that server anymore. We don’t care about the load or multiplicity either as it is a service after all: AWS has to deliver it whenever we ask for it: INFINITE scale!. I am sure the skeptics are saying ok but at what price ?
Do make it a point to read the pricing at the link above. IT IS REAL!. Think when was the last time you saw prices in microcents. It costs literally nothing to run these functions. They give 1 million requests and 400,000 GB-seconds free. That is 3.2 million seconds of a 128 MB node.js virtual machine. So not only you get a infinite scale, you get it for dirt cheap rates.
No need to watch any server cluster in cloud and yet have a fully working highly scalable system. Sleep peacefully at night!
Over the next 18 months we changed our architectural thinking to get rid of any ec2 nodes whose job was record keeping or coordinating between various AWS services. Instead we started architect-ing with the following stack:
- API Gateway
- Lambda Functions
- DynamoDB/SimpleDB and/or S3
Clients call API gateway which triggers Lambda functions which manipulate data in DynamoDB/SimpleDB or S3.
This is the infrastructure stack which
- Is not running 24x7 hence does not fail due to disk outages or such mishaps.
- Has no cost at rest ! well other than the data storage itself
- Scales to any load demand.
In the next tale I’ll cover how we converted mission critical bossy massive GPU nodes into servants of meek Lambda functions.