You didn’t hear it from me, but Cleo Engineers aren’t perfect. We have bugs. We have quite a few of them. Our relentless focus at Cleo is to make it as simple and joyful as possible for our users to level up their relationship with money.
IN THIS ARTICLE:
But, when your error reporting tool (Rollbar) is frequently shouting at you, it’s difficult to prioritise which bugs to fix first. We knew we needed a solution.
Where to start? The bugs that had the most negative effect on users and the ones that were affecting user retention the most would make the biggest impact, so we put all the bug data into our analytics database and compared it with user retention. At a high level, the approach was to copy data from Rollbar to our analytics database (a Redshift instance).
We wrote some code that would run periodically and copy the data from Rollbar into Redshift, while keeping it updated. There were two options for tools: Heroku and AWS Lambda. Heroku was a natural first choice because our main application ran on Heroku. So, we laid out the pros and cons of each approach:
Heroku
Advantages:
No learning curve. Everyone at Cleo is familiar with it.
Disadvantages:
Redshift isn’t easily accessible from Heroku, we’d need to introduce significant architectural change.
AWS Lambda
Advantages:
An opportunity to learn something new.
We used it before for something similar.
Connecting with Redshift would be easier. Lambda and Redshift are both AWS services.
Disadvantages:
Potentially a big learning curve.
From a technical perspective, using a service on AWS meant that we could mitigate the risks and complications that might have arisen from talking to AWS services from outside AWS.
So without too much difficulty, we landed on AWS Lambda. We’re all about levelling up, so learning something new is always considered an opportunity here. One of our engineering principles at Cleo is to innovate the product, not the tech stack, which means we tend to lean towards the established way of doing things. However, we felt that because this was a non-user facing feature, we had more freedom to experiment and embrace new technology to enhance our tech stack. In fact, at the time of writing, Heimdallr* has been extended to save CircleCI and Heroku build and release information to our Redshift instance.
How does it work?
Above is a Ruby app with a job on AWS Lambda that runs every minute to get the latest occurrences from Rollbar and writes them to Redshift.
Why are there so many middle men between the Lambda and Rollbar? Well, this Lambda exists on our Virtual Private Cloud (VPC), and you can’t make external requests from a Lambda in a VPC to the internet. In order to make these requests, we need an Internet Gateway, which in turn requires a NAT Gateway. We route all outbound traffic from the Lambda to the NAT Gateway, and then we route all outbound traffic from the NAT Gateway to the Internet Gateway. We can then talk to Rollbar from the Lambda. This stack overflow answer explains it better.
The task to run was pretty simple. We needed to query the Rollbar API periodically and write to a table on Redshift to update it with new occurrences. We used the Ruby on Jets framework to simplify the process of getting our code on AWS. Using Jets meant we could write all of our Lambda in beautiful Ruby without having to deal with much boilerplate code. It comes with great documentation and some example apps. Using their command-line tool made it easy to build as we could run our Lambda functions locally in a Ruby console. Deployment was as simple as a single command. From this point, it was all about making sure we had any additional required configuration set up in the AWS console and alerting setup through CloudWatch to ensure our Lambda was functioning as expected.
You might be wondering what part Norse mythology plays here. Heimdallr is a Norse mythic God and provided inspiration and guidance for this project. The story goes that Heimdallr guarded the rainbow bridge Bifrost which connected Asgard, the world of the gods, and Midgard, the world of humanity (the bridge being the connection between Rollbar and Redshift). He was said to sleep less than a bird (our Heimdallr is always running) and used his horn, Gjallarhorn, to alert the gods when enemies drew near (our Heimdallr alerts us when there is an error). This metaphor is tenuous at best and grossly misunderstood at worst but I think the little joy it brings us outweighs any of its shortcomings.
Interested in working at Cleo? Check out our open roles, here.
The benefits of Typescript in a team environment for building and maintaining production-worthy codebases are nothing new. At Cleo we committed early to developing our web and mobile apps with Typescript to give us the safety of static typing when developing our product.