Monitis transaction – a rapid implementation web-monitoring solution
Posted on by Javier TejeroWhat if you could be made aware of a broken user journey from anywhere across the globe, in less than five minutes? And what if you can set up this monitoring within hours, if not minutes?
A high traffic website presents multiple scalability problems, often implying a certain level of added complexity.
This is especially true when you have a distributed system that is composed of a large number of engineers, spread across a considerable amount of squads that deploy code asynchronously. Due to that complexity underneath you might not be aware when your website breaks in unpredictable ways, e.g. for a given subset of users or for users in a certain location. Unfortunately, this has occasionally happened to us.
Thankfully, this is now a thing of the past, because Monitis Transaction assists our small team in solving this challenge. In a nutshell, Monitis Transaction is a monitoring solution that leverages Selenium and through a grid of servers deployed across different regions and cities, allows you to test User Journeys. If something is broken from somewhere you will get an alert within five minutes. In our case we have integrated alerts within email and VictorOps.
The good parts
Selenium is a widespread, well-known, incredibly useful tool. The User Journey is defined in terms of actions supported by Selenium, so the learning curve is almost zero if you already know how to use Selenium. Plus, there’s a Firefox plugin to help you create the User Journey; very handy, particularly in locally testing that your transaction works as expected.
So if you already know Selenium and you use the Firefox plugin you can literally create the User Journey in minutes.
Finally, the dashboard is really useful because you can pinpoint quickly why a specific journey failed. Not only will it tell you which step has failed, but it will save a screenshot of the rendered web page for your analysis later. The screenshot turns out to be priceless in most cases.
The bad parts – and how to overcome them
Firstly, you might want to exclude the User Journey from specific side-effects such as analytics, which is not trivial unless you already have a proper mechanism in place.
Secondly, the Firefox plugin is handy but it does not behave exactly as the real Selenium agents. For some complex User Journeys you might need an extra dose of patience. Also, in our case, we had to white-list IP addresses from the Monitis cloud, otherwise our anti-bot systems would prevent them from running.
Last but not least: when testing with Selenium it’s well-known that a number of False Positives can arise, for example due to network issues. This is possibly the worst part because it generates noise in the notifications, causes unwanted waste of resources and possibly worst of all: a production issue could be ignored by the effect of the ‘crying wolf’.
While we are still in the early stages to completely solve this solution handicap, we find that VictorOps helps alleviate this problem by sacrificing a few minutes of the notification – you only get notified when the issue has not been resolved by itself within 15-20 minutes.
User Journey example
I am part of the Hotels Front End squad, so our Transaction covers just the hotels search scenario. We use one single transaction to cover a complete user journey:
• The Skyscanner home page loads and the user can go to the Hotels vertical by using the ‘Hotels’ tab.
• Our Search Controls work as expected (this includes from Autosuggest component to date pickers).
• User can perform a hotel search and the result contains valid results.
• The Hotel Detail page also renders correctly.
While there are many components and parts of the website that are not tested, this is the most critical user journey for our squad.
We run this in a single Monitis Transaction to keep costs down and so far it’s worked well for us. We run the transaction every 10 minutes from five different locations all over the world.