A polyglot architecture – Skyscanner’s frontend under the hoodPosted on by Alex Bardas
A few months ago I had the pleasure – if I am to ignore the random weather– to travel to London and attend a recruitment event called Silicon Milkroundabout with some of my colleagues from Skyscanner. The most frequent question I was asked was “What programming language do you use?” or “What’s Skyscanner written in?”
It’s an interesting question, because it’s hard to give a straight answer to it. Not because of some weird non-disclosure agreement (we’re pretty open about the technologies we use and we open source a lot of our projects – check out our github), but because we use so many of them. Here’s a look at how and why.
Tribes, squads and different programming languages
Internally, Skyscanner is ‘squadified’, an organizational model described here, which actually means we’re organized in tribes – yes we called them tribes and each has a tribe leader, although I’m not sure that’s actually written on their business cards . Each tribe is divided into squads.
Squads are multidisciplinary teams that own a service and have a very high degree of autonomy, the main – and sometimes only – constraint being that they must respect contracts with other squads. Think of each squad as a mini-startup. Under this paradigm, squad members can pick their release cycle, their agile methodology – scrum, Kanban – and of course, their technology stack. Which makes things really interesting and very diverse.
At Skyscanner we have Java, Python, .NET, PHP, Ruby, nodeJS squads and the list can go on. We believe that the problem choses the technology and not the other way around, so I can say we’re actually polyglots when it comes to programming languages. Which is really cool, but also really hard to explain at a recruitment event.
The Hotels Vertical
I work in the Hotels tribe, the frontend squad, but other verticals share a similar architectural approach. Compiling a list of hotels and displaying it to the end user, with up to date prices from multiple global partners, without duplicates and with relevant images goes well beyond issuing a “SELECT * FROM hotels WHERE price BETWEEN …” DB query. It’s a fine-tuned process that includes multiple teams, including:
• Partner engineering squad – they liaise with our partners and are responsible with retrieving information from them, such as hotel prices. Stack: mostly Python.
• Data squad – this squad creates the so called “hotel data packages” which are used to display information from the users, making sure the information is consistent and without duplicates.
Jacek’s One picture is worth a thousand words. So, how does it scale to a million pictures? provides a more in depth view on what they do. Stack: Python.
• Geo squad – this squad maintains the Travel Knowledge Graph, a database system that represents the world as an ontology. This database can be queried directly using a language called DQL (Distributed Query Language). Stack: Python and a modified version of postgreSQL.
• Search services squad – tasked with providing the best autocomplete results to the user, their algorithms try to guess the user’s intended destination even when there are typos involved. More on the subject in Ben’s Measuring Autosuggest Quality post. Stack: Java, Solr and Lucene.
• Backend squad – the backend communicates with all the squads described above and compiles a list of results that is made available to the frontend via a RESTful API. Stack: Python.
• Web Applications squad – they own the Scaffolding component described in detail below and some elements that are common across all Skyscanner’s pages, such as headers & footers. Stack: .NET / C#.
• INTLOC squad – our Internationalization & Localization squad. Their service allows all other squads to deliver a localized native experience to our global pool of users. Stack: .NET / C#
Edge Side Includes in the frontend
The Skyscanner frontend is not a ‘site’ in the traditional sense of the word, with HTML code being generated by a single server-side technology and served to the user by Apache or nginx, but rather a collection of various ESI components developed and managed by different squads. The end result is that depending on where one might click on the page, the underlying HTML was generated by a different server-side technology owned by a different squad, located in a different part of the world.
So what exactly is an ESI component in our case? Well, it’s a self contained entity that renders, styles and provides JS interaction to a piece of HTML. Each component has a unique URL – ex: /hotels/search-box – and several endpoints, each with its own responsibility, as shown in the image below.
The endpoint’s name is appended to the component’s URL to create the endpoint’s URL. So for example, if I want to render the script tag for the hotels search box somewhere in the footer, I would issue a request to /hotels/search-box/script.
In the current architecture, each public URL has a template mapped to it, with placeholders for ESI components. This template is pre-processed by a component called Scaffolding and sent to Varnish which in turn requests all the ESI endpoints, applies caching rules and sends the end result back to the user. Given that ESI URLs are internal and dispatched by Varnish, components cannot access directly information coming from the client, such as query string parameters or cookies and this information is being requested via a special endpoint called requirements and injected by Scaffolding during pre-processing.
Here’s a simplified diagram that shows how a HTTP request is being handled.
In a nutshell…
Hundreds of engineers working with different technologies, in different geographical areas, are releasing independently of each other at different times components that are assembled on the fly to render the site. Seems like a giant puzzle. And every single time I describe our architecture and way of working, I get the following question: “Does it actually work?” Yes! Amazingly well, and for 50 million users every month.
How we got here
Things were not always like this for the Hotels Frontend Squad and if you want to learn about our journey, have a look at my presentation ‘Skyscanner Journey – From code jungle to state of the art’ given at the PHP Barcelona Conference in 2015.
For more on being a polyglot, see Richard Lennox’s post on being a ‘Polyglot Technologist’ here.