
Artur Ejsmont is a guy with a really cool name, and a really cool book. He is also Head of Platform Engineering at Yahoo in Sydney.
Now to the book: it is hard for me to overstate how much relief I felt after reading this book. I am a self taught software developer and I had some serious wholes in my knowledge when it came to architecturing large full stack applications. I had of course come across disjointed articles here and there that covered some of the topics were are about to discuss, but this book simply does an incredible job of tying it all together. It gives the full picture and a much better intuition for how backend concepts are related.
Every line of the book packs knowledge and I had many “aha” moments while reading it. It’s not going to be possible to go into every area that the book covers, but I’m going to do my best at outlining this huge knowledge cloud!
“Front End Layer”
Artur mentions CDNs and caching as two ways to help the front layer scale.
Content Delivery Networks use GeoDNSs to point users to a local node of the network, they are ubiquitously used in web development because they enable a number of interesting benefits. One of the first benefits of CDNs is that they can be used to cache static content (HTML, CSS, JavaScript, fonts, images, videos, audio). This is helpful because your web servers then don’t have to do any work.
The other main benefit of CDNs is that they are closer to the user than your web servers are. Some DNSs have nodes in over 50 different locations over the worlds, and a CDN uses GeoDNS (a DNS service that can interpret to user’s physical location) to route a user to its closest node, drastically reducing latency. I’m going to write another book review shortly about a fantastic networking book I’m reading that goes into more depth about how DNSs drastically improve TCP/TLS latency, by reducing handshake and certificate negotiation times and avoiding the slow start algorithms inherent to TCP communications.
On top of the previously covered HTTP and media caching, the book also talks about browser caching (localStorage). An app can quite easily use localStorage to keep track of the local state of the application, and simply reload the last state when an application is closed and reopened. Google maps for instance might locally persist the latest map, so that the next time the user opens his app, it isn’t blank: there will already be a loaded map.
“Web Services”
The book goes in detail into load balancing. And caching. HTTP caching is of course useful and it avoids having to generate new content. But object caching, application entities that cannot be cached in NGINX, can benefit from being cached in a fast in memory database such as Redis (Redis is particularly useful because it has an expiry flag that can be set for each object).
Keeping web services stateless is vitally important as it enables them to be horizontally scalable: hide any number of servers behind a load balancer, and as long as they are stateless, your application can scale. This enables better scalability than vertical scaling (buying a more and more powerful service to handle all traffic) because there isn’t a limit on the number of web servers you can have. If your load balancer is becoming overwhelmed: use more than one load balancer and use your DNS server to route traffic to each load balancer.
“Data Layer”
The book has the best introduction to SQL scalability that I have come across. The author talks about read replication where multiple read databases reading form a master, to scale reads (but not writes). Master-master replication to improve reliability and avoid downtime. And vertical or horizontal sharding of databases.
Managing distributed SQL databases might be a bit of a hassle though, and the author also talks about some NoSQL databases, their pros and cons. NoSQL databases usually have a harder time guaranteeing ACID transactions, and usually offer a choice between high availability (low latency when getting and editing data) versus high consistency (the retrieved data is always accurate). There are many trade offs ad gotchas that have to be properly understood when designing and using distributed databases (CAP theorem, and more). I have plans to read more about it soon, if it’s any good I’ll be reviewing that too! The pros of NoSQL databases are often that they are able to store Tera to Peta bytes of data.
“Asynchronous Processing”
Ejsmont mainly talks about queues. He mentions that some services are publishers and others are workers. Queuing jobs should help your applications are using resources efficiently and avoid getting overwhelmed. The workers will only take the next job as soon as they are able to. Queues also encourage strong decoupling between parts of your application which is a sign of good engineering.
We have used asynchronous processes at TransferWise to process money orders by making them go through various stages (is there missing information? is the customer verified? is this a potential fraud? does we need to get more information about the recipient? are their enough funds on the account? did the payment go through?). We also use queues at Maple Inside with Google PubSub as part of our Event Driven Architecture.
There are however challenges to queues: it’s something extra to manage and deploy. Queues can sometimes crash if they become overwhelmed. And poisoned messages can systematically crash your whole application because they keep getting sent to new workers after crashing the previous ones.
“Searching for Data”
In one of the last parts of the book, Artur talks about search engines and how they might be useful for some applications. He mentions ElasticSearch in particular – a somewhat new but hot technology. Search engines might take in XML or JSON documents and are then able to efficient search through them. A common pattern for an eCommerce website that sells cars might be to add a new document to ElasticSearch every time a new car is added to the cars database. When a user wants to search for a specific car model, a list is returned by ElasticSearch, and the latest data for each carId is fetched from the cars database.
There is a good deal more information in the book that I haven’t touched on. I’m particularly fond of the book and think it might be one of my favorite books of the year. It should really be recommended reading for any (especially newby) engineers.
