Architecture Weekly Issue #67. Articles, books, and playlists on architecture and related topics. Split by sections, highlighted with complexity: ๐ค means hardcore, ๐ทโโ๏ธ is technically applicable right away, ย ๐ผ - is an introduction to the topic or an overview. Now in telegram as well.
WARNING ๐บ๐ฆ
It's already been a year since Russia's crazy, brutal and unjustified war against Ukraine. We condemn this war and want it to stop ASAP. We continue this newsletter so you can advance your skill and help the millions of Ukrainian people in any way possible. If you want to help directly, visit this fund.
This week we handled a discussion on Disaster Recovery with Misha Druzhining. And you won't believe what happened in the middle of the broadcast.
Big thanks to Nikita, Anatoly, Oleksandr, Dima, Pavel B, Pavel, Robert, Roman, Iyri, Andrey, Lidia, Vladimir, August, Roman, Egor, Roman, Evgeniy and Nadia for supporting the newsletter. They receive early access to the articles, influence the content and participate in the closed group where we discuss the architecture problems. They also see my daily updates on all the things I am working on. Join them at Patreon or Boosty! ย
Highlights
Database Sharding Explained ๐ค
Sharding is an important concept to ensure the reliability and performance of the overall system. You can do that in a variety of ways, which of them can cause it's own problems. Architecture Notes blog has a free post explaining in deep details what the sharding is in a nutshell. ย
#db #sharding
How Tinder built its own API Gateway ๐ทโโ๏ธ
Tinder tried multiple solutions for API Gateway, including AWS API Gateway, Apigee, Kong and others. But in the end, they decided they really needed a bespoke solution to match their requirements of scalable, reusable and configuration-based demands. So they took Spring Cloud Gateway and built their solution on top. Find what they managed to achieve in the article in Tinder Tech Blog.
Migrating Critical Traffic at Scale with No Downtime - Part 1 ๐ทโโ๏ธ
Bringing new infrastructure to the production load is always a little risky. For Netflix which wants to ensure an uninterrupted watching experience this is a critical technical capability. In the latest blog post, they explain that real traffic replay plays a crucial role in testing new services and they built a special solution including a replay server. Follow the article for the details!
#sre #casestudy
Follow-Up
Software Architecture Canvas ๐ผ
I am a big proponent of Solution Architecture Documents, RFCs and ADRs. But it's always good to take a fresh look. Patrick Roos shared a new format to allow the collaborative effort to architecture: the Canvas. I especially like the strong demand for the business case(top of green) and the risks and challenges(in blue). Give it a try!
#documentation
The Inner Workings of Distributed Databases ๐ค
Alex Pelagenko begins an article with a nice analogy: he gets to the office by the bike, but if it failes - should there be a replacement? Same happens with the databases: if the first node fails, the should be a standby. But should the replication by sync or async? Should it be a master-master replication? Alex considers several databases and demonstrates the sequence diagrams how they handle disconnection issues. ย
#db #timeseries
Building a large scale unsupervised model anomaly detection system ๐ค
Lyft leverages tons of ML models to define a wide range of parameters from ETAs to pricing. But they also need to understand if those model perform well. The problem is that different model different number of features and outputs. So they need to unify and process them efficiently. Find how they do it in the blog post!
#ml
2023 State of Platform Engineering Report
The word DevOps is mentioned less frequently while people speak more and more about Platform Engineering. Perforce is publishing it's report on Platform Engineering, and among many valuable insights, you will find the statement about companies underinvesting in the product managers for the platforms - because it's still a product, even for your internal developers. Find the report download below, and while you're going through it, turn the discussion about developer relations with Baruch Sadogursky here. ย
#devops #platformengineering
Passwords are no more
Passwords have a long history of problems like being easy to brute force, phished and prone to social engineering attacks. With the zero trust world coming, the passwordless approach has finally become publicly available with support from Google and Apple. Read the news post! ย ย
#security