Architecture Weekly Issue #16. Articles, books, and playlists on architecture and related topics. Every record has the complexity indication: π€ means hardcore, π·ββοΈ is technically applicable right away, Β πΌ - introduction to the topic or an overview. Now in telegram as well.
WARNING πΊπ¦
It's already a two month and a half of crazy, inhuman, unjustified war of Russia against Ukraine. We condemn this war and want it to stop ASAP. We continue this newsletter so you can advance your skill and help the millions of Ukrainian people in any way possible.
Fault Injection for Reliability Testing π·ββοΈ
We already shared an article about Chaos Engineering in Netflix. DoorDash goes beyond and introduces a tool that can analyze the microservices and inject the expected failure in order to test the output and measure resiliency. Details inside.
Load-balanced Brooklin Mirror Maker π€
Having multiple Kafka clusters in multiple data centers can be challenging from the replication perspective. LinkedIn was using Kafka Mirroring for it but experienced scaling issues. So they introduced their own open-source mirroring solution: Brooklin. Read below what difficulties they faced further and how they overcame them.
Operation-Based SLOs πΌ
Service Level Objectives are a measure of how a service behaves in terms of latency, availability, etc. This is a good SRE practice. However in the microservice world, the product usually consists of multiple services, so it is hard to connect the product metrics to individual MS SLOs. That's why Zalando basically introduced business SLOs or how do they call it Operation-Based SLOs. Read further.
Combining Event Driven Architecture with Microservices Style π·ββοΈ
Huge article by IBM from 2020 on architecture considerations of both styles, complexities, architecture blueprint, patterns, technology stack, and deployment practices.
Fault Tolerance via Idempotence π€
The complex white paper which introduces a language to talk about process failures and idempotence and then proves that having idempotence leads to a correct transaction behavior in the distributed system. A lot of monads inside.
Books for Great Software Architects πΌ
This series of articles describes the architecture learning path based on great books. It provides a clear sequence of materials and helps to avoid double reading. It is not finished yet, but it looks good. Also, we recommend adding Learning Domain-Driven Design by Vladik Khononov in the DDD thread.
How Scentbird moved to a new payment service π·ββοΈ
An exciting story written by my pal Andrew Rebrov, a CTO of Scentbird, on how they understood what they expect from a subscription service, how much does it cost to build one, and how to migrate the users. Must read.
Warp: Lightweight Multi-Key Transactions for Key-Value Stores π€
Murat continues to review computer science papers. This time it is about distributed transactions for NoSQL systems. There are interesting decisions with chained communication pattern and optimistic concurrency control. It looks like 2PC, but the paper's authors assure good performance (75% of non-transactional solution).
SoundCloud Chronicles the End of the Public API Strangler πΌ
A short story about an 8-year-long migration to a fully-fledged Backend For Frontend inside SoundCloud. Strangler pattern with telemetry is safe. But it also brings non-obvious risks: unhealthy codebase for many years, security concerns, and complexity for feature development.
Delivering Large-Scale Platform Reliability πΌ
Roblox team describes what they do for reliability. First, it bases on measurements for all product lifecycle: from CI to client experience. Also, they have architectural reviews and aggressive low latency policies for clients. Look how they keep attention to SLA of internal dependencies. Monthly Reliability Report is the perfect instrument to share information with the whole team. All of these look very open and promising after huge downtime history.
The newsletter is supported by 5 premium subscribers. It helps to pay for the hosting and mailing services, but doesn't do the job completely and we still need at least another 5. If you liked the newsletter please consider supporting it as well by subscribing to a premium subscription.
Brought to you by Vladimir @vvsevolodovich Ivanov and Ilya @puzan Zonov