Architecture Weekly Issue #158. Articles, books, and playlists on architecture and related topics. Split by sections, highlighted with complexity: 🤟 means hardcore, 👷‍♂️ is technically applicable right away,  🍼 - is an introduction to the topic or an overview. Now in telegram and Substack as well.

Highlights

The trouble with Leader Elections 🤟

Leader election is a mechanism of resiliency: once a node goes down, we can elect another and move on. However, it can still lead to multiple concurrent run of the same task or not running one at all. Find the details and possible ways out inside.

The Trouble with Leader Elections (in distributed systems)
A collection of posts by Joe Magerramov

#distributedsystems

Introducing Impessions in Netflix 👷‍♂️

From a YouTuber perspective the video cover is what sells your content to the user. Netflix is a content creation company as well as video platform, so they have both perspective. Learn how they decide how to present their content to the user and understand what to present in the first place.

Introducing Impressions at Netflix
Part 1: Creating the Source of Truth for Impressions

#casestudy

You need to be more strategic 🍼

I frequently tell it to senior engineers and engineering managers meaning that delivering on features is not enough. You need to know how you're going to evolve the system according to business goals. Dan Pupius elaborates on it and mentions some good relevant topics.

“You need to be more strategic”
A primer on strategy for software engineers

#strategy

System Design Course Cohort #5 is open!

Typical system design courses teach technical skills but often overlook the connection to business problems. This course fills that gap, emphasizing the importance of recognizing and addressing business priorities with technical approaches. Learn to go beyond load balancing options and performance tactics by focusing on solving real business challenges. Course is completed by 50+ engineers with great feedback!

SIGN UP HERE

Follow-Up

A Quick Crash Course on Stateful vs Stateless Architecture 👷‍♂️

I rarely include such kind of primers, but this one actually deserves some respect due to the final passage: don't make the final choice, but rather select what architecture quantas should be done with one style, and which one should go with the other.

A Quick Crash Course on Stateful vs Stateless Architecture - SWE Quiz
SWE Quiz - Become the smartest dev on your team.

#architecture

Amazon EKS Cluster Reliability with Node Auto Repair Hands On Guide 👷‍♂️

Typically with Kubernets you think about the Pod liveness. But pods do not run in vacuum, but rather on the nodes. What happens if the node goes down? Then the node repair is required. This is what Node Auto Repair in EKS is for. FInd out how to configure it and make sure it works.

🎅Boost Your Amazon EKS Cluster Reliability with Node Auto Repair🤶
🎄Wishing You a Happy Christmas and a Reliable Kubernetes Experience🎄

#aws #eks

Communication is the new bottleneck for the OLTP databases 🤟

From 2008 the major bottleneck for the databases were CPU and storage I/O. However with 17 years of advancements this is not the case anymore: now the biggest problem lies within network communication as DBs now spend the same amount of time returning the response as running the query itself. Checkout the paper with the research!

#paper #db

Stripe Ledger. 👷‍♂️

Stripe processed 300 million transactions a day last Cybermonday. In order to correctly do it, they built a system called “Ledger” to track and verify every movement of money. It shows how they designed it to make sure all numbers match, even when lots of transactions happen at the same time. They also discuss the challenges of making Ledger reliable, secure, and easy to grow as Stripe expands.

Ledger: Stripe’s system for tracking and validating money movement
Technical details on how Stripe built Ledger, a state-of-the-art money movement tracking system, including how teams at Stripe interact with the data quality metrics that underlie our payment processing network.

#casestudy

Redefining Data Engineering with Go and Apache Arrow 👷‍♂️

And finally an interesting speculation on the data engineering rethinking. Thomas F McGeehan argues - and rightfully so - that the traditional data processing pipeline consists of multiple steps like read the data in a row-oriented format, serialize, pass through network, deserialize, convert to column-based etc. The author shows how it be optimized by reading the data right into the column-based binary format and consumed as in-memory source in DuckDB.

Redefining Data Engineering with Go and Apache Arrow
Breaking Bottlenecks and Unlocking True Performance

#dataengineering

Big thanks to Nikita, Constantin, Anatoly, Oleksandr, Dima, Pavel B, Pavel, Robert, Roman, Iyri, Andrey, Lidia, Vladimir, August, Roman, Egor, Roman, Evgeniy, Nadia, Daria, Dzmitry, Mikhail, Nikita, Dmytro, Denis and Mikhail for supporting the newsletter on Patreon! If you like the newsletter, feel free to support it there - with one-time support for example!