Architecture Weekly #138

Architecture Weekly Issue #138. Articles, books, and playlists on architecture and related topics. Split by sections, highlighted with complexity: 🤟 means hardcore, 👷‍♂️ is technically applicable right away,  🍼 - is an introduction to the topic or an overview. Now in telegram and Substack as well.

Sponsored

Depot Managed GitHub Actions runners offer caching that's 10x faster than GitHub's own solution, and at half the cost. It’s the secret to boosting your CI/CD pipelines without breaking the bank. Find out how it works!

Highlights

PostgreSQL and UUID as primary key 👷‍♂️

Absolute must read if you're using UUIDs as primary keys, not only in PostgreSQL - anywhere. Apparently, the type of Id column and UUID version you're using drastically affect the performance in tables with at least hundred of thousands of rows.

PostgreSQL and UUID as primary key
PostgreSQL and UUID as primary key

#db #performance

Pushy to the Limit: Evolving Netflix's WebSocket proxy for the future 👷‍♂️

Handling tens of millions messages sent to the TV devices is not an easy task, but with Netflix you have to handle even more. Initially Pushy was well suited for the volume they had, however the number of devices and messages grew 10 times, so the changes to the system were required. Find the article by Netflix explaining the changes and motivation.

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future
Pushy is Netflix’s WebSocket server that maintains persistent WebSocket connections with devices running the Netflix application. This…

#casestudy

Rearchitecting: Redis to SQLite 👷‍♂️

Wafris is a Web Application Firewall, deployed as middleware with Rails applications. They store the WAF rules in Redis, at least until they realized the network latencies were hurting the product. So they... migrated to SQLite! Find the motivation inside!

Rearchitecting: Redis to SQLite | Wafris
Learn how we approached migrating our Wafris v1 client based on Redis to a new faster, easier to use SQLite architecture.

#db #casestrudy

Follow-Up

Legacy Modernization meets GenAI 👷‍♂️

Improving a legacy code base is an effortful activity involving tons of analysis, like understanding the parts of the system and relationship between them, building capabilities map on the one hand side. LLMs are most widely used for the code generation on the other hand side. You can apply GenAI to legacy modernization too - it's great at summarizing large texts(which codebases essentially are) and providing insights. Get a whole blog post on the topic in Martin Fowler's blog.

Legacy Modernization meets GenAI
Lessons from building and using a GenAI tool to assist legacy modernization.

#genai #architecture #refactoring

How to create Architectural Decision Records - and how not to 🍼

I talked a lot about the necessity of ADRs, but I missed this brilliant piece of actually HOW to write the records and what's even more importantly - on how not to write them. For example, you don't want to sugar your decision and deliberately omit the bad consequences of your decision.

How to create Architectural Decision Records (ADRs) — and how not to
How to document the design decisions that make or break your project

#documentation #adr

A Beginner's Guide to the OpenTelemetry Collector 🍼

You can send traces, logs and metrics into the observability backend straightaway, but it introduce risks such as a vendor lockin. You will might need to filter the data you sent let's say for privacy reasons. To help with that the OpenTelemetry Collector exists. Find out why and how to use it.

A Beginner’s Guide to the OpenTelemetry Collector | Better Stack Community
This article explores why the OpenTelemetry Collector is a popular choice for building efficient and adaptable observability pipelines

#observability

How Notion build and grew the data lake 👷‍♂️

Up to 2021, Notion used to store everything in a single PostgreSQL instance. Then the grow happened, and also the time for AI features came along; also analytics tasks demanded separate infrastructure. So they built the datalake based on Kafka, Apache Hudi and Apache Spark, storing all the data in S3. Grab the details inside!

How Notion build and grew our data lake to keep up with rapid growth
How Notion build and grew our data lake to keep up with rapid growth

#olap #casestudy

The new collaborative text editing Algorithm 🤟

Google Docs and other online editors use special data structures and algorithms to give you the collaborative experience. But current ones are slow and/or inefficient in some cases(How to design Google Docs). The new algorithm, Eg-walker, is better because it combines the strengths of CRDTs (Conflict-free Replicated Data Types) and Operational Transformation (OT) by offering efficient memory usage, lower latency, and fast conflict resolution. It achieves these improvements by using an event graph structure that scales well with document size and reduces the overhead of storing and merging edits, making it highly suitable for real-time and offline collaborative editing. Read the paper here!

#paper

WARNING 🇺🇦

The brutal and unjustified war against Ukraine continues already 2 years. If you want to help Ukraine directly visit this fund.

Big thanks to Nikita, Constantin, Anatoly, Oleksandr, Dima, Pavel B, Pavel, Robert, Roman, Iyri, Andrey, Lidia, Vladimir, August, Roman, Egor, Roman, Evgeniy, Nadia, Daria, Dzmitry, Mikhail, Nikita, Dmytro, Denis and Mikhail for supporting the newsletter. They receive early access to the articles, videos, influence the content and participate in the closed group where we discuss the architecture problems. Join them at Patreon or Boosty!