Architecture Weekly #130

Architecture Weekly Issue #130. Articles, books, and playlists on architecture and related topics. Split by sections, highlighted with complexity: 🤟 means hardcore, 👷‍♂️ is technically applicable right away,  🍼 - is an introduction to the topic or an overview. Now in telegram and Substack as well.

Highlights

Migrating from ECS onto K8s in less than a year at Figma 👷‍♂️

Figma was already running their payload in containers a year ago using ECS, but realized they are missing several crucial points of functionality like autoscaling, stateful storage and others. They decided to migrate to Kubernetes, but it's a challenging task at Figma's scale. Find out more details in their engineering blog, though no numberf unfortunately.

How We Migrated onto K8s in Less Than 12 months | Figma Blog
Migrating onto Kubernetes can take years. Here’s why we decided it was worth undertaking, and how we moved a majority of our core services.

#k8s #casestudy

Analytical Table Format comparisons 🤟

OLAP is still a pretty unmapped territory for myself, that's why it's so interesting to figure out what are the storage formats for analytical databases are and how they are compared with each other. Jack Vanlightly compares 4 formats: DeltaLake, Apache Iceberg, Hudi and Paimon and draws comprehensive diagrams!

Table format comparisons - How do the table formats represent the canonical set of files? — Jack Vanlightly
This is the first in a series of short comparisons of table format internals. While I have written in some detail about each, I think it’s interesting to look at what is the same or similar and what sets them apart from each other. Question: How do the table formats represent the canonical list of

#bigdata #db

The Testing Pyramid is upside-down 🍼

The classical testing pyramid suggests having tons of unit tests and a few of end-2-end ones, and people follow it because writing unit tests is much cheaper, yet, are they that useful? Probably, the modern software complexity require to flip the pyramid upside down and put the emphasis on e2e tests first. Find the considerations in the articlce.

The Testing Pyramid is upside-down
Conventional wisdom says you should have a lot of unit tests and a few end-to-end tests. What if that was exactly backwards?

#quality

Follow-Up

Kebab vs Cake organization 🍼

At Bolt the teams are named after the customer journeys like Rider Experience Team, or Eater Experience Team. This is a sign that we have a so-called Cake organization. The opposite to that is a Kebab organization, where each team handles the piece of the functionality. Alex Ewerlöf analysies both approaches and lists pros and cons for both. Team Topologies, Leacky abstractions - all of that impacts the organizational structure in the end

Kebab vs Cake organization
The 2 most common organization architectures, their key characteristics, pros, and cons with an example

#ea

QA slows work down 🍼

You might have heard, that QA activities slow things down but it's quite the opposite: actually not applying proper quality control measures will slow you down as you will to make tons of rework. Vitaly Sharovatov  tackles the issue in the Qase blog.

QA myth busting: QA slows work down
The idea that QA is a bottleneck is myth, so why do we keep hearing it? Let’s break down how this myth is perpetuated.

#qa

DynamoDB Backup & Restore 👷‍♂️

Serverless products are great, because they manage the operational work for you, providing scalability and reliability. But that does not mean everything is just available for you out of the box - even for serverless storages like DynamoDB you need to configure backups. Check out the options you have.

DynamoDB Backup & Restore - Step-by-Step Guide
Learn how to backup and restore your DynamoDB table(s) including cross-account and cross-region backups and restore.

#db #serverless

The long way towards resilience - Part 1 🍼

Uwe Friedrichsen starts a new series in his blog - this time about building resilient systems. He starts with the corner-stone point: what is the resilience in the first place? And no, just being able for a software system to work for a long time is not resilience.

The long way towards resilience - Part 1
What is resilience?

#resilience

Long-running processes in Modern Architectures 👷‍♂️

When I was working for a bank, we dealt with making a credit score for businesses. The process was long-running and multistep, requiring waiting, retries and persisting state. That's why I liked this article in InfoQ blocked, mentioning BPMN and all those long-running tactics.  

Are You Done Yet? Mastering Long-Running Processes in Modern Architectures
In this article, Bernd Ruecker explores the importance of long-running processes in various applications, particularly in distributed systems. He emphasizes the value of asynchronous communication and explores strategies like Centers of Excellence, along with visual tools like BPMN for enhancing com…

#architecture

WARNING 🇺🇦

The brutal and unjustified war against Ukraine continues already 2 years. If you want to help Ukraine directly visit this fund.

Big thanks to Nikita, Constantin, Anatoly, Oleksandr, Dima, Pavel B, Pavel, Robert, Roman, Iyri, Andrey, Lidia, Vladimir, August, Roman, Egor, Roman, Evgeniy, Nadia, Daria, Dzmitry, Mikhail, Nikita, Dmytro, Denis and Mikhail for supporting the newsletter. They receive early access to the articles, influence the content and participate in the closed group where we discuss the architecture problems. Join them at Patreon or Boosty!