Architecture Weekly #119

Architecture Weekly Issue #119. Articles, books, and playlists on architecture and related topics. Split by sections, highlighted with complexity: 🀟 means hardcore, πŸ‘·β€β™‚οΈ is technically applicable right away,  🍼 - is an introduction to the topic or an overview. Now in telegram and Substack as well.

Highlights

Post Mortem of Google Cloud deletion of customer's data 🍼

UniSuper, an australian client of Google Cloud discovered last month that all of their data, VMs, network configurations were gone overnight. After initial shock they contacted Google Cloud and managed to restore anything thanks to the backups in Cloud Storage. This week GCP shared the Post Mortem for this case. TL;DR: don't use deprecated internal tools :)

#cloud #disasterrecovery

How Kubernetes picks which pods to delete during scale-in πŸ‘·β€β™‚οΈ

Scaling down requires to solve a problem of which pods to delete. The answer is not obvious, more over it is not documented. Riccardo Padovani studied the source code to figure how what rules are applied by ReplicaSet to delete pods. Find them in the post.

How Kubernetes picks which pods to delete during scale-in
Have you ever wondered how K8s choose which pods to delete when a deployment is scaled down? Given it is not documented, I dived in the source code to learn.

#k8s #scalability

13 Tips to Improve PostgreSQL Insert Performance πŸ‘·β€β™‚οΈ

Ingest performance is crucial for many PostgreSQL use cases like application monitoring, analytics, and IoT monitoring. And all of them are append heavy. Optimizing your database's data ingestion speed is vital. That's why engineers at Timescale have extensive experience in performance optimization. In this article, they will show how to improve PostgreSQL insert performance

13 Tips to Improve PostgreSQL Database Insert Performance
Some of these may surprise you, but all 13 ways will improve ingest (INSERT) performance using PostgreSQL and TimescaleDB.

#performance #db

Follow-Up

Mapping the Mind of a LLM 🍼

Typically we think about LLMs as black boxes. This is why this post from the authors of Claude model is so important. The main takeaway is that inside the models are neurons and features; neurons can be activated during the work on different tasks, but features are specific to a job. So we can reason about the behavior of an LLM and modify it's behavior through enforcing related features. Great advance to my taste!

Mapping the Mind of a Large Language Model
We have identified how millions of concepts are represented inside Claude Sonnet, one of our deployed large language models. This is the first ever detailed look inside a modern, production-grade large language model.

#llm #ml

Designing an incident response process 🍼

You probably know about on-call practices, observability solutions and pagers. Grafana Labs created a holistic view on the whole incidents process explaining why we have alerts in the first place and proceeds explaining how to design responses and write post mortems. Great guide!

Call me, maybe: designing an incident response process | Grafana Labs
An incident response process outlines the steps your team needs to take when an incident occurs. Use the tips and cheat sheet in this post to help formulate yours.

#observability

Serverless Event Sourcing & CQRS πŸ‘·β€β™‚οΈ

There are many talks on Event Sourcing and even more about CQRS - separetely. However, there is no content on doing those together. Luckily, in the blog of Serverless Advocate you will find a comprehensive guide on both things with the source of a system implementing both concepts. Great stuff!

Serverless Event Sourcing & CQRS (Part 1)
An example of event sourcing and CQRS in serverless, with code examples in TypeScript and the AWS CDK. In Part 1 we cover Event Sourcing.

#serverless

Amazon MemoryDB: A Fast and Durable Memory-First Cloud Database 🀟

This is not an ad for a new service, but a review of a new paper by Murat. Redis(or now Valkey) is a well-known cache solution and... nothing really more due to issues with consistency during working in cluster setup. AWS decided to fix the issue and offer it as a MemoryDB. Find the technical details inside!

Amazon MemoryDB: A Fast and Durable Memory-First Cloud Database
Key-value stores have simple semantics which make it cumbersome to perform complex operations involving multiple keys. Redis addresses this ...

#db

Database scaling and optimizations in a microservice architecture πŸ‘·β€β™‚οΈ

Beyound using indexes there are other performance tactics to employ like sharding, replication and the use of caching. Find the post in LinkedIn about those tactics with configs for MongoDB and PostgreSQL.

Database scaling and optimization in a microservices architecture
Introduction Databases are critical for information systems, especially in high-load microservices architectures. Properly designed and optimized bases ensure high performance, scalability, and stability of the system.

#db #performance

WARNING πŸ‡ΊπŸ‡¦

The brutal and unjustified war against Ukraine continues already 2 years. If you want to help Ukraine directly visit this fund.

Big thanks to Nikita, Anatoly, Oleksandr, Dima, Pavel B, Pavel, Robert, Roman, Iyri, Andrey, Lidia, Vladimir, August, Roman, Egor, Roman, Evgeniy, Nadia, Daria, Dzmitry, Mikhail, Nikita, Dmytro, Denis and Mikhail for supporting the newsletter. They receive early access to the articles, influence the content and participate in the closed group where we discuss the architecture problems. They also see my daily updates on all the things I am working on. Join them at Patreon or Boosty!