Architecture Weekly #92

Architecture Weekly Issue #92. Articles, books, and playlists on architecture and related topics. Split by sections, highlighted with complexity: 🤟 means hardcore, 👷‍♂️ is technically applicable right away,  🍼 - is an introduction to the topic or an overview. Now in telegram as well.

WARNING 🇺🇦

It's already been a year and a half since Russia's crazy, brutal and unjustified war against Ukraine. We condemn this war and want it to stop ASAP. We continue this newsletter so you can advance your skill and help the millions of Ukrainian people in any way possible. If you want to help directly, visit this fund.

Big thanks to Nikita, Anatoly, Oleksandr, Dima, Pavel B, Pavel, Robert, Roman, Iyri, Andrey, Lidia, Vladimir, August, Roman, Egor, Roman, Evgeniy, Nadia, Daria and Dzmitry for supporting the newsletter. They receive early access to the articles, influence the content and participate in the closed group where we discuss the architecture problems. They also see my daily updates on all the things I am working on. Join them at Patreon or Boosty!  

Highlights

Fitness functions: yay or nay? 🍼

Fitness function is a term coined by Neil Ford in the "Building Evolutionary Architectures" book. They can really help in governing the software architecture, but what are they precisely? What are their pros and cons? Find out a new video by myself!

#video #architecture #governance

Post Mortem on Cloudflare Control Plane and Analytics Outage 👷‍♂️

There was a more than 24 hours outage of Cloudflare more than a week ago. They shared a long Post Mortem blog post. Shortly, one of their 3 datacenters went down due to the power failure. Cloudflare were preparing for having some unavailability in this DC, but never tested having it fully down and only with the disaster discovered, that they have critical services only deployed to this DC. Fascinating story inside :)  

Post Mortem on Cloudflare Control Plane and Analytics Outage
Beginning on Thursday, November 2, 2023, at 11:43 UTC Cloudflare’s control plane and analytics services experienced an outage. Here are the details

#disasterrecovery #incident

Architecture Kata Entries 👷‍♂️

Architecture Kata is an exercise when several teams of architects are designing a solution for a given problem. This practice is extremely useful to improve the architecture skills. In this repo you will find the outcomes of several katas and will be able to learn some tricks from the results.

SoftwareArchitectureResources/Resources/OReillyKata.md at main · tekiegirl/SoftwareArchitectureResources
A collection of resources for supporting and learning about software architecture - tekiegirl/SoftwareArchitectureResources

#architecturekata

Follow-Up

The architecture of today's LLM applications

Yeah-yeah, large language models, ChatGPT, bla-bla. But what technical components are inside the applications which leverage LLMs? Nicole Choi from Github delve into the architecture of modern LLM apps, showing the 3 big parts of user UI for input and output, embedding and LLM host and API. Find an awesome post here.

The architecture of today’s LLM applications
Here’s everything you need to know to build your first LLM app and problem spaces you can start exploring today.

#llm #ai

Towards Modern Development of Cloud Applications 🤟

This paper critiques the prevalent microservices-based architecture for distributed applications, highlighting its tendency to conflate logical (code writing) and physical (code deployment) boundaries. The authors propose a new programming methodology that separates these aspects. Their approach entails writing applications as logical monoliths and relying on an automated runtime for distribution and deployment decisions. This method significantly reduces application latency (up to 15×) and cost (up to 9×) compared to traditional practices

#paper #distributedsystems

Why you should probably be using SQLite 👷‍♂️

SQLite is famous because of it's usage in mobile, and also for the fact of tests to code ratio. But you'd never thought of using SQLite as a distributed database. With the recent development of LiteFS, you might reevaluate your choice. Find out why inside

Why you should probably be using SQLite
Where you store your application data has enormous impacts on your entire application. There are implications on the entire stack based on what you decide to...

#db

Scaling Raft 🤟

I am sharing a video from Hydra - the conference on distributed systems. In this video Konstantin Osipov, Director of Engineering at ScyllaDB shows what is  Strong Consistency in an Eventually consistent database, explains what Raft is covers how Raft is implemented in ScyllaDB.

#db #video

Our Journey Adopting SPIFFE/SPIRE at Scale

This post from Uber's Blog discusses the shift in security architecture towards Zero Trust networking in response to the limitations of traditional perimeter-based security models. This shift has been driven by the rapid adoption of distributed system architectures, decomposition into microservices, and migration to cloud-based infrastructures. The article details Uber's implementation of SPIFFE (Secure Production Identity Framework For Everyone) and SPIRE (SPIFFE Runtime Environment), highlighting their role in providing a framework for Zero Trust security and supporting heterogeneous deployments across multiple cloud platforms

#security

When we're going, we don't need threads

Testing Distributed Systems is much more challenging than making sure some piece of code is doing what's expected. You have the uncertainty of order which make the tests unpredictable, and enforcing an arbitrary order does not really help. But instead, you can actually simulate your system... More in the blogpost by RedPlanet Labs!

Where we’re going, we don’t need threads: Simulating Distributed Systems
Testing distributed systems is hard. You probably already knew that. If you’ve spent even a little time writing distributed systems, you know that the relevant algorithms and techniques are complex…

#distributedsystems #testing