Architecture Weekly #41
Architecture Weekly Issue #41. Articles, books, and playlists on architecture and related topics. Split by sections, highlighted with complexity: π€ means hardcore, π·ββοΈ is technically applicable right away, Β πΌ - is an introduction to the topic or an overview. Now in telegram as well.
WARNING πΊπ¦
It's already been 249 days since Russia's crazy, brutal, and unjustified war against Ukraine. We condemn this war and want it to stop ASAP. We continue this newsletter so you can advance your skill and help the millions of Ukrainian people in any way possible. If you want to help directly, visit this fund.
Video
Kafka Consumer Lag Monitoring π·ββοΈ
Consumer Lag is the difference between consumer offset and the write offset of a producer. This lag is essential when you process near real-time data; if it's too big the data you produce can be useless. That's why you need to monitor the consumer lag of Kafka clusters. Sematext explains the reasons for lags and suggests their own monitoring solution to discover those.
#monitoring #kafka
Message delivery and deduplication strategies π·ββοΈ
We spoke about the Outbox pattern last week. Continuing the message delivery narrative, I am sharing an article on the underwater stones of using the unique id per message for obtaining idempotency. In short, it requires transactionality and it can be tricky in a distributed system. More details inside.
#messaging #idempotency
Presto Speed Up with Alluxio Local Cache at Uber Β π€
Analytics at big companies such as Bolt or Uber is a load-heavy task. Uber runs half a million analytical queries a day against their set of Presto Clusters. But even having load balancing and split load for on-demand and scheduled jobs does not fulfil all the requirements. Uber introduced caching of the query results and applied Alluxio Cache Library, consistent hashing for nodes, cache filters and cache metadata to solve some of the raised issues.
#presto #bigdata #performance
Thoughtworks Technology Radar π·ββοΈ
Thoughtworks published a new issue of the technology radar this week. Important highlights include the trial of Camunda, Svelte, and Kotlin Gradle DSL. Threat Modelling finally received the Adopt verdict alongside Cognitive Load for teams. Check out the full radar below! Β
#techradar #radar
Dynamic Security Testing in CI/CD pipeline πΌ
Dynamic Security testing is a strategy of running an app and trying to find and exploit potential vulnerabilities. As per the 'Shift-left' approach, we want to conduct security testing as soon as possible in the development lifecycle. Grab a short note by Akira Brand on strategies to integrate the DAST and what approaches to avoid. Β
#security #cicd
What is OpenTelemetry? πΌ
Observability is one of the important properties of a software system; OpenTelemetry is a standard and a set of libraries and software components intended to abstract obtaining the monitoring data and sending it to a collection solution. Find out the architecture, principles and current state on the page.
#observability
TiDB architecture overview π€
At Bolt we are migrating from MySQL to Titanium DB for the sake of scalability and storage efficiency. Get to know what TitaniumDB offers and how it works under the hood!
#database
A different flavor of the distributed transaction π·ββοΈ
In this talk Martin Stefanko demonstrates the Java library which can implement a Saga pattern for you in a very convenient way. Basically, if you need to make 3 actions in a distributed transaction, the library can help you doing so with minimal coding effort, handling the rollbackes and the necessary context. Watch the full video!
#distributedsystems
List of Foudational distributed systems papers π€
Murat has been a great source of distributed system papers overviews. I am sharing a list of foundation papers on time, consensus and other topics which are absolutely necessary to grok the distributed systems well. Β
#distributedsystems #whitepaper
Failure Types πΌ
Dominik Tornow, who's newsletter I am subscribed to, shared an article of himself about temporal, intermittent and permanent errors and how they can be classified in spacial dimension, e.g. where in the system they should be handled. Good guide on handling application errors.
#applicationarchitecture #errorhandling
Like the newsletter? Consider helping to run it at Patreon or Boosty. Patrons and Boosty subscribers of a certain level also get access to a private Architecture Community. Big thanks to Nikita, Anatoly, Oleksandr, Dima, Pavel and Robert for already supporting the newsletter.