Kafka: The Definitive Guide

Niyaz Narhid

Genres: Programming

Year of publication: 2019

Year of reading: 2020

My rating: Normal

Number of reads: 1

Total pages: 320

Summary (pages): 11

Original language of publication: English

Translations to other languages: Russian, Chinese

General Description

A 320-page book consisting of 11 chapters. In addition to textual content, the book contains a significant amount of graphical information in the form of diagrams and charts, as well as embedded code snippets. The difficulty level is intermediate. Each chapter is supplemented with a brief summary in the form of a few-sentence recap.

Brief Overview

The first two chapters can be considered introductory. The first chapter introduces Kafka: the pub/sub principle, its niche, and areas of application. The second chapter covers the installation of Kafka, explained in detail—from choosing an OS and installing ZooKeeper to memory, disk, network, and processor requirements. Just these two introductory chapters account for about 20% of the entire book.

Since Kafka is built on the pub/sub principle, it is logical that producers and consumers are discussed right after installation. This is the focus of the next two chapters. It’s hard to definitively assess the depth of these topics, as I’m not a Kafka expert, but I can note some of the areas covered: creating and configuring consumers and producers, synchronous and asynchronous message sending, serialization, delivery guarantees, and working with offsets.

The next chapter delves deeper into the architecture details. It discusses replication mechanisms, working with indexes, failure handling, and more.

Then comes a chapter on data delivery reliability. It covers the principles of ACID, with a significant portion devoted to replication and various configuration options for its setup.

Next is a chapter on building data pipelines. It starts with a discussion of the requirements for such systems, followed by an examination of Kafka Connect, working with data from MySQL and Elasticsearch, and a few words on alternatives to Kafka Connect.

The following chapter focuses on replication capabilities between different Kafka clusters using MirrorMaker. It covers migration scenarios, backup, and ensuring fault tolerance at the geo-distribution level.

After that, there are two chapters on Kafka administration and monitoring. There’s no need to go into detail here—those who are interested can check them out themselves. Overall, these chapters seemed neither boring nor overly complex.

Finally, the last chapter is dedicated to stream processing—it covers the basics of Kafka Streams and building real-time data processing applications.

Opinion

A decent book on Kafka. Even though I hadn’t worked with Kafka before reading it, the material was quite understandable and informative. Unfortunately, many things tend to be forgotten without practical experience, but the notes I took earlier helped refresh my memory quickly. If you’re a backend programmer and plan to continue growing as a specialist, I would recommend this book for reading, even if you’re already familiar with another message broker.