Hey this is a draft on my understanding of Kafka :)

Apache Kafka is a distributed streaming system that can publish and subscribe to a stream of records. In another aspect, it is an enterprise messaging system. It is a highly fast, horizontally scalable, and fault-tolerant system. Kafka has 4 core APIs

Apache Kafka uses Apache Zookeeper to maintain and coordinate the Apache Kafka brokers. A version of Apache Zookeeper is bundled with Apache Kafka. 

Use cases

Kafka is used for the below use cases:

  1. Messaging System Kafka is used as an enterprise messaging system to decouple source and target systems to exchange data. Kafka provides high throughput with partitions and fault tolerance with replication in comparison to JMS.
  2. Web Activity Tracking This is done to track user journey events on the website for analytics and offline data processing. 
  3. Log Aggregation This processes the log from various systems, especially in distributed environments with microservices architectures in which the systems are deployed on various hosts. We need to aggregate the logs from various systems and make the logs available in a central place for analysis. 
  4. Metrics Collector Kafka is used to collecting metrics from various systems and networks for operations monitoring. There are Kafka metrics reporters available for monitoring tools like Ganglia, Graphite, etc.
Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of some of its components.The Java Message Service (JMS) API is a messaging standard that allows application components based on the Java Platform Enterprise Edition (Java EE) to create, send, receive, and read messages. It enables distributed communication that is loosely coupled, reliable, and asynchronous.

Broker An instance in a Kafka cluster is called a broker.

In a Kafka cluster, if you connect to a broker, you will be able to access the entire cluster. The broker instance that we connect to in order to access the cluster is known as a bootstrap server. Each broker is identified by a numeric ID in the cluster. To start a Kafka cluster, three brokers is a good number, but there are clusters with hundreds of brokers. A topic is a logical name to which the records are published. Internally, the topic is divided into partitions to which the data is published. These partitions are distributed across the brokers in the cluster.
For example, if a topic has three partitions with three brokers in the cluster, each broker has one partition. The published data to partition is append-only with the offset increment.