Apache Kafka has always been high on my list of things to explore, but since there are quite a few things high on my list, Kafka couldn’t actually make it to the very top. Until just recently, when I was asked to give the broker a try and see whether or not it meets a project’s needs. Two projects, to be honest. You should see my face when I heard it.
I compiled Apache Kafka from the sources, connected it to Spark Streaming and even attempted to answer few questions on StackOverflow (How to use Kafka in Flink using Scala? and How to monitor Kafka broker using jmxtrans?), not to mention reading tons of articles and watching videos about the tool. I developed pretty strong confidence what use cases are the sweet spot for Apache Kafka.
With the team in Codilime I’m developing DeepSense.io platform where we have just used Ansible to automate deployment. We’ve also been evaluating Docker and/or Vagrant. All to ease the deployment of DeepSense.io.
That’s the moment when these two needs converged - exploring Apache Kafka and Docker (among the other tools) for three separate projects! Amazing, isn’t it? I could finally explore how Docker might ease exploration of products and deployment. I knew Docker could ease my developer life, but it’s only now when I really saw it. I would now dockerize everything. When I was told about the images wurstmeister/kafka and wurstmeister/zookeeper I couldn’t have been happier. Running Apache Kafka and using Docker finally became a no-brainer and such a pleasant experience.
I then thought I’d share the love so it’s not only mine and others could benefit from it, too.
Since I’m on Mac OS X the steps to run Apache Kafka using Docker rely on boot2docker - a Lightweight Linux for Docker for platforms that don’t natively support Docker - aforementioned Mac OS X and Windows.
You’re going to use the images wurstmeister/kafka and wurstmeister/zookeeper.
You can run containers off the images in background or foreground. Depending on you Unix skills, it means one or two terminals. Let’s use two terminals for each server - Apache Kafka and Apache Zookeeper. I’m going to explain the role of Apache Zookeeper in another blog post.
Here come the steps to run Apache Kafka using Docker. It’s assumed you’ve got boot2docker
and docker
tools installed.
➜ ~ boot2docker version
Boot2Docker-cli version: v1.7.1
Git commit: 8fdc6f5
➜ ~ docker --version
Docker version 1.7.1, build 786b29d
I’m a big fan of homebrew and highly recommend it to anyone using Mac OS X. Plenty of ready-to-use packages are just brew install
away, docker and boot2docker including.
Running Kafka on two Docker images
(Mac OS X and Windows users only) Execute
boot2docker up
to start the tiny Linux core on Mac OS.➜ ~ boot2docker up Waiting for VM and Docker daemon to start... .o Started. Writing /Users/jacek/.boot2docker/certs/boot2docker-vm/ca.pem Writing /Users/jacek/.boot2docker/certs/boot2docker-vm/cert.pem Writing /Users/jacek/.boot2docker/certs/boot2docker-vm/key.pem To connect the Docker client to the Docker daemon, please set: export DOCKER_HOST=tcp://192.168.59.104:2376 export DOCKER_CERT_PATH=/Users/jacek/.boot2docker/certs/boot2docker-vm export DOCKER_TLS_VERIFY=1
(Mac OS X and Windows users only) Execute
$(boot2docker shellinit)
to have the terminal set up and letdocker
know where the tiny Linux core is running (viaboot2docker
). You have to do the step in any terminal you open to work with Docker so theexport
s above are set. Should you face communication issues withdocker
commands, remember the step.➜ ~ $(boot2docker shellinit) Writing /Users/jacek/.boot2docker/certs/boot2docker-vm/ca.pem Writing /Users/jacek/.boot2docker/certs/boot2docker-vm/cert.pem Writing /Users/jacek/.boot2docker/certs/boot2docker-vm/key.pem
Run
docker ps
to verify the terminal is configured properly for Docker.➜ ~ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
No containers are running at this time. It’s going to change soon once you start the containers for Zookeeper first and then Kafka.
Create an account on Docker Hub and execute
docker login
to store the credentials. With the step you don’t have to repeat them fordocker pull
to pull images off the public hub of Docker images. Think of the Docker Hub as the GitHub for Docker images. Refer to the documentation Using the Docker Hub for more up-to-date information.Run
docker pull wurstmeister/zookeeper
to pull the Zookeeper image off Docker Hub (might take a few minutes to download):➜ ~ docker pull wurstmeister/zookeeper Pulling repository wurstmeister/zookeeper a3075a3d32da: Download complete ... 840840289a0d: Download complete e7381f1a45cf: Download complete 5a6fc057f418: Download complete Status: Downloaded newer image for wurstmeister/zookeeper:latest
You will see hashes of respective layers printed out to the console. It’s expected.
Execute
docker pull wurstmeister/kafka
to pull the Kafka image off Docker Hub (might take a few minutes to download):➜ ~ docker pull wurstmeister/kafka latest: Pulling from wurstmeister/kafka 428b411c28f0: Pull complete ... 422705fe88c8: Pull complete 02bb7ca441d8: Pull complete 0f9a08061516: Pull complete 24fc32f98556: Already exists Digest: sha256:06150c136dcfe6e4fbbf37731a2119ea17a953c75902e52775b5511b3572aa1f Status: Downloaded newer image for wurstmeister/kafka:latest
Verify that the two images -
wurstmeister/kafka
andwurstmeister/zookeeper
- are downloaded. From the command line executedocker images
:➜ ~ docker images REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE wurstmeister/kafka latest 24fc32f98556 3 weeks ago 477.6 MB wurstmeister/zookeeper latest a3075a3d32da 9 months ago 451 MB
You can now execute
docker run --name zookeeper -p 2181 -t wurstmeister/zookeeper
in one terminal to boot Zookeeper up.Remember
$(boot2docker shellinit)
if you’re on Mac OS X or Windows.➜ ~ docker run --name zookeeper -p 2181:2181 -t wurstmeister/zookeeper JMX enabled by default Using config: /opt/zookeeper-3.4.6/bin/../conf/zoo.cfg 2015-07-17 19:10:40,419 [myid:] - INFO [main:QuorumPeerConfig@103] - Reading configuration from: /opt/zookeeper-3.4.6/bin/../conf/zoo.cfg ... 2015-07-17 19:10:40,452 [myid:] - INFO [main:ZooKeeperServer@773] - maxSessionTimeout set to -1 2015-07-17 19:10:40,464 [myid:] - INFO [main:NIOServerCnxnFactory@94] - binding to port 0.0.0.0/0.0.0.0:2181
This gives you Zookeeper listening to port 2181. Check it out by telneting to it using docker (or boot2docker on MacOS) ip address.
➜ ~ telnet `boot2docker ip` 2181 Trying 192.168.59.103... Connected to 192.168.59.103. Escape character is '^]'.
Execute
docker run --name kafka -e HOST_IP=localhost -e KAFKA_ADVERTISED_PORT=9092 -e KAFKA_BROKER_ID=1 -e ZK=zk -p 9092 --link zookeeper:zk -t wurstmeister/kafka
in another terminal.Remember
$(boot2docker shellinit)
if you’re on Mac OS X or Windows.➜ ~ docker run --name kafka -e HOST_IP=localhost -e KAFKA_ADVERTISED_PORT=9092 -e KAFKA_BROKER_ID=1 -e ZK=zk -p 9092 --link zookeeper:zk -t wurstmeister/kafka [2015-07-17 19:32:35,865] INFO Verifying properties (kafka.utils.VerifiableProperties) [2015-07-17 19:32:35,891] INFO Property advertised.port is overridden to 9092 (kafka.utils.VerifiableProperties) [2015-07-17 19:32:35,891] INFO Property broker.id is overridden to 1 (kafka.utils.VerifiableProperties) ... [2015-07-17 19:32:35,894] INFO Property zookeeper.connect is overridden to 172.17.0.5:2181 (kafka.utils.VerifiableProperties) [2015-07-17 19:32:35,895] INFO Property zookeeper.connection.timeout.ms is overridden to 6000 (kafka.utils.VerifiableProperties) [2015-07-17 19:32:35,924] INFO [Kafka Server 1], starting (kafka.server.KafkaServer) [2015-07-17 19:32:35,925] INFO [Kafka Server 1], Connecting to zookeeper on 172.17.0.5:2181 (kafka.server.KafkaServer) [2015-07-17 19:32:35,934] INFO Starting ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread) [2015-07-17 19:32:35,939] INFO Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT (org.apache.zookeeper.ZooKeeper) ... [2015-07-17 19:32:36,093] INFO Awaiting socket connections on 0.0.0.0:9092. (kafka.network.Acceptor) [2015-07-17 19:32:36,095] INFO [Socket Server on Broker 1], Started (kafka.network.SocketServer) [2015-07-17 19:32:36,146] INFO Will not load MX4J, mx4j-tools.jar is not in the classpath (kafka.utils.Mx4jLoader$) [2015-07-17 19:32:36,172] INFO 1 successfully elected as leader (kafka.server.ZookeeperLeaderElector) [2015-07-17 19:32:36,253] INFO Registered broker 1 at path /brokers/ids/1 with address 61c359a3136b:9092. (kafka.utils.ZkUtils$) [2015-07-17 19:32:36,270] INFO [Kafka Server 1], started (kafka.server.KafkaServer) [2015-07-17 19:32:36,318] INFO New leader is 1 (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
You’re now a happy user of Apache Kafka on your computer using Docker. Check the status of the containers using
docker ps
:➜ ~ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 0b34a9927004 wurstmeister/kafka "/bin/sh -c start-ka 2 minutes ago Up 2 minutes 0.0.0.0:32769->9092/tcp kafka 14fd32558b1c wurstmeister/zookeeper "/bin/sh -c '/usr/sb 4 minutes ago Up 4 minutes 22/tcp, 2888/tcp, 3888/tcp, 0.0.0.0:32768->2181/tcp zookeeper
Once you’re done with your journey into Apache Kafka,
docker stop
the containers usingdocker stop kafka zookeeper
(ordocker stop $(docker ps -aq)
if the only running containers arekafka
andzookeeper
).➜ ~ docker stop kafka zookeeper kafka zookeeper
Running
docker ps
shows no running containers afterwards:➜ ~ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
There are no running containers since they’re stopped now. They are still ready to be booted up again - use
docker ps -a
to see the ready-to-use containers:➜ ~ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 7dde25ff7ec2 wurstmeister/kafka "/bin/sh -c start-ka 15 hours ago Exited (137) 16 seconds ago kafka b7b4b675b9c0 wurstmeister/zookeeper "/bin/sh -c '/usr/sb 16 hours ago Exited (137) 5 seconds ago zookeeper
(Mac OS X and Windows users only) Finally, stop
boot2docker
daemon usingboot2docker down
.
Summary
With these two docker images - wurstmeister/kafka and wurstmeister/zookeeper - you can run Apache Kafka without much changing your local workstation to install it together with the necessary components like Apache ZooKeeper. You don’t need to worry about upgrading the software and its dependencies except docker itself (and boot2docker if you’re lucky to be on Mac OS). That saves you from spending time on installation and ensures proper functioning of your machine and the software. Moreover, the Docker images could be deployed to other machines and guarantee a consistent environment of the software inside.
Let me know what you think about the topic1 of the blog post in the Comments section below or contact me at jacek@japila.pl. Follow the author as @jaceklaskowski on Twitter, too.
-
pun intended↩