Dockerfile for Apache Kafka
The image is available directly from https://registry.hub.docker.com/
http://wurstmeister.github.io/kafka-docker/
- install docker
- modify the
KAFKA_ADVERTISED_HOST_NAME
indocker-compose.yml
to match your docker host IP (Note: Do not use localhost or 127.0.0.1 as the host ip if you want to run multiple brokers.) - if you want to customize any Kafka parameters, simply add them as environment variables in
docker-compose.yml
, e.g. in order to increase themessage.max.bytes
parameter set the environment toKAFKA_MESSAGE_MAX_BYTES: 2000000
. To turn off automatic topic creation setKAFKA_AUTO_CREATE_TOPICS_ENABLE: 'false'
Start a cluster:
docker-compose up -d
The default docker-compose.yml can launch multiple kafka brokers giving each broker a unique port number. Depending on your use case this might not be desirable. If you need to use specific ports and broker ids, modify the docker-compose
configuration accordingly. To add more brokers, run:
docker-compose scale kafka=3
The alternative docker-compose-single-broker.yml can be used to deploy a single Kafka broker:
docker-compose -f docker-compose-single-broker.yml up
If you don't specify a broker id in your docker-compose file, it will automatically be generated (see https://issues.apache.org/jira/browse/KAFKA-1070. This allows scaling up and down. In this case it is recommended to use the --no-recreate
option of docker-compose to ensure that containers are not re-created and thus keep their names and ids.
If you want to have Kafka automatically create topics during creation, modify the scripts\kafka\kafka_init.sh
script to include the topics you want to create. The script is run by a kafka init container upon startup. The default script creates all necessary topics with a replication factor of 1. If you want to change the replication factor, modify the script accordingly.
You can configure the advertised hostname in different ways
- explicitly, using
KAFKA_ADVERTISED_HOST_NAME
- via a command, using
HOSTNAME_COMMAND
, e.g.HOSTNAME_COMMAND: "route -n | awk '/UG[ \t]/{print $$2}'"
When using commands, make sure you review the "Variable Substitution" section in https://docs.docker.com/compose/compose-file/
If KAFKA_ADVERTISED_HOST_NAME
is specified, it takes presendence over HOSTNAME_COMMAND
For AWS deployment, you can use the Metadata service to get the container host's IP:
HOSTNAME_COMMAND=wget -t3 -T2 -qO- http://169.254.169.254/latest/meta-data/local-ipv4
Reference: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html