This Vagrant project creates a VM with single node setup of big data cluster.
The versions of the above components that the VM is provisioned with are defined in the file scripts/versions.sh
The version information is as follows:
- Hadoop 2.7.7
- Hive 2.3.4
- Hbase 1.4.8
- Spark 2.0.2
- Scala 2.11.12
- Zookeeper 3.5.7
- Phoenix 4.14.0
- Kafka 2.4.1
- Download and install VirtualBox
- Download and install Vagrant.
- Git clone this project, and change directory (cd) into this project (directory).
- Run
vagrant up
to create the VM. - Run
vagrant ssh
to get into your VM. - Run
vagrant destroy
when you want to destroy and get rid of the VM.
A list of available Vagrant boxes is shown at http://www.vagrantbox.es.
The Vagrant box is downloaded to the ~/.vagrant.d/boxes directory. On Windows, this is C:/Users/{your-username}/.vagrant.d/boxes.
To stop the VM and preserve all setup/data within the VM:
vagrant halt
or
vagrant suspend
Issue a vagrant up
command again to restart the VM from where you left off.
To completely wipe the VM so that vagrant up
command gives you a fresh machine: -
vagrant destroy
Then issue vagrant up
command as usual.
Here are some useful links to navigate to various UI's:
- YARN resource manager
- HBase
- Job history
- HDFS
- Spark history server
- Spark context UI (if a Spark context is running)
Substitute the ip address of the container or virtualbox VM for hadoop
if necessary.